Quick Start
Let's set up a simple pipeline that extracts posts from a public API and loads them into Postgres.
Quick Setup
Get your first pipeline running in under 5 minutes!
1
Create Configuration
Create a pipelines.yaml file to define your source (API) and target (DB).
defaults:
pagination:
kind: page_number
page_param: _page
sources:
json_posts:
url: https://jsonplaceholder.typicode.com/posts
headers:
- key: User-Agent
value: ApiTap/1.0
targets:
- name: postgres_db
type: postgres
host: localhost
database: mydb
auth:
username: "${PG_USER}"
password: "${PG_PASS}"
Pro Tip
Environment variables like ${PG_USER} are automatically substituted at runtime.
2
Create SQL Transformation
Create a posts.sql file. This is where you tell ApiTap what data to fetch and how to transform it.
{{ sink(name="postgres_db") }}
SELECT
id,
title,
body
FROM json_posts
SQL Power
Use full DataFusion SQL including WHERE, JOIN, and aggregate functions!
3
Run the Pipeline
Mount your directory and run the standard Docker container.
docker run -it --rm \
--network host \
-v $(pwd):/app \
devasm/apitap:latest \
-y pipelines.yaml
What Happens Next?
1. Connects to API
Fetches data with automatic pagination
2. In-Memory Process
Uses <100MB RAM for any dataset size
3. Applies SQL
Transforms data using your SQL queries
4. Streams to DB
Inserts directly to Postgres in real-time
