Quick Start

Let's set up a simple pipeline that extracts posts from a public API and loads them into Postgres.

Quick Setup

Get your first pipeline running in under 5 minutes!

1

Create Configuration

Create a pipelines.yaml file to define your source (API) and target (DB).

defaults:
  pagination:
    kind: page_number
    page_param: _page

sources:
  json_posts:
    url: https://jsonplaceholder.typicode.com/posts
    headers:
      - key: User-Agent
        value: ApiTap/1.0

targets:
  - name: postgres_db
    type: postgres
    host: localhost
    database: mydb
    auth:
      username: "${PG_USER}"
      password: "${PG_PASS}"

Pro Tip

Environment variables like ${PG_USER} are automatically substituted at runtime.

2

Create SQL Transformation

Create a posts.sql file. This is where you tell ApiTap what data to fetch and how to transform it.

{{ sink(name="postgres_db") }}

SELECT
    id,
    title,
    body
FROM json_posts

SQL Power

Use full DataFusion SQL including WHERE, JOIN, and aggregate functions!

3

Run the Pipeline

Mount your directory and run the standard Docker container.

docker run -it --rm \
  --network host \
  -v $(pwd):/app \
  devasm/apitap:latest \
  -y pipelines.yaml

What Happens Next?

1. Connects to API
Fetches data with automatic pagination
2. In-Memory Process
Uses <100MB RAM for any dataset size
3. Applies SQL
Transforms data using your SQL queries
4. Streams to DB
Inserts directly to Postgres in real-time