Introduction

Welcome to the ApiTap documentation.

ApiTap is a high-performance Streaming Batch engine built in Rust. It allows you to extract data from any REST or GraphQL API and stream it directly to your database without writing custom scripts.

Interactive Data Pipeline

Click on each step to learn more

API Request
→→→
Processing
→→→
Transform
→→→
Database ✓

Core Philosophy

  1. Streaming Batch: We process data in concurrent batches but stream records instantly.
  2. Zero Memory: We never buffer the full dataset. RAM usage stays flat (~25MB).
  3. Declarative: Pipelines are defined in YAML, not Python code.

Key Features

Lightning Fast

Process 250MB in 20 seconds using just 0.5 CPU and 256MB RAM

Zero Copy Memory

Apache Arrow-based processing with vectorized execution

SQL Transforms

Use DataFusion SQL to filter, join, and transform data on-the-fly

Multiple Sinks

Stream to PostgreSQL, MySQL, files, or custom destinations

Getting Started

To get started with ApiTap, we recommend using our official Docker image.

docker pull devasm/apitap:latest

Check out the Quick Start guide to run your first pipeline in fewer than 5 minutes.

Architecture Overview

ApiTap follows a simple but powerful architecture:

graph LR
    A[REST/GraphQL API] --> B[Concurrent Fetcher]
    B --> C[SQL Transform]
    C --> D[Stream Writer]
    D --> E[(Database)]

The engine is designed to maintain constant memory usage regardless of dataset size, making it perfect for ETL workloads where traditional tools would require massive infrastructure.