Fetchers & Pagination

The Fetcher module is the heart of ApiTap's ingestion engine. It handles HTTP requests, concurrency, and intelligent pagination.

Architecture

PaginatedFetcher

  • 1Manages HTTP requests, pagination & concurrency
  • 2Supports 20+ concurrent page fetches
  • 3Handles rate limits automatically

Pagination Strategies

  • 1Limit / Offset (SQL Style)
  • 2Page Number (REST Standard)
  • 3Page Only (Simple)
  • 4Cursor (Infinite Scroll)

Supported Strategies

Limit / Offset

Standard SQL-style pagination

Page Number

Common REST API pagination

Page Only

Simple page-based fetching

Cursor Based

Infinite scroll pagination

1. Limit / Offset

SQL-Style Pagination

ApiTap fetches pages in parallel

Config:

yaml
pagination:
  kind: limit_offset
  limit_param: "limit"
  offset_param: "offset"

2. Page Number

REST API Standard

Fetches page 1 first, then parallelizes remaining pages

yaml
pagination:
  kind: page_number
  page_param: "page"
  format: "page_only" # or page_per_page

Performance Tips

Concurrency

By default, ApiTap uses conservative concurrency. You can tune this:

  • 1-5For rate-limited APIs
  • 10-20For high-throughput internal APIs

Memory Management

Optimize buffer sizes based on your workload for maximum efficiency

Optimization: For 1000+ internal pipelines, lower the internal buffer size to 256 items to save RAM.
Buffer SizeMemory / PipeUse Case
8192 (Default)~8 MBHigh throughput, few jobs
256 ✓~256 KBMassive concurrency