Fetchers & Pagination
The Fetcher module is the heart of ApiTap's ingestion engine. It handles HTTP requests, concurrency, and intelligent pagination.
Architecture
PaginatedFetcher
- 1Manages HTTP requests, pagination & concurrency
- 2Supports 20+ concurrent page fetches
- 3Handles rate limits automatically
Pagination Strategies
- 1Limit / Offset (SQL Style)
- 2Page Number (REST Standard)
- 3Page Only (Simple)
- 4Cursor (Infinite Scroll)
Supported Strategies
Limit / Offset
Standard SQL-style pagination
Page Number
Common REST API pagination
Page Only
Simple page-based fetching
Cursor Based
Infinite scroll pagination
1. Limit / Offset
SQL-Style Pagination
ApiTap fetches pages in parallel
Config:
yaml
pagination:
kind: limit_offset
limit_param: "limit"
offset_param: "offset"2. Page Number
REST API Standard
Fetches page 1 first, then parallelizes remaining pages
yaml
pagination:
kind: page_number
page_param: "page"
format: "page_only" # or page_per_pagePerformance Tips
Concurrency
By default, ApiTap uses conservative concurrency. You can tune this:
- 1-5For rate-limited APIs
- 10-20For high-throughput internal APIs
Memory Management
Optimize buffer sizes based on your workload for maximum efficiency
Optimization: For 1000+ internal pipelines, lower the internal buffer size to 256 items to save RAM.
| Buffer Size | Memory / Pipe | Use Case |
|---|---|---|
| 8192 (Default) | ~8 MB | High throughput, few jobs |
| 256 ✓ | ~256 KB | Massive concurrency |
