Extract | Starlake

📄️ Extract Data from Databases with Starlake

Learn how to extract tables from databases as files using Starlake. Step-by-step tutorial covering connection setup, schema selection, and data extraction.

📄️ Extract Data Using Custom SQL Queries

Extract data from JDBC databases using custom SQL queries in Starlake. Configure joins, filters, and aggregations in YAML. Includes CLI examples for extract-data and extract-schema.

📄️ Incremental Data Extraction with Starlake

Configure incremental extraction in Starlake to pull only new rows from a database. Set partitionColumn and fullExport in YAML. State is tracked in the SL_LAST_EXPORT audit table.

📄️ Parallel Data Extraction with Starlake (numPartitions)

Speed up Starlake database extraction using parallel mode. Configure numPartitions and partitionColumn for concurrent JDBC reads. Compatible with incremental extraction.

📄️ Monitor Starlake Data Extractions (SL_LAST_EXPORT Audit Table)

Monitor Starlake data extractions using the SL_LAST_EXPORT audit table. Query extraction status, detect failures, and track row counts. Includes SQL examples and audit configuration.

📄️ Database-Specific Extraction Settings: DB2, Oracle, SQL Server

Configure Starlake schema extraction for DB2, Oracle, and SQL Server. Extract column and table comments using columnRemarks and tableRemarks SQL queries with template placeholders.

📄️ Extract Starlake Schemas from OpenAPI Definitions (REST API to Tables)

Map OpenAPI routes and schemas to Starlake domains and tables using YAML configuration. Supports route filtering, schema exclusion, explode strategies, and name normalization.

📄️ Extract Data from REST APIs with Starlake

Learn how to extract data from any REST API using Starlake. Covers authentication, pagination, rate limiting, incremental extraction, and parent-child endpoints.