Core concepts
Data Lake
Topics that are backed by an AVRO, JSON or Protocol Buffers schema can also be written as:
- Apache Iceberg
- Delta Lake
- Parquet
Tansu automatically maps each type used in the schema into an equivalent in Apache Arrow. Using Apache Parquet as a mezzanine format with minor changes to accommodate either Apache Iceberg or Delta Lake.
Examples of schema backed topics with Apache Iceberg or Delta Lake:
Sink
A sink is a schema backed topic whose records are only written to Apache Iceberg or Delta Lake:
- The broker maintains minimal metadata about a sink topic (topic metadata including the high watermark for each partition)
- Kafka clients can only produce messages to a sink topic
- Fetching is not supported on a sink topic
A person sink topic can be created as follows:
tansu topic create person --config tansu.lake.sink=true