Core concepts

Data Lake

Topics that are backed by an AVRO, JSON or Protocol Buffers schema can also be written as:

Tansu automatically maps each type used in the schema into an equivalent in Apache Arrow. Using Apache Parquet as a mezzanine format with minor changes to accommodate either Apache Iceberg or Delta Lake.

Examples of schema backed topics with Apache Iceberg or Delta Lake:

Sink

A sink is a schema backed topic whose records are only written to Apache Iceberg or Delta Lake:

  • The broker maintains minimal metadata about a sink topic (topic metadata including the high watermark for each partition)
  • Kafka clients can only produce messages to a sink topic
  • Fetching is not supported on a sink topic

A person sink topic can be created as follows:

tansu topic create person --config tansu.lake.sink=true
Previous
Schema Registry
Next
Memory