What are barriers and checkpoints?
If you’re new to stream processing, the terms barrier and checkpoint can be confusing. Here’s a simple way to think about them in RisingWave:- Barrier: A periodic “sync marker” that RisingWave injects into the streaming dataflow. It travels through operators along with your data and is used to coordinate consistent progress (for example, triggering state flush and checkpoint completion).
- Checkpoint: A “global consistent snapshot” of the whole streaming system, created when a barrier has been processed consistently across the system. Checkpoints are what RisingWave uses for durability and recovery.
Default timing
By default:barrier_interval_ms = 1000→ RisingWave generates a barrier every 1 second.checkpoint_frequency = 1→ RisingWave creates a checkpoint every 1 barrier.
How to view or change the timing (usually NOT needed)
Most users should keep the defaults. Changing barrier/checkpoint settings can affect latency, throughput, and system load.- View the current values:
- Change global defaults:
- Change per database:
ALTER DATABASE ... SET.
This 1-second checkpoint interval controls RisingWave’s internal fault tolerance and recovery. For sinks, data is committed to downstream systems at a different frequency controlled by
commit_checkpoint_interval, which defaults to every 10 checkpoints (approximately 10 seconds) when sink decoupling is enabled. See Sink decoupling for details.- S3, or S3-compatible object storage
- Google Cloud Storage, or HDFS/WebHDFS (support implemented via Apache OpenDAL)