-
Notifications
You must be signed in to change notification settings - Fork 0
Kappa architecture
David Liu edited this page Nov 17, 2024
·
1 revision
- 通过化批为流的方式实现流批一体
- Log is the data
- 与Delta architecture完全相反?
- 它可以先读取数据库全量数据同步到数仓中,然后自动切换到增量模式,通过 CDC 读 Binlog 进行增量和全量的同步
- AWS Aurora的migration也采用了类似的思想
- Proposed by Jay Kreps @Confluent, simplifies the Lambda approach
- Treating both real-time and batch processing as stream processing
Stream Processing Layer
- Ingests all data as an immutable log of events
Costly infrastructure with scalability issues:
- Storing big data in an event streaming platform can be costly.
- Solutions
- use data lake approach from your cloud provider (like AWS S3 or GCP Google Cloud Storage).
- Apache Flink