Numaflow

Summary

Numaflow is a Kubernetes-native tool for running massively parallel stream processing. A Numaflow Pipeline is implemented as a Kubernetes custom resource and consists of one or more source, data processing, and sink vertices.

Numaflow installs in a few minutes and is easier and cheaper to use for simple data processing applications than a full-featured stream processing platforms.

Use Cases

Real-time data analytics applications.
Event driven applications such as anomaly detection, monitoring and alerting.
Streaming applications such as data instrumentation and data movement.
Workflows running in a streaming manner.

Key Features

Kubernetes-native: If you know Kubernetes, you already know how to use Numaflow.
Language agnostic: Use your favorite programming language.
Exactly-Once semantics: No input element is duplicated or lost even as pods are rescheduled or restarted.
Auto-scaling with back-pressure: Each vertex automatically scales from zero to whatever is needed.

Data Integrity Guarantees:

Minimally provide at-least-once semantics
Provide exactly-once semantics for unbounded and near real-time data sources
Preserving order is not required

Roadmap

Data aggregation (e.g. group-by)

Resources

QUICK_START
EXAMPLES
DEVELOPMENT
CONTRIBUTING

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Numaflow

Summary

Use Cases

Key Features

Data Integrity Guarantees:

Roadmap

Resources

Files

README.md

Latest commit

History

README.md

File metadata and controls

Numaflow

Summary

Use Cases

Key Features

Data Integrity Guarantees:

Roadmap

Resources