Skip to content

[Proposal] Introduce streaming job for incremental load #56191

@sollhui

Description

@sollhui

Search before asking

  • I had searched in the issues and found no similar issues.

Description

Background

As an OLAP system, a basic requirement is the way to handle different data sources, such as S3/Kafka/CDC, etc. Traditional incremental pipelines rely on extra middleware or self-written jobs, which brings:

  1. Delay: the more systems flow through, the more unavoidable the delay becomes

  2. Complexity: need to maintain more component.

We therefore introduce Streaming Job, a native low-latency ingest path that moves incremental data directly into Doris with low-latency, simplicity, and exactly once semantics guarantee.

Design

Grammar

Reuse the syntax of the job, simply mark it ON STREAMING:

CREATE JOB example_job 
Properties(
)
ON STREAMING
DO 
INSERT INTO db.tbl
select * from tvf ()

user can alter job

Alter Job FOR jobName
Properties(
)
INSERT INTO db.tbl
select * from tvf ()

query job:

select * from job(type=insert) where ExecuteType = streaming

Schedule

Image

Scheduler is included job schedule and task schedule:

  1. Job Schedule (time-driven): reuse the logic of time wheels, generate job scheduler subtasks at regular time.

  2. Task Schedule (Event driven) : driven-scheduling relies on the callback after task completion.

Offset management

Exactly-once is achieved through persistent offset plus two-phase task verification:

  • Offset is committed only after data is visible and durable in Doris.

  • Each task carries a monotonic ID; the scheduler rejects any duplicate or out-of-order task, eliminating replay risk.

Use case

No response

Related issues

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Labels

kind/featureCategorizes issue or PR as related to a new feature.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions