Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chasm interface draft #6987

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Chasm interface draft #6987

wants to merge 1 commit into from

Conversation

yycptt
Copy link
Member

@yycptt yycptt commented Dec 13, 2024

What changed?

  • Draft chasm interface with activity as a sample use case.

Why?

How did you test it?

Potential risks

Documentation

Is hotfix candidate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this in a top level chasm directory. There's likely going to be some chasm related code in other services.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that as well, but think having a top level chasm package that re-exports everything is probably better.

  • Some concepts like Field, Ref are exported as a struct and contain un-exported fields needs for the implementation. Having the implementation in the top level package instead of service/history feels a bit weird to me.
  • There will be some other exported functions, structs etc. in the chasm package meant to be consumed by other packages in the history service, but not the developers who's writing a chasm component. With re-export, we can choose not to re-export them, so dev will see a cleaner top level chasm package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synced offline. I've moved the package to top level.
Though later during implementation, we will have to comment that some exported function are meant for internal implementation use and should not be used by lib/component author.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just move this out of service/history though since there will be some chasm functionality that isn't strictly tied to the history service (e.g. exposing public APIs via the frontend and possibly extending matching functionality).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top level, you mean common/chasm?

//
// Framework will try to recognize the type and do serialization/deserialization
// proto.Message is recommended so the component get compatibility if state definition changes
State persistencepb.ActivityInfo // proto.Message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this standard field Data, since State is usually an enum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we can let you embed a proto and treat that as the data field. That could be slightly nicer to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with naming change. Disagree with proto embed. I'd prefer for it to just be explicit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah Data is better. I won't commit this sample activity impl though :)

But I will track as low priority that the reflection logic in chasm engine needs to support embedded fields.

Input *chasm.Field[*common.Payload] `chasm:"lazy"`
Output *chasm.Field[*common.Payload] `chasm:"lazy"`

EventNotifier *chasm.Field[EventNotifier]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just to show the pointer capabilities, but the real activity implementation will need to do more than just notify its parent, it'll need to be injected with a way to read the inputs and outputs. They won't be embedded payloads in the workflow case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had thought we wanted a way for the parent to be able to decide what notifications it cared about from its children, rather than the children needing to remember to notify the parent. I think that can still be implemented in terms of this a pointer just fine though.

Something as simple as the following I think would be nice, and isn't really even part of the framework, just a useful pattern. I think it keeps things a little more obvious for the activity implementer.

func NewScheduledActivity(
	chasmContext chasm.MutableContext,
	params *NewActivityRequest,
	onComplete func(ActivityCompletedEvent)
) { ... }

// Then workflow constructs like
NewScheduledActivity(ctx, params, func(ce) { me.AsComponentPointer(ctx).OnCompletion(ce) })

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the real activity implementation will need to do more than just notify its parent

Yeah for sure. The interface can be anything and can have a method for like loading events for example. This field internally will just be a persisted pointer (basically the path) to another component.

onComplete func(ActivityCompletedEvent)

The framework needs to be able to persist this pointer. Not exactly sure if we can get a pointer to the method receiver in this approach though.


func (l Library) Components() []chasm.RegistrableComponent {
return []chasm.RegistrableComponent{
chasm.NewRegistrableComponent[*Activity](
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kinda going back and forth between a RegisterComponent function and this approach where the library returns the list of components.

The main downside to this approach is that you have to wrap with NewRegisterableComponent.

If we do want this approach, we may want a LibraryBase struct that has empty implementations for all of the methods so library implementors only provide the items they need registered.

I like that with the concept of a library, you essentially get a namespace for all components and tasks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of returning the list of components. Having to wrap seems like a total non-issue for something that's done this infrequently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah wrapping feels ok to me.

We anyway need a way for specifying name & options of the component. Alternatively they can be methods on the component struct itself, but I feel that's mixing other concerns with the component data & behavior.

LibraryBase makes sense. Guess we also needs it for making the interface forward compatible so all library impl should embed it. Will add.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets go with library, I agree it's the better approach.

"go.temporal.io/server/service/history/chasm"
)

// This will be nexus
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll work on a Nexus POC on top of this.


type TimeoutTaskHandler struct{}

func (h *TimeoutTaskHandler) Validate(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little on the fence if Validate should be here or on the Component or the task but this works and has some advantages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda feels more natural to me if the concept belongs to the Task. and can be shared across multiple components.


func (h *TimeoutTaskHandler) Validate(
chasmContext chasm.Context,
activity *Activity,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we reuse tasks for separate components and there will need to be a way to provide a Validate function for each of them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking the Component here doesn't have to be a concrete struct. It can be an interface and then the same task definition can be reused across components.
When validating/executing the task, the underlying struct will be determined based on the component name stored in the task and the registered concrete type for that name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try this out. Hard to tell without building some use cases on top of this.

return nil, err
}

resp, startedActivityRef, err := chasm.UpdateComponent(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed yesterday, I wish we could get the ref in the update function and keep the ref out of this signature. This is good to start with though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ideally a component author making a transition doesn't need to care where the referenced component came from

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally a component author making a transition doesn't need to care where the referenced component came from

They have to care unfortunately. The ref also serves as a consistency token. If a second operation depends on the first one, you need to use the ref returned from the first operation or the second operation may talk to a stale copy of the component and get rejected (e.g. not found)

But yeah this means some boilerplate code in the api handler. As a start I think I can at least move the ref ser/de part into chasm.UpdateComponent, so the method will take in and return []byte. We lose some type safety but code will be simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up making the ref type generic as well, and it can be []byte | ComponentRef.
For api, takes in []byte makes sense.
For task processing, decoded ComponentRef makes more sense as the framework will construct ComponentRef in memory. Always takes in []byte means extra encoding/decoding. Framework also needs to put extra task validation fn in the ref as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too concerned with this detail for now, we can figure it out as we build use cases on top of the API.

Comment on lines +146 to +153
func(a *Activity, ctx chasm.Context, _ *GetActivityResultRequest) bool {
return a.LifecycleState() == chasm.LifecycleStateCompleted
},
func(a *Activity, ctx chasm.MutableContext, _ *GetActivityResultRequest) (*GetActivityResultResponse, error) {
outputPayload, err := a.Output.Get(ctx)
resp.Output = outputPayload.Data
return resp, err
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we'd be better off merging these two like so:

Suggested change
func(a *Activity, ctx chasm.Context, _ *GetActivityResultRequest) bool {
return a.LifecycleState() == chasm.LifecycleStateCompleted
},
func(a *Activity, ctx chasm.MutableContext, _ *GetActivityResultRequest) (*GetActivityResultResponse, error) {
outputPayload, err := a.Output.Get(ctx)
resp.Output = outputPayload.Data
return resp, err
},
func(a *Activity, ctx chasm.MutableContext, _ *GetActivityResultRequest) (*GetActivityResultResponse, error) {
if a.LifecycleState() != chasm.LifecycleStateCompleted {
return nil, chasm.ErrNoReady / *name TBD */
}
outputPayload, err := a.Output.Get(ctx)
resp.Output = outputPayload.Data
return resp, err
},

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer them separate, because what happens if you mutate something and then say "not ready"? That would be some weird violation that shouldn't be possible, and separate contexts enforces that at the type level.

If you need context from one to the other, make the ready function return (bool, T) and T is passed into the mutator

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I separated them mainly to ensure the validation logic is readonly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the signature of the validation to return (T, bool, error), though that means the mutator is now taking 4 params...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why there needs to be a mutator in poll or why you need both an error and a bool.

// panic("not implemented")
// }

func NewEntity[C Component, I any, O any](
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, call this CreateEntity but what you have is fine too.

panic("not implemented")
}

// Not needed for V1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, this isn't needed but if we were to do it, there's a type safe way using a closure that gets an instance and returns field references.

Copy link
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking really good to me

//
// Framework will try to recognize the type and do serialization/deserialization
// proto.Message is recommended so the component get compatibility if state definition changes
State persistencepb.ActivityInfo // proto.Message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with naming change. Disagree with proto embed. I'd prefer for it to just be explicit.

}

func (i *Activity) GetDispatchInfo(
chasmContext chasm.MutableContext,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an immutable context?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. yes, it should be immutable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still kinda prefer Read and Write context but that's just a personal non-blocking preference.


func (l Library) Components() []chasm.RegistrableComponent {
return []chasm.RegistrableComponent{
chasm.NewRegistrableComponent[*Activity](
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of returning the list of components. Having to wrap seems like a total non-issue for something that's done this infrequently.

Input *chasm.Field[*common.Payload] `chasm:"lazy"`
Output *chasm.Field[*common.Payload] `chasm:"lazy"`

EventNotifier *chasm.Field[EventNotifier]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had thought we wanted a way for the parent to be able to decide what notifications it cared about from its children, rather than the children needing to remember to notify the parent. I think that can still be implemented in terms of this a pointer just fine though.

Something as simple as the following I think would be nice, and isn't really even part of the framework, just a useful pattern. I think it keeps things a little more obvious for the activity implementer.

func NewScheduledActivity(
	chasmContext chasm.MutableContext,
	params *NewActivityRequest,
	onComplete func(ActivityCompletedEvent)
) { ... }

// Then workflow constructs like
NewScheduledActivity(ctx, params, func(ce) { me.AsComponentPointer(ctx).OnCompletion(ce) })

return nil, err
}

resp, startedActivityRef, err := chasm.UpdateComponent(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ideally a component author making a transition doesn't need to care where the referenced component came from

Comment on lines +146 to +153
func(a *Activity, ctx chasm.Context, _ *GetActivityResultRequest) bool {
return a.LifecycleState() == chasm.LifecycleStateCompleted
},
func(a *Activity, ctx chasm.MutableContext, _ *GetActivityResultRequest) (*GetActivityResultResponse, error) {
outputPayload, err := a.Output.Get(ctx)
resp.Output = outputPayload.Data
return resp, err
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer them separate, because what happens if you mutate something and then say "not ready"? That would be some weird violation that shouldn't be possible, and separate contexts enforces that at the type level.

If you need context from one to the other, make the ready function return (bool, T) and T is passed into the mutator

)

type Context interface {
// Context is not binded to any component,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Context is not binded to any component,
// Context is not bound to any component,

Comment on lines +42 to +44
// we probably don't even need this,
// can make the function generic and find the name from registry
rootComponentName: rootComponentName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Taking in the type and having components define their name so you operate in terms of types rather than names is nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized later that if we pass in the component type, then we'll also need a way to access the registry. Either pass in the registry or the function impl reference a global variable... Neither feels like a good idea to me.

I need to better understand how the services will be exposed as nexus and then come back to this. We might solve the routing problem in a different way and can just store shardID in the ref.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... shard ID in the ref might work. I don't think any of this is blocking the initial work.

Comment on lines +36 to +38
// If we provide this method, then the method on the engine doesn't need to
// return a Ref
// NewRef(Component) (ComponentRef, bool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have a CreationContext that simply doesn't include this, and that and MutableContext both embed an un-exported base interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is new components can be created in any transition not just the first one that creates the entity.

Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, had some followup questions.

I think we're in agreement but just want to double check that the structure will be:

chasm/ # framework code live here
chasm/lib # libraries live here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just move this out of service/history though since there will be some chasm functionality that isn't strictly tied to the history service (e.g. exposing public APIs via the frontend and possibly extending matching functionality).

}

func (i *Activity) GetDispatchInfo(
chasmContext chasm.MutableContext,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still kinda prefer Read and Write context but that's just a personal non-blocking preference.


func (l Library) Components() []chasm.RegistrableComponent {
return []chasm.RegistrableComponent{
chasm.NewRegistrableComponent[*Activity](
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets go with library, I agree it's the better approach.

return nil, err
}

resp, startedActivityRef, err := chasm.UpdateComponent(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too concerned with this detail for now, we can figure it out as we build use cases on top of the API.

Comment on lines +146 to +153
func(a *Activity, ctx chasm.Context, _ *GetActivityResultRequest) bool {
return a.LifecycleState() == chasm.LifecycleStateCompleted
},
func(a *Activity, ctx chasm.MutableContext, _ *GetActivityResultRequest) (*GetActivityResultResponse, error) {
outputPayload, err := a.Output.Get(ctx)
resp.Output = outputPayload.Data
return resp, err
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why there needs to be a mutator in poll or why you need both an error and a bool.


func (h *TimeoutTaskHandler) Validate(
chasmContext chasm.Context,
activity *Activity,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try this out. Hard to tell without building some use cases on top of this.

Comment on lines +42 to +44
// we probably don't even need this,
// can make the function generic and find the name from registry
rootComponentName: rootComponentName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... shard ID in the ref might work. I don't think any of this is blocking the initial work.

@yycptt yycptt changed the title [Will NOT Merge] Chasm interface draft Chasm interface draft Jan 14, 2025
type LifecycleState int

const (
LifecycleStateUnspecified LifecycleState = 0
Copy link
Member

@alexshtin alexshtin Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LifecycleStateUnspecified LifecycleState = 0
LifecycleStateUnspecified = LifecycleState(0)

and then it can be moved to the end of next const block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants