Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: demonstrate an event-only model #16

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

SavagePencil
Copy link
Member

This update reduces the amount of boilerplate for customers to handle when invoking a structure and provides them a method to use only an eventing system. It eliminates the need for them to poll both Structure Run Status AND the Structure Run events. Customers who want to rely on Structure Run resource Status can still do so.

Assumptions: This draft assumes that Structure Runs are amended to inject System Events for when a Structure Run...

  1. ...starts execution (i.e., gets out of QUEUED state)
  2. ...stops executing customer code with errors
  3. ...stops executing customer code withOUT errors

Open Questions:

  1. Can the Events API be amended to take an optional parameter for where to start? This would allow clients to only request the events that occured since the most recent one they received and reduce bandwidth.
  2. Do we want to change the name of the FinishStructureRunEvent to be clearer it is an OPTIONAL signal from the Structure code to let the client know that it is, from the client's POV, done and has a payload ready for them?
  3. How do we want to signify the System Events?
  4. In the mainline version we have the client requesting logs, but these can take several minutes against a real world situation. Can we embed the log info into the structure completion events?

# 2. The Structure has stopped running due to an error
# 3. The Structure has stopped running without an error
while not structure_has_terminal_event:
# TODO: Can the API be adjusted to give us all NEW events that occurred after a specific point?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be accomplished with pagination? Cloud will return pagination information, and the client can send subsequent ListEvent requests starting from the last page received

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrating my comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you outline pseudocode for how it will look for the client to do that? I don't want to jump to conclusions (sidebar: I already made that mistake when we were chatting in Slack about this).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    cursor = None  # Start with no cursor for the first page
    while True:
        page = fetch_page(cursor)  # Fetch a page of data
        process_data(page["data"])  # Process the data
        
        cursor = page["next_cursor"]  # Get the cursor for the next page
        if not cursor:  # If no more cursor, exit the loop
            break

@zachgiordano @vachillo does this type of pagination make sense to have in the Cloud API? I know we've already got offset based pagination for certain resources.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for a single poll that has (potentially) multiple pages, but how do we maintain the cursor for the next poll iteration? What's the equivalent of "set the cursor to be just after the last page of events I pulled"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow..

We update the cursor on each poll iteration.

cursor = page["next_cursor"]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, so it will only return None when the Structure has completed running?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it makes sense. We already have pagination set up for StructureRuns.

It should be on Events as well. Seems we just haven't added it yet.

Comment on lines +90 to +91
# Is this an event type that we care about?
event_type = event["value"]["type"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not use the framework event type here. This should be keying off a top level event type field set by Cloud.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give an example of what you're thinking? I want to make sure we're not foisting more work on the client, and maybe we're not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

event["value"]["type"] is a framework-specific field. Each of the "Cloud Events" should contain its own top level type field.

event_type = event["type"]

match event_type:
    case "UserEvent.CompletionChunkEvent":
        # If the agent is set to streaming, print out each streaming chunk
        completion_token = event["value"]["token"]
        print(completion_token, flush=True, end="")
    case "UserEvent.FinishStructureRunEvent":
    ...
    case "SystemEvent.StructureRunExecutionFailed":
    ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh got it, embedding it in the name of the event type. I was concerned we were going to add a second type field at the top level of event, which would have forced more boilerplate for the client. Totally onboard with your suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SavagePencil I did add a type field at the top level of event. But yes it includes the framework type event too. Just to entertain what would happen if we didn't...

event_type = event["type"]

match event_type:
    case "UserEvent":
       framework_event_type = event["value"]["type"]
       match framework_event_type:
           case "CompletionChunkEvent":      
               # If the agent is set to streaming, print out each streaming chunk
              completion_token = event["value"]["token"]
              print(completion_token, flush=True, end="")
           case "FinishStructureRunEvent":
           ...
    case "SystemEvent.StructureRunExecutionFailed":
    ...

Putting it in the name really only works because we know the event is coming from the framework and we know to look at event["value"]["type"]. If we wanted to allow for non-framework user events we'd either need to enforce a similar structure or do something like this ^.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bleh, I think customers would much prefer the dotted notation inside the string than having to juggle those. I feel like we're doing this for OUR convenience over theirs. Happy to get one of our Jasons to settle this for us.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool with namespacing in the event name. Main takeaway is that there is a Cloud-inserted top level event type:

event_type = event["type"]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@collindutter how do you feel about being more explicit to make it clear when an event came from the service ("system" feels ambiguous to me). e.g., GriptapeCloudEvent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then I'd ask what kind of Griptape Cloud Event is it? Is it one the "system" generated, or one I, the "user", generated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmph, it worked for me 😤. I was hoping to capture the essence of "this event happened outside of my code, out of my control." Any terms you can think of? Should we crack open the thesaurus?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants