Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: CloudEvents component #394

Open
Mossaka opened this issue Apr 24, 2022 · 5 comments
Open

Feature: CloudEvents component #394

Mossaka opened this issue Apr 24, 2022 · 5 comments

Comments

@Mossaka
Copy link
Contributor

Mossaka commented Apr 24, 2022

Enabling CloudEvents component could make spin a great solution to serverless and event-driven applications. I am thinking about an event trigger that subscribes to a addressible broker, and whenever the broker emits an event, the event trigger will instantiate a wasm module to hanlde. Or the trigger itself can emits an event.

For example the trigger can subscribe to a knative event source.

I am proposing the following component sdk. Let me know your thoughts. @radu-matei

use anyhow::Result;
use spin_sdk::{
    event::{Event},
    event_component,
};

/// A simple Spin event component.
#[event_component]
fn trigger(event: Event) -> Result<()> {
    println!("{}", event);
    Ok(())
}
@radu-matei
Copy link
Member

@Mossaka this looks like a great idea!

For larger work items, we try to write a Spin Improvement Proposal (SIP) - see a few examples.

@radu-matei
Copy link
Member

Thanks a lot for yesterday's presentation, @Mossaka!

ref #398

After going through the CloudEvents core spec,
the HTTP webhook spec,
and the protocol bindings,
here is roughly how I am thinking about implementing support for CloudEvents in Spin.

Responding to HTTP webhooks

The CloudEvents HTTP webhook specification
describes the pattern to deliver notifications through HTTP endpoints.

In this scenario, the producer sends a valid HTTP request with a CloudEvents
payload (potentially with support for the validation and abuse protection
protocol), and expects a valid HTTP response, with an optional CloudEvents payload.

In handling the incoming request, a handler component can use external services
(and potentially send new CloudEvents payloads), but the interaction is
limited to the request-response model of HTTP/1.1.
Because of that, it seems like this can be implemented with the current HTTP
trigger — the incoming webhook is an HTTP request, with a CloudEvent payload,
and the response is an HTTP response.
If we add support in the Spin SDKs for easily deserializing Event objects from
HTTP requests and serializing Event objects as HTTP responses, we can write
something similar to:

#[http_component]
pub fn handle_ce(req: Request) -> Result<Response> {

    // perform the spec endpoint validation for abuse protection
    // https://github.com/cloudevents/spec/blob/main/cloudevents/http-webhook.md#4-abuse-protection
    match req.method {
        http::Method::OPTIONS => return spin_sdk::validate_webhook(req),
        _ => {}
    };

    // read the CloudEvents request event from the HTTP request
    let request_event = Event::try_from(req)?;
    
    // create a CloudEvents response event
    let response_event = Event::new(...);
    ...
    // potentially use other external services here
    ...

    // return the HTTP response
    Ok(http::Response::builder()
        .status(200)
        // optionally add other headers
        .body(Some(response_event.into()))?)
}

How would we feel about something like the component above?

This way, we would follow the HTTP webhook specification (returning an HTTP
response with an optional Event response), and reuse the current Spin HTTP
infrastructure for everything (executor, component definition, templates, SDK)

A CloudEvents subscription manager

The above implementation for HTTP webhooks is great, and it requires minimal
changes to Spin. However, it is only a small subset of the potential CloudEvents
integration into Spin.

Of particular importance could be the Subscriptions API.

Specifically, Spin could act as a "subscription manager"
and manage event subscriptions on behalf of "event consumers" (Spin components
defined in the current application).

In this scenario, we would build a specific CloudEvents executor:

// The entry point for a CloudEvents handler.
handle-cloudevent: function(event: event) -> expected<event, error>

This executor would be entirely independent of a specific Spin trigger. Rather,
for Spin triggers that are compatible with the CloudEvents transport protocols
(i.e. HTTP, NATS, or Kafka), we would be able to adapt the incoming request and use the
CloudEvents executor as a new executor, besides the existing executors (for the
HTTP trigger, there would be a Spin executor, a Wagi executor, and a CloudEvents
executor — details on how this differs from the CloudEvents webhook endpoint in
the next section).

This is a CloudEvents Subscription:

{
  "id": "[a subscription manager scoped unique string]",
  "source": "[...]", ?
  "types": "[ "[ce-type values]" + ]", ?
  "config": { ?
    "[key]": [subscription manager specific value], *
  },

  "filters": [ ?
    { "[dialect name]": [dialect specific object] } +
  ],

  "sink": "[URI to where events are delivered]",
  "protocol": "[delivery protocol]",
  "protocolsettings": { ?
    "[key]": "[type]", *
  }
}

The sink field is how the Subscription (and this proposed Spin executor)
would be different compared to the simple webhook scenario — sink defines the
address to which the incoming events MUST be sent. This can be any protocol
that the implementation supports, and the event must be sent without client
code manually performing the request.

In the course of implementing such a component, developers could manually send
requests or events to other locations, but to properly implement the Subscription
API, Spin must implement automatically sending the resulting event to the sink.

For example, this event subscription defines that after execution, the result
CloudEvents event MUST be sent to http://example.com/event-processor.

{
  "id": "sub-193-18365",

  "config": {
    "data": "hello",
    "interval": 5
  },

  "filters": [
    { "prefix": { "type": "com.example." } }
  ],

  "protocol": "HTTP",
  "protocolsettings": {
    "method": "POST"
  },
  "sink": "http://example.com/event-processor"
}

So the proposed executor would invoke the handle-cloudevent function implemented
by the Wasm component, then send the result to the sink address.
The Subscription API supports HTTP, MQTT, AMQP, Kafka, and NATS as sink protocols.

Initially, we would implement a subset of the potential protocols.

The concepts of the Subscription API, the sink address, and the underlying
runtime sending events are very closely related to a few ideas we have been
discussing in the past around "output bindings" — the ability to
simply return values from executing Wasm components and have Spin automatically
send those results to useful places (i.e. publish on queues, store as blobs).

These are some initial, unstructured thoughts about CloudEvents support.
None of the ideas in this comment are strong opinions, and I'm happy to discuss them to find the best solution for Spin.

@itowlson
Copy link
Contributor

A custom executor is one way to approach this, and provides a very immediate solution that we already know well how to do. However, at the moment executors are very intimately linked to triggers. You don't want the HTTP, MQTT, Kafka and NATS triggers to all have to carry around cloud events code. And it would be nice if protocol/product owners could add executors for their own application protocols without needing to touch trigger repos.

Obviously we could reorganise things so that executors rather than triggers are the 'units' but we'd need to figure out how one HTTP base could serve multiple executors.

But I wonder if it's worth exploring an alternative approach that allows protocol bindings to be built and injected separately from transports/triggers. In a world like .NET or Java we could do this by dynamically loading code, but that's hard to do in Rust. One possibility is to do the binding guest side, via a linked component. This would have an additional benefit that transport-protocol bindings could be written in any language that compiled to Wasm.

I'm thinking the idea would be something like:

  • The trigger (e.g. HTTP or Kafka) detects an 'event' (request, message, etc.) and routes it to a component.
  • The component is configured with two things:
    • A new, optional setting that specifies a protocol binder component (e.g. HTTP/CloudEvents). Each protocol binder is a standard pre-written component.
    • The existing 'guest module to run' setting (the user code). Today this conforms to the executor signature but if a binder is present it conforms instead to the app protocol signature.
  • The protocol binder exports a handler function that conforms to the transport (trigger/executor) signature (e.g. HTTP request-response), and imports a function that conforms to the application protocol (e.g. CloudEvents).
  • Spin performs linking so that the protocol binder's import is satisfied by the user code's export.
  • Spin calls the protocol binder's entry point with the raw trigger event.

I haven't thought this through to be clear! Just sharing it for discussion.

(Already, thinking about it, we could use the executor setting as the hypothetical magic setting to use a binder, and then the user wouldn't have to care if the binder was internal-Rust or external-Wasm.)

@Mossaka
Copy link
Contributor Author

Mossaka commented May 11, 2022

@radu-matei Thanks for your thoguhts and this is an amazing writing!

This way, we would follow the HTTP webhook specification (returning an HTTP
response with an optional Event response), and reuse the current Spin HTTP
infrastructure for everything (executor, component definition, templates, SDK)

For the Webhook use case, I can definitely see the value of reusing the Spin HTTP component to handle the events. I had a working example in spin-kitchensink pr that implements the webhook abuse protection mechanism. I agree that if we can offload the logic of parsing HTTP headers, match header attributes and handle callbacks in the Spin SDK, it will be a much better user experience.

I would argue that if eventually a CloudEvents trigger like handle-cloudevent: function(event: event) -> expected<event, error> is what we want in Spin, then we should make a step further - offload the entire webhook abuse protection logic to Spin internal, such that component writer does not even need to worry about checking HTTP::Options header anymore.

#[cloudevents_component]
pub fn handle_ce(in: Event) -> Result<Event> {
    // before the function is called, the validation HTTP request is handled 
    // and Spin serializes HTTP request to CloudEvents
    
    // potentially use other external services here
    ...

    // return the event
    Ok(in)
}

The CloudEvents component will automatically deserialize CloudEvents return value back to HTTP response and return back. The benefits of doing this is that

  1. It is trigger agnostic. The component itself does not care about what trigger it is and user is free to change HTTP to other transport protocols bindings that CloudEvents support.
  2. It hinds the if branch to check HTTP OPTIONS method
  3. It is chainable to other CloudEvents component.

A CloudEvents subscription manager

I really like the subscription spec. In fact, there are three next-gen specficiations for CloudEvents, and I think they work together in an amazing way.

  1. Discovery API - Spin could make a request to external service that supports Discovery API to understand

    • what events the service produces and h
    • how to subscribe to the service
      If in the future, Spin supports multiple triggers, Spin itself could become a service that implements Discovery API for external services to address.
  2. Subscription API, as @radu-matei already put it, Spin could become a susbcription manager.

  3. Schema Registry this allow Spin to understand different schema documents for serialization and data validation.

The concepts of the Subscription API, the sink address, and the underlying
runtime sending events are very closely related to a few ideas we have been
discussing in the past around "output bindings" — the ability to
simply return values from executing Wasm components and have Spin automatically
send those results to useful places (i.e. publish on queues, store as blobs).

Totally agree! This could unlock the ability to easily chain multiple Spin applications together in a serverless manner. Imagine 2 Spin application running on the cloud, and one's sink is an adress of another Spin application. This creates a pipeline of handling workloads to wasm modules.

Check out the CloudEvents extensions, which could be very useful in this scenario:

  1. The distributed tracing extension
  2. Sequence basically defined the order of how events should arrive.

This worth a few SIP proposals to fully spec out. I am happy to help bringing CloudEvents support to Spin!

@Mossaka
Copy link
Contributor Author

Mossaka commented May 11, 2022

See an disussion here on the use cases of CloudEvents subscription spec: cloudevents/spec#767

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 📋 Investigating / Open for Comment
Development

No branches or pull requests

3 participants