Skip to content

Conversation

magowan
Copy link
Member

@magowan magowan commented Jan 27, 2025

The Use Case working group generated slides describing Use Cases, this PR brings the description of the Supply Chain Use case into the website content.

Copy link

netlify bot commented Jan 27, 2025

Deploy Preview for frolicking-manatee-96c0c8 ready!

Name Link
🔨 Latest commit 42717e8
🔍 Latest deploy log https://app.netlify.com/sites/frolicking-manatee-96c0c8/deploys/67e53b2fbfc9a700089c1fca
😎 Deploy Preview https://deploy-preview-83--frolicking-manatee-96c0c8.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Member

@fitzthum fitzthum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. A few comments.


### Attestations
- Federated Learning requests multi-SDK attestation
- FL Servers needs to verify all client’s trustworthiness
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: client's -> clients'

### Attestations
- Federated Learning requests multi-SDK attestation
- FL Servers needs to verify all client’s trustworthiness
- Attestation at different points, self and cross verifications via attestation service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support all of these modes currently. What is the scope of this document? Are we introducing people to what these workloads are, the requirements, how you can use coco, etc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intended to be first pass at capturing info on the Use Cases, not how you can use CoCo just yet. Aiming to start somewhere then further PRs improve to ultimately get us to the How you use CoCo to solve the Use Case with an example set of containers anyone could use to try out the Use Case


### CC Policies:
- Bootup policy – provided by hardware vendor
- Self-verification CC policy – user defined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is self-verification?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self-verification means: participating pod verify if itself satisfy its own specified cc policy. For example, the policy specify the pod needs to a GPU, but the host has no GPU, then self-verification should fail

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with some clarification from Chester


![Multi-Party Computing](/img/MultiPartyCompute.png)

Source: [https://uploads-ssl.webflow.com/63c54a346e01f30e726f97cf/6418fb932b58e284018fdac0_OC3%20-%20Keith%20Moyer%20-%20MPC%20with%20CC.pdf](https://uploads-ssl.webflow.com/63c54a346e01f30e726f97cf/6418fb932b58e284018fdac0_OC3%20-%20Keith%20Moyer%20-%20MPC%20with%20CC.pdf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our docs are required to follow a creative commons license. I don't see any copyright stuff in these slides so maybe we're ok. Are the other images in this doc created by us or do they come from somewhere else?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked both PDFs are from @bpradipt

@@ -9,4 +9,74 @@ tags:
weight: 60
---

Coming soon
## Federated Learning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an introductory section to describe the purpose of the document.

Copy link

@chesterxgchen chesterxgchen Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is federated Learning

Federated Learning (FL) is a decentralized machine learning approach where multiple participants (such as organizations, edge devices, or distributed servers) collaboratively train a model without sharing their raw data. Instead of centralizing data, each participant trains a local model and shares only model updates with a central aggregator, preserving data privacy and reducing communication overhead.

FL is widely applied in domains like healthcare, finance, transportation, and edge computing, where concerns on data privacy, cost of moving data, regulatory compliance and security are critical.

Confidential Federated AI with confidential computing

Although Federated Learning (FL) doesn't move the data, there are still data privacy issues with model update. There are several privacy enhancement technologies (PETs) are available: example Differential Privacy and Homomorphic Encryption.

By leveraging Confidential Computing Trusted Execution Env. (TEE), we are adding additional security option or layer for the FL. FL + Confidential Computing can be called Confidential Federated AI.

Confidential Federated AI Use Cases and Requirements

Unlike most of AI Inference or RAG, or clean room use cases, Federated learning could have multiple Confidential TEEs. The use cases and requirements are different than single TEE case.

We have identified three common use cases

Building Explicit Trust

Since FL requires multiple participants (banks, hospitals etc.) to collaborate, they must trust each other. Traditionally, such trust is based on: business/legal contracts, IT security team verifications, such trust can be considered implicit trust. As participants need to trust the IT infrastructure and data scientists on all sides to be honest, no ill intentions and make no mistakes.

If the participants really concerns the security, CC will provide a much strong guarantee with "explicit trust" by leveraging CC Attestation.

"implicit trust" ==> "explicit trust"

Requirements for Attestation
  1. We need to be able to verify the other participants are trust worthy with specified CC policy.

This is cross-verification via attestation, the policy is cross-verification polices.

For example, in FL system, we have a FL server and several FL clients. FL Server needs to be able check FL Clients are trust worthy; FL clients want to make sure the FL Server is trust worthy as well.

  • Approach 1: CoCo Allows App get Attestation Token

FL Server and FL Client exchange the attestation tokens and then independently verify the received token via different attestation services.

Currently, CoCo doesn't provide a way for Application to get the attestation token, therefore this approach currently doesn't work in CoCo.

  • Approach 2: CoCo Trust will provide API to perform the verification

For example, instead FL server and FL Client exchange attestation tokens, FL Server will ask CoCo Trustee with instruction like:

"please verify Client with ID = 123 is trust worthy based on policy 1"

               fl_server.get_trustee().verify( object_id = 123, policy=policy_1.json) 
         

Similarly, FL Client will ask CoCo Trustee to do the same
"please verify FL Server with ID = 456 is trust worthy based on policy 2"

                fl_client.get_trustee().verify( object_id = 456, policy=policy_2.json) 
          

In other words, in CoCo env. Please provide one of ways to do it. either you let me do it myself ( provide me with token) or you do it for me.

  1. Attestation must be allow to be verified ad hoc or periodical

Secure Aggregation

The 2nd use case is secure aggregation. This similar to the clean-room use case, where the client's model is send to the aggregator ( usually is in FL server), where the model is aggregated. In a simplest case, Client trust each other, but not trust server, as the model can potentially inverted to find private data.

In this case, we can use Confidential Computing TEE (CoCo) to lockdown the access to the TEE.

Model IP Protection

The 3rd Use case is avoid model is being copied away during fine-tuning at Client side.

In this case, we can lock down the FL client side with CC TEE and other techniques.

summary of use cases and requirements

We have discussed the FL use cases and requirements.

The main blocker for FL in CoCo is the attestation for cross-verification. While we can do this easily in non-CoCo deployment env ( such as CVM on bare metal, CVM on Azure, Confidential Container in Azure ACI), we can't do the same in CoCo env.

In CoCo, we can't do ourself ( no access to attestation token) and CoCo won't do for us ( no APIs)

Hope this can change soon.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IP protection is more than Federated Learning, but for any Model inference (including RAG or Agent).

The key issue is that beside run-time protection. How to deploy the Model to the untrusty worthy host before the TEE is established.

In other words:
Model + Code ==> distributed ==> inference servicing infrastructure (untrusted host, will launch TEE)
How do we ensure the model + Code are not tampered BEFORE the TEE is running.

Federated learning (training or evaluation) is facing the same challenges during such process (provision + distribution)

IP protection means the Model weight and code (training, inference code) will not be leaked even the untrusted host trying to tamper BEFORE the CVM is launched

- Support multiple devices
- One participant may has only one type of device (CPU), but need to verify other participant’s devices including different CPU and GPU

## Retrieval Augmented Generation Large Language Models (RAG LLM)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should put this section first? It seems like it's the simplest case and closest to the demos that we have seen.

- data is already poisoned at rest
- model is already poisoned at rest

## Multi-Party Computing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-party computing could be more generic than AI. Do we want it here or in another use case?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. FL is by nature a party multi-party computing case. But we may want to be specific on this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two Multi-party computing use cases in the context of Federated Learning

  1. Private Set Intersection ==> we need to find a common set of users for example, without let other participants to know what else non-intersected users. This is the data alignment in Vertical FL.
    Currently, one way to do this is using Pair-Wise PSI repeatedly. With CC, this can be done via clean-room fashion.

  2. Some leveraging Federated Computing Framework as Multi-Agent API system, where FL server is considered as Coordinator, FL clients are the agents. In such setup, each Agent will perform RAG operations and Coordinator will perform the reranking and LLM inferencing, the approach is called Fed RAG. In such case, Multi-party are involved with different CC TEEs to protect Server and Clients.

@portersrc
Copy link
Member

Hey @magowan , thanks for adding this content for the AI use case (and for secure supply chains #82). These will be very useful for website visitors. I looked for the slides, which I guess are these?

My main suggestion would be to move from bullet points and open questions (taken perhaps from the slides and working group) to hardened paragraphs with a key audience in mind (e.g. an engineer or technical manager who wants to see if we can capture their use case).

@chesterxgchen
Copy link

Hey @magowan , thanks for adding this content for the AI use case (and for secure supply chains #82). These will be very useful for website visitors. I looked for the slides, which I guess are these?

My main suggestion would be to move from bullet points and open questions (taken perhaps from the slides and working group) to hardened paragraphs with a key audience in mind (e.g. an engineer or technical manager who wants to see if we can capture their use case).

Discussed with James (@magowan) on this, we will write a requirement doc on this in the coming weeks.

@chesterxgchen
Copy link

Hey @magowan , thanks for adding this content for the AI use case (and for secure supply chains #82). These will be very useful for website visitors. I looked for the slides, which I guess are these?
My main suggestion would be to move from bullet points and open questions (taken perhaps from the slides and working group) to hardened paragraphs with a key audience in mind (e.g. an engineer or technical manager who wants to see if we can capture their use case).

Discussed with James (@magowan) on this, we will write a requirement doc on this in the coming weeks.

After discussed with James (@magowan), we decided directly write as comment on this PR instead of write a separate requirement doc. I added the Federated learning section and addressed the comments.

The Use Case working group generated slides describing Use Cases, this PR brings the description of the Supply Chain Use case into the website content.

Signed-off-by: James Magowan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants