-
Notifications
You must be signed in to change notification settings - Fork 21
Update AI Use Cases #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for frolicking-manatee-96c0c8 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
8563d5c
to
f1b90f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. A few comments.
|
||
### Attestations | ||
- Federated Learning requests multi-SDK attestation | ||
- FL Servers needs to verify all client’s trustworthiness |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: client's
-> clients'
### Attestations | ||
- Federated Learning requests multi-SDK attestation | ||
- FL Servers needs to verify all client’s trustworthiness | ||
- Attestation at different points, self and cross verifications via attestation service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't support all of these modes currently. What is the scope of this document? Are we introducing people to what these workloads are, the requirements, how you can use coco, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intended to be first pass at capturing info on the Use Cases, not how you can use CoCo just yet. Aiming to start somewhere then further PRs improve to ultimately get us to the How you use CoCo to solve the Use Case with an example set of containers anyone could use to try out the Use Case
|
||
### CC Policies: | ||
- Bootup policy – provided by hardware vendor | ||
- Self-verification CC policy – user defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is self-verification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self-verification means: participating pod verify if itself satisfy its own specified cc policy. For example, the policy specify the pod needs to a GPU, but the host has no GPU, then self-verification should fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with some clarification from Chester
|
||
 | ||
|
||
Source: [https://uploads-ssl.webflow.com/63c54a346e01f30e726f97cf/6418fb932b58e284018fdac0_OC3%20-%20Keith%20Moyer%20-%20MPC%20with%20CC.pdf](https://uploads-ssl.webflow.com/63c54a346e01f30e726f97cf/6418fb932b58e284018fdac0_OC3%20-%20Keith%20Moyer%20-%20MPC%20with%20CC.pdf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think our docs are required to follow a creative commons license. I don't see any copyright stuff in these slides so maybe we're ok. Are the other images in this doc created by us or do they come from somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked both PDFs are from @bpradipt
@@ -9,4 +9,74 @@ tags: | |||
weight: 60 | |||
--- | |||
|
|||
Coming soon | |||
## Federated Learning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an introductory section to describe the purpose of the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is federated Learning
Federated Learning (FL) is a decentralized machine learning approach where multiple participants (such as organizations, edge devices, or distributed servers) collaboratively train a model without sharing their raw data. Instead of centralizing data, each participant trains a local model and shares only model updates with a central aggregator, preserving data privacy and reducing communication overhead.
FL is widely applied in domains like healthcare, finance, transportation, and edge computing, where concerns on data privacy, cost of moving data, regulatory compliance and security are critical.
Confidential Federated AI with confidential computing
Although Federated Learning (FL) doesn't move the data, there are still data privacy issues with model update. There are several privacy enhancement technologies (PETs) are available: example Differential Privacy and Homomorphic Encryption.
By leveraging Confidential Computing Trusted Execution Env. (TEE), we are adding additional security option or layer for the FL. FL + Confidential Computing can be called Confidential Federated AI.
Confidential Federated AI Use Cases and Requirements
Unlike most of AI Inference or RAG, or clean room use cases, Federated learning could have multiple Confidential TEEs. The use cases and requirements are different than single TEE case.
We have identified three common use cases
Building Explicit Trust
Since FL requires multiple participants (banks, hospitals etc.) to collaborate, they must trust each other. Traditionally, such trust is based on: business/legal contracts, IT security team verifications, such trust can be considered implicit trust. As participants need to trust the IT infrastructure and data scientists on all sides to be honest, no ill intentions and make no mistakes.
If the participants really concerns the security, CC will provide a much strong guarantee with "explicit trust" by leveraging CC Attestation.
"implicit trust" ==> "explicit trust"
Requirements for Attestation
- We need to be able to verify the other participants are trust worthy with specified CC policy.
This is cross-verification via attestation, the policy is cross-verification polices.
For example, in FL system, we have a FL server and several FL clients. FL Server needs to be able check FL Clients are trust worthy; FL clients want to make sure the FL Server is trust worthy as well.
- Approach 1: CoCo Allows App get Attestation Token
FL Server and FL Client exchange the attestation tokens and then independently verify the received token via different attestation services.
Currently, CoCo doesn't provide a way for Application to get the attestation token, therefore this approach currently doesn't work in CoCo.
- Approach 2: CoCo Trust will provide API to perform the verification
For example, instead FL server and FL Client exchange attestation tokens, FL Server will ask CoCo Trustee with instruction like:
"please verify Client with ID = 123 is trust worthy based on policy 1"
fl_server.get_trustee().verify( object_id = 123, policy=policy_1.json)
Similarly, FL Client will ask CoCo Trustee to do the same
"please verify FL Server with ID = 456 is trust worthy based on policy 2"
fl_client.get_trustee().verify( object_id = 456, policy=policy_2.json)
In other words, in CoCo env. Please provide one of ways to do it. either you let me do it myself ( provide me with token) or you do it for me.
- Attestation must be allow to be verified ad hoc or periodical
Secure Aggregation
The 2nd use case is secure aggregation. This similar to the clean-room use case, where the client's model is send to the aggregator ( usually is in FL server), where the model is aggregated. In a simplest case, Client trust each other, but not trust server, as the model can potentially inverted to find private data.
In this case, we can use Confidential Computing TEE (CoCo) to lockdown the access to the TEE.
Model IP Protection
The 3rd Use case is avoid model is being copied away during fine-tuning at Client side.
In this case, we can lock down the FL client side with CC TEE and other techniques.
summary of use cases and requirements
We have discussed the FL use cases and requirements.
The main blocker for FL in CoCo is the attestation for cross-verification. While we can do this easily in non-CoCo deployment env ( such as CVM on bare metal, CVM on Azure, Confidential Container in Azure ACI), we can't do the same in CoCo env.
In CoCo, we can't do ourself ( no access to attestation token) and CoCo won't do for us ( no APIs)
Hope this can change soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IP protection is more than Federated Learning, but for any Model inference (including RAG or Agent).
The key issue is that beside run-time protection. How to deploy the Model to the untrusty worthy host before the TEE is established.
In other words:
Model + Code ==> distributed ==> inference servicing infrastructure (untrusted host, will launch TEE)
How do we ensure the model + Code are not tampered BEFORE the TEE is running.
Federated learning (training or evaluation) is facing the same challenges during such process (provision + distribution)
IP protection means the Model weight and code (training, inference code) will not be leaked even the untrusted host trying to tamper BEFORE the CVM is launched
- Support multiple devices | ||
- One participant may has only one type of device (CPU), but need to verify other participant’s devices including different CPU and GPU | ||
|
||
## Retrieval Augmented Generation Large Language Models (RAG LLM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should put this section first? It seems like it's the simplest case and closest to the demos that we have seen.
- data is already poisoned at rest | ||
- model is already poisoned at rest | ||
|
||
## Multi-Party Computing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multi-party computing could be more generic than AI. Do we want it here or in another use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. FL is by nature a party multi-party computing case. But we may want to be specific on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two Multi-party computing use cases in the context of Federated Learning
-
Private Set Intersection ==> we need to find a common set of users for example, without let other participants to know what else non-intersected users. This is the data alignment in Vertical FL.
Currently, one way to do this is using Pair-Wise PSI repeatedly. With CC, this can be done via clean-room fashion. -
Some leveraging Federated Computing Framework as Multi-Agent API system, where FL server is considered as Coordinator, FL clients are the agents. In such setup, each Agent will perform RAG operations and Coordinator will perform the reranking and LLM inferencing, the approach is called Fed RAG. In such case, Multi-party are involved with different CC TEEs to protect Server and Clients.
Hey @magowan , thanks for adding this content for the AI use case (and for secure supply chains #82). These will be very useful for website visitors. I looked for the slides, which I guess are these? My main suggestion would be to move from bullet points and open questions (taken perhaps from the slides and working group) to hardened paragraphs with a key audience in mind (e.g. an engineer or technical manager who wants to see if we can capture their use case). |
Discussed with James (@magowan) on this, we will write a requirement doc on this in the coming weeks. |
After discussed with James (@magowan), we decided directly write as comment on this PR instead of write a separate requirement doc. I added the Federated learning section and addressed the comments. |
The Use Case working group generated slides describing Use Cases, this PR brings the description of the Supply Chain Use case into the website content. Signed-off-by: James Magowan <[email protected]>
The Use Case working group generated slides describing Use Cases, this PR brings the description of the Supply Chain Use case into the website content.