-
-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement Suggestion: Add RAG to the main diagram #240
Comments
Thanks @jsotiro ! Confirming receipt and self-assigned. I also had a similar vision and will review / work with you to take this on. Appreciate the input and feedback too!! |
Confirming I have access to our source code of v1.1 diagram artifact and will progressively work on this as well as other elements for improvement |
https://openreview.net/pdf?id=wK7wUdiM5g0 is another paper that might be worth citing in this area |
+1 on adding RAG and connecting sensitive data stores to the LLM models is becoming a very popular pattern for building AI applications. |
Will be happy to provide feedback or participate in the diagram creation. |
Hey @Bobsimonoff , Super gross formatting, but getting my idea about what we discussed and hope it can help with some conceptualization |
@GangGreenTemperTatum In the above pictures, I see a number of the Top 10 risk annotations have been removed. All risks attached to the arrows between LLM Production Services and plugins/extensions (sensitive information disclosure, insecure output handling, overreliance, insecure plug-in design, prompt injection (indirect), model denial of service) and risks associated with the LLM Model (Model Theft and Training Data Poisoning). These removals deliver it or was it more just to get your thoughts down to be able to include the purple text boxes? |
Sorry @Bobsimonoff , pretty much ignore everything but the purple boxes, not requesting those edits of LLM annotations etc The reason for the cuts was a simplified abstract from the main image for simple high level threat modelling and labeling some obvious trust boundaries. |
@GangGreenTemperTatum How should I read these purple areas and "Downstream services". Are these downstream services applicable to the left purple area as well? RAG is heavily used with Application services. So, I would like to see if these down stream services connected to Application services as well. |
@GangGreenTemperTatum @Bobsimonoff Any thoughts on the RAG applicability to Application services? I believe RAG is typically used in Application Services. In fact Sam Altman claimed that plugins did not see product-market fit beyond browsing. That might change with the agent's introduction. |
I think many companies, @GangGreenTemperTatum, are also driving their RAG off of Langchain and similar capabilities, which would be more in the Automation box. Also that quote was from June now there are literally hundreds of paid plugins/add-ons, so I am not sure his quote aged well. |
Here are some examples of applications that do RAG as part of pre-processing:
Regarding the plugins, There are multiple threads that indicate Custom GPTs/agent APIs are replacing plugins. Here are some examples:
I expect RAG to be used everywhere, Agents, Plugins and more commonly, applications and Orchestrators like Lang chain are going to make them simpler and are embedded as SDKs into applications. Current diagram indicates that application only interfaces with Fine-tuned and training data, which is misleading There are three types of models that AI application can use:
The current picture does not show this variety. I suggest expanding LLM to include these 3 varieties by adding 3 types of boxes. Since Fine-tuned data is mainly the input of Fine-tuned models and Training data is the input to custom models. My suggestion is to add the lines to those model boxes and remove from the application. To account for the applications using Data stores during pre-processing at inference time, My suggestion is to generalize "Down stream services" as "Retrieval services or some other meaningful name" and connect them from applications as well as plugins. Would be happy to jump on a call to discuss this further. |
@Bobsimonoff @NerdAboutTown you happy taking this one as part of v2? |
Retrieval augmented generation (RAG) is technique to enrich LLMs with own data. It has become very popular as it lowers the complexity entry to enriching input in LLM apps, allows for better access controls as opposed to fine tuning, and is known to reduce hallucination (see https://www.securityweek.com/vector-embeddings-antidote-to-psychotic-llms-and-a-cure-for-alert-fatigue/) see also the excellent Samsung paper on enterprise use of GenAI and the role of RAG.
RAG creates its own security risks and adds to the attack surface. yet, the diagram only includes fine tuning, We should add explicitly RAG as part of our diagram and annotate it with related LLM items.
Some useful links:
architectural approaches
Azure: https://github.com/Azure/GPT-RAG
AWS SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-customize-rag.html
AWS Bedrock RAG workshop: https://github.com/aws-samples/amazon-bedrock-rag-workshop
security concerns:
Security of AI Embeddings explained
Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models
Embedding Layer: AI (Brace For These Hidden GPT Dangers)
The text was updated successfully, but these errors were encountered: