Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a sample to redact sensitive information after an agent generates a response #2927

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Git-Noob123
Copy link

Why are these changes needed?

There's a need for a way to redact sensitive data that is stored as env variables. Currently the only way to do this is to use transform_messages which ONLY redacts before an agent generates a response. However, if a user asks an agent to run a script that retrieves env data, it's still going to be revealed. So a way to hide sensitive data after a response is generated is needed. The sample here shows how can we do it using hooks

Related issue number

Checks

@Git-Noob123
Copy link
Author

@microsoft-github-policy-service agree

@ma-armenta ma-armenta removed the request for review from Knucklessg1 June 12, 2024 16:37
@sonichi
Copy link
Collaborator

sonichi commented Jun 13, 2024

Would you like to render the notebook on the website? Please find instructions here: https://microsoft.github.io/autogen/docs/contributor-guide/documentation

@sonichi sonichi requested a review from WaelKarkoub June 13, 2024 05:24
Copy link
Collaborator

@WaelKarkoub WaelKarkoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Git-Noob123 thank you for the notebook!

I like this notebook since it's a good example to show how process_message_before_send (hooks in general) works.

If we hook TransformMessages onto process_message_before_send it makes debugging these agents much more difficult, since you don't know the ground truth (as it gets modified before being stored in the context history). And as you may already know, it's already difficult to debug vanilla LLMs without any modifications.

In the intro, we should maybe explain the differences between each hookable method, and why you picked the process_message_before_send

notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
Comment on lines 166 to 184
"def transform_generated_response(message: Union[Dict, str], **kwargs ) -> Union[Dict, str]:\n",
" temp_message = copy.deepcopy(message)\n",
" all_secrets = sorted(env_secrets.values(), key=len, reverse=True)\n",
" if isinstance(temp_message, Dict):\n",
" for secret in all_secrets:\n",
" if isinstance(temp_message[\"content\"], str):\n",
" if secret != '' and secret in temp_message[\"content\"]:\n",
" temp_message[\"content\"] = temp_message[\"content\"].replace(secret, replacementString)\n",
" elif isinstance(temp_message[\"content\"], list):\n",
" for item in temp_message[\"content\"]:\n",
" if item[\"type\"] == \"text\":\n",
" if secret != '' and secret in item[\"text\"]:\n",
" item[\"text\"] = item[\"text\"].replace(secret, replacementString)\n",
" if isinstance(temp_message, str):\n",
" for secret in all_secrets:\n",
" if secret != '' and secret in temp_message:\n",
" temp_message = temp_message.replace(secret, replacementString)\n",
"\n",
" return temp_message"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally avoid heavily nested loops simply because they are tougher to reason about, see if you can use regex to do the same thing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I believe we cannot be avoided, as env variables can be very different so I dont think there's a way to use regex for all of them

notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
notebook/agentchat_postresponse_secret_redaction.ipynb Outdated Show resolved Hide resolved
@Git-Noob123
Copy link
Author

Git-Noob123 commented Jun 14, 2024

@WaelKarkoub Thanks for the comments! I have resolved all of them besides the regex one. I added a section at the beginning of the notebook to describe what's hooks in Autogen as well. Please review it and feel free to put in more comments & thoughts.

For debugging with the postprocess hook, I think we should add another section in the notebook to warn readers about how difficult it can be to debug with redacted message, and users can potentially add logging before redaction. On the other hand hiding sensitive information is more of a security concern, as you dont want users to see your secrets. Please let me know what you think

@ghost
Copy link

ghost commented Jun 14, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants