-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sensitive data leakage #1374
Comments
Hey @pesmeriz! 👋 I'm here to help you out with any issues or questions you have. If you need assistance, feel free to ask! There is a known issue with the A similar issue was discussed in an open issue, which is now closed. The suggested temporary fix involves modifying the Here is the suggested temporary fix: Replace the following code in # Add dataframe details
dataframe_info += f"\ndfs[{extras['index']}]:{df.rows_count}x{df.columns_count}\n{df.to_csv()}" With this: # TEMP FIX: Do not add dataframe details
df_without_sample_data = pd.DataFrame(columns=df.pandas_df.columns)
dataframe_info += f"\ndfs[{extras['index']}]:{df.rows_count}x{df.columns_count}\n{df_without_sample_data.to_csv()}" This change will prevent the original dataframe details from being included when |
I tried the proposed solution and the problems remains. Also, this should be made clearar on the docs, though. Otherwise people start using, find a bug (or worse, they don't), and then are left without a solution and time wasted. |
Hi, @pesmeriz. I'm Dosu, and I'm helping the pandas-ai team manage their backlog. I'm marking this issue as stale. Issue Summary:
Next Steps:
Thank you for your understanding and contribution! |
Not sure this should really be closed. That bot is a bit aggressive. You can work around it by patching the
This works for me because I need more samples sent, but ymmv. |
Hi @gladeluco, yes, I agree on your view on the bot. Sorry for that. Thanks for sharing the workaround in the meanwhile |
System Info
OS version: MacOS Sequoia 15.0
My pyproject.toml
🐛 Describe the bug
Using
"enforce_privacy": True
does not anonimize the data. Even if you usecustomer_head
on yourSmartDataframe
, the Agent will always share the data within the original dataframe. My example:You can check this on
/pandasai/llm/bamboo_llm.py
line 18.The text was updated successfully, but these errors were encountered: