Update on collecting learnings centrally #436
AntonOsika
started this conversation in
General
Replies: 1 comment
-
I don't have an issue with collecting prompts, prompt results. As long as it isn't collecting code for example, that we may have it analyze. Like our own code base, that we may have it analyze, etc... |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey everyone,
Since we added the ability for
gpt-engineer
to learn from each time its used, I wrote up this in the "terms" and linked to the README on GitHub explaining the data collection process.However I crucially missed adding the file to the update (this is fixed now) so it became a broken link. I'm a bit shaken by the blunder on my side.
To be fully clear:
The learnings are collected centrally, and they include 1. the prompt you used 2. if you gave "feedback" to the system (via the
feedback
file).A lot of care is put into not sending or storing any private date (IP, use agent, similar). I'm happy to have this fact audited.
With that said, if you are uncomfortable with your prompts being sent to other servers and stored, I advice against using gpt-engineer
Personally I am committed to doing what is best for the community. Here, I think this means striking the right balance of not invading privacy and building a useful tool. Without getting feedback on how well this tool works for users it is difficult to improve it. Furthermore, I limit access to the raw data as much as possible. Only I have access to the raw data and this will continue to be the case. I will aggregate "number of successes" and similar and share more broadly.
My experience before this was brought up was created was that: very few people are protective of sharing their prompts with external services. I have now met those that disagree. I have already merged a PR that attempts to address this.
A quick ask from me to the community:
If someone could make a PR to make the "CLI review flow" ask "is OK to send data", that would be greatly appreciated.
In addition, I wanted to ask you all:
What do you think? OpenAI will store the prompts and feedback (so we don't get around that) but do you think it is a good idea that we collect fail/success data from those that use gpt-engineer? Or do you prefer if we don't collect any data at all?
Beta Was this translation helpful? Give feedback.
All reactions