You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature Request - Conversation Import, Editing, and Export Capabilities for Dataset Creation
Issue Description:
I am proposing a new feature that allows users to load and edit their conversations with GPT, with the goal of exporting them as a structured dataset. This feature would be immensely helpful for users who have a large amount of interaction data (~10M tokens) and want to organize it into a dataset, potentially even aiming towards a 1B tokens dataset, for future training purposes.
The feature should ideally enable:
Conversation Import and Structuring: Ability to load all user-GPT conversations and organize them in a structured format, like a thread, where each frame corresponds to one conversation.
Message Editing: Provide functionality to edit or cancel certain messages within the conversation. This could include the ability to modify the content, or completely remove certain interactions.
Text Masking through Regex/Substitution: Allow the use of regular expressions or text substitution to mask sensitive information, such as the user's name or any other identifiable information. This would be particularly useful for users who want to anonymize their conversations before exporting or sharing.
Text Tagging: Offer an option to select a subset of text within a message and assign them specific tags. This could be useful for further classification and analysis of the conversation data.
These capabilities would make it significantly easier to extract meaningful information from the conversations, anonymize the data, and transform it into a structured format, ready for any subsequent data analysis or machine learning task.
This is a proposal for an enhancement to our current system, and any thoughts, suggestions or improvements are highly welcome.
The text was updated successfully, but these errors were encountered:
I'm gonna use the new function in the BaseThread in order to do this. User inserts the chatgpt link and it will open a new conversation tab in the UI with the same history. So you can't just load all the conversation at once, unless you provide the link for each conversation, one by one.
no problem
no problem
I can do that but I'm not sure if you want to do that directly in the chat or in a memoryframe ? How should we visualize the frame tho, right now ?
to start ok, I would prefer we can load all the conversations from the backup zip, that I would drag and drop
For now a visualization similar to a single column of the text/pytorch dual column repo would be nice, the best ui to do the tagging I don't know to be fair, we can store the tags in the frame itself
Feature Request - Conversation Import, Editing, and Export Capabilities for Dataset Creation
Issue Description:
I am proposing a new feature that allows users to load and edit their conversations with GPT, with the goal of exporting them as a structured dataset. This feature would be immensely helpful for users who have a large amount of interaction data (~10M tokens) and want to organize it into a dataset, potentially even aiming towards a 1B tokens dataset, for future training purposes.
The feature should ideally enable:
Conversation Import and Structuring: Ability to load all user-GPT conversations and organize them in a structured format, like a thread, where each frame corresponds to one conversation.
Message Editing: Provide functionality to edit or cancel certain messages within the conversation. This could include the ability to modify the content, or completely remove certain interactions.
Text Masking through Regex/Substitution: Allow the use of regular expressions or text substitution to mask sensitive information, such as the user's name or any other identifiable information. This would be particularly useful for users who want to anonymize their conversations before exporting or sharing.
Text Tagging: Offer an option to select a subset of text within a message and assign them specific tags. This could be useful for further classification and analysis of the conversation data.
These capabilities would make it significantly easier to extract meaningful information from the conversations, anonymize the data, and transform it into a structured format, ready for any subsequent data analysis or machine learning task.
This is a proposal for an enhancement to our current system, and any thoughts, suggestions or improvements are highly welcome.
The text was updated successfully, but these errors were encountered: