Skip to content

Current Status on SnooSpoof (TODO) #1

@cnshing

Description

@cnshing

Implemented

  • Basic parsing for generated text
  • Basic scrapper
  • Finish Unit Testing on Parsing

Todo

  • Add Figma, Illustrator design files to repository
  • Add check for parsing invalid texts
  • Finish Unit Testing on Scrapper
  • Implement Flask Middleman API to use every component
  • Implement Trainer Component
  • Implement Encoder Component
  • Create adapter interface between front-end and backend
  • Create React components from Figma design
  • Abstract the Scrapper component to be injectable with any PRAW Reddit Instance
  • Optimize model/text pipeline mechanisms for repeat usage
  • Basic data validation for encoder and dataset
  • Remove gentext2json as it will most likely not be used in the future
  • Remove trainer_unittest as it is unneccesary to test a high level abstracted component of a heavily tested library
  • Compress the middleman generate API query parameter url to prevent it from exceeding 2048 characters

Nice to Have

  • Allow any component to be modularly run on seperate instances and configurations(Google Colab?)
  • Figure out some way for SnooSpoof to be hosted indepedently/freely
  • Add a caching system to optimize scrapper/dataset
  • More robust validiation for datasets (pydantic?)
  • Refactor requires function to not have to call itself to verify, rather let it automatically verify the features automatically whenever the requires decorator was applied
  • Refactor tag format to make it customizeable [prompt] or "prompt: " or (prompt), etc
  • Make each tag format more disconnected from the underlying text samples as to not pollute the data, by marking each tag format as a special token, making each token less distinguishable from regular text, etc
  • Allow asynchronous processing of text generation
  • Make Google Colab FastAPI notebook public
  • Containerize application as docker container for easy deployment for GPU renting servers

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions