Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix Type Error in Nomic Logging (#174)
* updated nomic version in requirements.txt * Updated Nomic in requirements.txt * fix openai version to pre 1.0 * upgrade python from 3.8 to 3.10 * trying to fix tesseract // pdfminer requirements for image ingest * adding strict versions to all requirements * Bump pymupdf from 1.22.5 to 1.23.6 (#136) Bumps [pymupdf](https://github.com/pymupdf/pymupdf) from 1.22.5 to 1.23.6. - [Release notes](https://github.com/pymupdf/pymupdf/releases) - [Changelog](https://github.com/pymupdf/PyMuPDF/blob/main/changes.txt) - [Commits](pymupdf/PyMuPDF@1.22.5...1.23.6) --- updated-dependencies: - dependency-name: pymupdf dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * compatible wheel version * upgrade pip during image startup * properly upgrade pip * Fully lock ALL requirements. Hopefully speed up build times, too * Limit unstructured dependencies, image balloned from 700MB to 6GB. Hopefully resolved * Lock version of pip * Lock (correct) version of pip * add libgl1 for cv2 in Docker (for unstructured) * adding proper error logging to image ingest * Installing unstructured requirements individually to hopefully redoce bundle size by 5GB * Reduce use of unstructured, hopefully the install is much smaller now * Guarantee Unique S3 Upload paths (#137) * should be fully working, in final testing * trying to fix double nested kwargs * fixing readable_filename in pdf ingest * apt install tesseract-ocr, LAME * remove stupid typo * minor bug * Finally fix **kwargs passing * minor fix * guarding against webscrape kwargs in pdf * guarding against webscrape kwargs in pdf * guarding against webscrape kwargs in pdf * adding better error messages * revert req changes * simplify prints * Bump typing-extensions from 4.7.1 to 4.8.0 (#90) Bumps [typing-extensions](https://github.com/python/typing_extensions) from 4.7.1 to 4.8.0. - [Release notes](https://github.com/python/typing_extensions/releases) - [Changelog](https://github.com/python/typing_extensions/blob/main/CHANGELOG.md) - [Commits](python/typing_extensions@4.7.1...4.8.0) --- updated-dependencies: - dependency-name: typing-extensions dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Kastan Day <[email protected]> * Bump flask from 2.3.3 to 3.0.0 (#101) Bumps [flask](https://github.com/pallets/flask) from 2.3.3 to 3.0.0. - [Release notes](https://github.com/pallets/flask/releases) - [Changelog](https://github.com/pallets/flask/blob/main/CHANGES.rst) - [Commits](pallets/flask@2.3.3...3.0.0) --- updated-dependencies: - dependency-name: flask dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Kastan Day <[email protected]> * Guard against kwargs failures during webscrape * HOTFIX: kwargs in html and pdf ingest for /webscrape * Export conversation history on /analysis page (#141) * updated nomic version in requirements.txt * initial commit to PR * created API endpoint * completed export function * testing csv export on railway * code to remove file from repo after download * moved file storing out of docs folder * added option for extending one URL our when on baseurl or to opt out of it * Guarentee unique s3 upload paths, support file updates (e.g. duplicate file guardfor Cron jobs) (#99) * added the add_users() for Canvas * added canvas course ingest * updated requirements * added .md ingest and fixed .py ingest * deleted test ipynb file * added nomic viz * added canvas file update function * completed update function * updated course export to include all contents * modified to handle diff file structures of downloaded content * modified canvas update * modified ingest function * modified update_files() for file replacement * removed the extra os.remove() * fix underscore to dash in for pip * removed json import and added abort to canvas functions * created separate PR for file update * added file-update logic in ingest, WIP * removed irrelevant text files * modified pdf ingest function * fixed PDF duplicate issue * removed unwanted files * updated nomic version in requirements.txt * modified s3_paths * testing unique filenames in aws upload * added missing library to requirements.txt * finished check_for_duplicates() * fixed filename errors * minor corrections * added a uuid check in check_for_duplicates() * regex depends on this being a dash * regex depends on this being a dash * Fix bug when no duplicate exists. * cleaning up prints, testing looks good. ready to merge * Further print and logging refinement * Remove s3 pased method for de-duplication, use Supabase only * remove duplicate imports * remove new requirement * Final print cleanups * remove pypdf import --------- Co-authored-by: root <root@ASMITA> Co-authored-by: Kastan Day <[email protected]> * Add Trunk Superlinter on-commit hooks (#164) * First attempt, should auto format on commit * maybe fix my yapf github action? Just bad formatting. * Finalized, excellent Trunk configs for my desired formatting * Further fix yapf GH Action * Full format of all files with Trunk * Fix more linting errors * Ignore .vscdoe folder * Reduce max line size to 120 (from 140) * Format code * Delete GH Action & Revert formatting in favor of Trunk. * Ignore the Readme * Remove trufflehog -- failing too much, confusing to new devs * Minor docstring update * trivial commit for testing * removing trivial commit for testing * Merge main into branch, vector_database.py probably needs work * Cleanup all Trunk lint errors that I can --------- Co-authored-by: KastanDay <[email protected]> Co-authored-by: Rohan Marwaha <[email protected]> * Add example usage of our public API for chat calls * Add timeout to request, best practice * Add example usage notebook for our public API * Improve usage example to return model's response for easy storage. Fix linter inf loop * Final fix: Switch to https connections * Enhance logging in getTopContexts(), improve usage exmaple * minor changes for postman testing * minor changes for testing * added print statements * re-creating error * added condition to check if content is a list * added json handling needed to test with Postman * exception handling for get-nomic-map * json decoding for testing * added prints for testing * added prints for testing * added prints for testing * added prints for testing * fix for string error in nomic log * removed json debugging code * Cleanup comments * Enhance type checking, cleanup formatting * formatting * Fix type checks to isinstance() * Revert vector_database.py to status on main --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Kastan Day <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: jkmin3 <[email protected]> Co-authored-by: root <root@ASMITA> Co-authored-by: KastanDay <[email protected]> Co-authored-by: Rohan Marwaha <[email protected]>
- Loading branch information