Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arxiv ingest method via API #158

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open

Arxiv ingest method via API #158

wants to merge 29 commits into from

Conversation

star-nox
Copy link
Member

Will contain all functions which deal with data extraction from online journals like arXiv, Wiley, Springer, etc.

star-nox and others added 29 commits November 6, 2023 12:47
Bumps [pymupdf](https://github.com/pymupdf/pymupdf) from 1.22.5 to 1.23.6.
- [Release notes](https://github.com/pymupdf/pymupdf/releases)
- [Changelog](https://github.com/pymupdf/PyMuPDF/blob/main/changes.txt)
- [Commits](pymupdf/PyMuPDF@1.22.5...1.23.6)

---
updated-dependencies:
- dependency-name: pymupdf
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* should be fully working, in final testing

* trying to fix double nested kwargs

* fixing readable_filename in pdf ingest

* apt install tesseract-ocr, LAME

* remove stupid typo

* minor bug

* Finally fix **kwargs passing

* minor fix

* guarding against webscrape kwargs in pdf

* guarding against webscrape kwargs in pdf

* guarding against webscrape kwargs in pdf

* adding better error messages

* revert req changes

* simplify prints
Bumps [typing-extensions](https://github.com/python/typing_extensions) from 4.7.1 to 4.8.0.
- [Release notes](https://github.com/python/typing_extensions/releases)
- [Changelog](https://github.com/python/typing_extensions/blob/main/CHANGELOG.md)
- [Commits](python/typing_extensions@4.7.1...4.8.0)

---
updated-dependencies:
- dependency-name: typing-extensions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kastan Day <[email protected]>
Bumps [flask](https://github.com/pallets/flask) from 2.3.3 to 3.0.0.
- [Release notes](https://github.com/pallets/flask/releases)
- [Changelog](https://github.com/pallets/flask/blob/main/CHANGES.rst)
- [Commits](pallets/flask@2.3.3...3.0.0)

---
updated-dependencies:
- dependency-name: flask
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kastan Day <[email protected]>
* updated nomic version in requirements.txt

* initial commit to PR

* created API endpoint

* completed export function

* testing csv export on railway

* code to remove file from repo after download

* moved file storing out of docs folder
Copy link

You need to setup a payment method to use Lintrule

You can fix that by putting in a card here.

Copy link

railway-app bot commented Nov 30, 2023

This PR is being deployed to Railway 🚅

flask: ◻️ REMOVED

@KastanDay KastanDay changed the title AIFARMS publications ingest Arxiv ingest method via API Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants