Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish search API, frontend and backend #28

Merged
merged 30 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: Test Suite

on:
pull_request:
branches:
- main

push:
branches:
- main

jobs:
test:
name: Python ${{ matrix.python-version }} - ${{ matrix.connection }} [redis-stack ${{matrix.redis-stack-version}}]
runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
python-version: ["3.11"]
# python-version: ["3.9", "3.10", "3.11"]
rbs333 marked this conversation as resolved.
Show resolved Hide resolved
redis-stack-version: ['latest']
# python-version: [3.9, 3.10, 3.11] # idk if we need all of this for a demo repo
# connection: ['hiredis', 'plain']
# redis-stack-version: ['6.2.6-v9', 'latest', 'edge']

services:
redis:
image: redis/redis-stack-server:${{matrix.redis-stack-version}}
ports:
- 6379:6379

steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'

- name: Install Poetry
uses: snok/install-poetry@v1

- name: Install dependencies
working-directory: ./backend
run: |
poetry install --all-extras

# - name: Install hiredis if needed
# if: matrix.connection == 'hiredis'
# run: |
# poetry add hiredis

- name: Set Redis version
run: |
echo "REDIS_VERSION=${{ matrix.redis-stack-version }}" >> $GITHUB_ENV

# - name: Authenticate to Google Cloud
rbs333 marked this conversation as resolved.
Show resolved Hide resolved
# uses: google-github-actions/auth@v1
# with:
# credentials_json: ${{ secrets.GOOGLE_CREDENTIALS }}

- name: Run tests
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
working-directory: ./backend
run: |
poetry run test
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,17 @@
arxiv-metadata-oai-snapshot.json
arxiv-papers-1000.json
arxiv.zip
*.DS_STORE
*.log
.env
.ipynb_checkpoints
*.pkl
.venv
venv
__pycache__
new_backend/arxivsearch/templates/
*/.nvm
.coverage*
coverage.*
htmlcov/
legacy-data/
24 changes: 24 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: FastAPI",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like checking in launch scripts in case people need help setting a debuger

"type": "debugpy",
"cwd": "${workspaceFolder}/backend/",
"env": {
"PYTHONPATH": "${cwd}"
},
"request": "launch",
"module": "uvicorn",
"args": [
"arxivsearch.main:app",
"--port=8888",
"--reload"
],
"jinja": true,
}
]
}
6 changes: 6 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"python.testing.pytestArgs": [],
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"python.testing.cwd": "${workspaceFolder}/backend/",
}
31 changes: 20 additions & 11 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,38 +1,47 @@
FROM node:18.8-alpine AS ReactImage
FROM node:22.0.0 AS ReactImage

WORKDIR /app/frontend

ENV NODE_PATH=/app/frontend/node_modules
ENV PATH=$PATH:/app/frontend/node_modules/.bin

COPY ./frontend/package.json ./
RUN yarn install --no-optional
RUN npm install

ADD ./frontend ./
RUN yarn build
RUN npm run build


FROM python:3.9-slim-buster AS ApiImage
FROM python:3.11 AS ApiImage

ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1

RUN python3 -m pip install --upgrade pip setuptools wheel

WORKDIR /app/
COPY ./data/ ./data
VOLUME [ "/data" ]

RUN apt-get update && \
apt-get install -y curl && \
rm -rf /var/lib/apt/lists/*

RUN curl -sSL https://install.python-poetry.org | POETRY_HOME=/opt/poetry python && \
cd /usr/local/bin && \
ln -s /opt/poetry/bin/poetry && \
poetry config virtualenvs.create false

RUN mkdir -p /app/backend

# copy deps first so we don't have to reload everytime
COPY ./backend/poetry.lock ./backend/pyproject.toml ./backend/

WORKDIR /app/backend
RUN poetry install --all-extras --no-interaction

COPY ./backend/ .
RUN pip install -e . --no-cache-dir

# add static react files to fastapi image
COPY --from=ReactImage /app/frontend/build /app/backend/arxivsearch/templates/build

LABEL org.opencontainers.image.source https://github.com/RedisVentures/redis-arxiv-search

WORKDIR /app/backend/arxivsearch

CMD ["sh", "./entrypoint.sh"]
CMD ["poetry", "run", "start-app"]
13 changes: 8 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the rest of this Readme up to date with instructions to run the service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is now.

<div align="center">
<a href="https://github.com/RedisVentures/redis-arXiv-search"><img src="https://github.com/RedisVentures/redis-arXiv-search/blob/main/backend/arxivsearch/data/redis-logo.png?raw=true" width="30%"><img></a>
<a href="https://github.com/RedisVentures/redis-arXiv-search"><img src="https://redis.io/wp-content/uploads/2024/04/Logotype.svg?raw=true" width="30%"><img></a>
<br />
<br />
<div display="inline-block">
<a href="https://docsearch.redisvl.com"><b>Hosted Demo</b></a>&nbsp;&nbsp;&nbsp;
<a href="https://github.com/RedisVentures/redis-arXiv-search"><b>Code</b></a>&nbsp;&nbsp;&nbsp;
<a href="https://github.com/redis-developer/redis-ai-resources"><b>More AI Recipes</b></a>&nbsp;&nbsp;&nbsp;
<a href="https://datasciencedojo.com/blog/ai-powered-document-search/"><b>Blog Post</b></a>&nbsp;&nbsp;&nbsp;
<a href="https://redis.io/docs/interact/search-and-query/advanced-concepts/vectors/"><b>Redis Vector Search Documentation</b></a>&nbsp;&nbsp;&nbsp;
</div>
Expand All @@ -23,7 +24,9 @@
The arXiv papers dataset was sourced from the the following [Kaggle link](https://www.kaggle.com/Cornell-University/arxiv). arXiv is commonly used for scientific research in a variety of fields. Exposing a semantic search layer enables natural human language to be used to discover relevant papers.


![Demo](data/assets/arXivSearch.png)
<!-- ![Demo](data/assets/arXivSearch.png) -->

![Demo](image.png)

## Application

Expand Down Expand Up @@ -110,11 +113,11 @@ It's typically easier to build front end in an interactive environment, testing
2. Install packages (you may need to use `npm` to install `yarn`)
```bash
$ cd frontend/
$ yarn install --no-optional
$ npm install
````
4. Use `yarn` to serve the application from your machine
4. Use `npm` to serve the application from your machine
```bash
$ yarn start
$ npm start
```
5. Navigate to `http://localhost:3000` in a browser.

Expand Down
8 changes: 8 additions & 0 deletions backend/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Python
__pycache__
app.egg-info
*.pyc
.mypy_cache
.coverage
htmlcov
.venv
Empty file added backend/README.md
Empty file.
6 changes: 6 additions & 0 deletions backend/arxivsearch/api/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from fastapi import APIRouter

from arxivsearch.api.routes import papers

api_router = APIRouter()
api_router.include_router(papers.router, prefix="/papers", tags=["papers"])
Loading
Loading