Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into add_keyword_only_…
Browse files Browse the repository at this point in the history
…shaders
  • Loading branch information
WassCodeur committed Jul 11, 2024
2 parents 49f17c1 + 56a6177 commit c0bcdb0
Show file tree
Hide file tree
Showing 15 changed files with 895 additions and 4 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8, 3.9, '3.10', 3.11]
python-version: [3.9, '3.10', 3.11, 3.12]
os: [ubuntu-latest, macos-latest, windows-latest]
platform: [x64]
install-type: [pip, ] # conda]
Expand All @@ -34,13 +34,13 @@ jobs:
use-pre: [false]
include:
- os: macos-latest # ubuntu-latest
python-version: '3.10'
python-version: 3.11
install-type: pip
depends: OPTIONAL_DEPS
coverage: true
use-pre: false
- os: ubuntu-latest
python-version: '3.10'
python-version: 3.12
install-type: pip
depends: OPTIONAL_DEPS
coverage: false
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ docs/source/api
docs/source/auto_examples
docs/source/auto_tutorials
docs/source/reference
docs/source/sg_execution_times.rst
docs/examples/*.png
docs/examples/*.vtk
docs/examples/*.gif
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
115 changes: 115 additions & 0 deletions docs/source/posts/2024/2024-06-06-week-1-robin.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
Week 1: It officially begins...
===============================

.. post:: June 06 2024
:author: Robin Roy
:tags: google
:category: gsoc

Hi, I'm `Robin <https://github.com/robinroy03>`_ and this is my blog about week 1.

My goal for week 1 was to start with `Retrieval-Augmented Generation (RAG) <https://www.pinecone.io/learn/retrieval-augmented-generation/>`_, check different databases and host every endpoint. My week1 and week2 are very intertwined because I applied everything I did during week1 on week2. (I'm writing this blog midway through week2)

why phi-3-mini-4k-instruct?
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Before I detail everything I've done this week, I'll explain why `phi-3 mini 4k <https://huggingface.co/microsoft/Phi-3-mini-4k-instruct>`_ was chosen as the LLM, I forgot to mention this in the last blog. Phi-3 is a small 3.8B 4k context model, it means it can work with 4k tokens(similar to words) at a time. Due to its small size, it runs fast both locally and on Huggingface. Performance wise comparatively with other opensource models, it performs decently well. In the `LMSYS LLM leaderboard <https://chat.lmsys.org/?leaderboard>`_ phi-3 mini 4k comes with an ELO of 1066 (59th position). But it achieves this as a small model.
I also tried Llama3-8B, it performs better than phi-3 mini with an ELO of 1153 and rank 22. But it is considerably slower for inference. Due to this, I chose phi-3 mini for now.


Things I did week-1 (and some week2)
------------------------------------

1) **Choosing the vector database**

I decided to choose `Pinecone <https://www.pinecone.io/>`_ as the vector DB because it had a very generous free tier. Other options on consideration were `pgvector <https://github.com/pgvector/pgvector>`_ and `chromadb <https://www.trychroma.com/>`_, but they didn't have a free tier.

2) **PR Submissions and Review**

I also merged a `PR <https://github.com/fury-gl/fury/pull/891>`_ on FURY which fixes a CI issue. I also spent time doing review of other PRs from my fellow GSoC mates.

3) **Deciding which embedding model to use**

A good embedding model is necessary to generate embeddings which we then upsert into the DB. Ollama had embedding model support, but I found the catalogue very small and the models they provided were not powerful enough. Therefore I decided to try using HuggingFace Sentence Transformers.
Sentence Transformers have a very vibrant catalogue of models available of various sizes. I chose `gte-large-en-v1.5 <https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5>`_ from Alibaba-NLP, an 8k context, 434 million parameter model. It only had a modest memory requirement of 1.62 GB.
Performance wise, it ranks 11th on the `MTEB leaderboard <https://huggingface.co/spaces/mteb/leaderboard>`_. It is a very interesting model due to its size:performance ratio.

4) **Hosting the embedding model**

Hosting this sentence-transformer model was confusing. For some reason, the HF spaces were blocking the Python script from writing on ``.cache`` folder. Docker container inside spaces runs with user id 1000 (non-root user), therefore I had to give it permission to download and store files.

I've hosted 5 gunicorn workers to serve 5 parallel requests at a time. Since the model is small, this is possible.

5) **Hosting the database endpoint**

I wrapped the pinecone DB API into an endpoint so it'll be easy to query and receive the results.
It is also configured to accept 5 concurrent requests although I could increase it a lot more.

I upserted docstrings from ``fury/actor.py`` into the vector DB for testing. So now, whenever you ask a question the model will use some ``actor.py`` function to give you an answer. For now, it could be used like a semantic function search engine.

I decided to abstract the DB endpoint to reduce the dependency on one provider. We can swap the providers as required and keep all other features running.

6) **Hosting Discord Bot**

So with this, all the endpoints are finally online. The bot has some issues, it is going offline midway for some reason. I'll have to see why that happens.

For some reason, Huggingface spaces decided to not start the bot script. Later a community admin from Huggingface told me to use their official bot implementation as a reference. This is why I had to use threading and gradio to get the bot running (migrating to docker can be done, but this is how they did it and I just took that for now).

Huggingface spaces need a script to satisfy certain criteria to allow them to run, one of them is a non-blocking I/O on the main loop. So I had to move the discord bot to a new thread.

Connecting all of them together!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

So now we have 4 hosted services, all hosted on HuggingFace spaces:
- Discord Bot
- LLM API
- Embeddings API
- Database API

Now we'll have to connect them all to get an answer to the user query.

This is the current architecture, there's a lot of room for improvement here.


.. raw:: html

<img src="https://github.com/fury-gl/fury-communication-assets/blob/main/gsoc_2024/7-6-2024-demo-architecture-gsoc-robin-week2.png?raw=true">

The Language Model takes the context and the user query, combines them to form an answer and returns to the user through discord (for now). Maybe moving the core logic from discord bot to a separate node might be good, and connect discord/github/X to that node.
The database takes embeddings and do an Approximate Nearest Neighbor search (a variant of KNN) and returns top-k results (k=3 for now).

.. raw:: html

<iframe src="https://github.com/robinroy03/fury-discord-bot/assets/115863770/48f1136d-18a5-45ee-aa22-0a3f6426d575" width="640" height="390" frameborder="0" allowfullscreen></iframe>

What is coming up next week?
----------------------------

Answer quality improvements. Also, the discord bot dies randomly, have to fix that also.

Did you get stuck anywhere?
---------------------------

Was stuck in hosting models on Huggingface spaces, fixed it later.

LINKS:

- `Discord Bot <https://huggingface.co/spaces/robinroy03/fury-bot-discord/tree/main>`_

- `Database Repo <https://huggingface.co/spaces/robinroy03/fury-db-endpoint/tree/main>`_

- `Embedding Repo <https://huggingface.co/spaces/robinroy03/fury-embeddings-endpoint/tree/main>`_

- `LLM Repo <https://huggingface.co/spaces/robinroy03/fury-bot/tree/main>`_

- `Retrieval-Augmented Generation (RAG) <https://www.pinecone.io/learn/retrieval-augmented-generation/>`_
- `phi-3 mini 4k <https://huggingface.co/microsoft/Phi-3-mini-4k-instruct>`_
- `LMSYS LLM leaderboard <https://chat.lmsys.org/?leaderboard>`_
- `Pinecone <https://www.pinecone.io/>`_
- `pgvector <https://github.com/pgvector/pgvector>`_
- `chromadb <https://www.trychroma.com/>`_
- `PR <https://github.com/fury-gl/fury/pull/891>`_
- `gte-large-en-v1.5 <https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5>`_
- `MTEB leaderboard <https://huggingface.co/spaces/mteb/leaderboard>`_

Thank you for reading!
61 changes: 61 additions & 0 deletions docs/source/posts/2024/2024-06-06-week1-wachiou-bouraima.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
WEEK 1: Progress and challenges at Google Summer of Code (GSoC) 2024
====================================================================

.. post:: June 06, 2024
:author: Wachiou BOURAIMA
:tags: google
:category: gsoc

Hello👋🏾,

Welcome back to my Google Summer of Code (GSoC) 2024 journey!
This week has been filled with progress and challenges as I continue to work on modernizing the FURY code base.


Applying the keyword_only decorator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

My main task this week was to apply the keyword_only decorator to several functions.
The decorator ensures that all arguments except the first are keyword-only,
which helps to make the code clearer and parameter passing more explicit.
Some warnings appeared after applying this decorator, and to resolve them,
I updated all the code where these functions were called with the necessary format. This was a very important step in maintaining the integrity and functionality of the code base.


Managing the challenges of Git rebasing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Rebasing the branch I was working on was the other major activity of my week.
It was a real challenge because of the conflicts that arose and had to be resolved.
It involved a lot of research and problem-solving on how to resolve these conflicts,
which greatly enhanced my understanding of Git. It was a challenging but satisfying experience of version control management and complex mergers.


Peer code review
~~~~~~~~~~~~~~~~

In addition to my duties, I was also tasked with reviewing the code of my peers.
This exercise was very rewarding, as it enabled me to understand different coding styles and approaches.
The constructive comments and suggestions were beneficial not only for teammates,
but also for improving my own coding and reviewing skills.


Acknowledgements
~~~~~~~~~~~~~~~~~

I would like to thank all my classmates: `Iñigo Tellaetxe Elorriaga <https://github.com/itellaetxe>`_, `Robin Roy <https://github.com/robinroy03>`_, `Kaustav Deka <https://github.com/deka27>`_ and my guide: `Serge Koudoro <https://github.com//skoudoro>`_ for their constructive suggestions on my work.
Their ideas and suggestions were of great help to me and I am grateful for their support and advice.


What happens next?
~~~~~~~~~~~~~~~~~~

Here's a summary of what I plan to do in week two:

- Apply the keyword_only decorator to all other necessary functions.
- Update the calling of these functions in the code to ensure consistency and avoid raising warnings.
- Rename the decorator with a more descriptive name.
- Add two parameters to the decorator, specifying from which version of FURY it will work.


🥰Thanks for reading! Your comments are most welcome, and I look forward to giving you a sneak preview of my work next week.
52 changes: 52 additions & 0 deletions docs/source/posts/2024/2024-06-15-week2-wachiou-bouraima.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
WEEK 2: Refinements and Further Enhancements
============================================

.. post:: June 15, 2024
:author: Wachiou BOURAIMA
:tags: google
:category: gsoc

Hello again,
~~~~~~~~~~~~~

Welcome back to my Google Summer of Code (GSoC) 2024 journey! This week has been dedicated to refining and improving the work done so far, with a particular focus on the keyword_only decorator.


Renaming and Updating the Decorator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This week, I've updated `this Pull Request <https://github.com/fury-gl/fury/pull/888>`_ by renaming the ``keyword_only`` decorator to ``warn_on_args_to_kwargs`` for greater clarity. The updated decorator now includes version parameters from_version and until_version. This enhancement ensures that the decorator will raise a RuntimeError if the current version of FURY is greater than until_version.


Peer Code Review
~~~~~~~~~~~~~~~~~

I also spent time reviewing `Kaustav Deka's <https://github.com/deka27>`_ code. This exercise remains rewarding, as it helps me understand different coding styles and approaches. Constructive feedback and suggestions from my classmates were invaluable, not only in helping my teammates but also in improving my own coding and reviewing skills.


Research into lazy loading
~~~~~~~~~~~~~~~~~~~~~~~~~~

In parallel, I started researching the lazy loading feature and thinking about how to implement it. This feature will optimize performance by loading resources only when they're needed, which is crucial to improving the efficiency of FURY's code base.


Acknowledgements
~~~~~~~~~~~~~~~~

I am deeply grateful to my classmates `Iñigo Tellaetxe Elorriaga <https://github.com/itellaetxe>`_, `Robin Roy <https://github.com/robinroy03>`_, and `Kaustav Deka <https://github.com/deka27>`_ for their insightful suggestions and comments on my work.
Special thanks to my mentor, `Serge Koudoro <https://github.com//skoudoro>`_, whose guidance and support enabled me to meet the challenges of this project.
Their combined efforts have greatly contributed to my progress, and I appreciate their continued help.


What happens next?
~~~~~~~~~~~~~~~~~~

For week 3, I plan to :

- Ensure that the ``warn_on_args_to_kwargs`` decorator is applied consistently in all necessary functions.
- Continue to update the calling of these functions in the code to maintain consistency and avoid warnings.
- Refine decorator as necessary based on feedback and testing.
- Start implementing lazy loading functionality based on my research to optimize performance.


🥰 Thank you for taking the time to follow my progress. Your feedback is always welcome and I look forward to sharing more updates with you next week.
79 changes: 79 additions & 0 deletions docs/source/posts/2024/2024-06-16-week2-robin.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
Week 2: The first iteration!
============================

.. post:: June 16 2024
:author: Robin Roy
:tags: google
:category: gsoc

Hi, I'm `Robin <https://github.com/robinroy03>`_ and this is my blog about week 2.

My goal for week 2 was to connect everything and make a prototype. So now we have a bot working 24x7 to answer all your doubts :)

Apart from the things mentioned in my `week 1 blog <https://fury.gl/latest/posts/2024/2024-06-06-week-1-robin.html>`_, the things I did in week 2 are basically:
- Chunking the files for embedding.
- Upserting the chunks into the database.
- Connecting everything together.
- Making the discord bot async.
- Merging a PR.

1) **Chunking the files for embedding**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In the context of building LLM-related applications, chunking is the process of breaking down large pieces of text into smaller segments. It's an essential technique that helps optimize the relevance of the content we get back from a vector database once we use an embedding model to embed content. For our case with FURY, our data is entirely code. So one approach I tried was to take docstrings and the function/class signature.

I used a naive parser during week 2, which used a combination of regex and common pattern matching to do this splitting. Later my mentors `Mohamed <https://github.com/m-agour>`_ and `Serge <https://github.com/skoudoro/>`_ told me to use a better approach, using the python ``inspect`` module.

Another thing to consider was the chunk size. It is shown that smaller chunks outperform larger chunks. This can be intuitively thought of like this: An embedding model can compress a smaller text to 1024 vectors without much data loss compared to compressing a larger text to 1024 vectors.

This also introduces another important issue, we need a way to test it based on our model. So we need benchmarking.


2) **Upserting chunks into the database**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I upserted all the chunks into the database, along with the vectors I gave metadata which was the function signature and docstrings. Later in week 3, we'll modify this to show links instead of the big wall of text.


3) **Connecting everything together**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I took the 4 key parts - Discord Bot, LLM API, Embeddings API and the Database API and connected them together. This was explained on the `week 1 blog <https://fury.gl/latest/posts/2024/2024-06-06-week-1-robin.html>`_ itself.


4) **Making the Discord Bot async**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

One of the biggest challenges I faced this week was to get everything running properly. LLM output takes a lot of time to generate (we'll fix this amazingly well in week 3 BTW).
I made a big mistake, I used ``requests`` library to do the REST API calls. It occurred to me later that it is synchronous and does blocking calls. This was the reason my Discord bot was dying randomly. I fixed it by migrating to ``aiohttp``.

This also made me realize I can use async in a lot of other places. A lot of these tasks are I/O bound. If I make them async we might be able to take many more concurrent requests.

5) **Merging a PR**
~~~~~~~~~~~~~~~~~~~

I merged a `PR <https://github.com/fury-gl/fury/pull/893>`_ which modifies `.gitignore`. I found this while generating the Sphinx docs.


What is coming up next week?
----------------------------

- A faster LLM inference.
- Better pipeline for data collection.
- Links for citation.

Did you get stuck anywhere?
---------------------------

Took me some time to realize I was using synchronous code inside async. Fixed it later.


LINKS:

- `Week 1 Blog <https://fury.gl/latest/posts/2024/2024-06-06-week-1-robin.html>`_
- `PR <https://github.com/fury-gl/fury/pull/893>`_
- `Serge Koudoro <https://github.com/skoudoro/>`_
- `Mohamed Abouagour <https://github.com/m-agour>`_
- `Robin :) <https://github.com/robinroy03>`_

Thank you for reading!
Loading

0 comments on commit c0bcdb0

Please sign in to comment.