-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chat with PDF using Chainlit #178
Conversation
does langchain provide any significant benefits? you could also do it with the openAI package but if you feel like using langchain is better, go ahead |
The embeddings are from OpenAI. langchain has features for making the conversation a contextual one. So the chat memory will append the current user query to the previous response and if there's any additional info in previous msg it takes that into account. I referred to this tutorial: https://docs.chainlit.io/examples/qa. Saw the same in Panel example Is it preferrable to try without langchain completely ? I'll update both methods |
that's ok, let's keep langchain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. Will wait for deploy to test functionality
App deployed here: https://frosty-flower-5089.ploomberapp.io/ @bryannho @edublancas |
@neelasha23 I got this error when I uploaded a PDF (didn't send any messages yet): |
Please share the PDF @bryannho |
@edublancas It was failing because I used a scanned PDF. @neelasha23 will add a catch and use pdf_scanned_to_text for this case |
@neelasha23 two more small notes:
Screen.Recording.2024-04-08.at.1.14.01.PM.mov |
Added call to pdf_scanned_to_text |
It happened because there was just one line in the PDF. Fixed
It was already after the file upload and before doc parsing. But I think because of the asynchronous nature of calling pdf_to_text the message was getting stuck. Have convrted the pdf conversion call to async. There might be a lag in the message display depending on size of PDF. Deployed updated app here: https://polished-night-8566.ploomberapp.io/ |
the answers look off (I tried this paper: https://arxiv.org/abs/2402.00838): |
I think it's confusing the title with the ones in the References section. It is answering specific questions about the content though: |
I think we should fix it, those basic questions should be answered correctly. check how it's done here: https://github.com/ploomber/doc/tree/main/examples/panel/chat-with-pdf I remember testing that example and it worked fine when asking about the abstract and title two things might happen:
|
cool. yeah, compare the answers from both examples, and try to see if you can improve it a bit. spend ~2 hours on this; no need to spend more. let me know your conclusions so we know if we should keep digging or publish as is |
I ended up spending a bit more time on this as even after adding the same settings as the Panel app the results were not comparing with the outputs I saw in the Panel app yesterday. But I tested the Panel app again and I realized it's not performing any better. The results are inconsistent. It might give the correct answer once in a while by chance. Also, this issue is mostly with the OLMO paper. I found better results with other papers. Below are the detailed observations and next steps: Changes made to Chainlit App (as per the Panel one)
Panel app observations (inconsistent results on the Olmo paper) It has picked up from the references: Another attempt: Chainlit output Conclusions
Deployed updated app: https://www.platform.ploomber.io/applications/steep-salad-8357/0f7566de |
@neelasha23 please prepare social media posts and include the link to the app, let's keep the example running for a week |
Closes #167
📚 Documentation preview 📚: https://ploomber-doc--178.org.readthedocs.build/en/178/