Salesforce Search Engine

Stack Exchange is a popular Q&A website where millions of users ask and answer questions on a wide range of topics. With a focus on questions and answers, the platform serves as a hub for individuals seeking to learn more about a particular subject or solve a problem. Each topic or subject area has its own section within the website, with numerous questions and answers available for users to peruse. While the Stack Exchange website is a valuable resource for users seeking information on a particular subject, it can be difficult for users to find accurate answers to their specific queries. To address this challenge, we developed a search engine that is specifically tailored to the Salesforce dataset that can accurately retrieve the most relevant questions and answers based on a user's query.

Installation

pip install git+https://github.com/deepset-ai/haystack.git
pip install streamlit
pip install uvicorn

Dataset

Stack Exchange Data Dump (Salesforce)
Training Data: salesforce.stackexchange.com/Posts.xml
- Contains 129,096 unique posts.
Test Data: salesforce.stackexchange.com/Test data.csv
- Contains 100 unique queries and their corresponding question ids.

Model

Model experimentation performed in Big_Data_Project.ipynb
Embedding Retriever selected for its ability to retrieve relevant documents from a large corpus of documents
- Embedding Model: sentence-transformers/all-MiniLM-L6-v2

Start and Initialize ElasticSearch

To initialize ElasticDocumentStore Docker Container

uvicorn main:app

This will

Serialize NetSuite documentation as JSON documents
Index the documents in ElasticSearch using BM25 model

Inference

Streamlit
- streamlit run app.py is the main file for the Streamlit app
Uvicorn
- uvicorn main_inference:app

Example Query

how to create opportunity in salesforce with out having licence?

References

[1] “Beautifulsoup4,” PyPI. [Online]. Available: https://pypi.org/project/beautifulsoup4/. [Accessed: 24-Oct-2022].

[2] “Data in: documents and indices,” Elastic.co. [Online]. Available: https://www.elastic.co/guide/en/elasticsearch/reference/current/documents-indices.html. [Accessed: 24-Oct-2022].

[3] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional Transformers for language understanding,” arXiv [cs.CL], 2018.

[4] L. Tunstall, L. von Werra, and T. Wolf, Natural language processing with transformers: Building language applications with hugging face. Sebastopol, CA: O’Reilly Media, 2022.

[5] “Streamlit • The fastest way to build and share data apps,” Streamlit.io. [Online]. Available: https://streamlit.io/. [Accessed: 05-Dec-2022].

[6] “Uvicorn,” Uvicorn.org. [Online]. Available: https://www.uvicorn.org/. [Accessed: 05-Dec-2022].

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
images		images
salesforce.stackexchange.com		salesforce.stackexchange.com
.DS_Store		.DS_Store
Big_Data_Project.ipynb		Big_Data_Project.ipynb
README.md		README.md
app.py		app.py
main.py		main.py
main_inference.py		main_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Salesforce Search Engine

Installation

Dataset

Model

Start and Initialize ElasticSearch

Inference

Example Query

References

About

Releases

Packages

Languages

sudheer997/Salesforce-Search-Engine

Folders and files

Latest commit

History

Repository files navigation

Salesforce Search Engine

Installation

Dataset

Model

Start and Initialize ElasticSearch

Inference

Example Query

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages