Skip to content

Omar-Said-4/Excess_Search_Engine

Repository files navigation

"Excess"

Find what you are looking for, faster than ever before!

Introduction

A search engine developed as a part of the university course "Advanced Programming Techniques" applying the basic search engines structure and hierarchy.

Main Features

  • Voice Search: Excess supports voice search in English language through using a text to speech api.
  • Phrase Searching: Instead of some keywords, excess supports phrase searching to search for an exact match or a semi-exact match of a phrase.
    You can also concatenate phrases with operators such as (AND, OR, NOT) and retrieve the results for complex search queries.
  • Keywords Suggestion: Based on your search history, excess can anticipate your keyword, thus facilitating searching process.
  • Results Paging and User-Friendly UI: results are paged 10 results per single page.
  • Crawler can save its state if interrupted

Project Description

There are different packages each one resembles a part of the search-engine structure

  1. Crawler: A thread-safe multithreaded crawler responsible for crawling web pages starting from the seed of links provided in the file seed.txt (maximum number of pages tested was 10000), the output of the crawler is a serializable file with crawled html documents together with URLs.
  2. Indexer: A thread-safe multithreaded indexer for indexing crawled web pages and uploading the inverted file to a Cloud MongoDB database.
  3. QueryProcessor: For processing the search query through removing stop words and stemming.
  4. PageRanker: For applying page ranking algorithm on collected webpages
  5. Ranker: For Ranking search query results based on term frequency and document frequency.
  6. MongoDB: an interface for handling MongoDB connections.
  7. ComplexPhraseSearching : for handling both normal and operator-separated search queries.
  8. SpringBoot : To interface frontend with backend

Guidelines

Guidelines

As simple as any search engine just enter the search query and enjoy the results.

🔵 To run the React App on your localhost

  1. Ensure you have nodejs on your pc.
  2. Clone this repo to your pc and navigate to client directory.
  3. Open cmd in your current directory and run the command npm install.
  4. Wait a while then run the command npm start

Now you should find the search engine running on your default browser (preferred to be chromium based) on localhost:3000

🔵 To run the Backend

  1. Open the Java Project in your preferred IDE.
  2. Navigate to PageRanker package.
  3. Navigate to RankerMain and run it.

And voilà the search engine is now ready to use

To re- crawl navigate to Main.java and run it

To re-index navigate to Indexer package then run IndexerMain

Screenshots





My Teammates

About

A search engine developed as a part of the university course "Advanced Programming Techniques" applying the basic search engines structure and hierarchy.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors