"Excess"

Find what you are looking for, faster than ever before!

Introduction

A search engine developed as a part of the university course "Advanced Programming Techniques" applying the basic search engines structure and hierarchy.

Main Features

Voice Search: Excess supports voice search in English language through using a text to speech api.
Phrase Searching: Instead of some keywords, excess supports phrase searching to search for an exact match or a semi-exact match of a phrase.
You can also concatenate phrases with operators such as (AND, OR, NOT) and retrieve the results for complex search queries.
Keywords Suggestion: Based on your search history, excess can anticipate your keyword, thus facilitating searching process.
Results Paging and User-Friendly UI: results are paged 10 results per single page.
Crawler can save its state if interrupted

Project Description

There are different packages each one resembles a part of the search-engine structure

Crawler: A thread-safe multithreaded crawler responsible for crawling web pages starting from the seed of links provided in the file seed.txt (maximum number of pages tested was 10000), the output of the crawler is a serializable file with crawled html documents together with URLs.
Indexer: A thread-safe multithreaded indexer for indexing crawled web pages and uploading the inverted file to a Cloud MongoDB database.
QueryProcessor: For processing the search query through removing stop words and stemming.
PageRanker: For applying page ranking algorithm on collected webpages
Ranker: For Ranking search query results based on term frequency and document frequency.
MongoDB: an interface for handling MongoDB connections.
ComplexPhraseSearching : for handling both normal and operator-separated search queries.
SpringBoot : To interface frontend with backend

Guidelines

As simple as any search engine just enter the search query and enjoy the results.

🔵 To run the React App on your localhost

Ensure you have nodejs on your pc.
Clone this repo to your pc and navigate to client directory.
Open cmd in your current directory and run the command npm install.
Wait a while then run the command npm start

Now you should find the search engine running on your default browser (preferred to be chromium based) on localhost:3000

🔵 To run the Backend

Open the Java Project in your preferred IDE.
Navigate to PageRanker package.
Navigate to RankerMain and run it.

And voilà the search engine is now ready to use

To re- crawl navigate to Main.java and run it

To re-index navigate to Indexer package then run IndexerMain

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
.vscode		.vscode
Client/excess-interface		Client/excess-interface
gradle/wrapper		gradle/wrapper
out/production/Excess_Search_Engine		out/production/Excess_Search_Engine
src/main/java		src/main/java
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle
crawler_state.ser		crawler_state.ser
gradlew		gradlew
gradlew.bat		gradlew.bat
icon.ser		icon.ser
img.png		img.png
map.ser		map.ser
titles.ser		titles.ser

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

"Excess"

Find what you are looking for, faster than ever before!

Introduction

Main Features

Project Description

Guidelines

Guidelines

Screenshots

My Teammates

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

"Excess"

Find what you are looking for, faster than ever before!

Introduction

Main Features

Project Description

Guidelines

Guidelines

Screenshots

My Teammates

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages