The aim of this project is to understand and implement the concepts of vectorization and cosine similarity for recommending movies based on user preferences. This project utilizes Python and natural language processing (NLP) techniques to achieve effective movie recommendations.
Vectorization: Converting textual data into numerical vectors to facilitate similarity computations.
Cosine Similarity: A measure of similarity between two non-zero vectors of an inner product space, used to determine the relevance of movies to user preferences.
To set up the project, follow these steps:
git clone https://github.com/thehrsr/movie-recommendation-system.git cd movie-recommendation-system
python -m venv env
source env/bin/activate # On Windows use env\Scripts\activate
pip install -r requirements.txt
Prepare Data: Ensure you have the movie dataset in the required format. The dataset should include movie titles and descriptions.
python recommend_movies.py
Input Preferences: Follow the prompts to enter user preferences and receive movie recommendations based on cosine similarity.
recommend_movies.py: Main script to run the recommendation system. vectorization.py: Contains functions for vectorizing movie descriptions. cosine_similarity.py: Functions for calculating cosine similarity. data/: Directory for storing dataset files. requirements.txt: List of Python dependencies.
User Feedback Integration: Incorporate user feedback to refine recommendations. Advanced NLP Techniques: Use more sophisticated NLP techniques like embeddings for better accuracy. Web Interface: Develop a web-based interface for easier interaction with the recommendation system. License This project is licensed under the MIT License - see the LICENSE file for details.
Python NLP Libraries Cosine Similarity Resources"# MOVIE-RECOMMENDATION"