Skip to content

Commit e79e3f7

Browse files
committed
Final: Enhanced app's loading times (#196)
1 parent 379ce1c commit e79e3f7

File tree

8 files changed

+90
-299
lines changed

8 files changed

+90
-299
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2023 Son Nguyen Hoang
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

MovieVerse-Middleware/machine-learning/README.md

+43-1
Original file line numberDiff line numberDiff line change
@@ -21,29 +21,71 @@ The `Machine-Learning` directory contains Python scripts that leverage machine l
2121

2222
This script uses machine learning models to classify movies into genres based on their descriptions, titles, and other metadata. It helps in categorizing movies accurately within the app database.
2323

24+
To run the genre classifier, execute the following command:
25+
26+
```bash
27+
python genre-classifier.py
28+
```
29+
2430
### Movie Recommendation (`movie-recommendation.py`)
2531

2632
This script is responsible for generating movie recommendations for users based on their viewing history, preferences, and ratings. It uses collaborative filtering and content-based methods to provide personalized recommendations.
2733

34+
To run, execute the following command:
35+
36+
```bash
37+
python movie-recommendation.py
38+
```
39+
2840
### Movie Reviews Analysis (`movie-reviews.py`)
2941

3042
This script processes and analyzes movie reviews, extracting insights and useful information. It might use natural language processing (NLP) techniques to understand user sentiments, key themes, and overall opinions about movies.
3143

44+
To get started, you can run the following command:
45+
46+
```bash
47+
python movie-reviews.py
48+
```
49+
50+
Then, follow the instructions provided by the script to analyze movie reviews and extract valuable information.
51+
3252
### Plot Summarizer (`plot-summarizer.py`)
3353

3454
`plot-summarizer.py` utilizes NLP and text summarization algorithms to create concise summaries of movie plots. This assists users in quickly grasping the essence of a movie without spoilers.
3555

56+
To get started, you can run the following command:
57+
58+
```bash
59+
python plot-summarizer.py
60+
```
61+
62+
Then, follow the instructions by Streamlit to view the plot summarizer web application. For example, you may receive the following instructions:
63+
64+
```
65+
Warning: to view this Streamlit app on a browser, run it with the following command:
66+
67+
streamlit run /Users/davidnguyen/WebstormProjects/The-MovieVerse-Database/MovieVerse-Backend/machine-learning/plot-summarizer.py [ARGUMENTS]
68+
```
69+
70+
In this case, simply copy and run the provided `streamlit run` command in your terminal to view the plot summarizer web application.
71+
3672
### Sentiment Analysis (`sentiment_analysis.py`)
3773

3874
This script performs sentiment analysis on user reviews and comments. It determines the overall sentiment (positive, negative, neutral) expressed in the text, helping in gauging audience reception of movies.
3975

76+
To run, simply execute the following command:
77+
78+
```bash
79+
python sentiment_analysis.py
80+
```
81+
4082
## Using these Scripts
4183

4284
To run these scripts:
4385

4486
1. Ensure you have Python installed on your system.
4587
2. Install necessary libraries using pip: `pip install -r requirements.txt` (assuming a `requirements.txt` file is present).
46-
3. Execute each script as needed, e.g., `python genre-classifier.py`.
88+
3. Execute each script as needed, following the instructions above.
4789

4890
## Customization and Adaptation
4991

MovieVerse-Middleware/machine-learning/genre_classifier.py

+2
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from sklearn.model_selection import train_test_split
66
from sklearn.metrics import classification_report
77

8+
89
class GenreClassifier:
910
def __init__(self):
1011
self.pipeline = Pipeline([
@@ -26,6 +27,7 @@ def predict_genre(self, description: str) -> str:
2627
def predict_genres(self, descriptions: List[str]) -> List[str]:
2728
return self.pipeline.predict(descriptions)
2829

30+
2931
# Example usage
3032
if __name__ == "__main__":
3133
# Example data

MovieVerse-Middleware/machine-learning/movie-recommendation.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@
66

77
# Load movie data
88
movies_df = pd.read_csv('movies.csv', usecols=['movieId', 'title'], dtype={'movieId': 'int32', 'title': 'str'})
9-
ratings_df = pd.read_csv('ratings.csv', usecols=['userId', 'movieId', 'rating'], dtype={'userId': 'int32', 'movieId': 'int32', 'rating': 'float32'})
9+
ratings_df = pd.read_csv('ratings.csv', usecols=['userId', 'movieId', 'rating'],
10+
dtype={'userId': 'int32', 'movieId': 'int32', 'rating': 'float32'})
1011

1112
# Preprocessing
1213
# Create a user-movie matrix
@@ -24,6 +25,7 @@
2425
all_user_predicted_ratings = np.dot(np.dot(U, sigma), Vt) + mean_user_rating.values.reshape(-1, 1)
2526
preds_df = pd.DataFrame(all_user_predicted_ratings, columns=user_movie_df.columns)
2627

28+
2729
# Recommend Movies
2830
def recommend_movies(predictions_df, userID, movies_df, original_ratings_df, num_recommendations=5):
2931
user_row_number = userID - 1
@@ -45,6 +47,7 @@ def recommend_movies(predictions_df, userID, movies_df, original_ratings_df, num
4547

4648
return user_full, recommendations
4749

50+
4851
# Test the recommendation system for a user
4952
user_id = 1
5053
rated_movies, recommendations = recommend_movies(preds_df, user_id, movies_df, ratings_df, 10)

MovieVerse-Middleware/machine-learning/plot-summarizer.py

+17-14
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
import torch
21
import logging
32
from transformers import BartTokenizer, BartForConditionalGeneration
43
import streamlit as st
@@ -7,6 +6,7 @@
76
logging.basicConfig(level=logging.INFO)
87
logger = logging.getLogger(__name__)
98

9+
1010
class MoviePlotSummarizer:
1111
def __init__(self, model_name='facebook/bart-large-cnn'):
1212
self.tokenizer = BartTokenizer.from_pretrained(model_name)
@@ -24,13 +24,16 @@ def summarize(self, plot_text, max_length=130, min_length=30, style='default'):
2424
min_length //= 2
2525

2626
# Tokenize and generate summary
27-
inputs = self.tokenizer.encode("summarize: " + plot_text, return_tensors="pt", max_length=1024, truncation=True)
28-
summary_ids = self.model.generate(inputs, max_length=max_length, min_length=min_length, length_penalty=2.0, num_beams=4, early_stopping=True)
27+
inputs = self.tokenizer.encode("summarize: " + plot_text, return_tensors="pt", max_length=1024,
28+
truncation=True)
29+
summary_ids = self.model.generate(inputs, max_length=max_length, min_length=min_length, length_penalty=2.0,
30+
num_beams=4, early_stopping=True)
2931
return self.tokenizer.decode(summary_ids[0], skip_special_tokens=True)
3032
except Exception as e:
3133
logger.error(f"Error in summarizing plot: {e}")
3234
return "Error in summarization process."
3335

36+
3437
# Streamlit UI
3538
def main():
3639
st.title("Movie Plot Summarizer")
@@ -49,17 +52,17 @@ def main():
4952

5053
if st.button("About"):
5154
st.subheader("About")
52-
st.write("This is a simple movie plot summarizer built using the HuggingFace Transformers library. It uses the BART model to generate the summaries.")
53-
st.write("The model was trained on the CNN/Daily Mail dataset, which contains news articles and their summaries. The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
54-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
55-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
56-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
57-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
58-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
59-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
60-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
61-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
62-
st.write("The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
55+
st.write(
56+
"This is a simple movie plot summarizer built using the HuggingFace Transformers library. It uses the "
57+
"BART model to generate the summaries.")
58+
st.write(
59+
"The model was trained on the CNN/Daily Mail dataset, which contains news articles and their summaries. "
60+
"The model was fine-tuned on the XSUM dataset, which contains summaries of BBC articles.")
61+
st.write("You can adjust the length of the summary and the style of summarization (default, verbose, concise).")
62+
st.write("The model may not always provide accurate summaries, especially for longer or complex plots.")
63+
st.write("Feel free to experiment with different movie plots and summarization settings! Enjoy!")
64+
st.write("Note: The model might take a few seconds to generate the summary, so please be patient.")
65+
6366

6467
if __name__ == "__main__":
6568
main()

MovieVerse-Middleware/machine-learning/sentiment_analysis.py

+1-13
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
# Ensure you have the necessary NLTK components
77
nltk.download('vader_lexicon')
88

9+
910
class SentimentAnalyzer:
1011
def __init__(self):
1112
self.analyzer = SentimentIntensityAnalyzer()
@@ -54,19 +55,6 @@ def analyze_review(self, review: str) -> str:
5455
sentiment = self.predict_sentiment(review)
5556
return sentiment
5657

57-
def analyze_reviews(self, reviews: List[str]) -> pd.DataFrame:
58-
results = {'Review': [], 'Sentiment': []}
59-
60-
for review in reviews:
61-
sentiment = self.predict_sentiment(review)
62-
results['Review'].append(review)
63-
results['Sentiment'].append(sentiment)
64-
65-
return pd.DataFrame(results)
66-
67-
def analyze_review(self, review: str) -> str:
68-
sentiment = self.predict_sentiment(review)
69-
return sentiment
7058

7159
if __name__ == "__main__":
7260
reviews = [

MovieVerse-Middleware/middleware.js

+1-1
Original file line numberDiff line numberDiff line change
@@ -89,4 +89,4 @@ app.use(errorHandler);
8989
const PORT = process.env.PORT || 3000;
9090
app.listen(PORT, () => {
9191
console.log(`Server is running on port ${PORT}`);
92-
});
92+
});

0 commit comments

Comments
 (0)