Project 1: https://youtu.be/dRD2LMxt_2E?si=1sarafHfKaw2HXPM
Project 2: https://youtu.be/N6BvjrIQn2o
Project 3: https://youtu.be/4ZEkeTvs1ow
Project 4: https://youtu.be/WjyvYfRoBrc
Project 5: https://youtu.be/rMztQfZuLXM
- Install [email protected] and Cygwin (Windows only), then execute the following code:
mysql -u root -p
SET GLOBAL general_log = 'ON';
exit
- Install Tomcat@10
- Import IntelliJ IDEA
- Use git to clone this repository locally and import the project to IntelliJ
- Install Maven
- Configure Tomcat to run locally on IntelliJ, with the target being
fabflix
on port 8080. When run, this should deploy the Fabflix site locally.
- Implemented frontend styling with Bootstrap and functionality for the movies page
- Added sorting, pagination,n (number records per page), and UI updates
- Adding employee dashboard functionality and casts XML parsing
- Search page, single movie page for Android
- Setting up AWS master/slave instances with load balancer
- Worked on the functionality for the single star and single movie pages
- Added searching and browsing by title and genre
- mains and actors XML parsing, as well as reCAPTCHA implementation
- Autocomplete functionality for the results page of Fabflix
- Movies list page on Android
- Test plan and time measurements with JMeter
We used the LIKE predicate for substring matching with search on title, director, and star for movies in our QueryServlet. This can be seen below:
String query = "SELECT * from movies, ratings where movies.id = ratings.movieId";
Map<Integer, String> parameters = new HashMap<>();
int currentParameter = 1;
if (!title.isEmpty()) {
query += " AND movies.title LIKE ?";
parameters.put(currentParameter, "%" + title + "%");
currentParameter++;
}
if (!year.isEmpty()) {
query += " AND movies.year = ?";
parameters.put(currentParameter, year);
currentParameter++;
}
if (!director.isEmpty()) {
query += " AND movies.director like ?";
parameters.put(currentParameter, "%" + director + "%");
currentParameter++;
}
if (!star.isEmpty()) {
query += " AND movies.id IN (SELECT m2.id FROM movies m2, stars_in_movies sim, stars s2 " +
"WHERE m2.id = sim.movieId AND s2.id = sim.starId AND s2.name like ?)";
parameters.put(currentParameter, "%" + star + "%");
}
PreparedStatement statement = conn.prepareStatement(query);
We also used LIKE for browsing by title in the QueryServlet:
SELECT * FROM movies, ratings WHERE movies.id = ratings.movieId AND movies.title LIKE '%s';
For mains243.xml
parsing, we decided that genres would be based on the abbreviations provided by the XML website.
We did this by using a HashMap with abbreviation: genre name
, but we did modify some names from the site if they were already existing in the database.
We converted all abbreviations to lowercase to compare in the dict before converting to the full name.
For example, the Stanford site mentions "Romt" -> "romantic"
, but we mapped "romt" -> "Romance"
instead because Romance already exists as a genre in the database.
However, if a genre did not match any of the mentioned valid abbreviations' formats, then we would write its incorrect value to mainerrors.txt
.
For inserting values with strings, backslashes () and double quotes (", removed specifically from movie titles) were removed to avoid breaking the insert strings.
For actors63.xml
parsing, we decided that stars would be uniquely identified by their stagename only, since that was how they were linked
to movies in the cast XML. A HashSet of Strings (star names) was used to keep track of only unique star names.
For dates of birth, any +
at the end of a string would be stripped off to extract the year.
Additionally, any blank or "n.a." years go as null, and are printed as "N/A" for the star. Otherwise, if the year is not numerical, it would be
invalid. These errors are found in actorerrors.txt
.
For casts124.xml
parsing, we decided to add any star we found that was not already in the database.
We used their stagename as a unique identifier and for their name in the database.
We also decided not to add movies found in the XML file that were not already in the database.
These movies are written to the file casterrors.txt
.
For our first optimization method, we used HashMaps and HashSets to hold the results of querying the database before inserting new data. Since the accessing runtime of both are O(1), this would allow us to quickly check for duplicate elements before inserting, instead of having to requery the database every time we wanted to check if we were inserting a new element.
For our second optimization method, we generated SQL INSERT
statements as we found new elements to insert into a file,
with BEGIN;
at the top and COMMIT;
at the bottom. This meant all the SQL statements were executed together as one transaction,
which was much quicker than making and executing one statement for every single new item.
-
- Connection pooling is configured in context.xml with the following line of code:
url="jdbc:mysql://localhost:3306/moviedb?autoReconnect=true&allowPublicKeyRetrieval=true&useSSL=false&cachePrepStmts=true"
-
The files that use Connection Pooling are here.
-
Originally in the Fabflix code, every servlet would create a new Connection to execute PreparedStatements with
moviedb
. Connection Pooling is utilized in the Fabflix code by being configured incontext.xml
, which creates a set number of cached connections tomoviedb
. Then, instead of creating a new Connection every time a servlet is called, each servlet call just grabs an available Connection from the pool and returns it after (by "closing" it). This saves time because the same pool of connections can be reused across servlet calls, rather than wasting time on opening and closing new connections. -
With two SQL databases in the backend (or just more than one), Connection Pooling works similarly to described above - a set of connections to either
database1
ordatabase2
are defined, and Connections are grabbed and returned to the pool as they are used by servlets. Again, this saves time because a completely new Connection does not have to be opened and closed for every servlet call.
-
-
When setting up database connections for the Slave instance, its MySQL connections point to the Master's IP address. On the other hand, the Master's MySQL uses its own MySQL setup. This means that Read/Write requests can be sent to the Master, while only Read requests are sent to the Slave (Write requests are redirected to the Master).
-
-
The log files for these results are located in
logs/
. The log files' format has each line asTS TJ
. To run the log processing file, runpython3 log_processing.py
from within the project directory. This may have to be run with sudo, depending on permissions.
-
The log files for these results are located in