-
Notifications
You must be signed in to change notification settings - Fork 1
Developer Guide: Database
BugQuery uses a MySQL database in order to store its queries, posts, etc. In this guide we will cover the following:
- Setting up a connection to the database
- Importing posts from StackOverflow to the database
BugQuery uses a properties file in order to store the login information for the database. It is located in BugQueryServerSide/src/main/resources/application.properties. For your convenience, an example properties file was uploaded to the repository under BugQueryServerSide/src/main/resources/application.properties.example.
- If this is the first time you run BugQuery in this database, you will also need to add
spring.jpa.hibernate.ddl-auto=createat the beginning of the file. This will create our tables in the database. Don't forget to remove it after the first run! -
spring.datasource.url- This is the address of the database server. Since we use JDBC and MySQL, the format we use isjdbc:mysql://<server_ip>:<server_port>/<database> -
spring.datasource.username- Username for MySQL -
spring.datasource.password- Password for MySQL
Note: because of the size of StackOverflow's dump, it is practically necessary to work with a local MySQL server for this step, and even then this process is expected to last a few hours. After importing the posts, you can export the key tables which are only about 1GB of size, to a remote server.
- About 150GB of available space (depends on StackOverflow's dump size)
- StackOverflow dump (can be downloaded here, under
stackoverflow.com-Posts.7z) - A MySQL server (obviously 😉 )
-
Extract
Posts.xmlfromstackoverflow.com-Posts.7z -
(Assuming the MySQL server was started with
--secure-file-priv, which is the default. Skip this step otherwise)
Login to the MySQL server's CLI and runSHOW VARIABLES LIKE "secure_file_priv";. Move the extracted Posts.xml to the returned folder. -
Run application.java with the flag
--updateDB <path_to_Posts.xml>
As mentioned before, this process is expected to take a few hours.