Skip to content

DjangoREST & Vue.js apps, that visualize some metrics & stats for the incidents that happen to Chicago City.

License

Notifications You must be signed in to change notification settings

vtsiatouras/311-chicago-incidents

Repository files navigation

311 Chicago Incidents

Django CI Vue.js CI codecov Python Django Vue.js PostgreSQL

This repository contains 2 applications, DjangoREST & Vue.js, that visualize some metrics & stats for the incidents that happen to Chicago City.

All the data used for the development can be found here. Also I have uploaded some of these data inside this repository, which can be found here.

Demo

demo

DjangoREST Application

Installation from source

This section contains the installation instructions in order to set up a local development environment. The instructions have been validated for Ubuntu 20.04.

First, install all required software with the following command:

sudo apt update
sudo apt install git python3 python3-pip python3-dev postgresql postgresql-contrib 

The project dependencies are managed with pipenv. You can install it with:

pip install --user pipenv

pipenv should now be in your PATH. If not, logout and log in again. Then install all dependencies with:

pipenv install --dev

Then you can enable the python environment with:

pipenv shell

All commands from this point forward require the python environment to be enabled.

Environment variables

The project uses environment variables in order to keep private data like user names and passwords out of source control. You can either set them at system level, or by creating a file named .env at the root of the repository. The required environment variables for development are:

  • CHICAGO_INCIDENT_DATABASE_USER: The database user
  • CHICAGO_INCIDENT_DATABASE_PASSWORD: The database user password
  • CHICAGO_INCIDENT_DATABASE_HOST: The database host. For local development use localhost
  • CHICAGO_INCIDENT_DATABASE_NAME: The database name.

Local Development

In order to run the project on your workstation, you must create a database named according to the value of the CHICAGO_INCIDENT_DATABASE_NAME environment variable, at the host that is specified by the CHICAGO_INCIDENT_DATABASE_HOST environment variable. You can create the database by running:

sudo -u postgres psql
postgres=# CREATE DATABASE chicago_incident_development_db;

After you create the database, you can populate it with the initial schema by running:

python manage.py migrate
python manage.py createcachetable cache_table

and load the initial data with:

python manage.py import_incidents_from_csvs [csv_files]

where csv_files is the path of the csv files (one or more) to import to the database

Example

python manage.py import_incidents_from_csvs ../assist_material/datasets/csv/311-service-requests-abandoned-vehicles.csv ../assist_material/datasets/csv/311-service-requests-alley-lights-out.csv ../assist_material/datasets/csv/311-service-requests-pot-holes-reported.csv  

Now you can run the web server with:

python manage.py runserver

The API is available at http://127.0.0.1:8000/api/v1/

The documentation Swagger page of the API is available at http://127.0.0.1:8000/api/swagger

Also the Django Admin page is available at http://127.0.0.1:8000/admin/

Vue.js Application

Installation from source

First, Yarn should be installed on your machine. The following works for Ubuntu 20.04.

curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
sudo apt-get update && sudo apt-get install yarn

After, install all the dependencies of the application through Yarn package manager.

yarn install

Compiles and hot-reloads for development

yarn serve

Compiles and minifies for production

yarn build

Lints and fixes files

yarn lint

The application is available at http://127.0.0.1:8081/

Installation using Docker

I have created Dockerfiles on both applications and with the docker-compose.yml you can build a database instance , the Django API & the Vue.js client at the same time. I recommend this way of installation in order to keep your machine clean from packages that you may not use ever again.

Initially, install Docker Engine (click the link to see instructions) & Docker Compose in order to build the project.

Set up the .env at the root of the repository!

  • CHICAGO_INCIDENT_DATABASE_USER: The database user
  • CHICAGO_INCIDENT_DATABASE_PASSWORD: The database user password
  • CHICAGO_INCIDENT_DATABASE_HOST: db The host name must be db
  • CHICAGO_INCIDENT_DATABASE_NAME: The database name.

Then just execute the following:

docker-compose up --build

Then you have the database, the API & the Vue.js client up and running!

In order to perform the import of the data you can log in to the running docker container and perform the process manually.

docker exec -it api bash

Now you have access to all the files of the API. You can now run the import command as mentioned above in the Local Development section.

Also, it is recommended to create a superuser in order to have access to the admin the swagger-documentation pages.

Run

python manage.py createsuperuser

The database is exposed at jdbc:postgresql://localhost:5433/

The API, the Swagger page and the Admin page are available to the same addresses that referred above. The Vue.js client is available at http://127.0.0.1:5000/

Database Report

Schema

database-schema

Some info about the schema and the decisions that were made in order to conclude to this state.

  1. The incidents table is the core table of the database and contains the information that is mutual to all types of incidents, such as creation_date, completion_date, service_request_number, type_of_service_request , street_address, zip_code, ssa & the info that appears to the newer API of Chicago Incident.

  2. The activities table contains current_activity & most_recent_action. The purpose of this table is to reduce the duplication of data, many incidents has the same set of current_activity & most_recent_action.

  3. The abandoned_vehicles table contains license_plate, vehicle_make_model, vehicle_color. This table created in order to avoid null fields to incidents table (not all incidents are ABANDONED_VEHICLE) and keep the data about vehicles separated making the queries that are related to abandoned vehicles more efficient.

  4. The same concept of 3 is applied also for tree, graffiti & sanitation code violation incidents.

  5. The tables abandoned_vehicles, trees, graffiti & sanitation_code_violations are not connected directly to incidents table, we use intermediate tables that holds the foreign keys of each record of each table and that way we can have the same vehicle, tree, graffiti, sanitation violation appear to more than one incident without holding duplicate information.

  6. The table number_of_carts_and_potholes holds the info about carts and potholes incidents at one table, because the payload is just a simple integer.

  7. The tables rodent_baiting_premises & number_of_carts_and_potholes are connected directly to incidents table by holding the foreign key to incidents to each record. This may have some duplication in the data, but it is ok because the payload is just integer numbers.

  8. You can look the models.py in order to see how actually all these are implemented and to check out the indices that used on each table.

Queries

  1. Find the total requests per type that were created within a specified time range and sort them in descending order.

    SELECT "incidents"."type_of_service_request", COUNT("incidents"."type_of_service_request") AS
    "number_of_requests" FROM "incidents" WHERE ("incidents"."creation_date" >= ? AND
    "incidents"."creation_date" <= ?) GROUP BY "incidents"."type_of_service_request"
    ORDER BY "number_of_requests" DESC;
    queryset = Incident.objects.filter(creation_date__gte=data.get('start_date'),
                                       creation_date__lte=data.get('end_date')) \
       .values('type_of_service_request') \
       .annotate(number_of_requests=Count('type_of_service_request')) \
       .order_by('-number_of_requests')
  2. Find the total requests per day for a specific request type and time range.

    SELECT "incidents"."creation_date", COUNT("incidents"."service_request_number") AS "number_of_requests"
    FROM "incidents" WHERE ("incidents"."creation_date" >= ? AND "incidents"."creation_date" <= ?
    AND "incidents"."type_of_service_request" = ?)
    GROUP BY "incidents"."creation_date" 
    ORDER BY "incidents"."creation_date" ASC;
    queryset = Incident.objects.filter(type_of_service_request=data.get('type_of_service_request'),
                                       creation_date__gte=data.get('start_date'),
                                      creation_date__lte=data.get('end_date')) \
       .values('creation_date') \
       .annotate(number_of_requests=Count('service_request_number')) \
       .order_by('creation_date')
  3. Find the most common service request per zipcode for a specific day.

    SELECT DISTINCT ON ("incidents"."zip_code") "incidents"."zip_code", "incidents"."type_of_service_request",
    COUNT("incidents"."type_of_service_request") AS "number_of_requests"
    FROM "incidents"
    WHERE "incidents"."zip_code" IS NOT NULL AND "incidents"."creation_date" = ?
    GROUP BY "incidents"."zip_code", "incidents"."type_of_service_request"
    ORDER BY "incidents"."zip_code", "number_of_requests" DESC;
  4. Find the average completion time per service request for a specific date range.

    SELECT "incidents"."type_of_service_request",
    AVG(("incidents"."completion_date" - "incidents"."creation_date")) AS "average_completion_time"
    FROM "incidents"
    WHERE ("incidents"."completion_date" IS NOT NULL
    AND "incidents"."creation_date" >= ?
    AND "incidents"."creation_date" <= ?)
    GROUP BY "incidents"."type_of_service_request"
    ORDER BY "incidents"."type_of_service_request" ASC;
    queryset = Incident.objects.filter(creation_date__gte=data.get('start_date'),
                                       creation_date__lte=data.get('end_date')) \
       .values('type_of_service_request') \
       .annotate(average_completion_time=Avg(F('completion_date') - F('creation_date'))) \
       .order_by('type_of_service_request')
  5. Find the most common service request in a specified bounding box (as designated by GPS- coordinates) for a specific day.

    SELECT "incidents"."type_of_service_request",
    COUNT("incidents"."type_of_service_request") AS "number_of_requests"
    FROM "incidents"
    WHERE ("incidents"."creation_date" = ? AND "incidents"."latitude" >= ?
    AND "incidents"."latitude" <= ? AND "incidents"."longitude" >= ?
    AND "incidents"."longitude" <= ?)
    GROUP BY "incidents"."type_of_service_request"
    ORDER BY "number_of_requests" DESC
    LIMIT 1;
    queryset = Incident.objects.filter(creation_date=data.get('date'),
                                       latitude__range=[data.get('b_latitude'), data.get('a_latitude')],
                                       longitude__range=[data.get('a_longitude'), data.get('b_longitude')]) \
        .values('type_of_service_request') \
        .annotate(number_of_requests=Count('type_of_service_request')) \
        .order_by('-number_of_requests')[:1]
  6. Find the top-5 Special Service Areas (SSA) with regards to total number of requests per day for a specific date range (for service requests types that SSA is available: abandoned vehicles, garbage carts, graffiti removal, pot holes reported)

    SELECT "incidents"."ssa", COUNT("incidents"."service_request_number") AS "number_of_requests"
    FROM "incidents"
    WHERE ("incidents"."creation_date" >= %s AND "incidents"."creation_date" <= %s
    AND "incidents"."ssa" IS NOT NULL)
    GROUP BY "incidents"."ssa"
    ORDER BY "number_of_requests" DESC
    LIMIT 5;
    queryset = Incident.objects.filter(creation_date__gte=data.get('start_date'),
                                       creation_date__lte=data.get('end_date'),
                                       ssa__isnull=False) \
        .values('ssa') \
        .annotate(number_of_requests=Count('service_request_number')) \
        .order_by('-number_of_requests')[:5]
  7. Find the license plates (if any) that have been involved in abandoned vehicle complaints more than once.

    SELECT "abandoned_vehicles"."license_plate", COUNT(DISTINCT "incidents"."street_address") AS "number_of_requests"
    FROM "abandoned_vehicles"
    LEFT OUTER JOIN "abandoned_vehicles_incidents"
    ON ("abandoned_vehicles"."id" = "abandoned_vehicles_incidents"."abandoned_vehicle_id")
    LEFT OUTER JOIN "incidents"
    ON ("abandoned_vehicles_incidents"."incident_id" = "incidents"."id")
    WHERE "abandoned_vehicles"."license_plate" IS NOT NULL
    AND "incidents"."status" = 'OPEN'
    GROUP BY "abandoned_vehicles"."license_plate"
    HAVING COUNT(DISTINCT "incidents"."street_address") > 1;
  8. Find the second most common color of vehicles involved in abandoned vehicle complaints.

    SELECT "abandoned_vehicles"."vehicle_color", COUNT("abandoned_vehicles"."vehicle_color") AS "color_count"
    FROM "abandoned_vehicles"
    WHERE "abandoned_vehicles"."vehicle_color" IS NOT NULL
    GROUP BY "abandoned_vehicles"."vehicle_color"
    ORDER BY "color_count" DESC
    LIMIT 1 OFFSET 1;
    queryset = AbandonedVehicle.objects.filter(vehicle_color__isnull=False) \
        .values('vehicle_color') \
        .annotate(color_count=Count('vehicle_color')) \
        .order_by('-color_count')[1:2]
  9. Find the rodent baiting requests where the number of premises baited/rats/garbage is less than a specified number.

    SELECT "incidents"."id", "incidents"."service_request_number", "incidents"."type_of_service_request",
    "incidents"."street_address", "incidents"."zip_code", "incidents"."latitude", "incidents"."longitude"
    FROM "incidents"
    INNER JOIN "rodent_baiting_premises"
    ON ("incidents"."id" = "rodent_baiting_premises"."incident_id")
    WHERE "rodent_baiting_premises"."number_of_premises_baited" < ?;
  10. For the Query 10 we can adapt the last line of the above query to the following

    WHERE "rodent_baiting_premises"."number_of_premises_w_garbage" < ?
  11. Similar with the Query 10

    WHERE "rodent_baiting_premises"."number_of_premises_w_rats" < ?

    Combined all together to one Django ORM query:

    queryset = Incident.objects.values('id', 'service_request_number', 'type_of_service_request',
                                       'street_address', 'zip_code', 'latitude', 'longitude')
    if type_of_premises == serializers.RodentBaitingParams.BAITED:
        queryset = queryset.filter(rodent_baiting_premises__number_of_premises_baited__lt=data.get('threshold')) \
            .order_by('id')
    elif type_of_premises == serializers.RodentBaitingParams.GARBAGE:
        queryset = queryset.filter(rodent_baiting_premises__number_of_premises_w_garbage__lt=data
                                   .get('threshold')) \
            .order_by('id')
    elif type_of_premises == serializers.RodentBaitingParams.RATS:
        queryset = queryset.filter(rodent_baiting_premises__number_of_premises_w_rats__lt=data.get('threshold')) \
            .order_by('id')
  12. Find the police districts that have handled “pot holes” requests with more than one number of potholes on the same day that they also handled “rodent baiting” requests with more than one number of premises baited, for a specific day.

    SELECT "incidents"."police_district",
    SUM("rodent_baiting_premises"."number_of_premises_baited") AS "rodent_baiting_sum",
    SUM("number_of_carts_and_potholes"."number_of_elements") AS "potholes_sum"
    FROM "incidents"
    LEFT OUTER JOIN "rodent_baiting_premises"
    ON ("incidents"."id" = "rodent_baiting_premises"."incident_id")
    LEFT OUTER JOIN "number_of_carts_and_potholes"
    ON ("incidents"."id" = "number_of_carts_and_potholes"."incident_id")
    WHERE ("incidents"."completion_date" = ?
    AND
    ("incidents"."type_of_service_request" = 'RODENT_BAITING' OR
    "incidents"."type_of_service_request" = 'POT_HOLE'))
    GROUP BY "incidents"."police_district"
    HAVING
    (SUM("number_of_carts_and_potholes"."number_of_elements") > 1
    AND SUM("rodent_baiting_premises"."number_of_premises_baited") > 1)
    ORDER BY "incidents"."police_district" ASC;
    queryset = Incident.objects.filter(Q(completion_date=data.get('date')) & (
            Q(type_of_service_request=Incident.RODENT_BAITING) | Q(type_of_service_request=Incident.POT_HOLE))) \
        .values('police_district') \
        .annotate(rodent_baiting_sum=Sum('rodent_baiting_premises__number_of_premises_baited'),
                  potholes_sum=Sum('number_of_carts_and_potholes__number_of_elements')) \
        .filter(rodent_baiting_sum__gt=1, potholes_sum__gt=1) \
        .order_by('police_district')

About

DjangoREST & Vue.js apps, that visualize some metrics & stats for the incidents that happen to Chicago City.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published