Random Data Maker is a REST webservice for generation random data sets from a defined model, including personal data, geojson features collections and medical metrics data for machine learning, tests, mocks etc..
- Python 3.7.5
- MySQL 8.0 with Docker support provided
See requirements.txt
- Django 2.2.7
- django-cors-headers 3.1.1
- djangorestframework 3.10.3
- enum34 1.1.6
- jsonfield 2.0.2
- mysqlclient 1.4.5
- sqlparse 0.3.0
Create external docker network named webproxy
docker network create webproxy
Run container stack
docker-compose up
To run this app you will need an external MySQL database, either your own, or one created using dockerfile from inside of the project
docker-compose -f docker-compose-db-only.yml up
.
Install virtual environments
python3 -m venv venv
. venv/bin/activate
pip install -r requirements.txt
Make migrations
python manage.py makemigrations
python manage.py migrate --settings=peselgen.settings-dev
And run the app:
python manage.py runserver --settings=peselgen.settings-dev
/generate
- method: POST - generates specified number of: persons, metrics, geolocations and attributes
/person
- method: GET - returns generated persons
- method: POST - generates specified number of persons
- method: DELETE - deletes all persons
/metrics
- method: GET - returns generated metrics
- method: POST - generates specified number of metrics
- method: DELETE - deletes all metrics
/attributes
- method: GET - returns generated attributes
- method: POST - generates specified number of attributes
- method: DELETE - deletes all attributes
/geolocation
- method: GET - returns generated geolocation collections
- method: POST - generates specified number of geolocation collections, number of points to generate and conutry can be specified in body parameters
- method: DELETE - deletes all geolocations
Number of records to generate can be specified in body parameter
Generated person entities contain:
id
: int, private keyfirst_name
: varchar(30)last_name
: varchar(30)pesel
: varchar(11) - unique person identification number based on birth date and sexemail
: varchar(64)phone
: varchar(12)password
: varchar(16)sex
: varchar(1) - '1' - male; '2' - femalebirth_date
: datetime - generated basing on polish General Statistics Departments' informations about births countgeolocation_id
: varchar - foreign key referencesid
from geolocation
Patient metrics entities contain:
id
: int, private keypatient_id
: varchar, foreign key referencesid
from persondoctor_id
: varcharcreated
: datetimeattributes_id
: varchar, foreign key referencesid
from attributesnotes
: varchar(1024)
Dynamic data model is defined in JSON format. It is used to specify what kind of data app should generate. It is generated in a form of the list of objects with predefined attributes:
key
- string value representing object namecount
: optional - integer value representing how many values should be generated, default = 1- Generating values from array
array
- array of values from whitch data will be generatedweights
: optional - array of numbers showing the probability of occurance of element from attribute array, must be the same length asarray
- Generating numbers between two values, type is being resolved dynamically (1 - discrete value, 1.0 - continuous value):
minimum
- smallest value that can be generatedmaximum
- biggest value that can be generatedfloating_points
: optional - number of digits after decimal point
- Generating datetime between two values:
min_date
- smallest datetime value that can be generatedmax_date
- biggest datetime value that can be generateddate_format
: optional - default format is: '%m/%d/%Y %I:%M %p'
id
: varchar, foreign key referencesid
from persongeojson
: json, includes location based on geojson model - default is random point within Poland polygon
Data from any endpoint can be returned in anonimized form. To do so sensitive fileds should be specified in array as anonymize_array
parameter e.g. /api/person/?anonymize_array=pesel,first_name
in GET person method. To get full raw data parameter should be equal to 'none'
API can be tried out with Swagger interface under default aplication address
Random Data Maker is a rest api app and has defined endpoints to communicate with it. It has three kinds of endpoints: generating number of random records (POST), getting them from database in the form of JSON (GET) and flushing tables (DELETE). When post method is used it generates random data e.g personal data, metrics data using apps own random data generator and saves it in database. This random data generator takes JSON file with defined data model and uses it to generate new data. When data is in a database it can be accessed with get methods. Project was originally made as support for medical machine learning system OCULUS Project
We're Computer Science students at Poznan University of Technology. We like programming and had a lot of fun creating this project.
Artur Bałczyński
Mateusz Ostrowski
Adam Przywuski
Maciej Stosik