Replication instructions

Neural Network Project

The goal here is to generate additional data samples from a structured dataset using a Generative Adversarial Network. See our poster here.

Replication instructions

Set up

Begin by cloning the repo and installing the packages

git clone https://github.com/despresj/ece_colab.git
python3 -m venv ece884_project_enviroment
source ece884_project_enviroment/bin/activate
pip install -r requirements.txt

After that load some libraries.

import os
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler

Then import the functions created for this project. Read the code here

from tools.train import build_network, train_gan # functions from project

import and scale the data. Note that any tabular data set with numeric columns will (hopefully) work.

df = pd.read_csv("data_clean/taxi_sample.csv") # data is a small subset of the full nyc taxi data
# obtain the full data for 2016 here
data_columns = df.columns
scaler = MinMaxScaler().fit(df)
df = scaler.transform(df)

Train and generate data. The project was done in Google Colab, however this is designed to produce output locally. For the best results use the full dataset.

for _ in range(5000):
    # these GANS often converge to one point, these randomized hyper parameters
    # prevent it getting stuck on one solution. Also, these are appropriate for 
    # this taxi data another  dataset would require different randomized archatecture
    neurons = np.random.randint(8, 25)
    hidden = np.random.randint(5, 10)
    noise_n = np.random.randint(125, 150)
    epochs = np.random.randint(75, 100)
    learna = np.random.exponential(1e-2)
    learnb = np.random.exponential(1e-2)

    generator = build_network(output_dim=df.shape[1], n_hidden=hidden, n_neurons=neurons, learning_rate=learna)
    discriminator = build_network(output_dim=1, n_hidden=hidden, n_neurons=neurons, learning_rate=learnb) 
    
    gen_data = train_gan(generator, discriminator, df, n_epochs=epochs, n_noise=noise_n)
    output_path = "generated_data.csv"
    generated_data = pd.DataFrame(scaler.inverse_transform(gen_data), columns=data_columns) 
    generated_data.to_csv(output_path, mode='a', header=not os.path.exists(output_path), index=False)

Basic Structure of a GAN[2]

See the majority of the data generated are plausable.

Although plausable, these require significant curiation hence the cavilar approach to the neural network archatectr.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data_clean		data_clean
data_raw		data_raw
notebooks		notebooks
pics		pics
poster		poster
slides		slides
tools		tools
README.md		README.md
clone_ECE_project.ipynb		clone_ECE_project.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Network Project

Replication instructions

Set up

Basic Structure of a GAN[2]

See the majority of the data generated are plausable.

About

Releases

Packages

Languages

yunus-shariff/Synthetic-Sample-Generation_using-GANs

Folders and files

Latest commit

History

Repository files navigation

Neural Network Project

Replication instructions

Set up

Basic Structure of a GAN[2]

See the majority of the data generated are plausable.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages