Tracking

From here I forgot to do the daily tracking, below is a summary of the remaining project.

Caffe Multi-class Precision and Recall
Debug memory consumption in word2vec (float64 to float32)
Develop configuration file infrastructure
Extract Wikipedia files in plain text, develop filters to maintain good text quality
Skeleton for web app
Refactoring of training instance generation
Debugging our net/architecture/code for issues regarding our low precision and recall
Write program to generate all configurations for the experiments
Adapt experiment.sh and training.sh files
Continuously run and monitor the experiments over christmas
Search for audio feature libraries to extract the pitch and energy levels, conversion of raw files and developed scripts for extraction
Run experiment with train and test from Wikipedia
Run experiments with info gain loss matrix to tackle class imbalance
Refactoring of audio feature generation
Audio deployment
Develop executable fusion evaluation
Double check evaluation results and chart plots
Final presentation
Paper

Read https://code.google.com/p/word2vec/
Understand distance.c, implement Python program to read word vectors
Explore how to use NLTK for POS tagging
Write python script for demo show cases
Documentation of demo script
Net configuration script
PlainText parser
Unify python paths and working directory
Help messages + README for python script folder
Generator usage to reduce memory consumption
Intermediate presentation
Progress for training instance generation and parsing adapted
Python path problems fixed
Config file
Initial LSTM testing
Initial paper stub
Parameter evaluation script improved and generalized
Chart for harmonic mean, comparison of different features
Help Ricarda with error fixing for demo model selection
Backwards compatibility of config files
Convenience download script for downloading all model files needed for demo from server
Try batch processing for speed up of demo (unsuccessful - reverted)
Initial fusion methods and class structure, evaluation of these done by Stefan
Refactoring for improved readability and versatility of Javascript Code
Improved and revised readme file with Ricarda
Fix Acoustic Model Index Conversion
Paper: Worked on introduction, conclusion, demo sections

Read https://code.google.com/p/word2vec/
Raad some papers
Write Python script to parse xml and ASR transcript files
Write Python script to create basic training instances using a sliding window
Write training instances to leveldb script
Ensure valid train and test split
Caffe Multi-class Precision and Recall
Pipeline work
Use POS-Tags as features
Introduced a flag to turn on/off POS-Tagging
Use parameters from config file
Refactoring the input parser
Web Demo
Presentation
Debugging our net/architecture/code for issues regarding our low precision and recall
Several trainings to get the baseline
Continously run and monitor the experiments over christmas
Converting xml and txt files into line format. POS Tags can be preprocessed and written to disk.
Main program gets only a config file as argument
Refactoring of line parser
Preprocessing of POS-Tagging: Write data files with POS tags
Refactroing of sliding window: Punctuation pos can be at any position
Debugging of Word2Vec: Use Float32 instead of Float64
Parse ctm files for generating accoustic training instances.
Parse pitch and energy files. Create pitch and energy features for the audio model.
Include audio model into web demo
Implement first basic fusion
Major refactoring of web demo backend
Implement evaluation of fusion
Final presentation
Paper

Read https://code.google.com/p/word2vec/
Read papers to get familiar with Deep Learning
Read papers for lexical sentence boundary approaches
Write Python script to parse xml and ASR transcript files
Write Python script to create basic training instances using a sliding window
Refactor script for creating trainings instances and work on pipeline to create instances
Write python script for demo show cases
Use POS-Tags as features
LineParser implemented
Generator usage to reduce memory consumption
Script to collect experiment results
Web demo model selection
use balanced data for trainings/testing including flag in config
json converter for prediciting results to demo
show pos tags in demo together with joseph
loading spinner in demo
fix selection options in demo
Demo writing results to file
Demo choose input text from existing files
Setup Demo on Server and fix availablility from outside
Write guide for demo setup
Paper

Provide feedback