This framework enables the specification of neural network training jobs for which training happens in parallel on fog nodes. Provides an interface for job creation where parameters like dataset, neural network architecture, activation function, etc can be specified.
All the workers can be started on the same machine. The image the machine is running has to match its hardware architecture. There are two docker images: one for linux-based machines, which uses 'openJDK' and the other for RaspberryPis, which uses 'arm32v7/gradle'.
Use docker-compose up
to start the services.
In order to deploy the framework on a network of fog nodes, a docker swarm has to be established after which it can be deployed through:
docker stack deploy akkaFramework --compose-file docker-compose.yml
Note: The repository contains only the 'main' folder files of the gradle project. So inorder to get this working, create a new gradle project and copy the contents of this main folder into the corressponding folder of that one.
Once the project has been created, it can be run locally using the gradle build tool. We build a FatJar out of the gradle project.
gradle clean build shadowJar
After the jar file has been built, actors are spun into existence on specific ports.
gradle run --args="master 2550"
gradle run --args="worker 2552"
For this, an instance of NNJobMessage has to be created. For example,
NNJobMessage nn_msg = new NNJobMessage("iris_task", trainingSet, testSet, 75, 75, relu, layerDimensions, 0.1, 50)
If the framework is being run through docker, the dataset has to be bind-mounted to the master service in the docker-compose.yml file. This instance is sent as a message to the master actor to initiate the processing.
system.scheduler().scheduleOnce(interval, master, nn_msg, system.dispatcher(), null);