-
Notifications
You must be signed in to change notification settings - Fork 119
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
continue-revolution
committed
Mar 11, 2023
1 parent
04ec45a
commit a377553
Showing
16 changed files
with
416 additions
and
296 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Configuration file of running tensorflow backend | ||
|
||
# ========== Cluster configuration ========== | ||
# ip address of the parameter server (need 1 GPU process) | ||
ps_ip: localhost | ||
|
||
exp_path: $FEDSCALE_HOME/fedscale/cloud | ||
|
||
aggregator_entry: aggregation/aggregator_mnn.py | ||
|
||
auth: | ||
ssh_user: "" | ||
ssh_private_key: ~/.ssh/id_rsa | ||
|
||
# cmd to run before we can indeed run FAR (in order) | ||
setup_commands: | ||
- source $HOME/anaconda3/bin/activate fedscale | ||
|
||
# ========== Additional job configuration ========== | ||
# Default parameters are specified in config_parser.py, wherein more description of the parameter can be found | ||
|
||
job_conf: | ||
- job_name: android-mnn # Generate logs under this folder: log_path/job_name/time_stamp | ||
- log_path: $FEDSCALE_HOME/benchmark # Path of log files | ||
- experiment_mode: mobile | ||
- num_participants: 1 # Number of participants per round, we use K=100 in our paper, large K will be much slower | ||
- model: linear # Need to define the model in aggregator_mnn.py | ||
- learning_rate: 0.01 | ||
- batch_size: 32 | ||
- input_shape: 32 32 3 | ||
- num_classes: 10 | ||
- test_bsz: 16 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Configuration file of running tensorflow backend | ||
|
||
# ========== Cluster configuration ========== | ||
# ip address of the parameter server (need 1 GPU process) | ||
ps_ip: localhost | ||
|
||
exp_path: $FEDSCALE_HOME/fedscale/cloud | ||
|
||
aggregator_entry: aggregation/aggregator_tflite.py | ||
|
||
auth: | ||
ssh_user: "" | ||
ssh_private_key: ~/.ssh/id_rsa | ||
|
||
# cmd to run before we can indeed run FAR (in order) | ||
setup_commands: | ||
- source $HOME/anaconda3/bin/activate fedscale | ||
|
||
# ========== Additional job configuration ========== | ||
# Default parameters are specified in config_parser.py, wherein more description of the parameter can be found | ||
|
||
job_conf: | ||
- job_name: android-tflite # Generate logs under this folder: log_path/job_name/time_stamp | ||
- log_path: $FEDSCALE_HOME/benchmark # Path of log files | ||
- experiment_mode: mobile | ||
- num_participants: 1 # Number of participants per round, we use K=100 in our paper, large K will be much slower | ||
- model: linear # Need to define the model in tf_aggregator.py | ||
- learning_rate: 0.01 | ||
- batch_size: 32 | ||
- input_shape: 32 32 3 | ||
- num_classes: 10 | ||
- test_bsz: 16 | ||
- engine: 'tensorflow' |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,53 @@ | ||
# Android TFLite Sample App | ||
# FedScale Deployment | ||
|
||
This directory contains minimum files modified from [MNN Android Demo](https://github.com/alibaba/MNN/tree/master/project/android/demo) and [TFLite Android Demo](https://www.tensorflow.org/lite/examples/on_device_training/overview). The training and testing will be conducted by TFLite backend, while the task execution and communication with server will be managed by Java. The sample has been tested upon image classification with a simple linear model and a small subset of [ImageNet-MINI](https://www.kaggle.com/datasets/ifigotin/imagenetmini-1000). This documentation contains a step-by-step tutorial on how to download, build and config this app on your own device, and modify this app for your own implementation and deployment. | ||
FedScale provides a cloud-based [aggregation service](https://github.com/SymbioticLab/FedScale/blob/master/fedscale/cloud/aggregation/README.md) and an SDK for smartphones on the edge that currently supports TensorflowLite and Alibaba MNN on Android (iOS support coming soon!). In this tutorial, we introduce how to: | ||
|
||
## Download and build sample android app | ||
- Initiate FedScale Cloud Service | ||
- Import FedScale SDK to locally fine tune models | ||
- Connect to FedScale cloud for federated training | ||
|
||
![fedscale deployment](../../../docs/fedscale-deploy.png) | ||
|
||
|
||
|
||
## FedScale Cloud Aggregation | ||
|
||
You may follow these steps to deploy and run the cloud server specific to [Alibaba MNN](https://github.com/SymbioticLab/FedScale/fedscale/edge/mnn/) or [TFLite](https://github.com/SymbioticLab/FedScale/fedscale/edge/tflite/). | ||
|
||
- Specify number of executors `num_participants: 1`. You may add more mobile participants to any number you want. We currently only support all participant in one single training backend. | ||
|
||
- Specify `model`. Currently we have only tested `linear` models for Alibaba MNN backend because Alibaba MNN does not support Dropout. However, you may choose one of `linear`|`mobilenetv3`|`resnet50`|`mobilenetv3_finetune`|`resnet50_finetune` models. `finetune` means that only the last 2 linear layers will be trained, but the backbone layers will be frozen. | ||
|
||
- Set `use_cuda` flag to `True` if you want to use GPU for aggregation. However, as aggregation process is sequential addition of several small tensors, GPU acceleration is very little. | ||
|
||
- Submit job | ||
|
||
``` | ||
cd $FEDSCALE_HOME/docker | ||
python3 driver.py submit $FEDSCALE_HOME/benchmark/configs/android/mnn.yml # If you want to run MNN backend on mobile. | ||
python3 driver.py submit $FEDSCALE_HOME/benchmark/configs/android/tflite.yml # If you want to run TFLite backend on mobile. | ||
``` | ||
|
||
- Check logs: FedScale will generate logs under `data_path` you provided by default. Keep in mind that k8s may load balancing your job to any node on the cluster, so make sure you are checking the `data_path` on the correct node. | ||
|
||
- Stop job | ||
|
||
``` | ||
cd $FEDSCALE_HOME/docker | ||
python3 driver.py stop $YOUR_JOB | ||
``` | ||
|
||
## FedScale Mobile Runtime | ||
|
||
We provide a sample app which you can choose to | ||
- Train/test models with TFLite or Alibaba MNN. | ||
- Fine-tune models locally **after** receiving model from the cloud. | ||
|
||
Please follow these steps to download and build the sample android app. | ||
|
||
1. Download and unzip [sample dataset (TrainTest.zip)](https://drive.google.com/file/d/1nfi3SVzjaE0LPxwj_5DNdqi6rK7BU8kb/view?usp=sharing) to `assets/` directory. Remove `TrainTest.zip` after unzip to save space on your mobile device. After unzip, you should see 3 files and 2 directories under `assets/`: | ||
1. `TrainSet`: Training set directory, contains 316 images. | ||
2. `TestSet`: Testing set directory, contains 34 images. | ||
1. `TrainSet`: Training set directory, contains 320 images. | ||
2. `TestSet`: Testing set directory, contains 32 images. | ||
3. `conf.json`: Configuration file for mobile app. | ||
4. `train_labels.txt`: Training label file with format `<filename> <label>`, where `<filename>` is the path after `TrainSet/`. | ||
5. `test_labels.txt`: Testing label file with the same format as `train_labels.txt`. | ||
|
@@ -16,23 +57,13 @@ This directory contains minimum files modified from [MNN Android Demo](https://g | |
- Gradle Version: 5.4.1 | ||
- Source Compatibility: Java 8 | ||
- Target Compatibility: Java 8 | ||
3. Make project. Android Studio will compile and build the app for you. | ||
|
||
## Test this app with default setting | ||
|
||
1. ssh to your own server and run | ||
``` | ||
cd fedscale/cloud/aggregation/android | ||
python3 aggregator_tflite.py --experiment_mode=mobile --num_participants=1 --engine=tensorflow | ||
``` | ||
2. Change aggregator IP address inside `assets/conf.json` and click `Run` inside Android Studio. | ||
|
||
## Customize your own app | ||
|
||
1. If you want to use your own dataset, please put your data under `assets/TrainSet` and `assets/TestSet`, make sure that your label has the same format as my label file. | ||
1. If you want to change the file/dir name under `assets`, please make sure to change the corresponding config in `assets` attribute inside `assets/conf.json`. | ||
2. If you want to use your own model for **image classification**, please either change `channel`, `width` and `height` inside `assets/conf.json` to your own input and change `num_classes` to your own classes, or override these attributes when sending `CLIENT_TRAIN` request. | ||
3. If you want to use your own model for tasks other than image classification, you may need to write your own TFLite trainer and tester. Please refer to [TFLite](https://www.tensorflow.org/lite/api_docs) for further development guide. You may also need to change `channel`, `width` and `height` inside `assets/conf.json` to your own input and change or remove `num_classes`. | ||
3. Modify `conf.json` and dataset. | ||
- You must modify `aggregator.ip` & `aggregator.port` to your own server. | ||
- You can choose your training framework by modifying `model_conf.backend` to `tflite` or `mnn`. | ||
- You may config your dataset information at `training_conf` and `testing_conf`. | ||
- You may put your own **image classification** dataset under `/TrainSet` and `/TestSet` directories and modify `train_labels.txt` and `test_labels.txt` accordingly. The format of labels must be \<filename\> \<label\>. | ||
- If you want to perform tasks other than image classification, you should modify framework-specific code [MNN](https://github.com/SymbioticLab/FedScale/fedscale/edge/android/app/src/main/java/com/fedscale/android/mnn) [TFLite](https://github.com/SymbioticLab/FedScale/fedscale/edge/android/app/src/main/java/com/fedscale/android/tflite). If you are using TFLite, you should also write your own signatures similar to [our TFLite model provider](https://github/com/SymbioticLab/FedScale/fedscale/cloud/internal/tflite_model_adapter.py) | ||
4. Make Project. Android Studio will compile and build the app for you. Click Run if you want to run the app on a real android device. | ||
|
||
---- | ||
If you need any other help, feel free to contact FedScale team or the developer [website](https://continue-revolution.github.io) [email](mailto:[email protected]) of this app. | ||
If you need any further help, feel free to contact FedScale team or the developer [website](https://continue-revolution.github.io) [email](mailto:[email protected]) of this app. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.