Skip to content

Commit

Permalink
Remove last number of version in doc content (#687)
Browse files Browse the repository at this point in the history
* Update workspace related documentation (#684)

* Update workspace related documentation

* Add more details to server/client workspace and add reference

* Update documentation format (#685)

* Remove last number of version in doc content
  • Loading branch information
YuanTingHsieh authored Jun 18, 2022
1 parent 65848ff commit 81dd280
Show file tree
Hide file tree
Showing 17 changed files with 222 additions and 185 deletions.
4 changes: 2 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@ def resolve_xref(self, env, fromdocname, builder, typ, target, node, contnode):
author = "NVIDIA"

# The full version, including alpha/beta/rc tags
release = "2.1.0"
version = "2.1.0"
release = "2.1.2"
version = "2.1.2"


# -- General configuration ---------------------------------------------------
Expand Down
16 changes: 8 additions & 8 deletions docs/example_applications.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,15 @@ For the complete collection of example applications, see https://github.com/NVID

Custom Code in Example Apps
===========================
There are several ways to make :ref:`custom code <custom_code>` available to clients when using NVIDIA FLARE. Most
hello-* examples use a custom folder within the FL application. Note that using a custom folder in the app needs to be
:ref:`allowed <troubleshooting_byoc>` when using secure provisioning. By default, this option is disabled in the secure
mode. POC mode, however, will work with custom code by default.
There are several ways to make :ref:`custom code <custom_code>` available to clients when using NVIDIA FLARE.
Most hello-* examples use a custom folder within the FL application.
Note that using a custom folder in the app needs to be :ref:`allowed <troubleshooting_byoc>` when using secure provisioning.
By default, this option is disabled in the secure mode. POC mode, however, will work with custom code by default.

In contrast, the `CIFAR-10 <https://github.com/NVIDIA/NVFlare/tree/main/examples/cifar10>`_,
`prostate segmentation <https://github.com/NVIDIA/NVFlare/tree/main/examples/prostate>`_,
and `BraTS18 segmentation <https://github.com/NVIDIA/NVFlare/tree/main/examples/brats18>`_ examples assume that the
learner code is already installed on the client's system and
available in the PYTHONPATH. Hence, the app folders do not include the custom code there. The PYTHONPATH is
set in the ``run_poc.sh`` or ``run_secure.sh`` scripts of the example. Running these scripts as described in the README
will make the learner code available to the clients.
learner code is already installed on the client's system and available in the PYTHONPATH.
Hence, the app folders do not include the custom code there.
The PYTHONPATH is set in the ``run_poc.sh`` or ``run_secure.sh`` scripts of the example.
Running these scripts as described in the README will make the learner code available to the clients.
11 changes: 3 additions & 8 deletions docs/examples/access_result.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,7 @@
Accessing the results
^^^^^^^^^^^^^^^^^^^^^

Once the job is finished, you can issue the ``download_job [JOB_ID]``
in the admin client to download the results.
The results of each job will usually be stored inside the server side workspace.

`[JOB_ID]` is the ID assigned by the system when submitting the job.

The result will be downloaded to your admin workspace
(the exact download path will be displayed when running the command).

The download workspace will be in ``[DOWNLOAD_DIR]/[JOB_ID]/workspace/``.
Please refer to :ref:`access server-side workspace <access_server_workspace>`
for accessing the server side workspace.
2 changes: 1 addition & 1 deletion docs/examples/hello_tf2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ let's put this preparation stage into one method ``setup``:

.. literalinclude:: ../../examples/hello-tf2/custom/trainer.py
:language: python
:lines: 41-73
:lines: 41-71
:lineno-start: 41
:linenos:

Expand Down
2 changes: 1 addition & 1 deletion docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ Server related questions

#. What happens if the FL server crashes?

See :ref:`high_availability` for the features implemented in NVIDIA FLARE 2.1.0 around FL server failover.
See :ref:`high_availability` for the features implemented in NVIDIA FLARE 2.1 around FL server fail-over.

#. Why does my FL server keep crashing after a certain round?

Expand Down
49 changes: 33 additions & 16 deletions docs/flare_overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,36 +4,53 @@
NVIDIA FLARE Overview
#####################

**NVIDIA FLARE** (NVIDIA Federated Learning Application Runtime Environment) is a domain-agnostic, open-source, extensible SDK that allows researchers and data scientists to adapt existing ML/DL workflow to a federated paradigm.
**NVIDIA FLARE** (NVIDIA Federated Learning Application Runtime Environment) is a domain-agnostic, open-source,
extensible SDK that allows researchers and data scientists to adapt existing ML/DL workflow to a federated paradigm.

With Nvidia FLARE platform developers can build a secure, privacy preserving offering for a distributed multi-party collaboration.
With Nvidia FLARE platform developers can build a secure, privacy preserving offering
for a distributed multi-party collaboration.

NVIDIA FLARE SDK is built for robust, production scale for real-world federated learning deployments. It includes:
NVIDIA FLARE SDK is built for robust, production scale for real-world federated learning deployments.

* A runtime environment enabling data scientists and researchers to easily carry out FL experiments in a real-world scenario. Nvidia FLARE supports multiple task execution, maximizing data scientist's productivity.
It includes:

* A runtime environment enabling data scientists and researchers to easily carry out FL experiments in a
real-world scenario. Nvidia FLARE supports multiple task execution, maximizing data scientist's productivity.

* System capabilities to stand up Federated learning with high availability infrastructure, eliminating FL server being a single point of failue.
* System capabilities to start up federated learning with high availability infrastructure.

* Built-in implementations of:

* Federated Training workflows (scatter-gather, Cyclic);
* Federated Evaluation workflows (global model evaluation, cross site model evalidation);
* Learning algorithms (FedAvg, FedOpt, FedProx) and
* Privacy preserving algorithms (homomorphic encryption, differential privacy)
* Federated training workflows (scatter-and-gather, Cyclic)
* Federated evaluation workflows (global model evaluation, cross site model validation);
* Learning algorithms (FedAvg, FedOpt, FedProx)
* Privacy preserving algorithms (homomorphic encryption, differential privacy)

* Extensible management tools for:

* Secure provisioning (SSL certificates),
* Secure provisioning (SSL certificates)
* Orchestration (Admin Console) | (Admin APIs)
* Monitoring of Federated learning experiments. (Aux APIs; Tensorboard visualization)
* Monitoring of federated learning experiments (Aux APIs; Tensorboard visualization)

* A rich set of programmable APIs allowing researchers to create new federated workflows, learning & privacy preserving algorithms.
* A rich set of programmable APIs allowing researchers to create new federated workflows,
learning & privacy preserving algorithms.


High-level System Architecture
==============================
As outlined above, NVIDIA FLARE includes components that allow researchers and developers to build and deploy end-to-end federated learning applications. The high-level architecture is shown in the diagram below. This includes the foundational components of the NVIDIA FLARE API and tools for Privacy Preservation and Secure Management of the platform. On top of this foundation are the building blocks for federated learning applications, with a set of Federation Workflows and Learning Algorithms.
As outlined above, NVIDIA FLARE includes components that allow researchers and developers to build and deploy
end-to-end federated learning applications.

The high-level architecture is shown in the diagram below.

This includes the foundational components of the NVIDIA FLARE API and tools for privacy preservation and
secure management of the platform.

On top of this foundation are the building blocks for federated learning applications,
with a set of federation workflows and learning algorithms.

Alongside this central stack are tools that allow experimentation and proof-of-concept development with the FL Simulator (POC mode), along with a set of tools used to deploy and manage production workflows.
Alongside this central stack are tools that allow experimentation and proof-of-concept development
with the FL Simulator (POC mode), along with a set of tools used to deploy and manage production workflows.

.. image:: resources/FL_stack.png
:height: 300px
Expand Down Expand Up @@ -65,7 +82,7 @@ in a way that allows others to easily customize and extend.
Every component and API is specification-based, so that alternative implementations can be
constructed by following the spec. This allows pretty much every component to be customized.

We strive to be unopinionated in reference implementations, encouraging developers and end-users
We strive to be open-minded in reference implementations, encouraging developers and end-users
to extend and customize to meet the needs of their specific workflows.


Expand All @@ -81,7 +98,7 @@ problems in a straightforward way.

We design ths system to be general purpose, to enable different "federated" computing use cases.
We carefully package the components into different layers with minimal dependencies between layers.
In this way, implementations for specific use cases should not demand modificastions to the
In this way, implementations for specific use cases should not demand modifications to the
underlying system core.


Expand Down
40 changes: 26 additions & 14 deletions docs/highlights.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
Highlights
##########

New in NVIDIA FLARE 2.1.0
=========================
New in NVIDIA FLARE 2.1
=======================
- :ref:`High Availability (HA) <high_availability>` supports multiple FL Servers and automatically cuts
over to another server when the currently active server becomes unavailable.
- :ref:`Multi-Job Execution <multi_job>` supports resource-based multi-job execution by allowing for concurrent runs
Expand All @@ -31,22 +31,34 @@ Training workflows
Evaluation workflows
--------------------
- :ref:`Cross site model validation <cross_site_model_evaluation>` is a workflow that allows validation of each
client model and the server global model against each client dataset. Data is not shared, rather the collection
of models is distributed to each client site to run local validation. The results of local validation are
collected by the server to construct an all-to-all matrix of model performance vs. client dataset.
client model and the server global model against each client dataset.

Data is not shared, rather the collection of models is distributed to each client site to run local validation.

The results of local validation are collected by the server to construct an all-to-all matrix of
model performance vs. client dataset.

- :ref:`Global model evaluation <cross_site_model_evaluation>` is a subset of cross-site model validation in which
the server’s global model is distributed to each client for evaluation on the client’s local dataset.

Privacy preservation algorithms
-------------------------------
Privacy preserving algorithms in NVIDIA FLARE are implemented as filters that can be applied as data is sent or received between peers.
Privacy preserving algorithms in NVIDIA FLARE are implemented as :ref:`filters <filters_for_privacy>`
that can be applied as data is sent or received between peers.

- Differential privacy:

- Exclude specific variables (:class:`ExcludeVars<nvflare.app_common.filters.exclude_vars.ExcludeVars>`)
- truncate weights by percentile (:class:`PercentilePrivacy<nvflare.app_common.filters.percentile_privacy.PercentilePrivacy>`)
- apply sparse vector techniques (:class:`SVTPrivacy<nvflare.app_common.filters.svt_privacy.SVTPrivacy>`).

- Homomorphic encryption: NVIDIA FLARE provides homomorphic encryption and decryption
filters that can be used by clients to encrypt Shareable data before sending it to a peer.

The server does not have a decryption key but using HE can operate on the encrypted data to aggregate
and return the encrypted aggregated data to clients.

- :ref:`Differential privacy <filters_for_privacy>` - Three reference filters are included to exclude specific
variables (exclude_vars), truncate weights by percentile (percentile_privacy), or apply sparse vector techniques (SVT, svt_privacy).
- :ref:`Homomorphic encryption <filters_for_privacy>` - NVIDIA FLARE provides homomorphic encryption and decryption
filters that can be used by clients to encrypt Shareable data before sending it to a peer. The server does not
have a decryption key but using HE can operate on the encrypted data to aggregate and return the encrypted
aggregated data to clients. Clients can then decrypt the data with their local key and continue local training.
Clients can then decrypt the data with their local key and continue local training.

Learning algorithms
-------------------
Expand All @@ -65,5 +77,5 @@ Learning algorithms
Examples
---------

Available at https://github.com/NVIDIA/NVFlare/tree/main/examples, including cifar10 (end-to-end workflow), hello-pt,
hello-monai, hello-numpy, hello-tf2.
Nvidia FLARE provide a rich set of :ref:`example applications <example_applications>` to walk your through the whole
process.
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Federated learning allows multiple clients, each with their own data, to collabo

NVIDIA FLARE is built on a componentized architecture that allows researchers to customize workflows to their liking and experiment with different ideas quickly.

With NVIDIA FLARE 2.1.0, :ref:`High Availability (HA) <high_availability>` and :ref:`Multi-Job Execution <multi_job>` introduce new concepts and change the way the system needs to be configured and operated. See `conversion from 2.0 <appendix/converting_from_previous.html>`_ for details.
With NVIDIA FLARE 2.1, :ref:`High Availability (HA) <high_availability>` and :ref:`Multi-Job Execution <multi_job>` introduce new concepts and change the way the system needs to be configured and operated. See `conversion from 2.0 <appendix/converting_from_previous.html>`_ for details.

.. toctree::
:maxdepth: 1
Expand Down
6 changes: 3 additions & 3 deletions docs/programming_guide/fl_context.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,8 @@ ClientEngineSpec for services they provide.

Job ID (fl_ctx.get_job_id())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FL application is always running within a RUN, which has a unique ID number. From NVIDIA FLARE version 2.1.0, job ID is
used as the run number, and it no longer has to be an integer.
FL application is always running within a RUN, which has a unique ID number.
From NVIDIA FLARE version 2.1, job ID is used as the run number, and it no longer has to be an integer.

Identity Name (fl_ctx.get_identity_name())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -203,7 +203,7 @@ The following diagram shows the lifecycle of the FL context for each iteration.
.. image:: ../resources/FL_Context.png
:height: 600px

In the Peer Context, following props from the Server are available (job ID is used as the run number in version 2.1.0+):
In the Peer Context, following props from the Server are available (job ID is used as the run number in version 2.1+):
- Run Number: peer_ctx.get_job_id())

Server Side FL Context
Expand Down
10 changes: 5 additions & 5 deletions docs/programming_guide/high_availability.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
#####################################
High Availability and Server Failover
#####################################
Previously in NVIDIA FLARE 2.0 and before, the FL server was the single point of failure for the system. Starting with
NVIDIA FLARE 2.1.0, a high availability (HA) solution has been implemented to support multiple FL servers with
automatic cutover when the currently active server becomes unavailable.
Previously in NVIDIA FLARE 2.0 and before, the FL server was the single point of failure for the system.
Starting with NVIDIA FLARE 2.1, a high availability (HA) solution has been implemented to support
multiple FL servers with automatic cut-over when the currently active server becomes unavailable.

The following areas were enhanced for supporting HA:

Expand Down Expand Up @@ -40,8 +40,8 @@ moment, there is at most one hot server.

The endpoint of the Overseer is provisioned and its configuration information is included in the startup kit of each entity.

For security reasons, the Overseer must only accept authenticated communications. In NVIDIA FLARE 2.1.0, the Overseer is
implemented with mTLS authentication.
For security reasons, the Overseer must only accept authenticated communications.
In NVIDIA FLARE 2.1, the Overseer is implemented with mTLS authentication.

Overseers maintain a service session id (SSID), which changes whenever any hot SP switch-over occurs, either by admin
commands or automatically. The following are cases associated with SP switch-over and SSID:
Expand Down
12 changes: 6 additions & 6 deletions docs/programming_guide/system_architecture.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Concepts and System Components

Spec-based Programming for System Service Objects
=================================================
NVIDIA FLARE 2.1.0 needs additional services to implement the HA feature:
NVIDIA FLARE 2.1 needs additional services to implement the HA feature:
storage, overseer, job definition management, etc. There are many ways to implement such services. For example,
storage could be implemented with a file system, AWS S3, or some database technologies. Similarly, job definition
management could be done with simple file reading or a sophisticated solution with a database or search engine.
Expand All @@ -34,13 +34,13 @@ See the example :ref:`project_yml` for how these components are configured in St

Overseer
--------
The Overseer is a system component newly introduced in 2.1.0 that determines the hot FL server at any time for high availability.
The Overseer is a system component newly introduced in 2.1 that determines the hot FL server at any time for high availability.
The name of the Overseer must be unique and in the format of fully qualified domain names. During
provisioning time, if the name is specified incorrectly, either being duplicate or containing incompatible
characters, the provision command will fail with an error message. It is possible to use a unique hostname rather than
FQDN, with the IP mapped to the hostname by having it added to ``/etc/hosts``.

NVIDIA FLARE 2.1.0 comes with HTTPS-based overseer. Users are welcome to change the name and port arguments of the overseer
NVIDIA FLARE 2.1 comes with HTTPS-based overseer. Users are welcome to change the name and port arguments of the overseer
in project.yml to fit their deployment environment.

The Overseer will receive a Startup kit, which includes the start.sh shell script, its certificate and private key,
Expand All @@ -65,9 +65,9 @@ their own Overseer Agent.

NVIDIA FLARE provides two implementations:

- :class:`HttpOverseerAgent<nvflare.ha.overseer_agent.HttpOverseerAgent>` to work with the Overseer server. For NVIDIA
FLARE 2.1.0, the provisioning tool will automatically map parameters specified in Overseer into the arguments for
the HttpOverseerAgent.
- :class:`HttpOverseerAgent<nvflare.ha.overseer_agent.HttpOverseerAgent>` to work with the Overseer server.
For NVIDIA FLARE 2.1, the provisioning tool will automatically map parameters specified in Overseer into
the arguments for the HttpOverseerAgent.
- :class:`DummyOverseerAgent<nvflare.ha.dummy_overseer_agent.DummyOverseerAgent>` is a dummy agent that simply
returns the configured endpoint as the hot FL server. The dummy agent is used when a single FL server is configured
and no Overseer server is necessary in an NVIDIA FLARE system. When DummyOverseerAgent is specified, the provisioning
Expand Down
Loading

0 comments on commit 81dd280

Please sign in to comment.