Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Documents		Documents
Layers		Layers
Omniglot		Omniglot
Youtube8M		Youtube8M
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Repository files navigation

Task Adaptive Activation Network

This repository contains the Pytorch implementation of my paper:

Y. Liu, X. Yang, D. Xie, X. Wang, L. Shen, H. Huang, N. Balasubramanian, "Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning." in Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-20), 2020. [citation]

In this paper, we are going to automatically learn the optimal splitting of network architecture for deep Multi-Task Learning in a scalable way.

Set Up

Prerequisites

Pytorch == 1.1.0
Pyorch-Ignite == 0.2.0
h5py == 2.9.0

Getting Started

Inside this repository, we conduct comprehensive experiments on two datasets.

Youtube-8M (video-level features); [link]
Omniglot; [link]

We also implement several recent deep Multi-Task Learning methods, including:

Multilinear Relationship Network (MRN): soft-sharing method that models task relationship by tensor Gaussian distribution. [citation]
Deep Multi-Task Representation Learning (DMTRL): soft-sharing method based on tensor decomposition. [citation]
Soft Layer Ordering (Soft-Order): compute task-specific order of the shared sets of hidden layer. [citation]
Cross-Stitch: soft-sharing method that computes feature by linear combination of the task-specific hidden layers. [citation]
Multi-gate Mixture-of-Experts (MMoE): Computes the last hidden feature by the gated combination of a set of neural networks (Experts). [citation]

The implementations can be found in the path Youtube8M/models.py.

Experiments

Layer Implementation

In the path Layers, we already implement the layers of MRN, DMTRL and TAL in our model. You can directly use them as the general Pytorch layers nn.Linear and nn.Conv2d. The only difference is that these layers have some extra parameters to set-up the regularization based on their proposed methods, and they have an additional member function self.regularization(c) that computes the corresponding regularization term with respect to the Lagrangian coefficient c. The details are given in Layers/README.md.

Run on Datasets

For the detailed instructions on reproducing our experiments, please refer Youtube8M/README.md and Omniglot/README.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Task Adaptive Activation Network

Set Up

Prerequisites

Getting Started

Experiments

Layer Implementation

Run on Datasets

About

Releases

Packages

Languages

License

yingrliu/TAAN-MTL

Folders and files

Latest commit

History

Repository files navigation

Task Adaptive Activation Network

Set Up

Prerequisites

Getting Started

Experiments

Layer Implementation

Run on Datasets

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages