This repository was archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 56
DataLab Features
viravit edited this page Nov 8, 2021
·
6 revisions
To be updated
| # | Features | AWS [Debian] | AWS [RedHat] | Azure [Debian] | Azure [RedHat] | GCP [Debian] |
|---|---|---|---|---|---|---|
| 1 | DataLab installation | |||||
| 1.1 | - Support of installing DataLab into one VPC | yes | yes | yes | yes | yes |
| 1.2 | - Support of installing DataLab into two VPCs | yes | yes | no | no | no |
| 1.3 | - DataLab installation via public IP | yes | yes | yes | no | yes |
| 1.4 | - DataLab installation via private IP | yes | yes | yes | yes | no |
| 2 | Login/Logout | |||||
| 2.1 | - Login with LDAP | yes | yes | yes | yes | yes |
| 2.2 | - Login with OAuth2 authentication and authorization | no | no | yes | yes | no |
| 2.3 | - Logout | yes | yes | yes | yes | yes |
| 3 | Edge node management | |||||
| 3.1 | - Create Edge node | yes | yes | yes | yes | yes |
| 3.2 | - Stop Edge node | yes | yes | yes | yes | yes |
| 3.3 | - Start Edge node | yes | yes | yes | yes | yes |
| 3.4 | - Recreate Edge node | in progress | in progress | in progress | in progress | in progress |
| 4 | Supported notebook templates | |||||
| 4.1 | - Jupyter notebook template | yes | yes | yes | yes | yes |
| 4.2 | - RStudio notebook template | yes | yes | yes | yes | yes |
| 4.3 | - Apache Zeppelin notebook template | yes | yes | yes | yes | yes |
| 4.4 | - DeepLearning notebook template | yes | yes | yes | no | yes |
| 4.5 | - Rstudio with TensorFlow notebook template | yes | yes | no | no | no |
| 4.6 | - Jupyter with TensorFlow notebook template | yes | yes | yes | no | yes |
| 4.7 | - Superset notebook template | no | no | no | no | yes |
| 5 | Notebook instance management | |||||
| 5.1 | - Stop notebook server instance | yes | yes | yes | yes | yes |
| 5.2 | - Start notebook server instance | yes | yes | yes | yes | yes |
| 5.3 | - Terminate notebook server instance | yes | yes | yes | yes | yes |
| 5.5 | - Go to notebook UI (reverse proxy) | yes | yes | yes | yes | yes |
| 5.6 | - Creating custom AMI from running notebook instance | yes | yes | yes | yes | yes |
| 5.7 | - Creating notebook instance from custom image | yes | yes | yes | yes | yes |
| 5.8 | - Creating notebook instance from shared image | yes | yes | yes | yes | yes |
| 5.9 | - Tune local spark parameters from web UI on instance creation step: | yes | yes | yes | yes | yes |
| 5.10 | - Reconfiguration local spark on already existed notebook server instance | yes | yes | yes | yes | yes |
| 6 | Notebook templates that support Spark Standalone as computational resource | |||||
| 6.1 | - Jupyter notebook template | yes | yes | yes | yes | yes |
| 6.2 | - RStudio notebook template | yes | yes | yes | yes | yes |
| 6.2 | - Apache Zeppelin notebook template | yes | yes | yes | yes | yes |
| 6.3 | - DeepLearning notebook template | yes | yes | yes | no | yes |
| 6.4 | - RStudio with TensorFlow notebook template | yes | yes | no | no | no |
| 6.5 | - Jupyter with TensorFlow notebook template | yes | yes | yes | no | yes |
| 6.5 | - Superset notebook template | no | no | no | no | no |
| 7 | Data Engine (Spark Standalone) management | |||||
| 7.1 | - Stop | yes | yes | yes | yes | yes |
| 7.2 | - Start | yes | yes | yes | yes | yes |
| 7.3 | - Terminate | yes | yes | yes | yes | yes |
| 7.4 | - Ability to deploy Spark Standalone using Notebook's images | yes | yes | yes | yes | yes |
| 7.5 | - Ability to tune Spark Standalone parameters from web UI on instance creation step | yes | yes | yes | yes | yes |
| 7.6 | - Ability to reconfiguration already existing Spark Standalone from DataLab Web UI | yes | yes | yes | yes | yes |
| 7.7 | - Ability to access Spark Standalone job tracker URL from Web UI (via reverse proxy) | yes | yes | yes | yes | yes |
| 8 | Notebook templates that support Cloud provider Data Engine Service as computational resource | |||||
| 8.1 | - Jupyter notebook template | yes, EMR | yes, EMR | no | no | yes, Dataproc |
| 8.2 | - RStudio notebook template | yes, EMR | yes, EMR | no | no | yes, Dataproc |
| 8.3 | - Zeppelin notebook template | yes, EMR | yes, EMR | no | no | yes, Dataproc |
| 8.4 | - TensorFlow notebook template | no | no | no | no | no |
| 8.5 | - DeepLearning notebook template | no | no | no | no | no |
| 8.6 | - RStudio with TensorFlow notebook temlate | no | no | no | no | no |
| 8.7 | - Jupyter with TensorFlow notebook template | no | no | no | no | no |
| 8.8 | - Superset notebook template | no | no | no | no | no |
| 9 | Data Engine Service management | |||||
| 9.1 | - Stop | no | no | no | no | no |
| 9.2 | - Start | no | no | no | no | no |
| 9.3 | - Terminate | yes | yes | yes | yes | yes |
| 9.4 | - Ability to tune Engine Service parameters from WEB UI on instance creation step | yes | yes | no | no | no |
| 9.5 | - Ability to navigate to Data Engine Service job tracker URL from Web UI (via reverse proxy) | yes | yes | no | no | yes |
| 10 | Libraries management | |||||
| 10.1 | - Ability to deploy libraries on notebook instance | yes | yes | yes | yes | yes |
| 10.2 | - Ability to deploy libraries on Data Engine (Spark Standalone) | yes | yes | yes | yes | yes |
| 10.3 | - Ability to deploy libraries on Data Engine Service | yes | yes | no | no | yes |
| 11 | Available library groups for installation from WEB UI: | |||||
| 11.1 | - Apt/Yum | yes | yes | yes | yes | yes |
| 11.2 | - Pip2 | yes | yes | yes | yes | yes |
| 11.3 | - Pip3 | yes | yes | yes | yes | yes |
| 11.4 | - R packages | yes | yes | yes | yes | yes |
| 11.5 | - Java | yes | yes | yes | yes | yes |
| 12 | Instance management via scheduler | |||||
| 12.1 | - Ability to stop/start notebook instance on scheduled basis | yes | yes | yes | yes | yes |
| 12.2 | - Ability to stop/start Data Engine (Spark Standalone) on scheduled basis | yes | yes | yes | yes | yes |
| 12.3 | - Ability to stop/start Data Engine Service on scheduled basis | yes | yes | yes | yes | yes |
| 12.4 | - Ability to terminate Compute on scheduled basis | yes | yes | yes | yes | yes |
| 12.5 | - Support of resources stopping on exceeding idle time via scheduler | yes | yes | yes | yes | yes |
| 12.6 | - Reminder after login, notifying that corresponding resources are about to be stopped/terminated | yes | yes | yes | yes | yes |
| 13 | Admin user only functionality | |||||
| 13.1 | - Ability to stop user's Edge node/Compute/ notebook instance separately | yes | yes | yes | yes | yes |
| 13.2 | - Ability to terminate user's Compute/notebook instance separately | yes | yes | yes | yes | yes |
| 13.3 | - Ability to stop user's Edge node with related instances simultaneously | yes | yes | yes | yes | yes |
| 13.4 | - Ability to terminate user's Edge node with related instances simultaneously | yes | yes | yes | yes | yes |
| 13.5 | - Ability to connect/disconnect endpoint | yes | yes | yes | yes | yes |
| 13.6 | - Ability to restrict available instance shapes based on user login (per user/group) | yes | yes | yes | yes | yes |
| 13.7 | - Ability to adjust total cost limitation for project and total Datalab as well | yes | yes | yes | yes | yes |
| 13.8 | - Ability to see and export billing data for all users | yes | yes | yes | yes | yes |
| 13.9 | - Ability to to set permissions for cloud buckets if user only accesses via bucket browser | yes | yes | yes | yes | yes |
| 14 | Notebook templates that support cloning repository/merging/pulling/pushing), ungit | |||||
| 14.1 | - Jupyter notebook template | yes | yes | yes | yes | yes |
| 14.2 | - RStudio notebook template | yes | yes | yes | yes | yes |
| 14.3 | - Apache Zeppelin notebook template | no | no | no | no | no |
| 14.4 | - DeepLearning notebook template | yes | yes | yes | no | yes |
| 14.5 | - Rstudio with TensorFlow notebook temlate | yes | no | no | no | no |
| 14.6 | - Jupyter with TensorFlow notebook template | yes | yes | yes | no | yes |
| 15 | Data Storage | |||||
| 15.1 | - Ability to read/write from/to shared or personal bucket | yes, S3 | yes, S3 | yes, blob storage, data lake | yes, blob storage, data lake | yes, blob storage, data lake |
| 16 | Bucket browser | |||||
| 16.1 | Ability to upload file, create folder, delete folder/file, download file, copy path to folder/file | yes | yes | yes | yes | yes |
| 17 | Billing Report | |||||
| 17.1 | - Ability to see and export billing Data | yes | yes | yes | yes | no |
| 18 | Audit Report | |||||
| 18.1 | - Ability to see all users action on DataLab UI | yes | yes | yes | yes | yes |