This workshop is meant to be highly interactive. The instructor will lead you in two interactive teaching styles:
-
Interactive Lecturing: The majority of content for this workshop is in a Notebook. Though the content will be introduced via PowerPoint, the rest of the workshop will consist of walking them through the Azure Notebooks. During this time, instructors will employ an interactive lecture style, where learners will be asked to participate by asking questions and offering up ideas.
-
Think, Pair, Share: For some of the more complex topics, the instructor will use the "Think, Pair, Share" method. This is where you will be asked a question and given about 45 seconds to think quietly to yourself. During this time it is imperative that you are not discussing with others yet. Then, you will have an opportunity to discuss with the 1-2 people next to you. Make sure you don't just share your answer, but why you think that is the answer. Finally, the instructor will ask for a few people to share what they discussed with their neighbors.
Notice: Various interactive cues are called out in the Notebooks. These are suggestions and at the instructor's discression.
Hour | Topic | |
---|---|---|
9:30 - 10:00am | Introduction to Data Science Keynote | |
10:00 - 11:00pm | Introduction and Refresher Python | |
12:00 - 1:00pm | Introduction to NumPy & Pandas | |
12:00 - 12:30pm | Lunch | |
12:30 - 1:30pm | Introduction to Data cleaning and manipulation | |
1:30 - 2:30pm | Introduction to Machine Learning Models and Linear Regression | |
2:00 - 2:10pm | Break | |
2:30 - 3:45pm | Using the Cloud for Machine Learning with Azure ML Studio | |
3:45 - 4pm | Wrap Up |
This repository contains a Visual Studio Code container build.
The setup of the one button click deployment is explained in more details below.
The Visual Studio Code Remote - Containers extension lets you use a Docker container as a full-featured development environment. It allows you to open any folder inside (or mounted into) a container and take advantage of Visual Studio Code's full feature set. A .devcontainer folder in your project tells VS Code how to access (or create) a development container with a well-defined tool and runtime stack. This container can be used to run an application or to sandbox tools, libraries, or runtimes needed for working with a codebase.
Workspace files are mounted from the local file system or copied or cloned into the container. Extensions are installed and run inside the container, where they have full access to the tools, platform, and file system. This means that you can seamlessly switch your entire development environment just by connecting to a different container.
- jupyter
- numpy
- pandas
- scipy
- folium==0.2.1
- matplotlib
- ipywidgets>=7.0.0
- bqplot
- nbinteract==0.0.12
You can run this container from VSCode locally see https://code.visualstudio.com/docs/remote/containers.
To get started, follow these steps:
Install and configure Docker for your operating system.
Windows / macOS:
Install Docker Desktop for Windows/Mac
Right-click on the Docker taskbar item and update Settings / Preferences > Shared Drives / File Sharing with any source code locations you want to open in a container. If you run into trouble, see Docker Desktop for Windows tips on avoiding common problems with sharing.
Linux:
Follow the official install instructions for Docker CE/EE for your distribution. If you are using Docker Compose, follow the Docker Compose directions as well.
Add your user to the docker group by using a terminal to run: sudo usermod -aG docker $USER
Sign out and back in again so your changes take effect.
Install Visual Studio Code
Install the Remote Development extension pack
Let's start by using a sample project to try things out.
Clone one of the repository
Start VS Code and click on the quick actions Status Bar item in the lower left corner of the window.
Quick actions Status bar item
Select Remote-Containers: Open Folder in Container... from the command list that appears, and open the root folder of the project you just cloned.
The window will then reload, but since the container does not exist yet, VS Code will create one. This may take some time, and a progress notification will provide status updates. Fortunately, this step isn't necessary the next time you open the folder since the container will already exist.
Dev Container Progress Notification
After the container is built, VS Code automatically connects to it and maps the project folder from your local file system into the container. Check out the Things to try section of README.md in the repository you cloned to see what to do next.
Tip: Want to use a remote Docker host? See the Advanced Containers article for details on setup
The way this works is as follows is simply calling a specific URL which creates your Visual Studio Online Environment
We also have the domain env.new which allows you to replace online.visualstudio.com making the url shorter
https://env.new?name=intro-DataScience&repo=leestott/intro-Datascience
You could also load this in a new window using target="_blank" so as an example so {your image}
install this in Visual Studio Online https://visualstudio.microsoft.com/services/visual-studio-online/ you'll need the following:
- A Microsoft Azure subscription. If you don't already have one, you can sign up for a free trial at https://azure.microsoft.com or a Student Subscription at https://aka.ms/azureforstudents.
- A Visual Studio Online environment. This provides a hosted instance of Visual Studio Code, in which you'll be able to run the notebooks for the lab exercises. To set up this environment:
- Browse to https://online.visualstudio.com
- Click Get Started.
- Sign in using the Microsoft account associated with your Azure subscription.
- Click Create environment. If you don't already have a Visual Studio Online plan, create one. This is used to track resource utlization by your Visual Studio Online environments. Then create an environment with the following settings:
- Environment Name: A name for your environment - for example, intro-Datascience.
- Git Repository: leestott/intro-Datascience
- Instance Type: Standard (Linux) 4 cores, 8GB RAM
- Suspend idle environment after: 120 minutes
- Wait for the environment to be created, and then click Connect to connect to it. This will open a browser-based instance of Visual Studio Code.
- Wait for a minute or so while the environment is set up for you. It might look like nothing is happening, but in the background we are installing some extensions that you will use in the labs.
- Copy your notebooks files and data and undertake your learning
Tip: you can change the color scheme back to a dark background if you prefer - just click the ⚙ icon at the bottom left and select a new Color Theme.
Using Azure Notebooks http://notebooks.azure.com
The primary source of content will be relatively bare Azure Notebooks where the instructor will guide you through discovering the different features of Python, NumPy, Pandas, and general data cleaning and manipulation.
The folder called "Course Material" has all of the content and exercises, plus written explanations and additional features and exercise not covered in this workshop.
Cloning creates a copy of an existing project in your own account, where you can then run and modify any notebook or other file in the project. You can also use cloning to make copies of your own projects in which you do experiments or other work without disturbing the original project.
To clone a project
Clone command on project context menu
In the Clone Project popup, enter a name and ID for the clone, and specify whether the clone is public. These settings are the same as for a new project.
After you select the Clone button, Azure Notebooks navigates directly to the copy
Azure Notebooks is still in Preview. This means that there are some times when it will fail. Here are some tips for avoiding losing your work:
- Ensure their work is being saved. In the Jupyter Notebook there is always one of two messages to the right of the title of the notebook:
(autosaved)
or(unsaved changes)
. Make sure you're noticing that your work is being saved. You should consider checking every 10 minutes or so. - Sometimes Notebooks get into a state where the Kernel cannot be started. Sometimes re-starting the kernel will work. But often you will have to somepletely sign out of Azure Notebooks and then sign back in.
Additionally, if you need a referesher on how to code in Python or work with NumPy or Pandas, we recommend you check out the materials from our other Reactor Workshop:
Data cleaning and manipulation
Microsoft Learn Interactive Labs
Building an AI Solution with ML Services