generated from jhudsl/OTTR_Template
-
Notifications
You must be signed in to change notification settings - Fork 1
/
08-modifying-docker-image.Rmd
234 lines (157 loc) · 12.1 KB
/
08-modifying-docker-image.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
```{r, include = FALSE}
ottrpal::set_knitr_image_path()
```
# Modifying a Docker image
## Learning Objectives
```{r, fig.alt="Learning Objectives. This chapter will demonstrate how to: Modify an existing Docker image. Store a Docker image on dockerhub. Incorporate Docker image use into a project.", out.width = "100%", echo = FALSE}
ottrpal::include_slide("https://docs.google.com/presentation/d/1IJ_uFxJud7OdIAr6p8ZOzvYs-SGDqa7g4cUHtUld03I/edit#slide=id.gfbc11e6ab0_0_149")
```
***
The docker image you are using from the last chapter was pre-made for you, but you will find depending on the needs of your project, that you may need different packages installed. In this chapter we will introduce you to the basics of how to manage your own Docker image.
## Managing images
Images can be on your own computer or on dockerhub.
To see your list of images on your computer, you can go to Docker desktop. From here you will want to delete images and containers periodically because they do take up room on your computer.
```{r, fig.alt="A reminder of where to find your current local images. Open up Docker desktop. Click on ‘images’ on the left. This shows the images you have available on your computer. If you hover your mouse over one of these images, you will see a Run button appear. Click the Run button to start a container with this image.", out.width = "100%", echo = FALSE}
ottrpal::include_slide("https://docs.google.com/presentation/d/1IJ_uFxJud7OdIAr6p8ZOzvYs-SGDqa7g4cUHtUld03I/edit#slide=id.gfc8849fa4d_0_17")
```
To see what images you have on your internet repository, you can log on to dockerhub.
[Go here to login (or create a username if you have not yet)](https://hub.docker.com/).
```{r, fig.alt="The dockerhub login page.", out.width = "100%", echo = FALSE}
ottrpal::include_slide("https://docs.google.com/presentation/d/1IJ_uFxJud7OdIAr6p8ZOzvYs-SGDqa7g4cUHtUld03I/edit#slide=id.gfc8849fa4d_0_38")
```
After you sign into dockerhub, click on the `Repositories` tab, so you can see the list of repositories you have stored online. At this point, you won’t have any if you just created your dockerhub account. To create a new repository, click the ‘Create Repository’ button.
```{r, fig.alt="In your dockerhub main page, you can see the list of repositories you have stored online. At this point, you won’t have any if you just created your dockerhub account. To create a new repository, click the ‘Create Repository’ button.", out.width = "100%", echo = FALSE}
ottrpal::include_slide("https://docs.google.com/presentation/d/1IJ_uFxJud7OdIAr6p8ZOzvYs-SGDqa7g4cUHtUld03I/edit#slide=id.gfc8849fa4d_0_10")
```
Upon adding the new repository to dockerhub, you will need name it the same as whatever you are calling it locally. You can put a description and name and click create. On the right it shows how you can interact with this from your local command line.
```{r, fig.alt="Upon adding the new repository to dockerhub, you will need name it the same as whatever you are calling it locally. You can put a description and name and click create. On the right it shows how you can interact with this from your local command line.", out.width = "100%", echo = FALSE}
ottrpal::include_slide("https://docs.google.com/presentation/d/1IJ_uFxJud7OdIAr6p8ZOzvYs-SGDqa7g4cUHtUld03I/edit#slide=id.gfc8849fa4d_0_48")
```
After you've created the image repository, you will be brought to the image repository page.
It will tell you `Last pushed: never`. On the right it will tell you the command you will need in order to push the image to dockerhub.
```{r, fig.alt="After you've created the image repository, you will be brought to the image repository page. It will tell you Last pushed: never. On the right it will tell you the command you will need in order to push the image to dockerhub.", out.width = "100%", echo = FALSE}
ottrpal::include_slide("https://docs.google.com/presentation/d/1IJ_uFxJud7OdIAr6p8ZOzvYs-SGDqa7g4cUHtUld03I/edit#slide=id.gfc8849fa4d_0_56")
```
Go to your [local command line](https://towardsdatascience.com/a-quick-guide-to-using-command-line-terminal-96815b97b955) and use the command specified on the right side of your repository page. You don't have to specify a tagname if you don't want to. If you don't want to specify a tagname, leave off the `:tagname` if you like.
Now you will be able to test pulling your image using `docker pull <image name>` like we did in the previous chapter. You can also click on the `Public View` button to copy the pull command for your Docker image.
Docker images can be pulled from being stored online but these images are built originally from a `Dockerfile`.
## Exercise: Build a Docker image
A Dockerfile is a recipe for how to build a docker image. The best way to learn to write Dockerfiles is to start off with one that is already written and modify it for your needs.
You can practice building a docker image by downloading the dockerfiles we have started and changing it slightly.
### Download an example Dockerfile
<details> <summary> Get the Python Dockerfile </summary>
Download the example Dockerfile for Python analyses.
```
wget https://raw.githubusercontent.com/jhudsl/Adv_Reproducibility_in_Cancer_Informatics/main/resources/python-docker/Dockerfile
```
If you get a message like `command not found` that means you will need to install [`wget`](https://www.jcchouinard.com/wget/).
Altervatively, you can navigate to the [Dockerfile's page on GitHub](https://raw.githubusercontent.com/jhudsl/Adv_Reproducibility_in_Cancer_Informatics/main/resources/python-docker/Dockerfile) and use `File` > `Save as` but do not add any suffix to the end of the file (no `.txt` or anything). Just save it as `Dockerfile`.
</details>
<details> <summary> Get the R Dockerfile </summary>
Download the example Dockerfile for R analyses.
```
wget https://raw.githubusercontent.com/jhudsl/Adv_Reproducibility_in_Cancer_Informatics/main/resources/r-docker/Dockerfile
```
If you get a message like `command not found` that means you will need to install [`wget`](https://www.jcchouinard.com/wget/).
Altervatively, you can navigate to the [Dockerfile's page on GitHub](https://raw.githubusercontent.com/jhudsl/Adv_Reproducibility_in_Cancer_Informatics/main/resources/r-docker/Dockerfile) and use `File` > `Save as` but do not add any suffix to the end of the file (no `.txt` or anything). Just save it as `Dockerfile`.
</details>
### Build a Docker image from a Dockerfile
Place this newly downloaded Dockerfile with the rest of your project files.
Build a docker image from this Dockerfile using the command below, but replace `image_name` with what you would like your modified image to be called.
```
docker build -f Dockerfile . -t image_name
```
Navigate back to your Docker desktop and the `images` window. If your image built successfully, you should see a new image in your list!
### Modify a Docker image
If you want add or remove a package from a Docker image, you'll need to modify the Dockerfile.
Using your preferred text editor (or RStudio or Jupyter Lab), open up the Dockerfile.
You will see the first line in the Docker image is a `FROM` command. This is a command that will take another docker image to start from.
- For our **R example**, we are starting off with an image that already has R and the tidyverse. - For our **Python example** we are starting off with an image that already has Python and Jupyter Lab.
There are so many Docker images out there, that it might be that someone has already created a docker image with most of the functionality you need for your project.
`FROM` is one of the [main commands that a Dockerfile can take as described by their documentation](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/):
> **FROM** creates a layer from the another Docker image.
> **COPY** adds files from your Docker client’s current directory.
> **RUN** builds your application with make.
> **CMD** specifies what command to run within the container.
### Add to the Dockerfile
To get a feel for how these work, let's add a line to the your example Dockerfile.
Using your preferred text editor (or RStudio or Jupyter Lab), open up the Dockerfile and add this line at the **very end** of the file. Do not add this line to the start of the file as this will not work. The `FROM` command needs to come first.
```
CMD ["echo","Yay! I added to this Docker image"]
```
Now re-run `docker build` as you did in the previous section. (Use the command below but replace `image_name` with whatever your image is called).
```
docker build -f Dockerfile . -t image_name
```
If all built successfully, you should see a message like:
```
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:ayuahgfuiseohfauwheufhauwihefuahweufhawfbuibe 0.0s
=> => naming to docker.io/library/image_name
```
Now to run the image we can use the docker run command we used in the previous chapter (see below) and we should have a message: `Yay! I added to this Docker image` pop up upon building.
<details> <summary> To run your new **Python docker image** </summary>
But replace `image_name` with whatever you have called your image.
```
docker run --rm -v $PWD:/home/jovyan/work -e JUPYTER_ENABLE_LAB=yes -p 8787:8787 image_name
```
</details>
<details> <summary> To run the **R docker image** </summary>
But replace `image_name` with whatever you have called your image.
```
docker run --rm -v $PWD:/home/rstudio -e PASSWORD=password -p 8787:8787 image_name
```
</details>
**Stop and remove these containers before moving on.** You can do this by going to Docker desktop and clicking on the trash can button next to each container. For images click `Clean up` to check off the images you'd like to remove and then hit `Remove`.
### Add another package!
Starting off with your example Dockerfile, we will practice adding another package and re-build the docker image with a new package.
**Note** that spacing is important as well as having a `\` at the end of each line if the command is continuing.
#### Adding an R package
To add R packages from CRAN, you can use this kind of format:
```
RUN Rscript -e "install.packages( \
c('BiocManager', \
'R.utils', \
'newpackagename'))"
```
To add an R package from Bioconductor, you can follow this kind of format:
```
RUN Rscript -e "options(warn = 2); BiocManager::install( \
c('limma', \
'newpackagename')
```
To add a **Python package using pip**, you will need to add pip3 to install Python packages using this format:
```
RUN pip3 install \
"somepackage==0.1.0"
```
There are so many things you can add to your Docker image. (Picture whatever software and packages you are using on your computer). We can only get you started for the feel of how to build a Dockerfile, and what you put on your Docker image will be up to you.
To figure out how to add something, a good strategy is to look for other Dockerfiles that might have the package you want installed and borrow their `RUN` command. Then try to re-build your Docker image with that added `RUN` command and see if it builds successfully.
And lastly, make sure that whatever changes you make to your Dockerfile, that you add it to your GitHub repository by [creating a pull request as we did in Chapter 3](https://jhudatascience.org/Adv_Reproducibility_in_Cancer_Informatics/using-version-control-with-github.html#create-a-branch).
### More about Docker next steps
- [Dockerfile Tutorial by Example](https://takacsmark.com/dockerfile-tutorial-by-example-dockerfile-best-practices-2018/#lets-create-your-first-image).
- [Dockerfile examples](https://linuxtechlab.com/learn-create-dockerfile-example/)
### A list of handy Docker commands:
_Get info on current containers:_
```
docker ps
```
_How to stop an individual container:_
```
docker container ls
docker stop <containerID>
```
_Get rid of all non-running containers:_
```
docker container prune
```
_Stop all containers:_
```
docker stop $(docker ps -a -q)
```
_Remove all containers:_
```
docker rm -f $(docker ps -a -q)
```
**If you have any feedback on this chapter, please [fill out this form](https://forms.gle/j3cJZX5CmNtQp6QKA), we'd love to hear your feedback!**