forked from jglev/Time-Travel-for-Academics
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Slides.Rmd
281 lines (179 loc) · 12.9 KB
/
Slides.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
% Time-Travel for Academics
% Jacob Levernier ([adunumdatum.org](http://adunumdatum.org))
% May 18, 2015
## This Presentation
* The code is on [GitHub](https://github.com/publicus/Time-Travel-for-Academics)
* A copy of the complete presentation is at [http://links.AdUnumDatum.org?time-travel-for-academics-2015-05-18](http://links.AdUnumDatum.org?time-travel-for-academics-2015-05-18)
# Get your digital life in order, and protect yourself from yourself
<!--
This presentation is meant to be used with reveal.js (which you need to download and place in the same directory as these slides). The slides are generated with [Pandoc](http://pandoc.org/demo/example9/producing-slide-shows-with-pandoc.html):
Rscript -e "library(knitr)" -e "knit('Slides.Rmd', output='markdown_compiled_NOT_FOR_EDITING.mkd')"
pandoc -t revealjs -i -s markdown_compiled_NOT_FOR_EDITING.mkd -o Slides_Build.html --slide-level 2 --css reveal.js/css/theme/white.css --variable transition="fade"
You can add the --self-contained flag to pandoc to create a single file (with all images and linked css and javascript files encoded and embedded) for distribution.
(-i is "incremental" mode for lists. --slide-level sets the header level that should trigger new slides)
(Note that per [here](https://github.com/jgm/pandoc-templates/pull/78), you may need to tempoararily change the css and js locations to not include ".min" in the name (or copy them and rename them with ".min").
-->
<div class="notes">
The point of this talk is to...
"Protect yourself from yourself" means to not have to worry about mistakes that you might make in the future, or about decisions (e.g., to delete something) that you made in the past.
</div>
```{r set_variables, echo = FALSE, eval = TRUE}
Sys.setenv(presentation_base_directory = "/home/jacoblevernier/ownCloudDesktopSyncFolder/Documents/Files in Transit/Time_Travel_for_Academics_Git_Presentation") # Following http://yihui.name/knitr/demo/engines/, this is currently the only way to set variables that can be shared across non-R code chunks that are bash-based (for other engines, variables have to be written to files to be shared across code chunks, apparently).
```
##
![[PHD Comics](http://www.phdcomics.com/comics/archive/phd101212s.gif)](images/phd_comics_version_control_101212s.gif)
## What to do?
Well...
* **Dropbox**
* <span class="fragment highlight-green">Saves (limited) versions of files</span>
* <span class="fragment highlight-red">No concurrent editing</span>
* <span class="fragment highlight-red">Doesn't show what was changed between different versions, or why</span>
* **Word Documents**
* <span class="fragment highlight-red">Passing around a document makes it hard to sync files and to keep track of which versions are based on which others.</span>
* **Google Docs**
* <span class="fragment highlight-green">Nice collaborative editing with versions</span>
* <span class="fragment highlight-green">Useful for writing manuscripts</span>
* <span class="fragment highlight-red">No/Limited offline access to files</span>
* <span class="fragment highlight-red">Not suited for editing code and analyses</span>
<div class="notes">
**Dropbox** *does* save limited versions of files, but doesn't allow two people to edit files at once, and doesn't show what was changed between different versions, or why.
Saving seperate versions of files (or passing around a **Word Document** makes it hard to sync files or keep track of which versions are based on which others.
Using **Google Docs** offers nice collaborative editing with versions, but if your internet connection is down, you have no or limited access to files. And while Docs are fine for writing manuscripts, they're not so good for editing code and analyses.
</div>
# Git
##
![[GotCredit.com, CC-BY](https://www.flickr.com/photos/jakerust/16836495381/)](images/insurance_keyboard.jpg)
## How it Works
<div style="height:300px;">![](images/apple-logo-grey.png) ![](images/apple_time_machine_logo.png)</div>
<div class="notes">
Note: Following [this site](http://smallbusiness.chron.com/fair-use-logos-2152.html), Fair Use allows for the reuse of logos if the use is only for identification purposes (e.g., in a newspaper to identify the company the article is about).
</div>
* Distributed (shareable)
* Only saves changes
* You build a "logbook" of everything that's happened to the tracked files.
## Command Line vs. Point-and-Click
* We'll talk about both.
* Git is fundamentally a command-line application.
* Programs like GitEye add a graphical interface.
* **One take-home message of everything after this:** GitEye can do everything that the command line can. But it's easier to learn about this with the command line.
(If you need to look up help later, it'll be useful to have heard the relevant command-line vocab terms)
## The Basic Workflow
1. Make git aware of the files you want to track
<span class="fragment">`git init` ("initialize"/start tracking a project),</span>
<span class="fragment">`git clone [url]` (create a copy on your computer of someone else's project),</span>
<span class="fragment">`git add [filename]` (add to the list of files to track)</span>
* You use `git add [filename]` to get a file to be tracked in the first place. After that, whenever the file gets modified, you use `git add` to put it into the list of things to snapshot.
1. Take a snapshot ("commit") whenever you do something. Add a commit message.
<span class="fragment">`git commit --message "I deleted the first part of the Introduction,`
`but added new citations to the Literature Review."`</span>
1. Do your work!
<span class="fragment">As needed, use `git log` (view the logbook of all previous commits),</span>
<span class="fragment">`git status` to see which files Git sees as having been changed since the last commit,</span>
<span class="fragment">`git diff [filename]` (see exactly what changed between two commits),</span>
<span class="fragment">`git checkout [id number]` (roll back to a different commit, making the folder look exactly as it did at that point in time — you can always roll forward later! (with `git checkout master` or `git checkout OtherBranchName`)</span>
## What to Track with Git
* <span class="fragment highlight-green">Plain-text files</span>
* Text: .txt, [Markdown](https://daringfireball.net/projects/markdown/syntax), [RMarkdown](http://rmarkdown.rstudio.com/)
* Code: R, Python, SPSS syntax, etc.
* Data: csv, corpora, etc.
* <span class="fragment highlight-red">Binary files</span>
* Videos
* Compiled software files
* Images
* Powerpoint, Word Docs, Excel spreadsheets
* Some users choose to store large files outside of a respository, with a note within the repository pointing to where others can find them.
* See, e.g., GitHub's [Large File Storage](https://github.com/blog/1986-announcing-git-large-file-storage-lfs) system, or the [`git-media` extension](https://github.com/alebedev/git-media)
* Only track a file if... <span class="fragment">it's changing, and you want to track the changes</span><span class="fragment">, and you want people with whom you share the repository to get a copy of it.</span>
# Real-World Examples
## Code Editing
* Cloning an Existing Repository
[Qualtrics Graph Helper](https://github.com/publicus/qualtrics_graph_helper)
* Starting a New Repository from Scratch
# The Command Line
## The Bare Essentials
* "Terminal" in Mac OSX and Linux. [Cygwin](https://www.cygwin.com/) in Windows (I haven't tested these command in the Windows Command Prompt, but I think that they should work there too), or [Git for Windows](https://msysgit.github.io/).
* `ls` = **l**i**s**t files in a folder.
* `cd` = "Change Directory"
* `cd ./subfolder` = "Go into the folder called 'subfolder' in the current directory."
* `cd ../` = "Go one directory up."
* `man [command]` = "Show me the manual for this command."
* For example, `man ls` will show and explain all of `ls`'s options.
* `ls -a` will show hidden files (like the `.git` directory!)
* `ls -l` will show the files in a list format (with each file on one line)
* This slide is to help you use Git at a basic level. <span class="fragment highlight-red">Do not use command-line commands that you don't know.</span> The command line doesn't ask "Are you sure?"
## `git clone`
![](images/github_screenshots/GitHub_Repo_Index_Page.png)
## `git clone`
![](images/github_screenshots/GitHub_Repo_Index_Page_with_Clone_URL_Highlighted.png)
## `git clone`
![](images/terminal_screenshots/blank_terminal.png)
## `git clone`
![](images/terminal_screenshots/terminal_before_git-clone.png)
## `git clone`
![](images/terminal_screenshots/terminal_after_git_clone.png)
## `git log`
![](images/terminal_screenshots/git-log_command.png)
## `git log`
![](images/terminal_screenshots/git-log_result.png)
(Press 'q' to quit the log viewer)
## `git log`
![](images/terminal_screenshots/git-log_result_hash_highlighted.png)
## `git show`
![](images/terminal_screenshots/git-show_command.png)
## `git show`
![](images/terminal_screenshots/git-show_result.png)
## Repository from Scratch
## Manuscript Writing
* Many writers use [Markdown](https://daringfireball.net/projects/markdown/syntax) and [RMarkdown](http://rmarkdown.rstudio.com/) to write.
* See [Pandoc](http://pandoc.org/getting-started.html) for converting between txt, markdown, RMarkdown, etc. and PDF, Word Documents, etc.
* This is also built into programs like [RStudio](http://www.rstudio.com/)
# GitEye
## GitEye
![[GitEye](http://www.collab.net/downloads/giteye)](images/giteye_screenshots/GitEye_Initial_View.png)
* See also `gitk`, `git-cola`, and several other tools. GitEye is the most straightforward I've found after quite a bit of looking.
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_ID_Highlighted.png)
GitEye's Commit Log ("History")
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Message_Highlighted.png)
GitEye's Commit Log ("History")
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Author_Highlighted.png)
GitEye's Commit Log ("History")
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Commit1_Highlighted.png)
GitEye's Commit Log ("History")
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Commit2_Highlighted.png)
GitEye's Commit Log ("History")
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Commit3_Highlighted.png)
GitEye's Commit Log ("History")
## Branches
* A git log is like a big trunk of a tree.
* That tree can have branches.
* "Branch" ≈ A hashtag given to a line of commits
* E.g., "New_Test_that_Might_Break_Everything"<span class="fragment">, "New_Feature"</span><span class="fragment">, "My_Part_of_the_Project"</span>
* `git branch` = See a list of available branches
* `git checkout [branchname]` = Go to the top of the branch called "branchname"
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Branchpoint1_Highlighted.png)
Branching (`git checkout -b NewBranchName`, or `git checkout ExisitingBranchName`)
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Branchpoint2_Highlighted.png)
Branching (`git checkout -b NewBranchName`, or `git checkout ExisitingBranchName`)
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Mergepoint1_Highlighted.png)
Merging (`git checkout MainBranchName; git merge OtherBranchName`)
## ![](images/giteye_screenshots/GitEye_Branches_Example_Screenshot_Mergepoint2_Highlighted.png)
Merging (`git checkout MainBranchName; git merge OtherBranchName`)
## Comparing Files between Commits (`git diff`)
# GitHub and Related Services
## What is GitHub?
* See also BitBucket
## Public vs. Private
* GitHub: Public by default. Pay for private.
* BitBucket: More allowance for private, but still based on pay-for-private model.
* **Both have academic/student accounts that allow for unlimited private repositories.**
## `git push` and `git pull`
* Once you've cloned a repository, all it takes to add your changes to it is to <span style="display:inline-block;" class="fragment grow">**"push"**</span> your local changes up to the server, with `git push https://github.com/url_of_the_repository`
* If you want to get what everyone else has been adding to the repository, you can <span style="display:inline-block;" class="fragment grow">**"pull"**</span> it down from the server, with `git pull` (Git should know where to pull it from automatically from when you cloned it).
## The Reference Log (`git reflog`)
* Is a log of every git command you run.
* Exists on your local machine only, and is wiped after 60-90 days.
* Useful if you mess up a merge, etc.
* To reset as if you hadn't done a command, you can use `git reset [IDNumber]`
* `git reset --hard 627c93d` will revert back to ID 627c93d, deleting anything you haven't committed yet.
* `git reset --soft 627c93d` will revert back to ID 627c93d, keeping anything you haven't committed yet.
* Using `git reset` will add a new entry to the Reference Log, so you can undo it, too! (Although you'd still lose any uncommitted changes)