-
Notifications
You must be signed in to change notification settings - Fork 1
/
Slides.Rmd
374 lines (239 loc) · 17.7 KB
/
Slides.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
% Time-Travel for Academics
% Jacob Levernier ([adunumdatum.org](http://adunumdatum.org))
% September 18, 2015
<!--
##########
# TODO:
##########
* Update GitHub Code link to match branch for this specific presentation.
-->
## This Presentation
* The code is on [GitHub](https://github.com/publicus/Time-Travel-for-Academics)
* A copy of the complete presentation is at [http://links.AdUnumDatum.org?time-travel-for-academics-2015-09-18](http://links.AdUnumDatum.org?time-travel-for-academics-2015-09-18)
* Choose 'o' for an overview of slides, 's' for the Presenter View and up/down and left/right to move between slides vertically and horizontally, respectively.
## Goals
1. Immediately-applicable skill with one new tool (**SmartGit**, supplemented with the command line)
1. Excitement about (but not yet usable understanding of) other tools:
* **RMarkdown**
* **The Unix Command Line**
* **Regular Expressions**
# Get your digital life in order, and protect yourself from yourself
<!--
This presentation is meant to be used with reveal.js (which you need to download and place in the same directory as these slides). The slides are generated with [Pandoc](http://pandoc.org/demo/example9/producing-slide-shows-with-pandoc.html):
Rscript -e "library(knitr)" -e "knit('Slides.Rmd', output='markdown_compiled_NOT_FOR_EDITING.mkd')"
pandoc -t revealjs -i -s markdown_compiled_NOT_FOR_EDITING.mkd -o Slides_Build.html --slide-level 2 --css reveal.js/css/theme/white.css --variable transition="fade"
You can add the --self-contained flag to pandoc to create a single file (with all images and linked css and javascript files encoded and embedded) for distribution.
(-i is "incremental" mode for lists. --slide-level sets the header level that should trigger new slides)
(Note that per [here](https://github.com/jgm/pandoc-templates/pull/78), you may need to tempoararily change the css and js locations to not include ".min" in the name (or copy them and rename them with ".min").
-->
<div class="notes">
The point of this talk is to...
"Protect yourself from yourself" means to not have to worry about mistakes that you might make in the future, or about decisions (e.g., to delete something) that you made in the past.
</div>
```{r set_variables, echo = FALSE, eval = TRUE}
Sys.setenv(presentation_base_directory = "/home/jacoblevernier/ownCloudDesktopSyncFolder/Documents/Files in Transit/Time_Travel_for_Academics_Git_Presentation") # Following http://yihui.name/knitr/demo/engines/, this is currently the only way to set variables that can be shared across non-R code chunks that are bash-based (for other engines, variables have to be written to files to be shared across code chunks, apparently).
```
##
![[PHD Comics](http://www.phdcomics.com/comics/archive/phd101212s.gif)](images/phd_comics_version_control_101212s.gif)
## What to do?
Well...
* **Dropbox**
* <span class="fragment highlight-green">Saves (limited) versions of files</span>
* <span class="fragment highlight-red">No concurrent editing</span>
* <span class="fragment highlight-red">Doesn't show what was changed between different versions, or why</span>
* **Word Documents**
* <span class="fragment highlight-red">Passing around a document makes it hard to sync files and to keep track of which versions are based on which others.</span>
* **Google Docs**
* <span class="fragment highlight-green">Nice collaborative editing with versions</span>
* <span class="fragment highlight-green">Useful for writing manuscripts</span>
* <span class="fragment highlight-red">No/Limited offline access to files</span>
* <span class="fragment highlight-red">Not suited for editing code and analyses</span>
<div class="notes">
**Dropbox** *does* save limited versions of files, but doesn't allow two people to edit files at once, and doesn't show what was changed between different versions, or why.
Saving seperate versions of files (or passing around a **Word Document** makes it hard to sync files or keep track of which versions are based on which others.
Using **Google Docs** offers nice collaborative editing with versions, but if your internet connection is down, you have no or limited access to files. And while Docs are fine for writing manuscripts, they're not so good for editing code and analyses.
</div>
# Git
##
![[GotCredit.com, CC-BY](https://www.flickr.com/photos/jakerust/16836495381/)](images/insurance_keyboard.jpg)
## How it Works
<div style="height:300px;">![](images/apple-logo-grey.png) ![](images/apple_time_machine_logo.png)</div>
<div class="notes">
Note: Following [this site](http://smallbusiness.chron.com/fair-use-logos-2152.html), Fair Use allows for the reuse of logos if the use is only for identification purposes (e.g., in a newspaper to identify the company the article is about).
</div>
* Distributed (shareable)
* Only saves changes
* You build a "logbook" of everything that's happened to the tracked files.
## Command Line vs. Point-and-Click
* We'll talk about both.
* Git is fundamentally a command-line application.
* Programs like SmartGit add a graphical interface.
* **One take-home message of everything after this:** SmartGit can do (pretty much) everything that the command line can.
<span class="fragment">We'll focus on GUI, but it's smartest to learn about this in parallel with the command line.
(If you need to look up help later, it'll be useful to have heard the relevant command-line vocab terms)</span>
## The Basic Workflow
1. Make git aware of the files you want to track
<span class="fragment">`git init` ("initialize"/start tracking a project),</span>
<span class="fragment">`git clone [url]` (create a copy on your computer of someone else's project),</span>
<span class="fragment">`git add [filename]` (add to the list of files to track)</span>
* You use `git add [filename]` to get a file to be tracked in the first place. After that, whenever the file gets modified, you use `git add` to put it into the list of things to snapshot.
1. Take a snapshot ("commit") whenever you do something. Add a commit message.
<span class="fragment">`git commit --message "I deleted the first part of the Introduction,`
`but added new citations to the Literature Review."`</span>
1. Do your work!
<span class="fragment">As needed, use `git log` (view the logbook of all previous commits),</span>
<span class="fragment">`git status` to see which files Git sees as having been changed since the last commit,</span>
<span class="fragment">`git diff [filename]` (see exactly what changed between two commits),</span>
<span class="fragment">`git checkout [id number]` (roll back to a different commit, making the folder look exactly as it did at that point in time — you can always roll forward later! (with `git checkout master` or `git checkout OtherBranchName`)</span>
<div class="notes">
Index vs. Head (if you click "Revert" in SmartGit, you may see this): Index = Changes at the moment of Staging; Head = Last commit/state on the current branch). See http://stackoverflow.com/a/3690522. https://git-scm.com/blog/2011/07/11/reset.html puts it as Head = "last commit snapshot", Index = "proposed next commit snapshot"</li>
git reset vs. git checkout: (quotes are from https://www.atlassian.com/git/tutorials/resetting-checking-out-and-reverting/commit-level-operations)
git reset: "On the commit-level, resetting is a way to move the tip of a branch to a different commit. This can be used to remove commits from the current branch."
git checkout: "Internally, all the above command does is move HEAD to a different branch and update the working directory to match. Since this has the potential to overwrite local changes, Git forces you to commit or stash any changes in the working directory that will be lost during the checkout operation. Unlike git reset, git checkout doesn’t move any branches around."
</div>
## What to Track with Git
* <span class="fragment highlight-green">Plain-text files</span>
* Text: .txt, [Markdown](https://daringfireball.net/projects/markdown/syntax), [RMarkdown](http://rmarkdown.rstudio.com/)
* Code: R, Python, SPSS syntax, etc.
* Data: csv, corpora, etc.
* <span class="fragment highlight-red">Binary files</span>
* Videos
* Compiled software files
* Images
* Powerpoint, Word Docs, Excel spreadsheets
* Some users choose to store large files outside of a respository, with a note within the repository pointing to where others can find them.
* See, e.g., GitHub's [Large File Storage](https://github.com/blog/1986-announcing-git-large-file-storage-lfs) system, or the [`git-media` extension](https://github.com/alebedev/git-media)
## Interlude: <span class="fragment highlight-green">Text</span>
* Text works across platforms.
* Especially through the Unix Command Line, text is searchable, combinable (able to be re-combined based on searches without changing the original text), and transmutable (able to be changed from format to format).
<div class="notes">
Examples: Go into my academic notes, and show searching using `grep`, inc. with --only-matching, --invert-match (e.g., for "Haidt"). Show writing that to a text file.
Optional: Concept map with `concept-map PRELIMS-Graham_Haidt_and_Nosek_2009.mkd -t ",,.*?,," -mn`
</div>
## Interlude: <span class="fragment highlight-green">Text</span>
* Many text editors (such as [SublimeText](http://www.sublimetext.com/ "SublimeText 2") can search for *patterns* of text using "Regular Expressions"
* This presentation is written in plain-text!
* It uses a tool called [Pandoc](http://pandoc.org/ "Pandoc"), which can convert plain-text manuscripts into PDFs, .docx, .html, .epub, etc.). The text is written in [Markdown](http://rmarkdown.rstudio.com/ "Markdown")
* With an R package called [Knitr](http://yihui.name/knitr "Knitr") (or with tools built-in to software like [R Studio](http://rstudio.com/ "R Studio")), presentations and manuscripts can contain code that auto-updates their charts and figures.
<div class="notes">
Examples: (No Regex example, as the command could be overwhelming even to see. Just talk about use-cases (e.g., cleaning phone number data).
</div>
## Interlude: Text
```{r Embedded_Code_Example, echo=FALSE, eval=TRUE, fig.cap=""}
data_to_process <- read.csv("Demographic_Data_of_Audience_Participants.csv")
# Create a nice plot. For color palettes, see http://www.r-bloggers.com/color-palettes-in-r/
barplot(
data_to_process$Number_of_People, # Bar heights
names.arg = data_to_process$Operating_System, # Labels
col = cm.colors(4), # Colors
main = NULL, # Main title
xlab = "Operating System", # X label
ylab = c("Number of people today,", format(Sys.Date(), format="%B %d, %Y")) # Y label.
)
```
## A Review...
* Anything can be tracked, but text files are best.
* Only track a file if... <span class="fragment">it's changing, and you want to track the changes</span><span class="fragment">, and you want people with whom you share the repository to get a copy of it.</span>
# The Command Line
## The Bare Essentials
* "Terminal" in Mac OSX and Linux. [Cygwin](https://www.cygwin.com/) in Windows (I haven't tested these command in the Windows Command Prompt, but I think that they should work there too), or [Git for Windows](https://msysgit.github.io/).
* `ls` = **l**i**s**t files in a folder.
* `cd` = "Change Directory"
* `cd ./subfolder` = "Go into the folder called 'subfolder' in the current directory."
* `cd ../` = "Go one directory up."
* `man [command]` = "Show me the manual for this command."
* For example, `man ls` will show and explain all of `ls`'s options.
* `ls -a` will show hidden files (like the `.git` directory!)
* `ls -l` will show the files in a list format (with each file on one line)
## The Bare Essentials
* This slide is to help you use Git at a basic level. <span class="fragment highlight-red">Do not use command-line commands that you don't know.</span> The command line doesn't ask "Are you sure?"
## `git clone`
![](images/github_screenshots/GitHub_Repo_Index_Page.png)
## `git clone`
![](images/github_screenshots/GitHub_Repo_Index_Page_with_Clone_URL_Highlighted.png)
## `git clone`
![](images/terminal_screenshots/blank_terminal.png)
## `git clone`
![](images/terminal_screenshots/terminal_before_git-clone.png)
## `git clone`
![](images/terminal_screenshots/terminal_after_git_clone.png)
## `git log`
![](images/terminal_screenshots/git-log_command.png)
## `git log`
![](images/terminal_screenshots/git-log_result.png)
(Press 'q' to quit the log viewer)
## `git log`
![](images/terminal_screenshots/git-log_result_hash_highlighted.png)
## `git show`
![](images/terminal_screenshots/git-show_command.png)
## `git show`
![](images/terminal_screenshots/git-show_result.png)
# SmartGit — Initial Setup
##
![Main Window menu customizations](images/smartgit_screenshots/smartgit_main_window_customize_menu.png)
##
![Log Window menu customizations](images/smartgit_screenshots/smartgit_log_window_customize_menu.png)
##
![Log Window — Click checkbox to see all branches](images/smartgit_screenshots/smartgit_log_window_show_all_branches.png)
## ![Optional — In Preferences, set up a Diff viewer that allows word-/line-wrapping](images/smartgit_screenshots/smartgit_preferences_diff_tool_customize_menu.png)
## ![Initial View](images/smartgit_screenshots/smartgit_Initial_View.png)
* See also `GitEye`, `gitk`, `git-cola`, and several other tools. SmartGit is the most straightforward I've found after quite a bit of looking.
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_ID_Highlighted.png)
SmartGit's Commit Log ("History")
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Message_Highlighted.png)
SmartGit's Commit Log ("History")
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Author_Highlighted.png)
SmartGit's Commit Log ("History")
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Commit1_Highlighted.png)
SmartGit's Commit Log ("History")
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Commit2_Highlighted.png)
SmartGit's Commit Log ("History")
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Commit3_Highlighted.png)
SmartGit's Commit Log ("History")
## ![](images/smartgit_screenshots/giteye_Staging_Area.png)
Main view when files have been modified (`git status`)
## ![](images/smartgit_screenshots/giteye_Staged_Changes.png)
Staging Changes (`git add example.txt`, or `git add .` for all changed files)
## ![](images/smartgit_screenshots/giteye_Commit_Dialog.png)
Committing Changes (`git commit -m "Commit message goes here"`)
## Branches
* By default, history in the `git log` is like the big trunk of a tree.
* That tree can have branches.
* "Branch" ≈ A hashtag given to a line of commits
* E.g., "New_Test_that_Might_Break_Everything"<span class="fragment">, "New_Feature"</span><span class="fragment">, "My_Part_of_the_Project"</span>
* `git branch` = See a list of available branches
* `git checkout [branchname]` = Go to the top of the branch called "branchname"
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Branchpoint1_Highlighted.png)
Branching (`git checkout -b NewBranchName`, or `git checkout ExisitingBranchName`)
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Branchpoint2_Highlighted.png)
Branching (`git checkout -b NewBranchName`, or `git checkout ExisitingBranchName`)
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Mergepoint1_Highlighted.png)
Merging (`git checkout MainBranchName; git merge OtherBranchName`)
## ![](images/smartgit_screenshots/smartgit_Branches_Example_Screenshot_Mergepoint2_Highlighted.png)
Merging (`git checkout MainBranchName; git merge OtherBranchName`)
## Comparing Files between Commits
* Command line: `git diff example.txt`, `git diff commitID1 commitID2`
* SmartGit: Right-click a file, (optionally after Ctrl+clicking two commits to select them both), and click "Show Changes"
* If only one commit is selected, SmartGit will show the changes from the previous commit.
# GitHub and Related Services
## What is GitHub?
* See also [BitBucket](https://bitbucket.org/
"BitBucket") and [GitLab](https://about.gitlab.com/ "GitLab")
## Public vs. Private
* GitHub: Public by default. Pay for private.
* BitBucket: More allowance for private, but still based on pay-for-private model.
* GitLab: Free, able to be hosted on-site. Can pay for support and extra features.
* **Both GitHub and BitBucket have academic/student accounts that allow for unlimited private repositories.**
## `git push` and `git pull`
* Once you've cloned a repository, all it takes to add your changes to it is to <span style="display:inline-block;" class="fragment grow">**"push"**</span> your local changes up to the server, with `git push https://github.com/url_of_the_repository`
* If you want to get what everyone else has been adding to the repository, you can <span style="display:inline-block;" class="fragment grow">**"pull"**</span> it down from the server, with `git pull` (Git should know where to pull it from automatically from when you cloned it).
## The Reference Log (`git reflog`)
* Is a log of every git command you run.
* Needs to be run from the command line (SmartGit doesn't have a full interface for the reflog).
* Exists on your local machine only, and is wiped after 60-90 days.
* Useful if you mess up a merge, etc.
## Resetting Changes
* To reset as if you hadn't done a command, you can use `git reset [IDNumber]` (the ID number can come from the `git log` or the `git reflog`).
* `git reset --hard 627c93d` will revert back to ID 627c93d, deleting anything you haven't committed yet.
* `git reset --soft 627c93d` will revert back to ID 627c93d, keeping anything you haven't committed yet.
* Using `git reset` will add a new entry to the Reference Log, so you can undo it, too! (Although you'd still lose any uncommitted changes)