Skip to content

Commit 188a57c

Browse files
committedMar 18, 2016
Update howto.
1 parent 57ec5a6 commit 188a57c

File tree

1 file changed

+125
-64
lines changed

1 file changed

+125
-64
lines changed
 

‎doc/howto.md

+125-64
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,14 @@
11
# Reading Google Sheets from Clojure
22

3-
One of the user stories I had to tackle in a recent sprint was to
4-
import data maintained by a non-technical staff member in a Google
5-
Spreadsheet into our analytics database. I quickly found a
6-
[Java API for Google Spreadsheets](https://developers.google.com/google-apps/spreadsheets/)
7-
that looked promising but turned out to be more tricky to get up and
8-
running than at first glance. In this article, I show you how to use
9-
this library from Clojure and avoid some of the pitfalls I fell into.
3+
One of the user stories I had to tackle in a recent sprint was to import data maintained by a non-technical staff member in a Google Spreadsheet into our analytics database. I quickly found a [Java API for Google Spreadsheets](https://developers.google.com/google-apps/spreadsheets/) that looked promising but turned out to be more tricky to get up and running than at first glance. In this article, I show you how to use this library from Clojure and avoid some of the pitfalls I fell into.
104

115
## Google Spreadsheets API
126

13-
The [GData Java client](https://github.com/google/gdata-java-client)
14-
referenced in the
15-
[Google Spreadsheets API documentation](https://developers.google.com/google-apps/spreadsheets/)
16-
uses an old XML-based protocol, which is mostly deprecated. We are
17-
recommended to use the newer,
18-
[JSON-based client](https://github.com/google/google-api-java-client).
19-
After chasing my tail on this, I discovered that Google Spreadsheets
20-
does not yet support this new API and we *do* need the GData client
21-
after all.
7+
The [GData Java client](https://github.com/google/gdata-java-client) referenced in the [Google Spreadsheets API documentation](https://developers.google.com/google-apps/spreadsheets/) uses an old XML-based protocol, which is mostly deprecated. We are recommended to use the newer, [JSON-based client](https://github.com/google/google-api-java-client). After chasing my tail on this, I discovered that Google Spreadsheets does not yet support this new API and we *do* need the GData client after all.
228

239
## The first hurdle: dependencies
2410

25-
The GData Java client is not available from Maven, so we have to
26-
[download a zip archive](http://storage.googleapis.com/gdata-java-client-binaries/gdata-src.java-1.47.1.zip).
27-
The easiest way to use these from a Leiningen project is to use `mvn`
28-
to install the required jar files in our local repository and specify
29-
the dependencies in the usual way. This handy script automates the
30-
process, only downloading the archive if necessary. (For this project,
31-
we only need the `gdata-core` and `gdata-spreadsheet` jars, but the
32-
script is easily extended if you need other components.)
11+
The GData Java client is not available from Maven, so we have to [download a zip archive](http://storage.googleapis.com/gdata-java-client-binaries/gdata-src.java-1.47.1.zip). The easiest way to use these from a Leiningen project is to use `mvn` to install the required jar files in our local repository and specify the dependencies in the usual way. This handy script automates the process, only downloading the archive if necessary. (For this project, we only need the `gdata-core` and `gdata-spreadsheet` jars, but the script is easily extended if you need other components.)
3312

3413
#!/bin/bash
3514

@@ -78,21 +57,13 @@ Once we've installed these jars, we can configure dependencies as follows:
7857

7958
## The second hurdle: authentication
8059

81-
This is a pain, as the documentation for the GData Java client is
82-
incomplete and at times confusing, and the examples it ships with no
83-
longer work as they use a deprecated OAuth version. The example Java
84-
code in the documentation tells us:
60+
This is a pain, as the documentation for the GData Java client is incomplete and at times confusing, and the examples it ships with no longer work as they use a deprecated OAuth version. The example Java code in the documentation tells us:
8561

8662
``` Java
8763
// TODO: Authorize the service object for a specific user (see other sections)
8864
```
8965

90-
The other sections were no more enlightening, but after more digging
91-
and reading of source code, I realised we can use the
92-
`google-api-client` to manage our OAuth credentials and simply pass
93-
that credentials object to the GData client. This library is already
94-
available from a central Maven repository, so we can simply update our
95-
project's dependencies to pull it in:
66+
The other sections were no more enlightening, but after more digging and reading of source code, I realised we can use the `google-api-client` to manage our OAuth credentials and simply pass that credentials object to the GData client. This library is already available from a central Maven repository, so we can simply update our project's dependencies to pull it in:
9667

9768
:dependencies [[org.clojure/clojure "1.8.0"]
9869
[com.google.api-client/google-api-client "1.21.0"]
@@ -101,26 +72,14 @@ project's dependencies to pull it in:
10172

10273
## OAuth credentials
10374

104-
Before we can start using OAuth, we have to register our client with
105-
Google. This is done via the
106-
[Google Developers Console](https://console.developers.google.com/).
107-
See
108-
[Using OAuth 2.0 to Access Google APIs](https://developers.google.com/identity/protocols/OAuth2).
109-
for full details, but here's a quick-start guide to creating a
110-
service.
75+
Before we can start using OAuth, we have to register our client with Google. This is done via the
76+
[Google Developers Console](https://console.developers.google.com/). See [Using OAuth 2.0 to Access Google APIs](https://developers.google.com/identity/protocols/OAuth2) for full details, but here's a quick-start guide to creating credentials for a service account.
11177

112-
Click on *Enable and manage APIs* and select *Create a new project*.
113-
Enter the project name and click *Create*.
78+
Navigate to the [Developers Console](https://console.developers.google.com/). Click on *Enable and manage APIs* and select *Create a new project*. Enter the project name and click *Create*.
11479

115-
Once project is created, click on *Credentials* in the sidebar, then
116-
the *Create Credentials* drop-down. As our client is going to run from
117-
cron, we want to enable server-to-server authentication, so select
118-
*Service account key*. On the next screen, select *New service
119-
account* and enter a name. Make sure the *JSON* radio button is
120-
selected, then click on *Create*.
80+
Once project is created, click on *Credentials* in the sidebar, then the *Create Credentials* drop-down. As our client is going to run from cron, we want to enable server-to-server authentication, so select *Service account key*. On the next screen, select *New service account* and enter a name. Make sure the *JSON* radio button is selected, then click on *Create*.
12181

122-
Copy the downloaded JSON file into your project's `resources`
123-
directory. It should look something like:
82+
Copy the downloaded JSON file into your project's `resources` directory. It should look something like:
12483

12584
{
12685
"type": "service_account",
@@ -135,15 +94,11 @@ directory. It should look something like:
13594
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/gsheets-demo%40gsheetdemo.iam.gserviceaccount.com"
13695
}
13796

138-
We'll use this in a moment to create a `GoogleCredential` object, but
139-
before that navigate to Google Sheets and create a test spreadsheet.
140-
Grant read access to the spreadsheet to the email address found in
141-
`client_email` in your downloaded credentials.
97+
We'll use this in a moment to create a `GoogleCredential` object, but before that navigate to Google Sheets and create a test spreadsheet. Grant read access to the spreadsheet to the email address found in `client_email` in your downloaded credentials.
14298

14399
## A simple Google Spreadsheets client
144100

145-
We're going to be using a Java client, so it should come as no
146-
surprise that our namespace imports a lot of Java classes:
101+
We're going to be using a Java client, so it should come as no surprise that our namespace imports a lot of Java classes:
147102

148103
(ns gsheets-demo.core
149104
(:require [clojure.java.io :as io])
@@ -157,21 +112,127 @@ surprise that our namespace imports a lot of Java classes:
157112
java.net.URL
158113
java.util.Collections))
159114

115+
We start by defining some constants for our application. The crenentials resource is the JSON file we downloaded from the developer console:
160116

117+
(def application-name "gsheetdemo-v0.0.1")
161118

119+
(def credentials-resource (io/resource "GSheetDemo-041db3d758a1.json"))
162120

163-
Github Repo:
121+
(def oauth-scope "https://spreadsheets.google.com/feeds")
164122

165-
https://github.com/google/gdata-java-client
123+
(def spreadsheet-feed-url (URL. "https://spreadsheets.google.com/feeds/spreadsheets/private/full"))
166124

167-
New API:
125+
With this in hand, we can create a `GoogleCredential` object and initialize the Google Sheets service:
168126

127+
(defn get-credential
128+
[]
129+
(with-open [in (io/input-stream credentials-resource)]
130+
(let [credential (GoogleCredential/fromStream in)]
131+
(.createScoped credential (Collections/singleton oauth-scope)))))
169132

133+
(defn init-service
134+
[]
135+
(let [credential (get-credential)
136+
service (SpreadsheetService. application-name)]
137+
(.setOAuth2Credentials service credential)
138+
service))
170139

171-
Download:
140+
Let's try it at a REPL:
172141

173-
http://storage.googleapis.com/gdata-java-client-binaries/gdata-src.java-1.47.1.zip
142+
lein repl
174143

175-
Samples:
144+
user=> (require '[gsheets-demo.core :as gsheets])
145+
nil
146+
user=> (def service (gsheets/init-service))
147+
#'user/service
148+
user=> (.getEntries (.getFeed service
149+
gsheets/spreadsheet-feed-url
150+
com.google.gdata.data.spreadsheet.SpreadsheetFeed))
151+
(#object[com.google.gdata.data.spreadsheet.SpreadsheetEntry 0x43ab2a3e "com.google.gdata.data.spreadsheet.SpreadsheetEntry@43ab2a3e"])
176152

177-
http://storage.googleapis.com/gdata-java-client-binaries/gdata-samples.java-1.47.1.zip
153+
Great! We can see the one spreadsheet we granted our service account read access. Let's wrap this up in a function and implemnet a helper to find a spreadsheet by name:
154+
155+
(defn list-spreadsheets
156+
[service]
157+
(.getEntries (.getFeed service spreadsheet-feed-url SpreadsheetFeed)))
158+
159+
(defn find-spreadsheet-by-title
160+
[service title]
161+
(let [spreadsheets (filter (fn [sheet] (= (.getPlainText (.getTitle sheet)) title))
162+
(list-spreadsheets service))]
163+
(if (= (count spreadsheets) 1)
164+
(first spreadsheets)
165+
(throw (Exception. (format "Found %d spreadsheets with name %s"
166+
(count spreadsheets)
167+
title))))))
168+
169+
Back at the REPL:
170+
171+
user=> (def spreadsheet (gsheets/find-spreadsheet-by-title service "Colour Counts"))
172+
user=> (.getPlainText (.getTitle spreadsheet))
173+
"Colour Counts"
174+
175+
A spreadsheet contains one or more worksheets, so the next functions we implement take a `SpreadsheetEntry` object and list or search worksheets:
176+
177+
(defn list-worksheets
178+
[service spreadsheet]
179+
(.getEntries (.getFeed service (.getWorksheetFeedUrl spreadsheet) WorksheetFeed)))
180+
181+
(defn find-worksheet-by-title
182+
[service spreadsheet title]
183+
(let [worksheets (filter (fn [ws] (= (.getPlainText (.getTitle ws)) title))
184+
(list-worksheets service spreadsheet))]
185+
(if (= (count worksheets) 1)
186+
(first worksheets)
187+
(throw (Exception. (format "Found %d worksheets in %s with name %s"
188+
(count worksheets)
189+
spreadsheet
190+
title))))))
191+
192+
...and at the REPL:
193+
194+
user=> (def worksheets (gsheets/list-worksheets service spreadsheet))
195+
user=> (map (fn [ws] (.getPlainText (.getTitle ws))) worksheets)
196+
("Sheet1")
197+
198+
Our next function returns the cells belonging to a worksheet:
199+
200+
(defn get-cells
201+
[service worksheet]
202+
(map (memfn getCell) (.getEntries (.getFeed service (.getCellFeedUrl worksheet) CellFeed))))
203+
204+
This gives us a flat list of `Cell` objects. It will be much more convenient to work in Clojure with a nested vector of the cell values:
205+
206+
(defn to-nested-vec
207+
[cells]
208+
(mapv (partial mapv (memfn getValue)) (partition-by (memfn getRow) cells)))
209+
210+
We now have all the building blocks for the function that will be the main entry point to our minimal Clojure API:
211+
212+
(defn fetch-worksheet
213+
[service {spreadsheet-title :spreadsheet worksheet-title :worksheet}]
214+
(if-let [spreadsheet (find-spreadsheet-by-title service spreadsheet-title)]
215+
(if-let [worksheet (find-worksheet-by-title service spreadsheet worksheet-title)]
216+
(to-nested-vec (get-cells service worksheet))
217+
(throw (Exception. (format "Spreadsheet '%s' has no worksheet '%s'"
218+
spreadsheet-title worksheet-title))))
219+
(throw (Exception. (format "Spreadsheet '%s' not found" spreadsheet-title)))))
220+
221+
With this in hand:
222+
223+
user=> (def sheet (gsheets/fetch-worksheet service {:spreadsheet "Colour Counts" :worksheet "Sheet1"}))
224+
#'user/sheet
225+
user=> (clojure.pprint/pprint sheet)
226+
[["Colour" "Count"]
227+
["red" "123"]
228+
["orange" "456"]
229+
["yellow" "789"]
230+
["green" "101112"]
231+
["blue" "131415"]
232+
["indigo" "161718"]
233+
["violet" "192021"]]
234+
nil
235+
236+
Our `to-nested-vec` function returns the cell values as strings. I could have used the `getNumericValue` method instead of `getValue`, but then `to-nested-vec` would have to know what data type to expect in each cell. Instead, I used [Plumatic Schema](https://github.com/plumatic/schema) to define a schema for each row, and used its [data coercion](http://plumatic.github.io/schema-0-2-0-back-with-clojurescript-data-coercion/) features to coerce each column to the desired data type - but that's a blog post for another day.
237+
238+
Code for the examples above is available on Github <https://github.com/ray1729/gsheets-demo>. We have barely scratched the surface of the Google Spreadsheets API; check out the [API Documentation](https://developers.google.com/google-apps/spreadsheets/) if you need to extend this code, for example to create or update spreadsheets.

0 commit comments

Comments
 (0)
Please sign in to comment.