forked from ericmason/pdftk
-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathpdftk.1.txt
496 lines (390 loc) · 19.6 KB
/
pdftk.1.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
PDFTK(1) PDFTK(1)
NAME
pdftk - A handy tool for manipulating PDF
SYNOPSIS
pdftk <input PDF files | - | PROMPT>
[ input_pw <input PDF owner passwords | PROMPT> ]
[ <operation> <operation arguments> ]
[ output <output filename | - | PROMPT> ]
[ encrypt_40bit | encrypt_128bit ]
[ allow <permissions> ]
[ owner_pw <owner password | PROMPT> ]
[ user_pw <user password | PROMPT> ]
[ flatten ] [ compress | uncompress ]
[ keep_first_id | keep_final_id ] [ drop_xfa ]
[ verbose ] [ dont_ask | do_ask ]
Where:
<operation> may be empty, or:
[ cat | shuffle | burst |
generate_fdf | fill_form |
background | multibackground |
stamp | multistamp |
dump_data | dump_data_utf8 |
dump_data_fields | dump_data_fields_utf8 |
update_info | update_info_utf8 |
attach_files | unpack_files ]
For Complete Help: pdftk --help
DESCRIPTION
If PDF is electronic paper, then pdftk is an electronic staple-remover,
hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a
simple tool for doing everyday things with PDF documents. Use it to:
* Merge PDF Documents or Collate PDF Page Scans
* Split PDF Pages into a New Document
* Rotate PDF Documents or Pages
* Decrypt Input as Necessary (Password Required)
* Encrypt Output as Desired
* Fill PDF Forms with X/FDF Data and/or Flatten Forms
* Generate FDF Data Stencils from PDF Forms
* Apply a Background Watermark or a Foreground Stamp
* Report PDF Metrics such as Metadata and Bookmarks
* Update PDF Metadata
* Attach Files to PDF Pages or the PDF Document
* Unpack PDF Attachments
* Burst a PDF Document into Single Pages
* Uncompress and Re-Compress Page Streams
* Repair Corrupted PDF (Where Possible)
OPTIONS
A summary of options is included below.
--help, -h
Show summary of options.
<input PDF files | - | PROMPT>
A list of the input PDF files. If you plan to combine these PDFs
(without using handles) then list files in the order you want
them combined. Use - to pass a single PDF into pdftk via stdin.
Input files can be associated with handles, where a handle is a
single, upper-case letter:
<input PDF handle>=<input PDF filename>
Handles are often omitted. They are useful when specifying PDF
passwords or page ranges, later.
For example: A=input1.pdf B=input2.pdf
[input_pw <input PDF owner passwords | PROMPT>]
Input PDF owner passwords, if necessary, are associated with
files by using their handles:
<input PDF handle>=<input PDF file owner password>
If handles are not given, then passwords are associated with
input files by order.
Most pdftk features require that encrypted input PDF are accom-
panied by the ~owner~ password. If the input PDF has no owner
password, then the user password must be given, instead. If the
input PDF has no passwords, then no password should be given.
When running in do_ask mode, pdftk will prompt you for a pass-
word if the supplied password is incorrect or none was given.
[<operation> <operation arguments>]
If this optional argument is omitted, then pdftk runs in 'fil-
ter' mode. Filter mode takes only one PDF input and creates a
new PDF after applying all of the output options, like encryp-
tion and compression.
Available operations are: cat, shuffle, burst, generate_fdf,
fill_form, background, multibackground, stamp, multistamp,
dump_data, dump_data_utf8, dump_data_fields,
dump_data_fields_utf8, update_info, update_info_utf8,
attach_files, unpack_files. Some operations takes additional
arguments, described below.
cat [<page ranges>]
Catenates pages from input PDFs to create a new PDF. Page
order in the new PDF is specified by the order of the given
page ranges. Page ranges are described like this:
<input PDF handle>[<begin page number>[-<end page num-
ber>[<qualifier>]]][<page rotation>]
Where the handle identifies one of the input PDF files, and
the beginning and ending page numbers are one-based refer-
ences to pages in the PDF file, and the qualifier can be even
or odd, and the page rotation can be N, S, E, W, L, R, or D.
If the handle is omitted from the page range, then the pages
are taken from the first input PDF.
The even qualifier causes pdftk to use only the even-numbered
PDF pages, so 1-6even yields pages 2, 4 and 6 in that order.
6-1even yields pages 6, 4 and 2 in that order.
The odd qualifier works similarly to the even.
The page rotation setting can cause pdftk to rotate pages and
documents. Each option sets the page rotation as follows (in
degrees): N: 0, E: 90, S: 180, W: 270, L: -90, R: +90, D:
+180. L, R, and D make relative adjustments to a page's rota-
tion.
If no arguments are passed to cat, then pdftk combines all
input PDFs in the order they were given to create the output.
NOTES:
* <end page number> may be less than <begin page number>.
* The keyword end may be used to reference the final page of
a document instead of a page number.
* Reference a single page by omitting the ending page number.
* The handle may be used alone to represent the entire PDF
document, e.g., B1-end is the same as B.
Page Range Examples w/o Handles:
1-endE - rotate entire document 90 degrees
5 11 20 - take single pages from input PDF
5-25oddW - take odd pages in range, rotate 90 degrees
6-1 - reverse pages in range from input PDF
Page Range Examples Using Handles:
Say A=in1.pdf B=in2.pdf, then:
A1-21 - take range from in1.pdf
Bend-1odd - take all odd pages from in2.pdf in reverse order
A72 - take a single page from in1.pdf
A1-21 Beven A72 - assemble pages from both in1.pdf and
in2.pdf
AW - rotate entire in1.pdf document 90 degrees
B - use all of in2.pdf
A2-30evenL - take the even pages from the range, remove 90
degrees from each page's rotation
A A - catenate in1.pdf with in1.pdf
AevenW AoddE - apply rotations to even pages, odd pages from
in1.pdf
AW BW BD - catenate rotated documents
shuffle [<page ranges>]
Collates pages from input PDFs to create a new PDF. Works
like the cat operation except that it takes one page at a
time from each page range to assemble the output PDF. If one
range runs out of pages, it continues with the remaining
ranges. Ranges can use all of the features described above
for cat, like reverse page ranges, multiple ranges from a
single PDF, and page rotation. This feature was designed to
help collate PDF pages after scanning paper documents.
burst Splits a single, input PDF document into individual pages.
Also creates a report named doc_data.txt which is the same as
the output from dump_data. If the output section is omitted,
then PDF pages are named: pg_%04d.pdf, e.g.: pg_0001.pdf,
pg_0002.pdf, etc. To name these pages yourself, supply a
printf-styled format string via the output section. For
example, if you want pages named: page_01.pdf, page_02.pdf,
etc., pass output page_%02d.pdf to pdftk. Encryption can be
applied to the output by appending output options such as
owner_pw, e.g.:
pdftk in.pdf burst owner_pw foopass
generate_fdf
Reads a single, input PDF file and generates an FDF file
suitable for fill_form out of it to the given output filename
or (if no output is given) to stdout. Does not create a new
PDF.
fill_form <FDF data filename | XFDF data filename | - | PROMPT>
Fills the single input PDF's form fields with the data from
an FDF file, XFDF file or stdin. Enter the data filename
after fill_form, or use - to pass the data via stdin, like
so:
pdftk form.pdf fill_form data.fdf output form.filled.pdf
After filling a form, the form fields remain interactive
unless you also use the flatten output option. flatten merges
the form fields with the PDF pages. You can use flatten
alone, too, but only on a single PDF:
pdftk form.pdf fill_form data.fdf output out.pdf flatten
or:
pdftk form.filled.pdf output out.pdf flatten
If the input FDF file includes Rich Text formatted data in
addition to plain text, then the Rich Text data is packed
into the form fields as well as the plain text. Pdftk also
sets a flag that cues Acrobat/Reader to generate new field
appearances based on the Rich Text data. That way, when the
user opens the PDF, the viewer will create the Rich Text
fields on the spot. If the user's PDF viewer does not sup-
port Rich Text, then the user will see the plain text data
instead. If you flatten this form before Acrobat has a
chance to create (and save) new field appearances, then the
plain text field data is what you'll see.
background <background PDF filename | - | PROMPT>
Applies a PDF watermark to the background of a single input
PDF. Pass the background PDF's filename after background
like so:
pdftk in.pdf background back.pdf output out.pdf
Pdftk uses only the first page from the background PDF and
applies it to every page of the input PDF. This page is
scaled and rotated as needed to fit the input page. You can
use - to pass a background PDF into pdftk via stdin.
If the input PDF does not have a transparent background (such
as a PDF created from page scans) then the resulting back-
ground won't be visible -- use the stamp operation instead.
multibackground <background PDF filename | - | PROMPT>
Same as the background operation, but applies each page of
the background PDF to the corresponding page of the input
PDF. If the input PDF has more pages than the stamp PDF,
then the final stamp page is repeated across these remaining
pages in the input PDF.
stamp <stamp PDF filename | - | PROMPT>
This behaves just like the background operation except it
overlays the stamp PDF page on top of the input PDF docu-
ment's pages. This works best if the stamp PDF page has a
transparent background.
multistamp <stamp PDF filename | - | PROMPT>
Same as the stamp operation, but applies each page of the
background PDF to the corresponding page of the input PDF.
If the input PDF has more pages than the stamp PDF, then the
final stamp page is repeated across these remaining pages in
the input PDF.
dump_data
Reads a single, input PDF file and reports various statis-
tics, metadata, bookmarks (a/k/a outlines), and page labels
to the given output filename or (if no output is given) to
stdout. Non-ASCII characters are encoded as XML numerical
entities. Does not create a new PDF.
dump_data_utf8
Same as dump_data excepct that the output is encoded as
UTF-8.
dump_data_fields
Reads a single, input PDF file and reports form field statis-
tics to the given output filename or (if no output is given)
to stdout. Non-ASCII characters are encoded as XML numerical
entities. Does not create a new PDF.
dump_data_fields_utf8
Same as dump_data_fields excepct that the output is encoded
as UTF-8.
update_info <info data filename | - | PROMPT>
Changes the metadata stored in a single PDF's Info dictionary
to match the input data file. The input data file uses the
same syntax as the output from dump_data. Non-ASCII charac-
ters should be encoded as XML numerical entities. This does
not change the metadata stored in the PDF's XMP stream, if it
has one. For example:
pdftk in.pdf update_info in.info output out.pdf
update_info_utf8 <info data filename | - | PROMPT>
Same as update_info except that the input is encoded as
UTF-8.
attach_files <attachment filenames | PROMPT> [to_page <page number |
PROMPT>]
Packs arbitrary files into a PDF using PDF's file attachment
features. More than one attachment may be listed after
attach_files. Attachments are added at the document level
unless the optional to_page option is given, in which case
the files are attached to the given page number (the first
page is 1, the final page is end). For example:
pdftk in.pdf attach_files table1.html table2.html to_page 6
output out.pdf
unpack_files
Copies all of the attachments from the input PDF into the
current folder or to an output directory given after output.
For example:
pdftk report.pdf unpack_files output ~/atts/
or, interactively:
pdftk report.pdf unpack_files output PROMPT
[output <output filename | - | PROMPT>]
The output PDF filename may not be set to the name of an input
filename. Use - to output to stdout. When using the dump_data
operation, use output to set the name of the output data file.
When using the unpack_files operation, use output to set the
name of an output directory. When using the burst operation,
you can use output to control the resulting PDF page filenames
(described above).
[encrypt_40bit | encrypt_128bit]
If an output PDF user or owner password is given, output PDF
encryption strength defaults to 128 bits. This can be overrid-
den by specifying encrypt_40bit.
[allow <permissions>]
Permissions are applied to the output PDF only if an encryption
strength is specified or an owner or user password is given. If
permissions are not specified, they default to 'none,' which
means all of the following features are disabled.
The permissions section may include one or more of the following
features:
Printing
Top Quality Printing
DegradedPrinting
Lower Quality Printing
ModifyContents
Also allows Assembly
Assembly
CopyContents
Also allows ScreenReaders
ScreenReaders
ModifyAnnotations
Also allows FillIn
FillIn
AllFeatures
Allows the user to perform all of the above, and top
quality printing.
[owner_pw <owner password | PROMPT>]
[user_pw <user password | PROMPT>]
If an encryption strength is given but no passwords are sup-
plied, then the owner and user passwords remain empty, which
means that the resulting PDF may be opened and its security
parameters altered by anybody.
[compress | uncompress]
These are only useful when you want to edit PDF code in a text
editor like vim or emacs. Remove PDF page stream compression by
applying the uncompress filter. Use the compress filter to
restore compression.
[flatten]
Use this option to merge an input PDF's interactive form fields
(and their data) with the PDF's pages. Only one input PDF may be
given. Sometimes used with the fill_form operation.
[keep_first_id | keep_final_id]
When combining pages from multiple PDFs, use one of these
options to copy the document ID from either the first or final
input document into the new output PDF. Otherwise pdftk creates
a new document ID for the output PDF. When no operation is
given, pdftk always uses the ID from the (single) input PDF.
[drop_xfa]
If your input PDF is a form created using Acrobat 7 or Adobe
Designer, then it probably has XFA data. Filling such a form
using pdftk yields a PDF with data that fails to display in
Acrobat 7 (and 6?). The workaround solution is to remove the
form's XFA data, either before you fill the form using pdftk or
at the time you fill the form. Using this option causes pdftk to
omit the XFA data from the output PDF form.
This option is only useful when running pdftk on a single input
PDF. When assembling a PDF from multiple inputs using pdftk,
any XFA data in the input is automatically omitted.
[verbose]
By default, pdftk runs quietly. Append verbose to the end and it
will speak up.
[dont_ask | do_ask]
Depending on the compile-time settings (see ASK_ABOUT_WARNINGS),
pdftk might prompt you for further input when it encounters a
problem, such as a bad password. Override this default behavior
by adding dont_ask (so pdftk won't ask you what to do) or do_ask
(so pdftk will ask you what to do).
When running in dont_ask mode, pdftk will over-write files with
its output without notice.
EXAMPLES
Collate scanned pages
pdftk A=even.pdf B=odd.pdf shuffle A B output collated.pdf
or if odd.pdf is in reverse order:
pdftk A=even.pdf B=odd.pdf shuffle A Bend-1 output collated.pdf
Decrypt a PDF
pdftk secured.pdf input_pw foopass output unsecured.pdf
Encrypt a PDF using 128-bit strength (the default), withhold all per-
missions (the default)
pdftk 1.pdf output 1.128.pdf owner_pw foopass
Same as above, except password 'baz' must also be used to open output
PDF
pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz
Same as above, except printing is allowed (once the PDF is open)
pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz allow printing
Join in1.pdf and in2.pdf into a new PDF, out1.pdf
pdftk in1.pdf in2.pdf cat output out1.pdf
or (using handles):
pdftk A=in1.pdf B=in2.pdf cat A B output out1.pdf
or (using wildcards):
pdftk *.pdf cat output combined.pdf
Remove 'page 13' from in1.pdf to create out1.pdf
pdftk in.pdf cat 1-12 14-end output out1.pdf
or:
pdftk A=in1.pdf cat A1-12 A14-end output out1.pdf
Apply 40-bit encryption to output, revoking all permissions (the
default). Set the owner PW to 'foopass'.
pdftk 1.pdf 2.pdf cat output 3.pdf encrypt_40bit owner_pw foopass
Join two files, one of which requires the password 'foopass'. The out-
put is not encrypted.
pdftk A=secured.pdf 2.pdf input_pw A=foopass cat output 3.pdf
Uncompress PDF page streams for editing the PDF in a text editor (e.g.,
vim, emacs)
pdftk doc.pdf output doc.unc.pdf uncompress
Repair a PDF's corrupted XREF table and stream lengths, if possible
pdftk broken.pdf output fixed.pdf
Burst a single PDF document into pages and dump its data to
doc_data.txt
pdftk in.pdf burst
Burst a single PDF document into encrypted pages. Allow low-quality
printing
pdftk in.pdf burst owner_pw foopass allow DegradedPrinting
Write a report on PDF document metadata and bookmarks to report.txt
pdftk in.pdf dump_data output report.txt
Rotate the first PDF page to 90 degrees clockwise
pdftk in.pdf cat 1E 2-end output out.pdf
Rotate an entire PDF document to 180 degrees
pdftk in.pdf cat 1-endS output out.pdf
NOTES
The pdftk home page permalink is:
http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
The easy-to-remember shortcut is: www.pdftk.com
AUTHOR
Sid Steward (sid.steward at pdflabs dot com) maintains pdftk. Please
email him with questions or bug reports. Include pdftk in the subject
line to ensure successful delivery. Thank you.
October 28, 2010 PDFTK(1)