Skip to content
/ dao Public

DAO - DocBook Accessibility Optimizer for Apache FOP

License

Notifications You must be signed in to change notification settings

hbast/dao

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DAO - DocBook Accessibility Optimizer for Apache FOP

DAO is a script written in Python that optimizes the <fo:block> structures of XSL-FO code generated by DocBook-XSL. The optimized XSL-FO code can then be used with FOP for generating tagged PDF. The tag structure of the resulting PDF will be as flat as possible.

Notice: DAO does not generate any PDF tag used for tagged PDF files or solves the abscense of generating tagging information in the DocBook-XSL process. It only optimizes the XSL-FO code structure.

Introduction

Apache FOP can be used to generate tagged PDF from XSL-FO code. Using DocBook XML in conjunction with the official DocBook-XSL files leads to the problem that deep nested <fo:block> structures are generated. Because FOP automatically tags every <fo:block> as paragraph (PDF tag 'P'), the resulting PDF tag structure looks pretty 'weak'.

DAO solves that problem by crwaling through the XSL-FO document structure and optimizing the <fo:block> structure.

+---------+        +--------+        +--------+        +-----+
| DocBook |  XSL   | XSL-FO |  DAO   | XSL-FO |  FOP   | PDF |
|   XML   +------->|        +------->|  opt   +------->|     |
+---------+        +--------+        +--------+        +-----+

Usage

usage: dao.py [-h] -i INPUT -o OUTPUT [-v] [-p]

DAO - DocBook Accessibility Optimizer for Apache FOP

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Input file name
  -o OUTPUT, --output OUTPUT
                        Output file name
  -v, --verbose         verbose output
  -p, --pretty          pretty xml output

Example

XSL-FO before optimization:

<fo:block id="d0e9">
  <fo:block font-family="sans-serif,Symbol,ZapfDingbats">
    <fo:block start-indent="0pt" text-align="center">
      <fo:block keep-with-next.within-column="always" font-size="24.8832pt" font-weight="bold">
        <fo:block keep-with-next.within-column="always" space-before.optimum="10pt"
        space-before.minimum="10pt * 0.8" space-before.maximum="10pt * 1.2" hyphenate="false"
        text-align="start" start-indent="0pt" hyphenation-character="-"
        hyphenation-push-character-count="2"
        hyphenation-remain-character-count="2">Test Document</fo:block>
      </fo:block>
    </fo:block>
  </fo:block>
</fo:block>

XSL-FO after DAO-optimization:

<fo:block keep-with-next.within-column="always" space-before.optimum="10pt"
space-before.minimum="10pt * 0.8" space-before.maximum="10pt * 1.2" hyphenate="false"
text-align="start" start-indent="0pt" hyphenation-character="-" hyphenation-push-character-count="2"
hyphenation-remain-character-count="2" font-size="24.8832pt"
font-family="sans-serif,Symbol,ZapfDingbats" id="d0e9" font-weight="bold">Test Document</fo:block>

PDF tag structure before optimization:

<p>
  <p>
    <p>
      <p>
        <p>Test Document</p>
      </p>
    </p>
  </p>
</p>

PDF tag structure after DAO-optimization:

<p>Test Document</p>

About

DAO - DocBook Accessibility Optimizer for Apache FOP

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published