Looking for approach to 'flattening' PDFs (suspected jargon conflicts inside) #3571
-
First seems like there are two general meanings when people say flatten a pdf: I'm referring to the second thing. Cribbing from adobe (ref: https://helpx.adobe.com/acrobat/kb/printing-complex-pdfs-acrobat.html)
It's possible I'm just not looking in the right places, I suspect .scrub() can remove much of what's considered 'user data' but I'm at a loss for the layer aspect. My use case is developing a workflow to normalize PDFs from a variety of sources I have no control over so they behave more consistently in the systems I deal with. Following a set of steps in acrobat functionally is fine, but I'd prefer to be able to do this in python so I can fully automate it. By way of context this is in the print and digital presentation realm, where I've had several vendors assert their products perform best with PDFs prepared in this way, and some even flatly say things like layers (or even vector paths) cause problems. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Layers in a PDF are used to show / hide certain objects. Depending on the PDF creators effort, this become quite sophisticated. Hidden layers are objects whose display is currently switched OFF. This happens via so-called OCGs (Optional Content Groups) - simple normal PDF objects whose meaning in life is to be ON or OFF - nothing else. Hundreds of OCGs may exist in a PDF, plus complex relationships between their respective ON/OFF states - via specialized other objects which represent logical expressions. You could have things equivalent to IF ... THEN ... ELSE language constructs referring to OCG states, with literally arbitrary complexity. I am not sue what "discard hidden layer content" may entail: physical deletion of any image, Form XObject, annotation, text, ... that is (currently !) not displayed? We do however support making arbitrary changes to ON/OFF states (layers) temporarily and also permanently such that the PDF can be saved in each of these respective states. |
Beta Was this translation helpful? Give feedback.
Layers in a PDF are used to show / hide certain objects. Depending on the PDF creators effort, this become quite sophisticated.
Often, CAD/CAM exports to PDF embody ways to show or hide construction drawings details, or multi-language documents may support switching between the different languages etc.
Hidden layers are objects whose display is currently switched OFF. This happens via so-called OCGs (Optional Content Groups) - simple normal PDF objects whose meaning in life is to be ON or OFF - nothing else.
Other objects like the mentioned ones and several others (images, annotations, ...) may be assigned an OCG. They will then be shown or hidden when their OCG is switched ON respectivel…