From a8b476420000de973e7e796f36cb7b60415b5a03 Mon Sep 17 00:00:00 2001 From: Adrien Joly <531781+adrienjoly@users.noreply.github.com> Date: Sat, 2 Apr 2022 13:16:56 +0200 Subject: [PATCH] docs: note that photographed text is not supported Added note: This module extracts text entries from PDF files. It does not support photographed text. If you cannot select text from the PDF file, **you may need to use OCR software first**. Cf https://github.com/adrienjoly/npm-pdfreader/issues/104. --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index f2f5c2a..2586f7f 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,9 @@ Supports **tabular data** with automatic column detection, and **rule-based pars Dependencies: it is based on [pdf2json](https://www.npmjs.com/package/pdf2json), which itself relies on Mozilla's [pdf.js](https://github.com/mozilla/pdf.js/). -ℹ️ This module is meant to be run using Node.js only. **It does not work from a web browser.** +ℹ️ Important notes: +- This module is meant to be run using Node.js only. **It does not work from a web browser.** +- This module extracts text entries from PDF files. It does not support photographed text. If you cannot select text from the PDF file, **you may need to use OCR software first**. Summary: