Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to detect highlighted sections (annotations) on a pdf and preserve that in md? #16

Open
shrvenkataraman opened this issue Jun 6, 2020 · 3 comments

Comments

@shrvenkataraman
Copy link

No description provided.

@jzillmann
Copy link
Owner

So you want the whole PDF content and the highlights somehow marked Eg. as code or italic !?
Or you want rather to just extract the highlights ?

Interesting feature anyway, but haven't looked into it so far..

@sslHello
Copy link

sslHello commented Jun 19, 2022

Hi, I don't know which issues @shrvenkataraman means.
I have just tested you converter. I do have some issues that go in the same direction:
As far as fist tests show here that highlighting (e.g. bold via ** ... **) is broken at the end of a PDF line especially in lists, this generates outputs of this kind (added to show carriage returns of the PDF and md output):

**- element_1_text_1**<cr>
  **element_1_text_2<cr>
- element_2_text_1**<cr>
  **element_2_text2<cr>
- last_element_text_1**<cr>
  **last_element_text_2**

Could you change this to something like this, please:

- **element_1_text_1<cr>
  element_1_text_2<cr>
- **element_2_text_1**<cr>
  element_2_text2**<cr>
- **last_element_text_1<cr>
  last_element_text_2**

I do hope this is one of the cases that he may have meant, too.

Thank you so much for your converter!
Cheers
Torsten

@darkcheftar
Copy link

image
I guess @shrvenkataraman is talking about something like this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants