Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert tables to markdown #9

Closed
wants to merge 5 commits into from
Closed

Conversation

tijder
Copy link
Contributor

@tijder tijder commented Aug 14, 2022

With this merge request it possible to convert tables to markdown. As inspiration I used https://github.com/mixmark-io/turndown-plugin-gfm/blob/master/src/tables.js.

This pull request will fix #5

So this html:

<table>
  <thead>
    <tr>
      <th align="left">Column 1</th>
      <th align="center">Column 2</th>
      <th align="right">Column 3</th>
      <th align="foo">Column 4</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Row 1, Column 1</td>
      <td>Row 1, Column 2</td>
      <td>Row 1, Column 3</td>
      <td>Row 1, Column 4</td>
    </tr>
    <tr>
      <td>Row 2, Column 1</td>
      <td>Row 2, Column 2</td>
      <td>Row 2, Column 3</td>
      <td>Row 2, Column 4</td>
    </tr>
  </tbody>
</table>

will be converted to:

| Column 1 | Column 2 | Column 3 | Column 4 |
| :-- | :-: | --: | --- |
| Row 1, Column 1 | Row 1, Column 2 | Row 1, Column 3 | Row 1, Column 4 |
| Row 2, Column 1 | Row 2, Column 2 | Row 2, Column 3 | Row 2, Column 4 |

And in github this will showed as:

Column 1 Column 2 Column 3 Column 4
Row 1, Column 1 Row 1, Column 2 Row 1, Column 3 Row 1, Column 4
Row 2, Column 1 Row 2, Column 2 Row 2, Column 3 Row 2, Column 4

A nice thing of the java org.jsoup:jsoup library is that tables without a tbody or thead can also be converted, because jsoup add tbody if none is present. This isn't working/supported by the turndown-plugin-gfm library.

@tijder tijder changed the title Convert tables to markdown fixes #5 Convert tables to markdown Aug 14, 2022
@furstenheim
Copy link
Owner

Hi,
Thanks @tijder for the pr. I'll need some careful reading on this one. I might take some time to review it, since I have other projects atm.

BTW, already published your previous PR with version 1.1, it's already accessible in maven central

@tijder
Copy link
Contributor Author

tijder commented Aug 14, 2022

Take your time. If you have any questions or comments please let me know. I have added most of the test cases from https://github.com/mixmark-io/turndown-plugin-gfm/blob/master/test/index.html. So you can see what the result of the html to markdown is.

@tijder
Copy link
Contributor Author

tijder commented Apr 6, 2023

@furstenheim can you please look into this PR?

addRule("br", new Rule("br", (content, element) -> {return options.br + "\n";}));
addRule("br", new Rule("br", (content, element) -> {
// If the br tag is in a table keep the br tag. Otherwise, the table will not work in markdown.
if (element.parentNode() != null && Arrays.asList("td", "th").contains(element.parentNode().nodeName().toLowerCase())) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit sketchy. It works at only one level, right? Both up and down

var alignMap = Map.of("left", ":--", "right", "--:", "center", ":-:");

if (isHeadingRow(element)) {
for (var i = 0; i < element.childNodes().size(); i++) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this brittle? Can't you have an html comment as a child, for example?

@@ -761,5 +761,65 @@
},
"input": "\n<pre><code>\nCode\n\n</code></pre>\n ",
"output": "```\n\nCode\n\n```"
},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some tests here where the content inside the table has other htmls, also comments, and things like that?

@furstenheim
Copy link
Owner

I understand that the code is just from the other library, but it looks slightly brittle as it is right now

@tijder
Copy link
Contributor Author

tijder commented Jun 3, 2023

Sorry, I close this pull request for now. Maybe when I have time I will look in to it again.

@tijder tijder closed this Jun 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Table convertion
2 participants