Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add line numbers on all block tokens during parsing (#144) #188

Merged
merged 5 commits into from
Dec 2, 2023

Conversation

anderskaplan
Copy link
Contributor

@anderskaplan anderskaplan commented Jun 25, 2023

This is a partial solution for #144 which records line numbers on all block tokens. The span tokens are still left unlabeled, but at least it's a starting point, and I think the functionality added here is useful by itself.

Open question: should the line_number be a added to repr_attributes, so that it gets included in the output from ASTRenderer?

@coveralls
Copy link

coveralls commented Jun 25, 2023

Coverage Status

coverage: 94.15% (+0.03%) from 94.118%
when pulling 6070d7e on anderskaplan:line-numbers-2
into ee7ce94 on miyuchina:master.

@pbodnar pbodnar added the feature label Jul 2, 2023
@pbodnar pbodnar linked an issue Jul 2, 2023 that may be closed by this pull request
@pbodnar pbodnar added this to the 1.3.0 milestone Aug 21, 2023
@pbodnar
Copy link
Collaborator

pbodnar commented Aug 21, 2023

@anderskaplan, thank you, I will try to look at this for the next release.

@pbodnar
Copy link
Collaborator

pbodnar commented Sep 20, 2023

Finally, I have found time for mistletoe once again. :)

@anderskaplan, I must say "good job" 👍 . Basically, I like it, including the clever way of the new unit test.

Probably just 3 more things be resolved:

  1. Performance - did you run the benchmark test? I guess there should be little to zero difference to the version without line numbers, right?
  2. Backwards compatibility - shouldn't we better pass the line_number parameter as the last AND optional (with default value of None) to the constructors of TableRow etc. (even Table?)? This way, we could achieve backwards compatibility for those users who might create (some) tokens without actually parsing a markdown text. Yes, there probably won't be many, but what if there are some out there? :)
  3. Documentation - probably add some notes on this cool feature as a sub-chapter to https://github.com/miyuchina/mistletoe/blob/v1.2.1/dev-guide.md#understanding-the-ast?

Open question: should the line_number be a added to repr_attributes, so that it gets included in the output from ASTRenderer?

Not sure here, but probably yes, it could? :)

@anderskaplan
Copy link
Contributor Author

Hey @pbodnar, just checking in to say hi. The last weeks have been quite busy with other things but I'll get back to this as soon as I can.

@anderskaplan
Copy link
Contributor Author

I have updated the documentation, made the line_number parameter optional, and run the benchmark as requested in the review.

The benchmark results have a fairly large variance (5 %CV) from run to run on my machine. But I'm fairly confident that the performance penalty of the line numbers is negligible, because if I compare the mean benchmark with and without line numbers from five runs of each type, then the relative difference is about 0.1%. I.e., far less than the run-to-run variance.

I also added the line_number attribute to repr_attributes, so that the line numbers are included in the output from the ASTRenderer.

Copy link
Collaborator

@pbodnar pbodnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anderskaplan, thank you for the amendments, I've found just one typo. :)

Beside that, it looks like some other conflicting commits were made in the mean time...

dev-guide.md Outdated Show resolved Hide resolved
@pbodnar pbodnar merged commit 13d1c11 into miyuchina:master Dec 2, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Record line numbers on tokens
3 participants