Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for parsing .debug_types section introduced in DWARF version 4 #520

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

dinkark-dev
Copy link

Adds support for parsing the .debug_types section introduced in DWARF version 4.

Intends to address the issue #193

Changes:

  • Support for parsing type units from the .debug_types section
  • Added logic to generate debug_types with readelf tool
  • Added test binaries generated using IAR Embedded Workbench for ARM toolchain (dwarf_debug_types.elf)

Known issues:
When running unit tests with new ELF binaries generated using the IAR EWARM toolchain,

  • Parsing of the .debug_frames section results in an infinite loop
  • Parsing of the .debug_aranges section causes stream parsing errors

Credits:
Builds over .debug_types updates introduced in #265

Changes
  * Support for parsing type units from the .debug_types section
  * Added logic to generate debug_types with readelf tool
  * Added test binaries generated using IAR Embedded Workbench for ARM toolchain (dwarf_debug_types.elf)

Known issues
  * When running unittests with new ELF binaries,
    * parsing of the .debug_frames section results in an infinite loop
    * parsing of the .debug_aranges section causes stream parsing errors
@sevaa
Copy link
Contributor

sevaa commented Nov 15, 2023

Just a heads up, please run the autotests locally and follow up until they all succeed. If you add binaries to the corpus, it's up to you to make sure the autotest doesn't fail on those - regardless of the primary purpose of the PR. I, too, had to debug unrelated discrepancies on readelf/dwarfdump, and the library only became better for that.

On Windows it's doable too, but somewhat trickier.

@sevaa
Copy link
Contributor

sevaa commented Nov 22, 2023

@dinkark-dev : are you working on this?

@dinkark-dev
Copy link
Author

Yes, I am working on it. However, I am not quite familiar with the .debug_frames or .debug_aranges sections thus it's taking a while.

@driftregion
Copy link

Hi @dinkark-dev, thanks for your work on this. I've pushed a test binary generated by TI Code Composer Studio for a TMS320 device to https://github.com/driftregion/pyelftools/blob/dinkar/dwarf-v4-debug-types-support/test/testfiles_for_dwarfdump/dwarf_ticcs.elf

It contains DW_FORM_ref_sig8 tags which pyelftools 0.30 does not yet handle.

@sevaa
Copy link
Contributor

sevaa commented Sep 24, 2024

@eliben

The issue with the frames section that plagues the binary in this PR is the same as in #563: FDE before its corresponding CIE. The infinite loop was fixed in #563, but the mismatch with readelf stays. Binutils' bug records at https://sourceware.org/bugzilla/show_bug.cgi?id=31975 and https://sourceware.org/bugzilla/show_bug.cgi?id=31973 have been sitting there since July 2024; the latter even has a proposed fix patch, but no reaction from the maintainers so far.

I have a fairly good idea how to reproduce the faulty behavior of readelf in pyelftools - but it won't be a descriptions only fix, it will have to go deep into the parser, introducing a "readelf compatible" mode to CallFrameInfo._parse_entry_at and upstack. With more effort, I could submit a patch to binutils for 31975 too - that's the right thing to do in the long run, but who knows when will it see the light of day.

@eliben
Copy link
Owner

eliben commented Sep 26, 2024

@sevaa

If I'm understanding correctly, I would like to avoid "bug compatibility" fixes inside pyelftools; we've generally been avoiding this, and instead have selective skipping or relaxing of specific tests in the readelf-compatibility testing suite. Would it be possible to go that way?

@sevaa
Copy link
Contributor

sevaa commented Sep 26, 2024

We've removed individual testfiles from the readelf autotest in the past, and that's what this PR currently does :) Eventually, readelf would catch up.

If #526 is merged, this will break - they both add sections to the DWARFInfo constructor .

@eliben
Copy link
Owner

eliben commented Sep 26, 2024

It's ok for now, then.

Re merge conflict - yes. @dinkark-dev can you rebase this PR on the main branch?

@sevaa
Copy link
Contributor

sevaa commented Oct 23, 2024

OBTW, the aranges section in that binary is malformed, and this is squarely the toolchain's fault. The DWARF format (section 7.21 in v5) requires that address/length entries in aranges be aligned within the section to the entry size (8 bytes on ARM32). The header is 12 bytes, so there needs to be a 4 byte padding between the header and the entries. This binary doesn't have the gap, hence the premature stream end error.

Same thing in dwarf_v4cie.elf - also by the IAR compiler. I've sent them a support e-mail, but this won't fix the binaries already in the corpus.

In conclusion, removing the binary from the autotest is entirely justified.

@sevaa
Copy link
Contributor

sevaa commented Oct 23, 2024

@dinkark-dev the IAR support is asking for an EWARM license number. Can you share one please?

@sevaa
Copy link
Contributor

sevaa commented Oct 28, 2024

@Maxicu5 you are an IAR user, right? Do you have a license number I could use for their support?

@Maxicu5
Copy link

Maxicu5 commented Oct 29, 2024

@Maxicu5 you are an IAR user, right? Do you have a license number I could use for their support?

I cannot share the number because of security politics, but I can ask needed questions and share the answers from IAR support

@sevaa
Copy link
Contributor

sevaa commented Oct 29, 2024

Understandable. In that case, can you please fill out the support request form at https://www.iar.com/about/contact -> Customer support with the following issue description (copy-pasted from my ticket):


I have a handful of compiled binary files, all generated by IAR. The debug_aranges section in those binaries is malformed. According to DWARF format (section 7.21 in DWARFv5), the address/length entries (tuples) in the aranges section must be aligned, within the section, to the entry size - 8 on 32-bit ARM. This means there should be a padding gap between the CU header (which is 12 bytes) and the entries. In the binaries by IAR, there isn't any gap. Third party, standard compliant DWARF tools choke on that.

To reproduce, build a "hello world" type program with IAR for ARM32, with debug symbols enabled, and use an ELF viewer of your choice (readelf, objdump, XELFViewer) to see the size of the debug_aranges section. If it's not a multiple of 8 - the issue persists.


Feel free to give them my e-mail - [email protected] - if they have questions about reproduction. For product version, use "IAR ANSI C/C++ Compiler V9.20.4.327/W64 for ARM".

@Maxicu5
Copy link

Maxicu5 commented Oct 30, 2024

I've created a case, and technical support engineer has replied that he will try to reproduce

@sevaa
Copy link
Contributor

sevaa commented Oct 30, 2024

Thanks. Is there a case number? Mine was 00342995.

@sevaa
Copy link
Contributor

sevaa commented Oct 30, 2024

@dinkark-dev: looks like the aranges bug has been addressed in the latest IAR (v9.40). Would it be possible to update the compiler and rebuild the test binary?

@sevaa
Copy link
Contributor

sevaa commented Nov 27, 2024

@dinkark-dev: beautiful, thank you. The frames test will have to stay excluded until the binutiils team fixes their bugs :(

@eliben: how does one deal with merge conflicts?

yield cu

def parse_TUs_iter(self, offset=0):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this function be private with a _ prefix? Or is the user intended to use it directly?

In any case it should have a (perhaps brief) docstring


while offset < self.debug_types_sec.size:
tu = self._parse_TU_at_offset(offset)
# Compute the offset of the next CU in the section. The unit_length
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/CU/TU/ here as needed?

@@ -72,6 +72,13 @@ def run_test_on_file(filename, verbose=False, opt=None):
else:
options = [opt]

if filename.endswith('dwarf_debug_types.elf'):
# TODO: The offset calculation logic in dwarf/callframe.py starts and
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should link to an issue with additional details

@eliben
Copy link
Owner

eliben commented Dec 20, 2024

@dinkark-dev it's not clear to me if the review comments were addressed - I got a notification of a pushed commit, but that's it. Please resolve the comments and re-request review (on the top right-hand side of the PR UI) once you're ready. Thanks

@dinkark-dev dinkark-dev requested a review from eliben December 30, 2024 22:45
@eliben
Copy link
Owner

eliben commented Jan 2, 2025

@sevaa LGTY?

@sevaa
Copy link
Contributor

sevaa commented Jan 2, 2025

Looks good.

Copy link
Owner

@eliben eliben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good. Just a couple of comments about places that are likely copy-paste remnants.

Please address these, resolve the comments in the GitHub UI and re-request a review when done.

def iter_CUs(self):
""" Yield all the compile units (CompileUnit objects) in the debug info
"""
return self._parse_CUs_iter()

def iter_TUs(self):
"""Yield all the compile units (CompileUnit objects) in the debug_types
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are Type Units, not compile units, right? Should the comment be updated?

def _parse_TU_at_offset(self, offset):
""" Parse and return a Type Unit (TU) at the given offset in the debug_types stream.
"""
# Section 7.4 (32-bit and 64-bit DWARF Formats) of the DWARF spec v3
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DWARF v3 doesn't mention debug_types, should this be v4?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants