-
Notifications
You must be signed in to change notification settings - Fork 49
fix: out-of-bounds read in StringViewToDLDataType_ with non-null-terminated strings
#387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: out-of-bounds read in StringViewToDLDataType_ with non-null-terminated strings
#387
Conversation
…-null terminated strings
Summary of ChangesHello @khj809, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical data type parsing bug that could cause crashes or incorrect behavior when Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly fixes a critical out-of-bounds read bug in StringViewToDLDataType_ by replacing strtoul with a custom, bounds-aware digit parser. The fix is sound and is accompanied by a comprehensive set of new tests. My review includes suggestions to further improve the robustness of the parser by adding checks for integer overflows and validating the parsed values for bits and lanes to prevent silent data truncation and invalid configurations.
|
Interesting! @tqchen could you take a look? |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses a critical out-of-bounds read vulnerability in StringViewToDLDataType_ by replacing the unsafe strtoul with a custom, bounds-aware parsing function for non-null-terminated std::string_views. The accompanying tests are thorough and validate the fix for various edge cases. My review focuses on ensuring the new code is fully robust. I've identified one potential out-of-bounds read that was introduced and an opportunity to reduce code duplication for better maintainability.
|
/gemini Review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is an excellent fix for a critical out-of-bounds read vulnerability. Replacing strtoul with a custom, bounds-aware parsing implementation is the right approach to ensure safety with non-null-terminated std::string_views. The new helper functions parse_digits and parse_lanes are well-structured and make the code cleaner. The accompanying tests are very thorough and cover a wide range of edge cases, which gives high confidence in the fix. I have one suggestion to make the digit parsing even more robust against potential overflows.
| while (*ptr < end && **ptr >= '0' && **ptr <= '9') { | ||
| value = value * 10 + (**ptr - '0'); | ||
| (*ptr)++; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of parse_digits has a subtle bug where value can overflow uint64_t if the input string contains a very large number (more than 19 digits). If value overflows, it will wrap around (which is defined behavior for unsigned integers), and the subsequent check value > UINT32_MAX might fail to detect the overflow, leading to incorrect parsing.
To make this function more robust, you should check for potential overflow before performing the multiplication and addition. Since the final value should fit in a uint32_t, you can check against UINT32_MAX within the loop.
while (*ptr < end && **ptr >= '0' && **ptr <= '9') {
uint8_t digit = **ptr - '0';
if (value > UINT32_MAX / 10 || (value == UINT32_MAX / 10 && digit > UINT32_MAX % 10)) {
// Number is too large for uint32_t, set to overflow and consume rest of digits.
value = (uint64_t)UINT32_MAX + 1;
while (*ptr < end && **ptr >= '0' && **ptr <= '9') {
(*ptr)++;
}
break;
}
value = value * 10 + digit;
(*ptr)++;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gemini is getting too much
Summary
Fixes a critical bug in
StringViewToDataType_that causes undefined behavior when parsing dtype strings from non-null-terminatedstd::string_view. This bug particularly affects Electron applications due to aggresive memory reuse in Chromium's allocators.Problem
The original implementation used
strtoul()to parse numeric values (bits and lanes) directly fromstd::string_view::data(). However,strtoul()requires null-terminated strings and will continue reading beyond the string_view bounds until it finds a null terminator or non-digit character.Solution
Replace
strtoul()with manual digit parsing that respectsstring_viewboundaries:str.data() + str.length()strtoulbut safeTesting
New test
TEST(DType, NonNullTerminatedStringView)validates:"float16999888777"length 7 -> bits=16, not 16999888777)"int32x4extradata"length 5 -> lanes=1, not 4)"bfloat16x2"length 10 -> lanes=2)"float64AAAAA"length 7 -> float64)