Skip to content

Conversation

@guan404ming
Copy link
Member

Why

String class lacks essential string manipulation methods that are commonly expected in standard library equivalents, requiring users to convert to std::string for basic operations like substring search and extraction.

How

  • Add find() method with three overloads (String, const char*, const char* with count) to search for substrings
  • Add substr() method to extract substrings with bounds checking

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @guan404ming, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the String class by integrating essential string manipulation functionalities, find and substr. These additions address a long-standing limitation, allowing users to perform common string operations directly within the String class, thereby streamlining development and reducing the need for conversions to std::string for basic tasks. The changes include robust implementations and comprehensive testing to ensure reliability.

Highlights

  • String Manipulation Methods: Implemented find and substr methods for the String class, enhancing its string manipulation capabilities without requiring conversion to std::string.
  • Find Method Overloads: Added three overloads for the find method, allowing searches for String objects, const char* with default length, and const char* with a specified length.
  • Substring Extraction with Bounds Checking: Introduced the substr method for extracting substrings, which includes robust bounds checking and throws std::out_of_range for invalid indices.
  • Comprehensive Unit Testing: New unit tests have been added in both C++ and Python to thoroughly validate the functionality and edge cases of the new find and substr methods.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds useful find and substr methods to the String class, which is a great enhancement. The implementation of substr is solid. For the find method, I've suggested refactoring it to use std::string_view::find. This will simplify the code, make it more robust by aligning its behavior with the standard library (especially for edge cases like searching for an empty string), and likely improve performance. I've also recommended adding a String::npos static member for better API consistency with std::string. The tests have been updated accordingly, with suggestions to improve coverage for edge cases in both C++ and Python.

Comment on lines 637 to 650
size_t find(const char* str, size_t pos, size_t count) const {
if (count == 0) return pos <= size() ? pos : size_t(-1);
if (pos >= size() || count > size() - pos) return size_t(-1);

const char* this_data = data();
size_t this_size = size();

for (size_t i = pos; i <= this_size - count; ++i) {
if (std::memcmp(this_data + i, str, count) == 0) {
return i;
}
}
return size_t(-1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The manual implementation of find is complex and has a subtle inconsistency with std::string::find's behavior for searching for empty strings. It's also likely less performant than the standard library's implementation.

It's better to delegate to std::string_view::find, which is highly optimized and guaranteed to be correct. This simplifies the code, improves maintainability, and ensures standard-compliant behavior (including edge cases like searching for an empty string past the end).

Note that std::string_view::find returns std::string_view::npos on failure, which is equivalent to static_cast<size_t>(-1).

  size_t find(const char* str, size_t pos, size_t count) const {
    return std::string_view(data(), size()).find(std::string_view(str, count), pos);
  }

throw std::out_of_range("tvm::String index out of bounds");
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with std::string, it's good practice to define a static npos member to represent 'not found'. This improves readability and makes the interface more familiar to C++ developers.

  static constexpr size_t npos = static_cast<size_t>(-1);

Comment on lines 454 to 464
EXPECT_EQ(s.find("notfound"), size_t(-1));
EXPECT_EQ(s.find(""), 0);
EXPECT_EQ(s.find("", 5), 5);
EXPECT_EQ(s.find("", 20), size_t(-1));

String pattern{"world"};
EXPECT_EQ(s.find(pattern), 6);

String empty{""};
EXPECT_EQ(empty.find("x"), size_t(-1));
EXPECT_EQ(empty.find(""), 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve test clarity and align with the suggested changes for String::find:

  1. Use String::npos (which should be added to the String class) instead of size_t(-1) for better readability.
  2. The test for finding an empty string with a starting position beyond the string's length should expect s.size() to be returned, which is the standard behavior for std::string::find.
  EXPECT_EQ(s.find("notfound"), String::npos);
  EXPECT_EQ(s.find(""), 0);
  EXPECT_EQ(s.find("", 5), 5);
  EXPECT_EQ(s.find("", 20), s.size());

  String pattern{"world"};
  EXPECT_EQ(s.find(pattern), 6);

  String empty{"

assert s.find("o") == 4
assert s.find("o", 5) == 7
assert s.find("notfound") == -1
assert s.find("") == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Python tests for find could be more comprehensive. It would be good to add tests for searching for an empty string with a starting position, especially one that is out of bounds, to ensure it matches Python's str.find behavior.

For example, Python's "abc".find("", 5) returns 3 (the length of the string).

Suggested change
assert s.find("") == 0
assert s.find("") == 0
assert s.find("", 5) == 5
assert s.find("", 11) == 11
assert s.find("", 20) == 11

@guan404ming guan404ming marked this pull request as ready for review January 6, 2026 05:45
Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@junrushao junrushao merged commit bd12b26 into apache:main Jan 6, 2026
7 checks passed
@guan404ming guan404ming deleted the feat-string-find-substr branch January 6, 2026 16:59
@guan404ming
Copy link
Member Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants