Skip to content

Conversation

@m4rch3n1ng
Copy link
Contributor

add a --attribute-filter cli flag to icu4x-datagen, that implements a regex search like suggested in this review comment chain.

as i have implemented something extremely similarly sometime in the past i have some thoughts:

  • i have implemented a way to negate a regex match by prefixing it with a - as suggested in this review, but i wanted to share an alternative: in my previous implementation i used regex "flags" similarly to ecmascript regexes, so i can do a case-insensitive match like /regex/i and an inverted like /regex/v (-v flag from grep) or i can combine them.
  • i have implemented an anchored (i.e. whole-string) match as suggested in this review, by just wrapping the regex in ^(?:)$. this does however have one side effect: the error messages are worse, as the error message now point to the wrapped regex and )x( is now allowed as an input. this is fixable by parsing the regex twice, which i can do if you want me to. however, this is also an issue in ripgrep, so it is probably fine for a cli.

resolves #6851
supersedes/closes #7166

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

inverted: bool,
}

#[derive(Debug)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use derive(Display) which uses the displaydoc crate to generate a Display impl

for filter in &cli.attribute_filter {
let regex = filter.regex.clone();
let inverted = filter.inverted;
driver = driver.with_marker_attributes_filter(&filter.domain, move |attr| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: You should collect the filters for the same domain into a single filter and then call this function. This function drops all previous filters. #7240

use std::fmt::Display;
use std::path::PathBuf;
use std::str::FromStr;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test for this new flag, because it is sufficiently novel that it isn't covered by the ExportDriver tests.

I suggest adding a command to export a human-readable JSON file tree into a tests/filter_test/out directory using an interesting attributes filter.

You can add the command line to generate the JSON here:

[tasks.testdata-hello-world]

You can add any additional interesting attributes to the hello world data here. There is already one attribute that is either empty or is the string "reverse":

impl HelloWorldProvider {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add attributes filtering to CLI

2 participants