Skip to content

Conversation

@yuhengtu
Copy link
Contributor

@yuhengtu yuhengtu commented Jun 14, 2025

We add a new bool argument --validity-check to helm-summarize. If it is activated, we load the four pre-calculated validity metrics values from HuggingFace and write them into the display_prediction.json. In this way, we achieve the goal of displaying the validity metrics values on the HELM website. The script to calculate those four validity metrics is in scripts/validity_check.py.

help="EXPERIMENTAL: Full class name of the Summarizer class to use. If unset, uses the default Summarizer.",
)
parser.add_argument(
"--validity-check",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer this to be --psychometric-validity-check because "validity" is a vague concept (it could be data completeness validation, or data schema validation, or other kinds of validation).

def write_run_display_json(self, skip_completed: bool) -> None:
def process(run: Run) -> None:
write_run_display_json(run.run_path, run.run_spec, self.schema, skip_completed)
write_run_display_json(run.run_path, run.run_spec, self.schema, self.validity_check, skip_completed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.validity_check should be the last argument.

verbose: bool,
num_threads: int,
allow_unknown_models: bool,
validity_check: bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to psychometrics_validity_check or something that identifies the paper.

Also, set the default value to False to fix these errors:


src/helm/benchmark/presentation/torr_robustness_summarizer.py:36: error: Missing positional argument "validity_check" in call to "__init__" of "Summarizer"  [call-arg]
src/helm/benchmark/presentation/test_summarize.py:13: error: Missing positional argument "validity_check" in call to "Summarizer"  [call-arg]
src/helm/benchmark/presentation/test_summarize.py:31: error: Missing positional argument "validity_check" in call to "Summarizer"  [call-arg]

@htrack(None)
def write_run_display_json(run_path: str, run_spec: RunSpec, schema: Schema, skip_completed: bool) -> None:
def write_run_display_json(
run_path: str, run_spec: RunSpec, schema: Schema, skip_completed: bool, validity_check: bool = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change validity_check to psychometrics_validity_check or something that identifies the paper.

@yifanmai
Copy link
Collaborator

This fixes #3645.

@yifanmai
Copy link
Collaborator

This pull request is still causing the type checker to fail. If you'd like to merge, please resolve the type checking issues and update this pull request.

@yifanmai
Copy link
Collaborator

Hi, it's been a month since the last update; are you still working on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants