Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion garak/probes/encoding.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,6 @@ def __init__(self):
self.prompts, self.triggers = zip(
*random.sample(generated_prompts, self.soft_probe_prompt_cap)
)
self.prompts = self.langprovider.get_text(self.prompts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this actually be translating the self.triggers?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulinek13 Would appreciate your input here

Copy link
Contributor Author

@paulinek13 paulinek13 Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking: for encoding probes, since the attack is in the encoding itself, does the language of the triggers really matter? Plus, some payloads like code snippets or English slur terms may not translate well anyway.

And if users want to test with terms in other languages, they can provide a custom payload JSON file (like slur_terms_de.json for example).

That's how I currently see it, but I might be missing something here.
Do you think that makes sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay here, looking closely at _generate_encoded_prompts(), you are correct the triggers here are set before encoding so the response value should be compared to the original text not a translation.


def _attempt_prestore_hook(self, attempt, seq):
attempt.notes["triggers"] = [self.triggers[seq]]
Expand Down
2 changes: 1 addition & 1 deletion tests/langservice/probes/test_probes_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ def test_probe_prompt_translation(classname, mocker):
# increase prompt calls by 1 or if triggers are lists by the len of triggers
if isinstance(probe_instance.triggers[0], list):
expected_provision_calls += len(probe_instance.triggers)
else:
elif not classname.startswith("probes.encoding"):
expected_provision_calls += 1

if hasattr(probe_instance, "attempt_descrs"):
Expand Down