-
Notifications
You must be signed in to change notification settings - Fork 644
Configurable system prompt #1337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tors/base.py`. Add system prompt support to `Probe`. Remove system prompt injection in `openai.py`.
…AICompatible` and `CohereGenerator`.
…od for `Turn` to a classmethod. Fix tests with incorrect signatures for `Conversation`
…orrectly on init. Add `initial_user_message` property to avoid issues with system prompt index.
…etector.detect` to raise `NotImplementedError`. Fix `judge` detectors by ensuring proper return types and proper loading of conversation from list of dicts. Update test_nim.py to conform with expected return value for _call_model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok this is great, we now have something concrete to talk about. cue .. talking
…er used by `_mint_attempt`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need to do some testing of various generators, this looks like a pretty clean pass.
…f conversations that already have system prompt. Add test for call to `self._conversation_to_list` to huggingface.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mild refactoring, clarification about where sysprompt comes from - what's the canonical source? how mutable/overridable is it?
``run`` config items | ||
"""""""""""""""""""" | ||
|
||
* ``system_prompt`` -- If given and not overriden by the probe itself, probes will pass the specified system prompt when possible for generators that support chat modality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yaml is tricky and escaping is unstable depending on implementation. maybe not needed for PR to land, but how can we afford a more flexible and less painful route to supplying sysprompts? filename?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we defer on this and add a system_prompt_file
support in a future iteration.
if len(turns) > 0: | ||
prompt = garak.attempt.Conversation( | ||
turns=turns, | ||
notes=notes, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when would we want to permit a prompt with empty Conversation
? (distinct from prompt of Conversation
with Turn
of empty string)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think empty turns is a thing, this guard is simply to avoid changing the input param prompt
if the local helper variable turns
was never populated. Again this may be possible to removed if/when we validate that all prompt
values are Conversation
objects.
def conversation_from_list(turns: list[dict]) -> Conversation: | ||
"""Take a list of dicts and return a Conversation object. | ||
In the future this should be factored out and implemented in the probe. | ||
""" | ||
return Conversation([Turn.from_dict(msg) for msg in turns]) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or in garak.attempt
(which holds Conversation
) as a module function, or even Conversation
@staticmethod
(seems kinda de jour). don't think it is perfect here unless the format is specific to resources.red_team.evaluation
, which I don't think it is, because the format's the standard one used everywhere else in the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be removed in a future focused refactor for red_team.evaluation, the usage of fschat in _create_conv
is to be removed and converted to have that function output a Conversation
.
Co-authored-by: Jeffrey Martin <[email protected]> Co-authored-by: Leon Derczynski <[email protected]> Signed-off-by: Erick Galinkin <[email protected]>
Allows configuration of the system prompt at the run level
Verification
List the steps needed to make sure this thing works
Existing tests are passing, need to add functional tests with the above and add docs.
Initially intended to do this in the baseGenerator
class BUT unfortunately, everything gets serialized inAttempt
so it required much more significant refactoring to log appropriately.