-
Notifications
You must be signed in to change notification settings - Fork 0
Enhancement: Use HuggingFace apply_chat_template for optional prompt construction #202
Description
Summary
Add optional support for HuggingFace's tokenizer.apply_chat_template() as an alternative to our manual prompt config classes for constructing chat-formatted prompts for instruction-tuned models.
Priority: Low — Current manual prompt config approach works well and provides full control.
Background
Interpretune currently uses manual prompt config classes (e.g., Gemma2PromptConfig, Gemma3PromptConfig, Llama3PromptConfig) defined in src/it_examples/example_prompt_configs.py to construct chat-formatted prompts for instruction-tuned models.
These classes define model-specific tokens (<start_of_turn>, <|begin_of_text|>, etc.) and implement model_chat_template_fn() to wrap task prompts in the appropriate chat template format.
HuggingFace's transformers library provides a built-in mechanism for this via tokenizer.apply_chat_template(), which uses Jinja2 templates embedded in each tokenizer's configuration.
Proposed Enhancement
Add an optional apply_chat_template mode that delegates to the HuggingFace tokenizer:
def model_chat_template_fn(self, task_prompt, tokenization_pattern=None, tokenizer=None):
if tokenization_pattern and tokenizer is not None and hasattr(tokenizer, 'apply_chat_template'):
messages = [{"role": "user", "content": task_prompt.strip()}]
return tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
# Fall back to manual template
return self._manual_template(task_prompt, tokenization_pattern)Advantages
- Reduced maintenance: No need to manually track chat template changes across model versions
- Broader model support: Any model with a Jinja2 chat template would work automatically
- Consistency: Uses the same template the model was trained with (embedded in tokenizer config)
- System prompts: HF's API natively supports system prompts, multi-turn conversations, tool use
Disadvantages / Considerations
- Less control: Manual templates allow fine-grained customization of prompt structure
- Tokenizer dependency: Requires tokenizer to be available at prompt construction time (may not always be the case in all code paths)
- Template variability: Different HF model revisions may have different templates, making reproducibility harder
- Testing overhead: Need to validate that HF templates produce equivalent outputs to our manual ones
- Non-chat models: Pre-trained (non-IT) models don't have chat templates, so the manual fallback is still needed (e.g.,
google/gemma-3-1b-pthas no chat template butgoogle/gemma-3-1b-itdoes)
Recommendation
Keep manual prompt configs as the default for maximum control and reproducibility. Add apply_chat_template as an opt-in convenience for users who prefer it, particularly for new model families where writing a manual template is tedious.
Implementation priority is low since:
- We currently support only a handful of model families (GPT-2, Gemma2, Gemma3, Llama3)
- Manual templates are simple and well-tested
- The prompt config pattern is well-established in the codebase
Related Files
src/it_examples/example_prompt_configs.py— Current prompt config implementationsdocs/apply_chat_template_proposal.md— Original proposal document- HuggingFace docs: Chat Templates