[Bounty: 50 RTC] Training data generator for SFT pipeline

## Description

Port the SophiaCore data generation pattern to ShaprAI for supervised fine-tuning (SFT) training data generation.

## Requirements

- Port the proven pattern from `sophiacore_data_generator.py` to work with ShaprAI's template system
- Generate ChatML-formatted training data with proper `<|im_start|>` / `<|im_end|>` tokens
- Support identity-weighted examples (personality-defining responses weighted higher in training)
- Customizable personality templates — users define their agent's voice, values, and behavioral boundaries

## Acceptance Criteria

- [ ] `shaprai/training/sft_generator.py` module created
- [ ] Generates valid ChatML JSONL output
- [ ] Identity-weighted sampling: core personality examples appear 3-5x more frequently
- [ ] Template-driven: personality defined via YAML/JSON config, not hardcoded
- [ ] CLI command: `shaprai generate-sft --template my_agent.yaml --output train.jsonl --count 1000`
- [ ] Includes at least 3 example personality templates
- [ ] Compatible with HuggingFace TRL `SFTTrainer` format
- [ ] Unit tests for generator logic

## Bounty

**50 RTC** — Paid on merge to main.

## How to Claim

Comment on this issue to claim it. Submit a PR referencing this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bounty: 50 RTC] Training data generator for SFT pipeline #2

Description

Requirements

Acceptance Criteria

Bounty

How to Claim

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bounty: 50 RTC] Training data generator for SFT pipeline #2

Description

Description

Requirements

Acceptance Criteria

Bounty

How to Claim

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions