Persona SFT: first-person text beats chat demos on Qwen3-4B LoRA test, Zubnet AI News

Ferran Alia published a controlled SFT comparison on Towards Data Science Wednesday, testing three different data formats for teaching a language model a persona — and the counterintuitive option won. The setup is the part most practitioner posts skip: Qwen3-4B-Instruct as base model, LoRA (r=16, alpha=32, attention and MLP projections), 3 epochs with cosine LR schedule and 5% warmup, 500 training examples per strategy generated by Claude, all hyperparameters held constant across runs so the only variable was data format. The three strategies map to three different theories about where persona lives in the weights: chat demonstrations (behavioral imitation), first-person introspective statements ("I am C-3PO, I prefer to calculate the odds before committing to any course of action"), and synthetic document fine-tuning (third-person Wikipedia-style descriptions, the technique from Anthropic's 2025 belief-insertion research). Code is on GitHub.

The result is the headline finding worth lifting: first-person statements outperformed demonstrations on generalization, measured via a 4×3 perplexity matrix (baseline plus three fine-tunes, evaluated on samples from all three data formats) plus trait-tagging on 30 fixed-prompt responses checking for C-3PO behaviors (calling people "Sir," quoting odds, expressing anxiety, formal etiquette). The dialog-trained model is best at producing C-3PO-style dialog; the first-person-trained model produces more C-3PO output across all formats, including dialog. Synthetic documents do something different — they teach the facts of the persona (six million forms of communication, protocol droid functions) better than the felt sense of being that persona. Alia's reading is that demonstrations update behavioral patterns, first-person text updates self-representation, and SDF updates world-knowledge about a named entity — and self-representation generalizes farther than behavior. The honest hedge in the piece is also important: a well-tuned system prompt is still very competitive against any of these fine-tunes, and the experiment doesn't claim SFT is necessarily the right tool for many persona tasks.

The ecosystem read here connects two threads that have not been linked publicly before. Anthropic's 2025 SDF work showed you can insert false-but-plausible facts into a model by training on document-style text framed as factual. Alia's finding is the SFT analogue: if you want the model to be the entity, write first-person; if you want it to know facts about the entity, write third-person. The implication for everyone building customer-support assistants, branded copilots, role-play characters, or domain agents is that the intuitive default (record human-AI conversations, fine-tune on them) is leaving generalization on the table. For agent designers in particular — including the Hermes-style local stacks shipping this week — the SFT-data-format question is exactly the one that determines whether your agent has consistent character or drifts under distribution shift.

For builders fine-tuning for persona or character: generate three types of synthetic data (first-person introspection, demonstrations, SDF documents), train ablations on equal token budgets, and benchmark on out-of-distribution prompts before deploying. The 500-example LoRA setup Alia used is cheap enough for a single-GPU weekend, and the code is reusable. The load-bearing caveat is that this is a single-author, single-base-model (Qwen3-4B) result on a fictional persona — Anthropic's belief-insertion work suggests the pattern probably generalizes, but expect to see the experiment re-run on Llama 4 and on real corporate personas before treating "first-person beats demos" as settled. The deeper takeaway is methodological: ablation studies on SFT data format are still rare in the practitioner literature, and any team doing serious fine-tuning should be running their own version of this 3-way comparison rather than copying defaults from the last template they saw.

Persona SFT: first-person text beats chat demos on Qwen3-4B LoRA test

More News