A prompting technique where you ask the model to show its reasoning step by step before giving a final answer. Instead of jumping to a conclusion, the model "thinks out loud," which dramatically improves accuracy on complex tasks.
Why it matters: Asking "explain your reasoning" isn't just for transparency — it actually makes models smarter. CoT reduced math errors by up to 50% in early studies. Most modern models now do this internally.
The maximum amount of text (measured in tokens) a model can process in a single conversation. This includes both your input and the model's output. If a model has a 200K context window, that's roughly 150,000 words — about two novels.
Why it matters: Context window size determines what you can do. Summarize a whole codebase? Needs big context. Quick question-answer? Small is fine. But bigger isn't always better — models can lose focus in very long contexts.
The body of text (or other data) used to train a model. A corpus can range from curated collections of books and papers to massive scrapes of the entire internet. The quality and composition of the corpus fundamentally shapes what the model knows and how it behaves.
Why it matters: Garbage in, garbage out. A model trained on Reddit talks differently than one trained on scientific papers. This is why we curated our own corpus for Sarah — generic web crawls produced confused, incoherent results.