AI Terms Explained Without the Jargon
You don’t need to understand the math behind AI to use it well. But you do need to understand the vocabulary, because the terms keep showing up — in product descriptions, pricing pages, blog posts, and conversations with people who assume you already know what they mean.
This isn’t a textbook glossary. Every term gets a plain English definition, an analogy that actually helps, and a concrete example. No jargon to explain the jargon.
The Core Concepts
A really well-read assistant. An LLM has been trained on billions of pages of text — books, articles, websites, code, conversations — and learned the patterns of language well enough to generate new text that sounds human. It doesn’t “know” things the way you do. It predicts the most likely next word, over and over, incredibly fast. But the result is so good that the difference is often academic.
A piece of a word. AI doesn’t read words the way you do — it breaks text into “tokens,” which are chunks that might be a whole word, part of a word, or even a single character. Common words like “hello” are usually one token. Longer words get split: “extraordinary” becomes about 3 tokens. A rough rule of thumb: 1 token ≈ 0.75 words, or about 4 characters.
The context window is the total amount of text the AI can hold in its working memory during a conversation. Everything you’ve said, everything it’s said, plus any documents you’ve pasted in — it all has to fit in the context window. Once the conversation exceeds the window, the AI starts “forgetting” the earliest parts.
Your instructions to the AI. A prompt can be as simple as “What’s the capital of France?” or as complex as a multi-paragraph set of instructions with examples, constraints, and formatting requirements. The quality of your prompt is the single biggest factor in the quality of the response.
How AI Thinks (Sort Of)
When an AI model processes your prompt and generates output, that’s called inference. The model itself was trained once (which takes weeks or months and millions of dollars). Every time you use it after that, it’s doing inference — applying what it learned to your specific input. Think of training as going to school, and inference as taking the exam.
A setting (usually 0 to 1) that controls how predictable or creative the AI’s responses are. At temperature 0, the AI always picks the single most likely next word — reliable, consistent, sometimes boring. At temperature 1, it introduces randomness, choosing less obvious words, leading to more creative and varied output. Think of it as a slider between “strict accountant” and “jazz musician.”
An AI hallucination is when the model generates information that sounds authoritative and plausible but is completely fabricated. It’s not lying (it has no intent) — it’s predicting what plausible-sounding text looks like, and sometimes that prediction doesn’t correspond to reality. Fake citations, invented statistics, non-existent URLs, and confident-but-wrong factual claims are all hallucinations.
Making AI Smarter
Taking a pre-trained LLM and training it further on a specialized dataset so it gets better at a specific task. The base model already knows language; fine-tuning teaches it the particular patterns of your domain. It’s like hiring a generally smart person and then giving them specialized on-the-job training.
Instead of relying only on what the model learned during training, RAG lets the AI search through a specific set of documents before answering. It retrieves relevant information first, then generates a response based on what it found. This dramatically reduces hallucination for factual questions because the AI is working from actual source material, not just its memory.
An embedding converts a piece of text into a list of numbers (a “vector”) that captures its meaning. Similar texts get similar numbers. This lets AI do semantic search — finding documents that are about the same topic, even if they use completely different words. It’s the technology that powers RAG, recommendation systems, and intelligent search.
The Business Side
The organization that trains, hosts, and serves the AI model. When you use Claude, the provider is Anthropic. When you use GPT-4, the provider is OpenAI. When you use Gemini, it’s Google. Providers own the model, run the GPUs, and set the pricing. Some providers make their own models (Anthropic, Google); others host models made by different teams (Together.ai, Fireworks).
A company that doesn’t run its own AI but builds a product on top of someone else’s API. Some wrappers add genuine value — better interfaces, billing features, multi-provider access. Others are just reselling API access with a markup and a logo. The key question is: what value does the wrapper add? If the answer is “nothing,” you’re just paying extra.
Instead of paying a platform’s marked-up price, you get your own API key directly from the provider (like Anthropic or OpenAI) and plug it into the platform. You pay the provider directly at their wholesale rates, and the platform just provides the interface. It’s like bringing your own ingredients to a restaurant that charges a cooking fee instead of the full meal price.
Multimodal & Beyond Text
A multimodal AI can process and generate multiple types of content: text, images, audio, video, or code. Early AI was text-only — you typed, it typed back. Modern multimodal models can look at an image and describe it, listen to audio and transcribe it, or take a text description and generate an image. The trend is toward models that handle everything.
A standard way to connect AI models to external tools and data sources. Instead of just chatting, the AI can search the web, query databases, read files, run code, call APIs, and take actions in the real world. MCP defines how these connections work so that any compatible tool works with any compatible model. Think of it as USB for AI — a universal plug that lets you connect any tool.
Quick Reference
LLM = the AI brain • Token = its unit of measurement • Context window = its working memory • Prompt = your instructions • Inference = the AI thinking • Temperature = creativity dial • Hallucination = confident fiction • Fine-tuning = specialized training • RAG = giving it reference material • Embedding = meaning as numbers • Provider = who runs it • Wrapper = middleman • BYOK = your keys, their interface • Multimodal = beyond text • MCP = AI + real tools
That’s the vocabulary. You don’t need to memorize it all at once — come back to this page when you encounter a term you’re not sure about. The goal isn’t to sound smart in conversations about AI. It’s to understand what you’re buying, what you’re using, and what the people selling you AI tools are actually talking about.
Want to see these concepts in action? Zubnet puts 361+ models from 61 providers in one place — with BYOK support, multi-model comparison, and transparent pricing.