A chatbot maintains a conversation history: a sequence of messages alternating between user and assistant, often prefixed by a system message. Each time you send a message, the entire conversation history is sent to the model as context. The model generates a response conditioned on this full history. This is why chatbots seem to "remember" earlier parts of the conversation — they're re-reading it every time.
The "memory" of a chatbot is bounded by its context window. Once the conversation exceeds the context limit, earlier messages must be dropped or summarized. The chatbot doesn't truly remember — it re-reads the transcript. Some chatbots implement persistent memory by storing key facts in a separate database and injecting them into the system prompt, giving the appearance of long-term memory across conversations. But the model itself has no state between API calls.
What separates a chatbot from a raw API call is the product layer: the UI design, the conversation management, the safety filters, the model routing (some chatbots use different models for different tasks), tool integrations (web search, code execution, file analysis), and the system prompt that defines the assistant's personality and capabilities. Two chatbots using the same underlying model can feel completely different because of their product layer choices.