Zubnet AI学习Wiki › Open Weights
Safety

Open Weights

又名: Open Source (in AI context)
公司发布模型的训练好的参数,让任何人下载并运行。“Open weights”比“open source”更精确,因为大多数发布的模型不包括训练数据或训练代码 — 你得到成品模型但不是配方。Llama、Mistral、Qwen 都是 open-weights 模型。

为什么重要

Open weights 意味着你能在自己的硬件上完全隐私地跑 AI — 没有 API 调用,没有数据离开你的网络。权衡是你需要 GPU 资源来运行,你自己负责安全。

Deep Dive

The term "open weights" exists because the AI industry's use of "open source" is genuinely misleading. Traditional open source (as defined by the OSI) means you get the source code, can modify it, and can redistribute it. When Meta releases Llama, you get the trained model weights — the billions of numerical parameters that define the model's behavior — but not the training data, not the full training code, and often not the data preprocessing pipeline. You can run inference and fine-tune, but you can't reproduce the model from scratch. The Open Source Initiative released a formal definition of "Open Source AI" in late 2024 attempting to clarify this, but the industry still uses the terms loosely. Knowing the distinction matters when you're evaluating what you can actually do with a model.

The License Spectrum

The spectrum of openness varies widely between releases. At one end, Meta's Llama models come with a custom license that prohibits use by companies with over 700 million monthly active users (clearly aimed at competitors) and requires attribution. Mistral's models have generally used Apache 2.0, one of the most permissive licenses available. Alibaba's Qwen family uses Apache 2.0 as well. DeepSeek has released weights under MIT license. Meanwhile, projects like BLOOM (BigScience) and OLMo (AI2) went further by also releasing training data and full training code — these are closer to truly open source. For developers, the license determines whether you can use the model commercially, whether you need to share modifications, and whether you can build proprietary products on top of it.

Running It Yourself

Running open-weights models yourself has gotten dramatically more accessible thanks to quantization and optimized inference engines. A 70-billion-parameter model that would need over 140 GB of VRAM in full precision can run on a single 24 GB consumer GPU at 4-bit quantization with acceptable quality loss. 工具 like llama.cpp, vLLM, and Ollama have made local inference almost trivially easy — you can have a capable model running on a gaming laptop in minutes. The practical bottleneck has shifted from "can I run it?" to "is the quality sufficient for my use case?" Quantized smaller models are remarkably good for many tasks, but they do lose performance on complex reasoning and long-context work compared to full-precision frontier models served via API.

The Safety Debate

The safety implications of open weights are one of the most actively debated topics in AI policy. The concern is straightforward: once weights are released, anyone can fine-tune away the safety training. Researchers have demonstrated that RLHF-based safety guardrails can be removed from open-weights models with just a few hundred examples and minimal compute. This means open-weights models can be turned into uncensored versions that will comply with any request. The counterargument — and it's a strong one — is that the knowledge these models contain is already available on the internet, that the benefits of open research and distributed innovation outweigh the risks, and that trying to restrict model distribution just concentrates power in a few large companies without meaningfully improving safety. Both sides have valid points, and the debate is far from settled.

Making the Choice

For practitioners choosing between open-weights and API-based models, the decision comes down to four factors: privacy (open weights keep your data local), cost (self-hosting is cheaper at high volume but more expensive at low volume), control (you can fine-tune and customize freely), and capability (API-only frontier models like GPT-4o and Claude still outperform the best open-weights models on many benchmarks, though the gap narrows with each major release). Many production systems use both — routing simple queries to a local open-weights model for speed and cost, while sending complex tasks to a frontier API. This hybrid approach gives you the best of both worlds, and it's increasingly the pragmatic choice for teams that need both performance and privacy.

相关概念

← 所有术语
← Open vs. Closed OpenAI →
ESC