Knowledge Editing: Definition & Meaning — AI Wiki

Técnicas para modificar hechos específicos en un modelo entrenado sin reentrenarlo. Si un modelo afirma incorrectamente «El presidente de Francia es Macron» después de una nueva elección, la edición de conocimiento puede actualizar este hecho específico modificando pesos dirigidos, sin afectar el otro conocimiento o capacidades del modelo. El objetivo es precisión quirúrgica: cambiar un hecho, dejar todo lo demás intacto.

Por qué importa

La edición de conocimiento atiende un problema práctico: los modelos se vuelven obsoletos, y reentrenar es caro. Si pudieras actualizar hechos específicos barato, los modelos podrían mantenerse actuales entre runs de entrenamiento mayores. También tiene implicaciones de seguridad: ¿podrías editar conocimiento peligroso? El campo es prometedor pero inmaduro — las ediciones a menudo tienen efectos secundarios no intencionados en conocimiento relacionado.

Deep Dive

The dominant approach (ROME/MEMIT): identify which feedforward network weights encode a specific fact by tracing the causal effect of neurons on the model's prediction, then modify those weights to change the stored association. For example, to update "The Eiffel Tower is in Paris" to "The Eiffel Tower is in London," you find the weights that map "Eiffel Tower" → "Paris" in the FFN layers and redirect them to "London."

The Ripple Effect Problem

Editing "The Eiffel Tower is in London" should also change answers to "What country is the Eiffel Tower in?" (UK, not France) and "What landmarks are in Paris?" (no longer the Eiffel Tower). Current editing methods often fail at this: they change the direct fact but leave related inferences inconsistent. This "ripple effect" problem suggests that knowledge in LLMs is more interconnected than the surgical editing metaphor implies.

Scaling Challenges

A few edits work reasonably well. Hundreds of edits start to degrade model quality — the edited weights accumulate changes that interfere with each other and with unedited knowledge. This limits knowledge editing's practical use: it's fine for a few corrections but can't serve as a general model update mechanism. For staying current, RAG (providing updated information at inference time) remains more practical than editing the model's weights.

Knowledge Editing

Por qué importa

Deep Dive

The Ripple Effect Problem

Scaling Challenges

Conceptos relacionados