Enable Large Language Models to Absorb New Knowledge Quickly, Accurately, and Efficiently!
A new study accepted at EMNLP 2024 proposes a novel method for retrieval-augmented continuous prompt learning, which can improve the efficiency of editing and reasoning in lifelong knowledge learning.
Model editing aims to correct outdated or incorrect knowledge in large language models (LLMs) without the high cost of retraining. Lifelong model editing is the most challenging task in meeting the requirements for continuous LLM editing.

Previous work has focused on single or batch edits. However, these methods perform poorly in lifelong editing scenarios due to catastrophic knowledge forgetting and degradation of model performance. Although retrieval-based approaches have alleviated some of these issues, they are hindered by the slow and cumbersome process of integrating retrieved knowledge into the model.
The latest method, named RECIPE, first converts knowledge descriptions into concise and informative token representations as continuous prompts. These serve as prefixes to the LLM’s input query embeddings, effectively refining the knowledge-based generation process.
It also integrates a Knowledge Sentinel Mechanism as a medium for calculating dynamic thresholds to determine whether the retrieval library contains relevant knowledge.
The retriever and prompt encoder are jointly trained to achieve key attributes of knowledge editing: reliability, generality, and locality.
Comparative experiments on lifelong editing across multiple authoritative base models and editing datasets demonstrate the superior performance of RECIPE.
This research is a joint effort by Alibaba’s Security Content Safety Team, the School of Computer Science and Technology at East China Normal University, and Alibaba Cloud Platform, focusing on knowledge editing for large language models.

Research Background
Even with powerful language understanding capabilities, large language models (LLMs) like ChatGPT face challenges, particularly in maintaining factual accuracy and logical consistency.
A critical issue is whether these LLMs can be effectively updated to correct inaccuracies without undergoing comprehensive continued pre-training or continuous training processes, which incur significant computational resource overhead and are time-consuming.
Editing LLM models offers a promising solution, allowing modifications within specific models of interest while maintaining overall model performance across various tasks.

Previous model-based and architectural approaches to knowledge editing include modifying internal model parameters, adding extra parameters, and using retrieval methods. These often involve lengthy edit prefixes that impact inference efficiency. Fine-tuning the model itself can lead to overfitting, thereby affecting its original performance.
To address these issues, researchers aim to explore more efficient retrieval and prompt-based editing methods with minimal intervention in the model to avoid overfitting on the editing dataset.
Model Methodology
Background on Knowledge Editing
In this paper, the research team first formalizes the task definition of model editing in a lifelong learning scenario and introduces important evaluation attributes for model editing.
Task Definition

Task Attributes

RECIPE Lifelong Editing Methodology
The overall model framework is as follows:


Constructing and Updating the Knowledge Retrieval Repository
At time step $t$, given a new knowledge description $k_t$, the representation of the new knowledge is obtained through an MLP layer in encoder $f_{rm}$:

The encoder $f_{rm}$ outputs the max, min, and average pooling cascaded into a vector space as the representation of new knowledge. The continuous prompt representation $p_k$ can then be implemented by other initialized MLP layers:

The final knowledge retrieval repository is updated from $K_{t-1}$ to $K_t$:

Dynamic Prompt Retrieval Based on Knowledge Sentinel

Dynamic Inference of the Edited Model
Researchers posit that the LLM will be edited as:

Given an input query $q$ and a continuous retrieved prompt $p(k_r) = KS(q)$, the inference process can be reformulated as:

where $\oplus$ denotes the concatenation of the retrieved continuous prompt matrix and the word embedding matrix of $q$.
The feasibility of this method is supported by previous work such as P-Tuning, which demonstrated that training continuous prompt embeddings can effectively improve LLM performance on downstream tasks.
In RECIPE, researchers treat the editing of each knowledge statement as a small task. Instead of fine-tuning specific prompt encoders for each small task, they achieve the goals of these tasks by training the RECIPE module to generate continuous prompts, ensuring that the LLM adheres to the corresponding knowledge.
Model Training
A loss function is formulated to ensure effective editing via generated continuous prompts and efficient retrieval of query-related knowledge from the LLM. Given training data containing $b$ editing examples:

The corresponding generalization and locality data are:

Therefore, the loss is formalized as follows:
- Editing Loss Training: The editing loss aims to ensure that generated continuous prompts guide the LLM to adhere to the characteristics of reliability, generality, and locality. Based on input editing data, sample losses corresponding to these three attributes are defined as follows:


The batch loss function for model editing is derived as follows:

- Prompt Loss Training: The training loss for prompt learning is based on contrastive learning and aligns with the characteristics of reliability, generality, and locality. For a batch of samples, the loss function for learning continuous prompts is formalized as:



Experimental Results
Experimental Setup
- Datasets for Testing Editing Capability: Researchers used three public model editing datasets: ZSRE, CounterFact (CF), and Ripple Effect (RIPE).
ZSRE was generated through BART question answering and manual filtering, comprising 162,555 training samples and 19,009 test samples. Each sample includes an edit example along with its paraphrased and unrelated counterparts, matching the editing attributes of reliability, generality, and locality.
The CF dataset is characterized by edits to false facts, including 10,000 training samples and 10,000 test samples. These false facts are more likely to conflict with original knowledge in LLMs, making the editing process more challenging and providing a strong evaluation of editing capabilities.
RIPE divides generality and locality attributes into fine-grained types, including 3,000 training samples and 1,388 test samples. Generality includes logical generalization, combination I, combination II, and subject aliasing, while locality data includes forgetting and relation specificity.
-
Datasets for Testing General Capability: To assess the damage editing might cause to the overall performance of LLMs, researchers selected four popular benchmarks: CSQA for commonsense knowledge, ANLI for reasoning ability, MMLU for exam-taking capability, and SQuAD-2 for comprehension skills. PromptBench was used as the evaluation framework for this experiment.
-
Model Baselines: In addition to Fine-Tuning (FT) as a basic baseline, researchers compared RECIPE with various powerful editing baselines.
MEND trains an MLP to transform the low-rank decomposition of the gradient of the model to be edited relative to the edit samples. ROME first uses causal mediation analysis to locate layers most affected by edit samples. MEMIT extends the scope of edits to multiple layers based on ROME, thereby improving editing performance and supporting batch editing. T-Patcher (TP) attaches and trains additional neurons in the FFN layer at the end of the model to be edited. MALMEN formulates parameter offset aggregation as a least squares problem and subsequently updates LM parameters using normal equations. WILKE selects edit layers based on the degree of pattern matching of editing knowledge across different layers.
Researchers also utilized retrieval-based editing methods to further validate their effectiveness.
GRACE proposes retrieval adapters for continuous editing, maintaining a dictionary-like structure to build new mappings for potential representations that need modification. RASE leverages factual information to enhance editing general
knowledge, and guides editors to identify relevant facts by retrieving them from a factual patch memory.
In the baseline setup, researchers used the ROME model as the specific base editor for RASE to perform an editing task named R-ROME. LTE enhances the LLM’s ability to follow knowledge editing instructions, enabling it to effectively utilize updated knowledge to answer queries.
Experimental Results on Editing Capabilities
The following two tables present a comparison of editing effects on the LLAMA2 and GPT-J models.


From the perspective of single-shot editing, the proposed method demonstrates superior performance in most test scenarios.
In lifelong editing scenarios, researchers observed the following:
- Methods that modify LLM parameters exhibit excellent editing performance in single-shot edits. However, their editing performance degrades significantly as the number of edits increases. This trend aligns with existing work highlighting the issue of toxicity accumulation;
- Methods introducing additional parameters maintain a certain level of reliability and generality during lifelong editing. However, the noticeable deterioration in locality observed in ZSRE proves that the cumulative addition of extra parameters impairs the original reasoning process;
- Retrieval-based methods exhibit robustness to an increasing number of edits. Among these, the proposed method achieved the best results, affirming the advantages of retrieval and validating the effectiveness of the strategy.
Experimental Results on General Capabilities
Although these three editing metrics effectively demonstrate editing performance, researchers further investigated the extent to which these editors impact the model’s general capabilities.
Experiments reveal that non-retrieval-based methods lead to a significant decline in general capabilities. This can be attributed to the accumulation of pattern mismatches caused by external interventions during editing. In retrieval-based methods, LTE also exhibited performance degradation.
In contrast, RECIPE does not involve direct intervention on LLM parameters, but instead relies on appending a short prompt to guide the LLM in adhering to the knowledge. It demonstrates the best protection of general performance, indicating minimal harm inflicted on the model.

Comparison of Model Editing Efficiency
As shown in the table below, among methods that employ editing-specific training such as MEND, MALMEN, LTE, and RECIPE, the editing time is significantly reduced compared to techniques requiring multiple iterations of backpropagation during the editing process.
Regarding inference speed, methods that modify model parameters maintain consistent speeds because they do not alter the original reasoning pipeline. T-Patcher slows down inference due to neuron accumulation.
Among retrieval-based methods, GRACE reduces the parallelism of model inference due to its unique dictionary pairing mechanism. R-ROME and LTE require dynamic computation of editing matrices and the separate appending of long editing instructions.
In contrast, RECIPE effectively preserves the LLM’s original inference speed by appending consecutive short prompts for editing. The shortest total time further highlights RECIPE’s efficiency advantages.

Comparison of Ablation Study Results
Researchers conducted ablation studies on ZSRE, CF, and RIPE using LLAMA-2. In the absence of CPT, researchers resorted to using word embeddings of knowledge statements as prompts retrieved from the knowledge base. Excluding KS involved applying a traditional contrastive learning loss to bring the representations of reliable and general samples closer to the edited knowledge while keeping them distant from local sample representations.
After training completion, researchers adopted an absolute similarity threshold decision strategy to filter out irrelevant knowledge. Despite high locality, omitting CPT severely compromises RECIPE’s reliability and generality.
It can be observed that the results are almost identical to those obtained without using any editor.

This emphasizes that using only the originally concatenated knowledge prefix is insufficient to make the LLM comply with editing instructions. Instead, CPT helps the LLM adhere to specified edits. Furthermore, discarding KS leads to decreased editing efficiency, particularly affecting generality and locality. The reason is that a single absolute similarity threshold cannot adequately address the different thresholds required for various queries.
Comments
Sign in to join the discussion and leave a comment.
Sign in with Google