This repository contains the code, data, and models pertaining to our ICLR 2024 accepted paper "In-context Autoencoder for Context Compression in a Large Language Model", which was initially announced on arXiv in July 2023 here.
Add pretrain.py, instruction_finetune.py and training_utils.py to icae_v2/, which can be used to train the ICAE.
Two new ICAE models based on Mistral-7B were released. These include a pretrained model an instruction fine-tuned model. The inference code accompanying these models is also provided.
Compared with the V1 released model, the Mistral-7B ICAE models extend support to multi-span concatenation, as illustrated in Figure 6 of the paper. Also, they extend the max input length to 5120.
In the release of V2, I move the dataset and models to my huggingface repo.
One can simply try the released ICAE models by running:
# transformers >= 4.36.2 is required
cd icae/code/icae_v2
# run the instruction fine-tuned ICAE model
bash fine_tuned_inference_script.sh
# Or run the pretrained ICAE model
bash pretrained_inference_script.sh
This is the original release of the In-context Autoencoder repository. This particular iteration includes the PwC dataset, the code, and a fine-tuned ICAE model based on Llama-2-7b-chat that is used in the paper.
The first version of the ICAE model is based in the Llama-2-7b-chat, which is used in the paper. It can be downloaded from this link. Please use the model with the code available in the code/icae_v1 directory.
If our work contributes to your research, please cite our paper:
@inproceedings{
ge2024incontext,
title={In-context Autoencoder for Context Compression in a Large Language Model},
author={Tao Ge and Hu Jing and Lei Wang and Xun Wang and Si-Qing Chen and Furu Wei},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=uREj4ZuGJE}
}