An implementation of retrieval-enhanced transformer based on Hugging Face Transformers and FAISS
- Implement the Chunked Cross Attention (CCA)
- Implement the BERT retreiver based on Hugging Face Transformers and FAISS
- Add CCA and retriever to GPT2 model of Hugging Face Transformers
- Fine-tuning and Evaluation
@misc{borgeaud2022improving,
title = {Improving language models by retrieving from trillions of tokens},
author = {Sebastian Borgeaud and Arthur Mensch and Jordan Hoffmann and Trevor Cai and Eliza Rutherford and Katie Millican and George van den Driessche and Jean-Baptiste Lespiau and Bogdan Damoc and Aidan Clark and Diego de Las Casas and Aurelia Guy and Jacob Menick and Roman Ring and Tom Hennigan and Saffron Huang and Loren Maggiore and Chris Jones and Albin Cassirer and Andy Brock and Michela Paganini and Geoffrey Irving and Oriol Vinyals and Simon Osindero and Karen Simonyan and Jack W. Rae and Erich Elsen and Laurent Sifre},
year = {2022},
eprint = {2112.04426},
archivePrefix = {arXiv},
primaryClass = {cs.CL}
}