#AI Paper proposes CREAM method for extending language models #ContextExtension

This AI Paper from China Proposes Continuity-Relativity indExing with gAussian Middle (CREAM): A Simple yet Effective AI Method to Extend the Context of Large Language Models

Researchers have introduced a new method called CREAM to extend the context window of large language models (LLMs) like transformers up to 256K tokens. This method addresses the challenges of efficiently using information from the middle of the context, known as the “Lost-in-the-Middle” problem. CREAM manipulates position indices and uses a truncated Gaussian sampling method to focus on the middle part of the context during fine-tuning, achieving effective performance on extended contexts. The methodology involves ensuring continuity and relativity in positional encoding, maintaining densely connected positional indices and leveraging rotary positional encoding (RoPE) to learn relative positions between token pairs. Experiments with Llama-2-7B and Llama-2-7B-Chat models demonstrated CREAM’s efficiency and effectiveness in long-context understanding tasks, outperforming existing methods and achieving promising results in question-answering and summarization tasks. CREAM successfully balances continuity and relativity in positional encoding and enhances middle-content understanding, offering a practical solution to the “Lost-in-the-Middle” problem. The method is designed to be efficient, effective, and easy to implement, providing a valuable contribution to the field of extending context length in LLMs.

Source link

Source link:

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

Postback URL que Necesito: An In-Depth Guide | by Vishunooli | Jun, 2024

Guide to Postback URLs: Everything you need to know #PostbackURL

Discord Markdown Guide. Discord is an immensely popular… | by Ravijeh | Jun, 2024

Discord Markdown Guide: Tips for Formatting Text Efficiently #DiscordFormatting