#Understanding RAG: Retrieval Augmented Generation Explained Simply #RAGExplained

Retrieval Augmented Generation (RAG) is a technique used to customize the outputs of a large language model (LLM) for a specific domain without altering the underlying model itself. Companies like CommandBar use RAG when building embedded user assistance agents to help users learn and use software. RAG involves a semantic retrieval step before calling the LLM, where relevant information is retrieved and fed to the model. This approach is beneficial for scenarios where speed is not critical and when specific, high-dimensional outputs are required.

RAG is preferred in situations where language output needs to be constrained, control over the system’s outputs is desired, resources for training or fine-tuning a model are limited, and leveraging changes in foundation models is important. Alternatives to RAG include model training, where a new model is created from scratch, and fine-tuning, which involves further training an existing model for a specific use case.

RAG is not suitable for scenarios where speed is crucial, as it involves an additional retrieval step before running the model. In cases where highly specific, high-dimensional outputs are needed, a general-purpose LLM may not suffice, and training a custom model may be necessary. Overall, RAG offers flexibility and control over language outputs, making it a valuable tool for companies like CommandBar in providing user assistance.

Source link

Source link: https://builtin.com/articles/retrieval-augmented-generation-explained