What is Retrieval-Augmented Generation?
Retrieval-Augmented Generation (RAG) is a Generative AI (GenAI) implementation technique that is accelerating the adoption of GenAI and Large Language Models (LLMs) across enterprise environments. By enabling organizations to use their proprietary data in ways that were previously impossible, RAG transforms static information into dynamic, interactable assets through conversational interfaces.
RAG (Retrieval-Augmented Generation) was first introduced in 2020 by Lewis et al. as a technique for handling knowledge-intensive NLP tasks using LLMs. Initially, RAG implementations were constrained by the limitations of LLMs in natural language understanding and following instructions. However, as newer model generations quickly overcame these barriers, the focus shifted from RAG being a “model problem” to a “data problem.” Over time, RAG has evolved from being just one of many techniques to becoming an implementation strategy almost synonymous with Enterprise GenAI.
Why is Retrieval-Augmented Generation important?
GenAI and RAG allow users to engage with data through conversational AI and Copilot interfaces, enabling new possibilities for both customer and employee experiences. For instance, customers can now ask questions about product specifications. Employees can retrieve specific policies from company documentation files, and customer service agents can quickly find the end date of an active promotion – all through natural language conversations with GenAI-powered bots.
In addition to expanding the functionality of AI-driven applications, RAG also addresses one of the most significant challenges associated with GenAI: hallucinations. By grounding LLMs in verifiable, source-based knowledge, RAG ensures that the generated answers are based on data that the organization controls. This reduces the risk of inaccurate or hallucinated answers, making RAG a critical enabler for building trustworthy and effective AI systems.
All of these capabilities make RAG a key technique for developing and deploying GenAI-powered solutions, such as Copilots and intelligent agents, in enterprise settings. It is no different over at Salesforce.
Salesforce’s Use of RAG in Einstein Copilot and Agentforce
Salesforce uses Retrieval-Augmented Generation (RAG) in its Einstein Copilot and Agentforce platforms, which were discussed at this year’s Dreamforce event. As Salesforce said, “AI needs your data to understand your business,” and this is exactly what RAG helps achieve. Without RAG and reliable data feeding it, Marc and Sanjna would end up being very disappointed customers using AI-assisted Saks clothing shopping experiences they described in their keynotes.
Einstein Copilot and Agentforce access relevant enterprise knowledge through Salesforce’s Data Cloud, which brings together both structured and unstructured data sources. Unstructured data is a key resource for RAG, as it’s estimated that 90% of enterprise data is in formats like PDFs and other documents.
This unstructured data is processed; first ingested and chunked, and then converted into numerical representations called embeddings. Embeddings allow the retrieval engine to find relevant knowledge articles for a given user’s query. When a request is made, the retrieval engine uses vector representations of both the request and the stored data to locate the most relevant articles.
The retrieved articles, referred to as context, help ground the LLM answers in the company’s own data. Both the user request and the retrieved context are then sent to Einstein’s Trust Layer, where they are combined and injected into the LLM’s prompt. Einstein then generates a contextualized response for the user, along with references to the knowledge articles used to produce the answer.
Based on: How Einstein Copilot Search Uses Retrieval Augmented Generation to Make AI More Trusted and Relevant
Key Issues with Enterprise RAG
While RAG may seem straightforward in concept, organizations often struggle to successfully implement it in production environments and scale it effectively. The most common pain points are inaccurate or inconsistent responses, leading to an overall lack of trust in the system’s reliability. With RAG, the responsibility for answer accuracy shifts from the LLM’s internal knowledge to our knowledge base and the quality of the data used for grounding.
RAG adoption barriers are now well understood and documented, most issues organizations encounter can be traced back to two primary factors: poor data quality and the inherent black-box nature of Large Language Models. Below, we outline the key failure points that commonly contribute to unsuccessful RAG initiatives:
Siloed, Scattered, and Disjointed Data
In many enterprises, data is often distributed across several disconnected systems, leading to fragmentation that complicates the process of accessing, analyzing, and using relevant information. Organization’s lack clear visibility into what data is available and where it is stored.
Data Lacking Business Context
One of the most significant challenges in integrating LLMs with proprietary data is the lack of critical business context. Data without context can be misinterpreted or unusable. In LLMs, this issue becomes even more worrying, as the models are forced to depend on their internal knowledge gained through training to infer meaning, often leading to hallucinated and inaccurate responses.
Duplicate and Outdated Data
The presence of redundant or outdated data within systems severely undermines the accuracy of data retrieval processes. LLMs that pull knowledge from such sources may generate responses based on incorrect or outdated information, resulting in incorrect information presented to users. We explore the issue of duplicate content in RAG in depth in our blog post, 10 Ways Duplicate Content Can Cause Errors in RAG Systems.
Conflicting and Inconsistent Data
When data from different sources contradicts or is inconsistent, it degrades the quality of RAG outputs and leads to a lack of confidence in the system’s reliability. These inconsistencies are surfaced through AI-generated outputs, again resulting in inaccurate and unreliable answers.
Data Leakage and Privacy Concerns
Insufficient access controls and a lack of awareness around the existence of sensitive data can significantly increase the risk of data leakage and privacy violations. Protecting sensitive information is crucial, especially as organizations adopt more sophisticated AI technologies like AI Agents, which may inadvertently expose or misuse such data.
The Black Box Nature of LLMs
A key challenge with LLMs is their inherent “black box” nature, which makes it difficult to interpret how the model arrives at a specific output. This lack of transparency complicates debugging and understanding incorrect or suboptimal answers. Both are required by teams looking to improve their RAG system through time.
Reliable Salesforce RAG with Shelf
Shelf platform is designed to help teams tackle and resolve enterprise RAG challenges discussed above. Salesforce’s RAG faces similar challenges that organizations must overcome to ensure reliable performance.
Shelf addresses the first major RAG issue—poor data quality—through Unstructured Data Quality and Next-Generation Knowledge Management. This allows teams to catalog, ensure data quality, govern, and monitor the data used in Salesforce’s GenAI offerings. By using Shelf, organizations can ensure that only accurate and trusted data is used for generating answers in the Einstein Copilot.
The second key challenge is the black-box nature of LLMs. Shelf’s Interaction Analysis provides real-time feedback on answer inaccuracies, enabling teams to proactively identify and address any data issues that may lead to incorrect responses from the Copilot.
The Shelf platform includes six key components that empower teams to address critical RAG issues and enhance the reliability of Salesforce GenAI applications:
Shelf Content Integration Layer
This layer ingests and standardizes data across multiple sources, effectively solving the challenge of siloed, scattered, and disjointed data. By creating a unified view of the available data, organizations gain transparency into the available data assets, setting the foundation for reliable RAG systems.
Shelf Data Catalog
The Data Catalog enables organizations to discover, categorize, and enrich their data with essential business context, directly addressing the issue of data lacking business context. By providing clear metadata and contextualization, it ensures that retrieved data is both relevant and understandable for LLMs within the organization’s specific environment. Unlike many open source or proprietary data catalogs, the Shelf platform incorporates human in the loop feedback to refine and implement the data catalog as a semantic data layer to achieve consistency, relevance and accuracy.
Shelf Data Quality Layer – Duplicates Detection
This layer is designed to detect and flag documents that contain outdated or duplicate information. By addressing the challenge of duplicate and outdated data, Shelf ensures that RAG systems only work with current, high-quality data, minimizing the risk of generating obsolete or incorrect responses.
Shelf Data Quality Layer – Conflicting Information Detection
In addition to detecting redundancy, Shelf’s Data Quality Layer also identifies conflicting or inconsistent information across data sources. This helps resolve the issue of conflicting and inconsistent data, improving the accuracy and consistency of RAG outputs.
Shelf Data Governance
With robust data governance features, Shelf enables organizations to manage and control data access while also identifying sensitive information, addressing the concerns around data leakage and privacy. This ensures compliance with data security policies and reduces the risk of unauthorized exposure of confidential data though GenAI solutions.
Shelf Data Observability
Shelf’s Data Observability component provides insights into GenAI application interaction data, offering real-time feedback on potential inaccuracies in generated answers. By addressing the black-box nature of LLMs, this feature enables users to understand and correct system outputs, improving transparency and trust in RAG systems over time.
Shelf provides organizations with the capabilities needed to support their GenAI initiatives by enabling data GenAI readiness and offering transparency into the quality of answers generated by GenAI applications. Both aspects are crucial for unlocking the full potential of Salesforce’s GenAI offerings.
The Future of RAG and The Importance of Data AI Readiness
As we witness rapid advancements in the capabilities of LLMs and GenAI applications, organizations are coming to the realization that GenAI alone cannot solve all of its problems. Achieving GenAI readiness requires a critical focus on one key asset: data. The principle of “garbage in, garbage out” remains true—without high-quality data, even the most advanced AI systems will struggle to deliver accurate results.
Over time, the quality of an organization’s data inevitably degrades. We explore the key concepts of data degradation in our blog post: Understanding Data Decay, Data Entropy, and Data Drift: Key Differences You Need to Know. Data management issues and Data Quality are often cited as the top obstacle preventing organizations from fully realizing the potential of GenAI. This is particularly true for applications that use RAG as their LLM implementation technique. We discuss this in depth in our YouTube video, Garbage In, Garbage Out: How Poor Data Quality Critically Impairs RAG System Accuracy.
While LLMs are becoming more advanced and their context windows are expanding, data quality remains a fundamental challenge that no LLM can address on its own. Incorrect data entered into an LLM’s prompt will still produce inaccurate results, regardless of the model’s capabilities. GenAI solutions such as copilots, AI agents, and conversational AI will all be impacted by poor data quality. This is also true for Salesforce’s Einstein Copilot and the Agentforce platform, which allow customers to integrate LLMs with their proprietary data.
To unlock the full potential of GenAI, organizations must prioritize organizing, preparing, and cleaning their data and knowledge sources. These assets represent the most valuable resources enterprises have accumulated over time and are key to enabling GenAI to deliver transformative applications that were once unimaginable. Ultimately, the path to GenAI success is paved with well-structured and high-quality enterprise data.