Why information retrieval systems are foundational for trustworthy and factual application of generative AI

Written by Jonathan Anderson & Simon Althoff

More and more companies are relying on the analytical and generative capabilities of LLMs and other generative models in their day to day activities. Simultaneously there are growing concerns about how factual errors and underlying biases produced by these models may have negative consequences. We have already written an article about how we may have made computers more human, and what that means for businesses going forward, you can read that here.


Short on time?

Read the condensed summary to get a quick glance on why information retrieval systems are fundamental for trustworthy AI.


Fact box: What is generative AI?

Generative artificial intelligence (generative AI) refers to a subset of artificial intelligence technologies that can produce new content and ideas, ranging from text and audio to images and videos. Unlike traditional AI systems that often classify or predict based on existing data, generative AI models learn patterns from large datasets and use this knowledge to create original outputs. This capability has broad applications across various domains, e.g. text generation, image creation, music and audio creation, video generation, design and art, and content personalization. 

The development of generative AI involves sophisticated techniques such as neural networks, particularly deep learning, and architectures like Generative Adversarial Networks (GANs) and Transformer models. These technologies have significantly advanced the capabilities of AI to generate content that is increasingly indistinguishable from that created by humans.

However, generative AI also raises ethical and societal concerns, such as the potential for misuse in creating misinformation, deepfakes, and other forms of deceptive content. Ensuring responsible development and deployment of generative AI is a critical area of ongoing research and discussion. In addition, there are clear concerns about accuracy of these systems, something that has been highlighted in several studies, e.g. from McKinsey and LucidWorks

The role of information retrieval systems

Generative AI in all variants are trained on massive amounts of data to pick up the patterns necessary to perform their given tasks. Looking at LLMs, since that is the version used by most businesses today, they require a very large corpus of text to learn to produce human-like language. Additionally they are trained through various techniques to exhibit the behavior desired by the developers. Through this process the language models pick up knowledge about various topics, knowledge that can be used in later applications. 

Further tuning these models on company specific data feeds the model with additional knowledge to be used within the company. When the models subsequently are put to production, the knowledge embedded within the model within the context of different topics, will be presented to the end user based on probabilistic predictions. Just as in humans, there is always the risk that the remembered information is incorrect. How the knowledge was presented to the model during training may have been ambiguous, or the user question may have been unclear, or the model may simply predict the incorrect knowledge to use for the specific question. There are many reasons why a model may present the incorrect information to a user, with no explanation or reasoning behind why that specific information was used. 


"Generative AI systems, like humans, are inherently frail in their recall of information. They risk retaining incorrect knowledge due to ambiguously presented data during training, unclear user queries, or simply predicting the wrong information for a given question.”

Vilma Ylvén, Data scientist at Algorithma


Enter RAG, or retrieval augmented generation, which has been a hot topic since large language models became popular, due to the clear issues of lacking transparency. The concept of RAG is always evolving, but in essence it involves supplying the model with the relevant data to perform the desired task. Let's explore the one of the most common applications of RAG:

Exhibit 1: Retrieval augmented generation

Imagine a company with a very large collection of documents crafted over many years of operations, who have found themselves struggling to find and sort through the information when needed. They want a more accessible interface to their internal company information and therefore implement an LLM to enable document chatting. To ensure that the model answers with relevant data, the company implements an information retrieval system to feed data to the model through the prompt.

This is a first step to ensuring that the model answers the questions within the correct context and with the correct information. It is however not fool-proof, since it is not guaranteed that the model will use the information correctly, or at all, when crafting the response. By providing source references for the retrieved data with the response, the user is at least allowed to cross reference the answer with the provided source information. Providing greater transparency in the process.

How information retrieval systems impact trustworthiness

Providing references to the source information is a vital first step to increase transparency and trustworthiness of generative AI systems. Most people would not trust a scientist that does not reference where their claims and information comes from. 

The issue with computer models is that we as humans have learned to expect them to be trustworthy, due to their deterministic behavior. When an LLM acts confident - and they almost always do - many will trust the information the model presents without second guessing it, which is dangerous in many applications. Data referencing is one step in the right direction but also only part of the solution. Proper training and change management is crucial for setting proper expectations around the capabilities of generative AI applications. 


"Most people would not trust a scientist who doesn't reference where their claims and information come from. Transparency, explainability and credibility are paramount as this builds trust and reinforces the integrity of the system." 

— Simon Althoff, Data Scientist at Algorithma


LLM applications in its current form, largely takes the form of “assistant”, “co-pilot” or “secretary”. And while these names are well intentioned, they may give the impression that the application is something (or someone) you can rely on, that does their job in a way we can trust it to be correct. The true nature of these models is that their output needs constant monitoring to make sure that it is factually correct and sound. In their current state, it is important to think of these assistants as document draft generators, code generators, or in the case of the aforementioned example, natural language data interfaces, which is more telling of their actual capabilities, and will lead to better conversations and outcomes. 

Moving towards a “human in the loop”-architecture

However, despite all this the fact remains that we are in a transition where we are fundamentally changing how we interact with information. To enable interaction with the right information, we are entirely reliant upon smart information retrieval systems. The benefits of having the IR system as the main player in your LLM applications include:

  • Transparency of what information is used in the application

  • Possibility for the user to stop or change an output based on incorrect information

  • Possibility to monitor the underlying information for out of date information or factual errors, enabling the application to stay up to date over time.

By incorporating a "human in the loop" approach we significantly enhance the reliability and adaptability of these systems. Human oversight allows for critical evaluation and decision-making that automated systems may not fully grasp. By involving humans, we can:

  1. Ensure ethical considerations are taken into account, preventing the propagation of biased or harmful content.

  2. Leverage human expertise to validate and refine information, ensuring that the most accurate and contextually appropriate data is used.

  3. Facilitate a feedback loop where humans can provide insights and corrections, leading to continuous improvement of the system's performance.

  4. Maintain a balance between automation and human intuition, allowing for creative and nuanced responses that purely algorithmic approaches might miss.

In this hybrid model, humans and machines complement each other's strengths, creating a more robust and dynamic information retrieval and application system. This synergy not only improves the accuracy and relevance of information but also builds trust and reliability in the system, ensuring it evolves effectively over time.


“By placing smart information retrieval systems in the overall architecture of LLM applications, we ensure transparency, user control, and accuracy. A 'human in the loop' approach enhances reliability by integrating human oversight, human decision-making, and optimizing data precision. This synergy between human expertise and AI efficiency enables robust, adaptable information solutions."

 - Jens Ekberg, CEO at Algorithma


Starting with a sound data infrastructure and information retrieval systems

Information retrieval is then a fundamental piece in implementing valuable generative AI applications. These systems often require hands-on work to function properly and deliver desired performance. Data is different, and there are many ways to configure search algorithms in IR systems to handle these differences effectively. This takes time and may include a lot of trial and error, meaning good IR can be a costly investment. To limit the costs associated with IR development, as well as maintenance, it is paramount to start looking at the data and infrastructure side. At Algorithma we believe that the basis for good application of AI, including IR and LLMs, start with robust data infrastructure.

The main challenge is then ensuring reliable, scalable and efficient data handling. After that is worked through, it is relevant to implement efficient IR, and subsequently to implement generative capabilities (if still relevant). To truly succeed in LLM and AI initiatives it is recommended that organizations follow the steps described below:

With a robust data infrastructure as a basis, IR systems enable quick and easy access to critical information, which is ever more relevant in the age of information overflow. Information retrieval can then act as the bridge between information and generative applications, which subsequently act as an interface to the end user within specialized use-cases. With a strong foundation in infrastructure and IR, LLMs and other generative models can deliver tremendous value to organizations while maintaining reliability and transparency. 

Once the applications are implemented, then follows the process of continuous improvement and maintenance of the systems. All the way from data governance to evolving the IR algorithms and generative models.

As we continue to integrate AI into our daily operations, it is imperative for organizations to invest in robust data infrastructures and sophisticated IR systems. This foundation not only enhances the effectiveness of generative AI but also ensures the accuracy and trustworthiness of the information these systems provide.

At Algorithma, we are committed to helping businesses navigate this complex landscape. By partnering with us, you can build a resilient data infrastructure, implement cutting-edge IR systems, and develop AI applications that are both innovative and reliable.

Take the first step towards transforming your business with trustworthy generative AI. Contact Algorithma today to learn how we can support your journey to harnessing the full potential of AI while safeguarding the integrity of your data and operations.

Previous
Previous

AI in predictive manufacturing

Next
Next

Did we accidentally make computers more human?