Since the birth of large language models (LLMs) and the release of ChatGPT, artificial intelligence (AI) has gone from being an out-of-reach concept to showing real promise in the business landscape for every industry and business. From personalized customer experiences to streamlined operations and increased security, the possibilities are endless.
The reality we encounter in many companies, however, shows a lack of robust and flexible infrastructure technologies that will allow the business to harness the full potential of these AI promises. Leaders see the potential of AI and want to adopt it, but the teams do not have access to the basic tooling that will allow them to explore it.
This is where Red Hat OpenShift AI and Redis come into play, offering a powerful environment for data scientists and machine learning (ML) engineers.
OpenShift AI provides organizations with an efficient way to deploy and manage a comprehensive set of AI/ML tools. Its ability to create custom environments ensures that data scientists and machine learning engineers always have the right resources at their fingertips.
Redis is the world's fastest in-memory database. It's a versatile solution that has evolved beyond a simple key-value data store to support a wide range of use cases, including:
- Vector database: Store and query vector data for similarity searches
- Retrieval augmented generation (RAG): Enhance LLM accuracy and relevance by grounding searches in real-time data
- LLM memory: Manage the context window of LLMs for more coherent and context-aware conversations
- Semantic cache: Optimize performance and reduce LLM costs by caching semantically similar prompts and responses
Better together: OpenShift AI and Redis
AI applications, especially those involving generative AI (gen AI), demand high performance and low latency. Users expect real-time responses and personalized experiences. The combination of OpenShift AI and Redis addresses these challenges head-on.

OpenShift AI provides the environment where data scientists can use different tools, including embedding models, third-party frameworks like LangChain or LlamaIndex and multiple LLMs to implement their gen AI use cases at scale. Redis delivers the sub-second latency that gen AI use cases need.
Let's dive into specific use cases:
1. Retrieval augmented generation (RAG)
RAG enhances the knowledge of LLMs by integrating external data sources. Instead of solely relying on their pretrained knowledge, LLMs can fetch relevant information from a database in real time to generate more accurate and contextually appropriate responses. Fine-tuning an LLM with business-specific data is traditionally a costly and time-consuming process, and it may not be viable, depending on how often the knowledge base changes (with new or updated records). Keeping the knowledge base external to the model provides more flexibility and makes it easier to ensure that the LLM always has the latest information to serve the users.

The business benefit: Improved accuracy, reduced hallucinations and access to up-to-date information for chatbots, content generation tools and more.
In this use case, Redis plays the role of the vector database, while Openshift AI provides the compute resources and notebook environment with pipelines and model serving tools for the data scientists to prepare the vector data and test the quality and performance of semantic searches. OpenShift AI also provides advanced tooling like guardrails to improve LLM accuracy and monitor and better safeguard both user input interactions and model outputs.
By using Redis as the vector database, administrators can configure role-based access control (RBAC) and access control lists (ACLs) to separate vector data between different users, departments, etc. This allows vectors that contain sensitive information that can only be shared to certain users. Redis implements this control at the highest level, not only as a query parameter.
2. Semantic cache
LLMs can be expensive to run, especially for repetitive queries. A semantic cache stores LLM responses based on the meaning of the query, not just the exact text. When the user submits a new prompt, the system will look for similar prompts, and if it finds a match, it will retrieve the LLM response directly from the cache, saving a trip to the LLM server and the associated token cost (when using a hosted LLM service) or compute capacity (when self-hosting a model).

The business benefit: Semantic caching can greatly reduce LLM costs, especially for use cases where users are expected to ask basic or generic questions (FAQs, etc). Other benefits include faster response times and improved scalability for AI-powered applications.
In this use case, Redis will be used as the vector database and semantic cache. Some frameworks, like LangChain, are configured immediately to save prompts to the semantic cache and check automatically for each new prompt. This allows developers to quickly take advantage of this capability without having to write a lot of code, or control reads and writes to the cache. Data scientists can define the distance threshold for the cache based on the requirements and characteristics of the use case.
OpenShift AI provides the environment and tooling such as Jupyter and data science pipelines to create and run the code that will generate the vector data and load it. This can help data scientists not only generate the vector data, but also preload the semantic cache with thousands of questions and answers that can be generated with the help of a LLM. That way, when the first user asks a question, there is a good probability that a similar question is already in the cache, reducing the response time to milliseconds, potentially 15x faster than the traditional RAG pipeline.
If we can ensure a user experience where most of the prompts take only milliseconds to respond, how else can we enrich this user experience? What other data or services can we add to it, if we know that users won’t be waiting for seconds every time?
3. LLM memory
LLMs are, by definition, stateless. Meaning, they keep no record of any previous interaction with the user. As far as the model is aware, every prompt is an entirely new prompt, with no past or history to consider.
To get around this limitation, client applications (like chatbots) keep the conversation history between the user and the model (plus some additional information) and serve this data to the model every time the user submits a new prompt. The capacity to use this data is called the “context window,”’ and it makes it possible for the model to keep the answers within the context of the conversation that is happening.

The business benefit: More engaging and context-aware chatbots, personalized customer experiences and improved ability to handle complex conversations.
Redis not only stores the context window data, but again provides immediate integrations with frameworks like LangChain, which allows developers to enable LLM memory using only 2 lines of code. Additionally, using Redis to store LLM memory has other impactful benefits, such as:
- It enables multichannel user experiences. Users can close the browser-based chatbot and open the voice assistant in their mobile application and continue the conversation exactly where they left off, because both clients are pulling the conversation history from Redis;
- In case a call center or some other internal team needs to review the conversation history between user and bot, it can very easily be retrieved from Redis. This allows internal staff to understand exactly what happened in that conversation and whether or not the information provided by the model was correct.
Additionally, OpenShift AI provides data scientists with an environment that can be customized to grant access to the conversation history within Redis to fine-tune the conversation model. Having access to a dataset that includes real questions and answers can be critical to ensure the continuous improvement of the LLM responses. With OpenShift AI, data scientists have the resources and tooling to analyse and prepare a dataset to use for fine-tuning the embedding model or the LLM (or to prepare new prompts to ‘pre-warm’ the semantic cache).
Now that we’ve covered the main use cases, let’s see how we can put these ideas into action.
Getting started
Deploying Redis to Red Hat OpenShift can be greatly simplified using the OperatorHub. In the OpenShift web console, go to the OperatorHub page (in the Operators section of the left-side navigation panel).
From there, you can browse to the Database tab and look for Redis, or simply type Redis in the search bar.

Then you can open the Details page and click on the Install button to deploy the Redis operator.
Once the Operator is deployed, Redis resources can be easily and quickly created through the OpenShift UI:

There are 2 main resources for Redis: the cluster and the database (along with their active-active counterparts, which is outside the scope of this article).
The Redis cluster manages multiple databases, ensuring high availability and scalability. This is the first resource that needs to be created. Once the cluster is created, a new database can be created to serve the use cases discussed above. Make sure to enable the Search and JSON support capabilities, as they are necessary for vector search.
Once the database is created, users can access the Redis console to retrieve the connection information, check metrics and track the overall health of the database.

Next, the OpenShift AI environment can be configured. Here, users can create a notebook environment, provision inference services for local LLMs, create and configure data science pipelines and much more.

Jupyter notebooks provide a very simple and convenient way to experiment with vector searches. Users can connect to the Redis database with only a few lines of code, and from there, they can take advantage of popular frameworks like LangChain and LlamaIndex, or they can use the redis-vl package, which allows them to use vector capabilities without requiring a specific framework.

Conclusion
The increasing popularity of AI tools has unquestionably enabled a broad set of opportunities for companies that are looking to modernize, innovate and stay ahead of the competition.
Providing an environment with the resources and capabilities needed to take advantage of these AI tools is one of the greatest challenges companies face, as they try to understand the value AI could bring to their business.
OpenShift AI and Redis offer a powerful combination for organizations looking to use AI effectively. By providing a flexible gen AI development environment and a high-performance data platform, these technologies empower data scientists and machine learning engineers to build innovative AI solutions that create real business value. Whether it's RAG, semantic caching or LLM memory, OpenShift AI and Redis provide the foundation for building intelligent applications that are fast, scalable and context-aware.
product trial
Red Hat OpenShift Data Foundation | Product Trial
About the author
Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver reliable and high-performing Linux, hybrid cloud, container, and Kubernetes technologies.
Red Hat helps customers integrate new and existing IT applications, develop cloud-native applications, standardize on our industry-leading operating system, and automate, secure, and manage complex environments. Award-winning support, training, and consulting services make Red Hat a trusted adviser to the Fortune 500. As a strategic partner to cloud providers, system integrators, application vendors, customers, and open source communities, Red Hat can help organizations prepare for the digital future.
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Inclusion at Red Hat
- Cool Stuff Store
- Red Hat Summit