Jump to section

How Kubernetes can help AI/ML

Copy URL

Kubernetes can assist with AI/ML workloads by making code consistently reproducible, portable, and scalable across diverse environments.

When building machine learning enabled applications, the workflow is not linear, and the stages of research, development, and production are in perpetual motion as teams work to continuously integrate and continuously deliver (CI/CD). The process of building, testing, merging, and deploying new data, algorithms, and versions of an application creates a lot of moving pieces, which can be difficult to manage. That’s where containers come in.

Containers are a Linux technology that allow you to package and isolate applications along with all the libraries and dependencies it needs to run. Containers don’t require an entire operating system, only the exact components it needs to operate, which makes it lightweight and portable. This provides an ease of deployment for operations and confidence for developers that their applications will run exactly the same way on different platforms or operating systems.

Another benefit of containers is that they help reduce conflicts between your development and operations teams by separating areas of responsibility. And when developers can focus on their apps and operations teams can focus on the infrastructure, integrating new code into an application as it grows and evolves throughout its lifecycle becomes more seamless and efficient.

Kubernetes is an open source platform that automates Linux container operations by eliminating many of the manual processes involved in deploying and scaling containerized applications. Kubernetes is key to streamlining the machine learning lifecycle as it provides data scientists the agility, flexibility, portability, and scalability to train, test, and deploy ML models.

Scalability: Kubernetes allows users to scale ML workloads up or down, depending on demand. This ensures that machine learning pipelines can accommodate large-scale processing and training without interfering with other elements of the project. 

Efficiency: Kubernetes optimizes resource allocation by scheduling workloads onto nodes based on their availability and capacity. By ensuring that computing resources are being utilized with intention, users can expect a reduction in cost and an increase in performance.

Portability: Kubernetes provides a standardized, platform-agnostic environment that allows data scientists to develop one ML model and deploy it across multiple environments and cloud platforms. This means not having to worry about compatibility issues and vendor lock-in.

Fault tolerance: With built-in fault tolerance and self-healing capabilities, users can trust Kubernetes to keep ML pipelines running even in the event of a hardware or software failure.

 

 

The machine learning lifecycle is made up of many different elements that, if managed separately, would be time consuming and resource intensive to operate and maintain. With a Kubernetes architecture, organizations can automate portions of the ML lifecycle, removing the need for manual intervention and creating more efficiency. 

Toolkits such as Kubeflow can be implemented to assist developers in streamlining and serving the trained ML workloads on Kubernetes. Kubeflow solves many of the challenges involved in orchestrating machine learning pipelines by providing a set of tools and APIs that simplify the process of training and deploying ML models at scale. Kubeflow also helps standardize and organize machine learning operations (MLOps).

 

As the industry’s leading hybrid cloud application platform powered by Kubernetes, Red Hat® OpenShift® brings together tested and trusted services while delivering a consistent experience across public cloud, on-premise hybrid cloud, or edge architecture.

Red Hat OpenShift Data Science, part of the OpenShift AI portfolio, is a service to Red Hat OpenShift that provides data scientists and developers with a consistent, powerful artificial intelligence and machine-learning (AI/ML) platform for building intelligent applications. In addition to core model building and experimentation, OpenShift Data Science provides MLOps capabilities including model serving and monitoring to bring models into production more quickly.

 

Solution Pattern

AI applications with Red Hat and NVIDIA AI Enterprise

Create a RAG application

Red Hat OpenShift AI is a platform for building data science projects and serving AI-enabled applications. You can integrate all the tools you need to support retrieval-augmented generation (RAG), a method for getting AI answers from your own reference documents. When you connect OpenShift AI with NVIDIA AI Enterprise, you can experiment with large language models (LLMs) to find the optimal model for your application.

Build a pipeline for documents

To make use of RAG, you first need to ingest your documents into a vector database. In our example app, we embed a set of product documents in a Redis database. Since these documents change frequently, we can create a pipeline for this process that we’ll run periodically, so we always have the latest versions of the documents.

Browse the LLM catalog

NVIDIA AI Enterprise gives you access to a catalog of different LLMs, so you can try different choices and select the model that delivers the best results. The models are hosted in the NVIDIA API catalog. Once you’ve set up an API token, you can deploy a model using the NVIDIA NIM model serving platform directly from OpenShift AI.

Choose the right model

As you test different LLMs, your users can rate each generated response. You can set up a Grafana monitoring dashboard to compare the ratings, as well as latency and response time for each model. Then you can use that data to choose the best LLM to use in production.

Download pdf icon

An architecture diagram shows an application built using Red Hat OpenShift AI and NVIDIA AI Enterprise. Components include OpenShift GitOps for connecting to GitHub and handling DevOps interactions, Grafana for monitoring, OpenShift AI for data science, Redis as a vector database, and Quay as an image registry. These components all flow to the app frontend and backend. These components are built on Red Hat OpenShift AI, with an integration with ai.nvidia.com.

Keep reading

Article

What is cloud management?

Learn the facets of cloud management and how a cloud management platform can help your enterprise.

Article

What are managed IT services?

Managed services are a way to offload general tasks to an expert, in order to reduce costs, improve service quality, or free internal teams to do work that’s specific to your business.

Article

Why build a Red Hat cloud?

Our open hybrid cloud strategy, supported by our open source technologies brings a consistent foundation to any cloud deployment: public, private, hybrid, or multi.

More about cloud computing

Products

A platform that virtualizes hardware and organizes those resources into clouds.

An enterprise-ready Kubernetes container platform with full-stack automated operations to manage hybrid cloud, multicloud, and edge deployments.

Engagements with our strategic advisers who take a big-picture view of your organization, analyze your challenges, and help you overcome them with comprehensive, cost-effective solutions.

Resources

Training

Free training course

Red Hat OpenStack Technical Overview