Use the RHEL command-line assistant offline with this new developer preview

September 16, 2025Brian Smith, Máirín Duffy, Sally O'Malley, Nicolas Muñoz6-minute read

An offline version of the Red Hat Enterprise Linux (RHEL) command-line assistant powered by RHEL Lightspeed is now available as a developer preview to existing Red Hat Satellite subscribers. This delivers the power of the RHEL command-line assistant in a self-contained format that runs locally on a workstation or an individual RHEL system. This enables the assistant to function in a completely disconnected, offline, or air-gapped environment because it doesn't require any external network connectivity to operate. The command-line assistant provides AI-powered guidance and suggestions to help you with a wide range of RHEL tasks. We’ve extended its knowledge about RHEL installation and upgrades topics, enabling you to get assistance in these areas even with limited or no connectivity.

Architecture overview

The offline version of command-line assistant is delivered as a set of container images that can be run with Podman. The containers used are:

installer container: Pulls the other required containers, installs the rhel-cla command, and optionally creates a systemd service
rlsapi container: Provides the endpoint that the command-line assistant client communicates with
rag-database container: The retrieval-augmented generation (RAG) database used to supplement the LLM’s knowledge with additional data such as the RHEL documentation
ramalama container: Provides LLM inference

Prerequisites and requirements

Your Red Hat Satellite subscription provides you with tools to locally manage your RHEL environments. These tools include the Satellite product itself, as well as the recently introduced Red Hat Offline Knowledge Portal, which provides an offline version of Red Hat's exclusive knowledge content. We've now expanded the Satellite subscription to also provide the offline version of the RHEL command-line assistant.

The offline version of the command-line assistant is delivered as a set of containers, and can run on a RHEL system, or a Mac or Windows workstation. If you're using a Mac system, Podman Desktop is required, and it's recommended to use a system equipped with an NVIDIA, AMD, or Mac M-series GPU.

Hardware requirements

CPU-only systems (RHEL 9.6+ and 10+ / Fedora 42 / Windows 11):

RAM: 8 GB
CPU cores: 2

GPU-capable systems (RHEL 9.6+ and 10+ / Fedora 42 / Windows 11):

RAM: 4 GB
GPU: At least 4 GB of VRAM

Apple systems (macOS 15.x):

M2 chip or newer

All systems:

Available disk space: 10 GB

Getting started

Container images for the offline version of the RHEL command-line assistant are available on the Red Hat Container Registry to all customers with a Satellite subscription. Start by authenticating to the Red Hat Container Registry:

$ podman login registry.redhat.io

You perform the installation using the installer container. This container pulls the other needed containers, installs the rhel-cla command, and optionally creates a systemd service. The systemd service can be used in environments where you would like to have the offline RHEL command-line assistant containers automatically start each time the system is booted.

Before running the installer, verify that the ~/.config and ~/.local/bin directories exist in your home directory. To start the installer container, use this command:

$ podman run -u : --rm -v $HOME/.config:/config:Z \

-v $HOME/.local/bin:/config/.local/bin:Z \

registry.redhat.io/rhel-cla/installer-rhel10:latest \

install-systemd

If you don't want to install the systemd service, then don't include the install-systemd argument at the end of the command.

After installation has completed, you can use the rhel-cla command to start, stop, uninstall, or check the status of the offline RHEL command-line assistant. On a Mac, you must make the rhel-cla command executable:

$ chmod +x ~/.local/bin/rhel-cla

To start the offline version of the RHEL command-line assistant:

$ rhel-cla start

This generates output, but all you need to look for is a message indicating that everything was successfully started:

✅ RHEL CLA pod is running!

Services available at:

API: http://localhost:8000
Model Server: http://localhost:8888
Database: localhost:5432

Configure the GPU

The next step is to configure the GPU within the ~/.config/rhel-cla/.env file. Open this file in a text editor, and complete the following configuration:

Set the LLAMACPP_IMAGE variable to the appropriate RamaLama container for your GPU hardware. There are comments within the file that provide more details
Set the HOST_DEVICE variable to the device for your GPU.
Set the NVIDIA specific variables if needed. The comments within the file provide more details on this.

The offline version of the RHEL command-line assistant includes the Microsoft Phi4-mini LLM model. You can use a different model by updating the LLM variable.

After changing any configuration settings, restart the offline RHEL command-line assistant:

$ rhel-cla stop

$ rhel-cla start

Configure the command-line assistant client

Install the command-line-assistant package on a RHEL system:

$ sudo dnf install command-line-assistant

By default, the command-line assistant client is configured to use the online RHEL Lightspeed service. Update the endpoint configuration option in the /etc/xdg/command-line-assistant/config.toml file, pointing to the system hosting the offline RHEL command-line assistant containers. In my environment, I’m running the command-line assistant client and the offline containers on the same system, so I would update the endpoint to the value provided after I started the command-line assistant service:

endpoint = "http://127.0.0.1:8000/v1"

It might be necessary to open port 8000 in your firewall if the offline version of the RHEL command-line assistant containers are on a different system than the command-line assistant client.

After making a change to the command-line assistant client configuration, you must restart the command-line assistant daemon (clad) if it is currently running:

$ sudo systemctl restart clad

Usage

At this point, everything is configured, so you can use the command-line assistant to get help with RHEL topics. You can ask it questions related to installing RHEL, troubleshooting issues, and much more. For example:

$ c "what is an immutable file?"

+ Asking RHEL Lightspeed

This feature uses AI technology. Do not include any personal information or other sensitive information in your input. Interactions may be used to improve Red Hat's products or services.

Hello! I'm glad you're interested in learning about immutable files. An immutable file is a file that cannot be modified, deleted, or renamed after it has been created. This concept is often used in systems to ensure data integrity and security.

In Red Hat Enterprise Linux, you can create an immutable file using the chattr command. Here's an example:

[bash] Snippet

sudo chattr +i filename

This command will make the file named filename immutable. You can verify that the file is immutable by using the lsattr command:

[bash] Snippet

lsattr filename

The output will show an i attribute next to the file name, indicating that it is immutable.

Remember, if you try to modify, delete, or rename an immutable file, you will receive an error message. You can remove the immutability attribute using the -i option with the chattr command:

[bash] Snippet

sudo chattr -i filename

I hope this helps! Let me know if you have any other questions.

Always review AI-generated content prior to use.

First query response delay

The first time a chat session starts, the model gets loaded into the system's memory. This may result in a slow response just for the first question.

CPU-only systems

The response time of the command-line assistant depends on the hardware being used. While it is possible to use the offline RHEL command-line assistant on a system without a GPU by using CPU inference, this results in slow response times.

The command-line assistant client package available in RHEL 9.6 and RHEL 10 times out after 30 seconds. This 30 second timeout is not currently configurable, however it is planned that the command-line assistant package will be updated in RHEL 9.7 and RHEL 10.1 to have a configurable timeout.

Additionally, you might experience slow response time on the first query (as explained above), which can trigger the 30 seconds timeout.

Intended for individual system use cases

The offline version of the RHEL command-line assistant is intended for individual system use cases. For example, it can be hosted on a RHEL system, and used by the command-line assistant client on that system, or it could be hosted on a server with an individual RHEL workstation configured with a command-line assistant client to point to that server, and so on. It is not intended to scale out with multiple RHEL command-line assistant clients connecting to a system hosting the offline command-line assistant.

Conclusion

The introduction of the new offline version of the RHEL command-line assistant marks a significant leap forward in empowering RHEL users in offline or disconnected environments. This powerful, self-contained AI-powered assistant provides an invaluable resource for RHEL users who often operate in air-gapped networks, remote locations, or environments with limited internet connectivity.

Historically, accessing resources such as documentation and troubleshooting information in offline settings has been a major challenge, and could result in increased downtime and frustration. The offline version of the RHEL command-line assistant helps address this pain point by bringing a wealth of knowledge and AI-driven problem-solving capabilities powered by RHEL Lightspeed directly to your fingertips, regardless of network availability.

For more details on the offline version of the RHEL command line assistant, refer to the documentation.

About the authors

Keep exploring

Browse by channel

Explore all channels

Use the RHEL command-line assistant offline with this new developer preview

Architecture overview

Prerequisites and requirements

Hardware requirements

Getting started

Configure the GPU

Configure the command-line assistant client

Usage

First query response delay

CPU-only systems

Intended for individual system use cases

Conclusion

Get started with AI Inference: Red Hat AI experts explain

About the authors

Brian Smith

Máirín Duffy

Sally O'Malley

Nicolas Muñoz

More like this

Keep exploring

Browse by channel

Platforms

Tools

Try, buy, & sell

Communicate

About Red Hat

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links