Why can't I use sudo with rootless Podman?

While you can't—and shouldn't—use sudo with rootless Podman, there are workarounds.

Posted: October 13, 2021 by Matthew Heon (Red Hat)

I was recently asked: Why can't I run rootless Podman containers when I log into a user via sudo or su? The problem is a bit complex to explain, so I'll start with an example.

Say I have a rootless Podman container running a web app for managing my photo collection. It is running as a dedicated user I've created for the app (photos, UID 1010). A new version of the app has been released, so I want to rebuild the image and recreate the container using it. I log into the user with a quick su photos from a root shell, then change to the directory with my Containerfile and do a podman build. However, instead of my build working, I get an error message:

Error: error creating tmpdir: mkdir /run/user/1010: permission denied

Confused, I run the same commands on another system and find they work without issue. What's going on? The answer lies in how Podman handles its temporary files directory.

Refreshing the state

One of the core reasons Podman requires a temporary files directory is for detecting if the system has rebooted. After a reboot, all containers are no longer running, all container filesystems are unmounted, and all network interfaces need to be recreated (among many other things). Podman needs to update its database to reflect this and perform some per-boot setup to ensure it is ready to launch containers. This is called "refreshing the state."

This is necessary because Podman is not a daemon. Each Podman command is run as a new process and doesn't initially know what state containers are in. You can look in the database for an accurate picture of all your current containers and their states. Refreshing the state after a reboot is essential to making sure this picture continues to be accurate.

To perform the refresh, you need a reliable way of detecting a system reboot, and early in development, the Podman team settled on using a sentinel file on a tmpfs filesystem. A tmpfs is an in-memory filesystem that is not saved after a reboot—every time the system starts, a tmpfs mount will be empty. By checking for the existence of a file on such a filesystem and creating it if it does not exist, Podman can know if it's the first time it has run since the system rebooted.

The problem becomes in determining where you should put your temporary files directory. The obvious answer is /tmp, but this is not guaranteed to be a tmpfs filesystem (though it often is). Instead, by default, Podman will use /run, which is guaranteed to be a tmpfs. Unfortunately, /run is only writable by root, so rootless Podman must look elsewhere. Our team settled on the /run/user/$UID directories, a per-user temporary files directory. These are not guaranteed to exist on all systems, though; they require a pluggable authentication module (PAM) configuration (e.g., logind) that supports them. Rootless Podman will fall back to using /tmp for systems that do not support them.

We still have the issue that this is not guaranteed to be a tmpfs, but there are no better options.

[ Sign up for this free online course: Developing cloud-native applications with microservices architectures. ]

Problems begin to emerge when we look at the semantics of the /run/user/$UID directories. These directories are automatically created when a user session is created and automatically destroyed when a user session is destroyed (roughly corresponding to a user logging in and logging out of the system). This presents an issue with persistent containers—the directory can be removed and recreated while containers are running, causing Podman to refresh the state and marking the running containers as no longer running. On logind-managed systems, there is an option to create a persistent user session by enabling lingering for the user (loginctl enable-linger); this is required for any users that run long-running containers.

There's no login session

All of this still doesn't explain why you cannot use sudo and su with rootless containers. The answer is that sudo and su do not create a login session. There are many historical reasons for this, most stemming from the fact that sudo and su are somewhat irregular (one user becoming another user, instead of a fresh login). See this GitHub issue for details. Given this, rootless Podman cannot be used with sudo and su unless loginctl enable-linger is used to force a persistent user session to be created for the user.

Root containers have no issues with sudo and su because they do not use /run/user/$UID and instead are located in /run, which is permanently mounted. Alternatively, you can access the user via a method that does create a user session; ssh is guaranteed to do so, for example. Systemd also provides several commands (for example, machinectl login) that open user sessions, which can be used as an alternative to sudo or su.

You might wonder why we do not use a temporary files directory that is not /run/user/$UID. Indeed, as mentioned previously, Podman will fall back to a directory in /tmp when it detects /run/user/$UID does not exist. We cannot do this on existing installations because Podman enforces a rule that the temporary directory cannot be changed after Podman is first run. We use the directory both for storing the sentinel file to detect restarts and for storing container content that is regenerated when a container is restarted (e.g., /etc/resolv.conf). Changing the directory where Podman looks for this content could introduce serious bugs.

[ Learn How to use Podman inside of a container. ]

But what if you force Podman to use a directory in /tmp the first time it is run? Many people do this unintentionally by running Podman for the first time using sudo or su. The /run/user/$UID directory will not exist because the user is not logged in, so it will fall back to /tmp and use it for all subsequent Podman invocations. This can also be manually forced via the --runroot flag to Podman, which specifies the path to the temporary files directory.

This will allow rootless Podman to be used with sudo and su, but only on cgroups v1 systems. Furthermore, it will be necessary to use the --login argument to sudo and su to ensure environment variables are set correctly. On systems using cgroups v2, Podman also requires other aspects of the login session (specifically, access to the user's dbus) to set up cgroups and resource limits for rootless containers. Additionally, /run/user/$UID removes all user content after a user logs out; Podman places the user's login credentials into registries under the temporary directory, so with /run/user/$UID the credentials will not be retained after the user logs out, improving security.

Wrapping up

Download now

To circle back to the original question: Why does rootless Podman work with sudo and su on some systems and not others? The answer depends on how Podman was first run and what temporary directory was selected. The Podman default, /run/user/$UID, does not work with sudo and su; an alternative directory under /tmp may.

An important note is that even if you can use rootless Podman with sudo or su on your system, it is not recommended. As cgroups v2 begins to move into the mainstream, Podman will require a login session to be present to run rootless containers, something that cannot be done with sudo and su. Work around this by using a method that does create a login session (ssh or machinectl login should work) or enabling a persistent user session for the user in question (loginctl enable-linger).