Skip to main content

What happens when you pull a container image?

Learn the differences between manifests and manifest lists and how container registries use them.
Image
Container ship with lights being pulled by a tug

Photo by Julius Silver from Pixabay

A Podman or Docker "pull" is a series of API requests against the registry you're pulling from. The API requests that the client (the podman or docker command) makes depends on what type of image manifest is being pulled.

[ Learn about deploying containerized applications in a technical overview course. ]

There are currently two types of image manifests: a manifest list (or OCI image index) and a manifest.

Manifests and manifest lists

A manifest is a single manifest and its list of "blobs" (which stands for "binary large objects"). A manifest can have different version schemas, so not all are structured exactly in the same way, but that's their general purpose.

A manifest list is exactly what the name advertises: a list of manifests.

Manifests and manifest lists are important because the client behaves differently depending on what the image you're pulling uses.

Pull a manifest

When you initiate a pull, the client first requests a manifest from the registry.

GET /v2/library/postgres/manifests/14

Suppose this returns a single manifest (rather than a manifest list). The client is interested in the image config and the image layers, so it needs to download those next. There's no order here, so most (if not all) clients download them in parallel. It's worth noting that a registry sees both the config and the layers of the image as just blobs (though in different formats).

GET /v2/library/postgres/blobs/sha256:asdf123+

For each blob, one request is made to the registry server.

[ Download the eBook Managing your Kubernetes clusters for dummies. ]

Pull a manifest list

When an image uses a manifest list instead of a simple manifest, the client needs to take an extra step before it can download any blobs. A manifest list has no blobs, after all (because it's a list of manifests).

For example, suppose that this returns a manifest list:

GET /v2/library/postgres/manifests/14

Instead of blobs, the client now has a list of manifests to choose from. A manifest list exists to allow a single image to support multiple platforms (where "platform" is an operating system on a specific architecture). Clients pulling images usually intend to run them, so when faced with a manifest list, the client needs to choose a manifest that it can run based on its operating system and architecture.

Suppose a client chooses the linux/amd64 operating system and architecture. Each sub-manifest in a manifest list has at least the operating system and architecture and the manifest digest.

By knowing which target operating system and architecture is being used (linux/amd64 in this example), the client knows which manifest to pull.

GET /v2/library/postgres/manifests/<sha256-digest-of-linux-amd64-manifest>

Registries and manifests

A container registry is often a "black box" with its complexity abstracted away so that users don't have to think about it. I hope that knowing that registries are just another HTTP API serving content helps demystify this part of container image distribution, making a small part of the ecosystem feel a little more familiar and friendly.

[ How to explain orchestration in plain English. ]

Image
Staircase with the number twelve in a circle
The 12 Factor App methodology is an influential pattern to designing scalable application architecture. Here is what that means for application architects and their architecture.
Topics:   Containers  
Author’s photo

Flavian Missi

Passionate about open source since the beginning of their career, Flavian has been lucky enough to have worked on multiple open source projects throughout the years. They currently work on Project Quay at Red Hat. More about me

Navigate the shifting technology landscape. Read An architect's guide to multicloud infrastructure.

OUR BEST CONTENT, DELIVERED TO YOUR INBOX

Privacy Statement