The Machine Config Operator (MCO) in OpenShift exposes a "Pause" feature on its MachineConfigPools that allows users to halt config deployment. We made some changes in 4.11 to try to make that feature "safer", and this blog tries to give some context around what pausing a MachineConfigPool actually does under the hood, the consequences/tradeoffs, and a small glimpse into how the MCO team thinks about exposing and evolving features like this.
The MCO Added A New Critical Alert
Along with OpenShift 4.11 the Machine Config Operator (MCO) added a new feature -- an alert to notify users when a certificate rotation is being prevented by a paused MachineConfig pool.
For users who aren't using the Machine Config Pool's Pause
feature, you will never experience this alert.
For users who are using the Pause
feature, especially those who leave their machine config pools paused long-term due to maintenance windows, etc, this will hopefully make the risks and tradeoffs more apparent, and protect you from weird cluster failure modes.
But Why Did We Do That?
The short answer is: Because we are trying to help you.
The long answer is something like: The consequences of long-term pausing a MachineConfigPool
are not immediately clear and predictable without a detailed understanding of internal OpenShift workings, and some of those consequences are very unfortunate.
We also don't really know what you're doing all the time or why you're doing it, so sometimes it's hard for us to know how to help without getting in the way.
The Relevant Parts of the MCO
If you aren't familiar with entirely how the MCO works, you can check out the documentation. several
This is an oversimplification, but for our purposes here, in "standard operation" the MCO's machine-config-operator
:
- Watches and reacts when certain cluster objects change
- Regenerates a CRD object called
ControllerConfig
when those objects change
And then the MCO's machine-config-controller
:
- Watches for
ControllerConfig
changes - Uses
ControllerConfig
to generate/regenerate templateMachineConfig
and other special controller-generated config - Watches for
MachineConfig
changes - Merges alllllll that
MachineConfig
(template + user + generated) together deterministically intorendered-configs
grouped byMachineConfigPool
- Coordinates assignment of that rendered-config to nodes via a
desiredConfig
annotation (and the resulting reboots and drains) across a pool
And then the machine-config-daemon
on the Node
:
- Applies the specified
desiredConfig
to itsNode
So objects change, the MCO reacts to it, sqishes it all of the config into a rendered-config
, and then assigns that rendered-config
to Nodes
Pool.Spec.MaxUnavailable
at a time, where the Node's
machine-config-daemon
reacts to it and applies the desiredConfig
config.
(There are a bunch of bootstrapping/special cases, but pause does not currently affect those, so I'm going to hand-wave those away for our purposes here)
...Pause?
Some of you have never used Pause
and may have no idea what I'm talking about, so before I go too far:
- The
MachineConfigPool
has a field in its CRD named.Spec.Pause
. It has actually been in there since the "beginning" of the MCO when the machineconfig types were initially added. - When set to
true
that field instructs the MCO's node controller uses it to prevent configuration changes from being rolled out to nodes.
It's useful when you want to prevent the MCO from changing node config, or when you want to stack up a whole bunch of changes and only take a single reboot.
Much like the initial MachineConfigPool
field, the logic around pause was part of the original machine-config-controller
design also.
So the sagely original developers knew that there would be cases where you wouldn't want configuration to roll out instantly, and they planned for it. What they maybe didn't plan for (and couldn't really have planned for) was all of the users who might eventually use it, the reasons it might be eventually used, and the long-term consequences and interactions resulting from its use.
So What Does Pause Actually Do?
The .Spec.Pause
field in the MachineConfigPool
CRD only does one very specific thing: It prevents the MCO's node_controller from changing a node's desiredConfig annotation.
That's it. That's all it does. It happens right here and we just return early during the pool sync without doing anything else.
Everything else in the MCO happens as normal. It doesn't pause the entire MCO. It's doesn't even pause "most" of the MCO, it just keeps that one thing from happening -- which conveniently is enough to prevent any new MCO config changes from being applied to any nodes.
(NOTE: Emphasis on new config changes here because if desiredConfig
has already been set on a node, and the machine-config-daemon
is already in the process of applying a configuration, pause will not stop it, because that's not what it stops).
To reiterate -- things pause stops:
- The MCO's
node_controller
from setting thedesiredConfig
annotation on nodes in a pool
This, by extension, does stop the machine-config-daemon
from applying configuration to nodes in the paused pool because:
- The
machine-config-daemon
only applies config when thedesiredConfig
annotation changes
Things Pause
does not stop:
- New
ControllerConfig
generation - New template
MachineConfig
generation/regeneration - New rendered
MachineConfig
generation - A user manually setting
desiredConfig
for a node (please do not do this) - A toolchest falling down the stairs
The MCO As The Postal Service
So in general MCO really likes to think of itself like the postal service. One of its most common defenses of whether or not we should be inspecting or surfacing some sort of config-specific data is "we don't look in the envelope, we just deliver it."
Yes, there are places where maybe we cheat a little and snarf an object out of the cluster and pack the envelope (which is maybe why it's okay for us to take a peek in that envelope later), but for the most part we try to hold to that philosophy.
So you have this big sprawling MCO postal service:
- it services every city (each
MachineConfigPool
) - it has a bunch of intake mailboxes that people drop mail in (templates, cluster object changes,
MachineConfig)
- it has sorting and consolidation offices (
render_controller
) - it has the "last mile" letter carrier (
node_controller
)- that puts the actual mail (
rendered MachineConfig
) - in your mailbox (
Node
desiredConfig
annotation) - so that the recipient (
machine-config-daemon
) can open it and deal with it.
- that puts the actual mail (
This metaphorical postal service also has a very weird feature (Pause
), wherein:
- the Mayor (
Administrator
) of a city (MachineConfigPool
) can call in and say "stop delivering mail to my city until I tell you that you can resume delivery" (.Spec.Pause
) and the postal service will obey and stop that "last mile" letter carrier (node_controller
) for that city from delivering mail.
I know, it's not 100% perfect, but it should get us where we need to go.
(this diagram is obviously a hand-wavy oversimplification)
The Mayor Said To Stop, So We Did?
So let's say the mayor of the city of worker
has called the postal service and said to stop delivering mail.
As we said, this Pause
in mail delivery doesn't stop anything other than that last mile letter carrier delivering the mail to residents' mailboxes. People still mail letters, the mail still gets picked up, and packed and sorted, and loaded onto mail vehicles, but the mail never gets delivered.
Residents, of worker
of course, could continue to open and deal any mail that had previously been delivered to their mailboxes, but they wouldn't get any new mail (and they obviously can't open and use mail they don't have).
To those doing the mailing, everything appears to be functioning perfectly -- it all still gets collected.
But to those expecting to receive mail, things are different: nothing shows up. Maybe that's okay for the most part. Maybe there actually wasn't any mail set to be delivered. Maybe the residents are busy and can't or don't want to deal with their mail right now anyway.
But let's say there is a resident of worker
out there waiting for a really important piece of mail. Something with a deadline. Something like...their new passport, so they can go on vacation in a week. If they don't get it in time, they can't travel, and if they can't travel they're going to miss their trip.
Well, now we have problems because this particular mail is very important, and it's also not being delivered because the Mayor told the postal service not to.
Your Nodes Were Expecting A Very Important Letter
This "important mail not being delivered" scenario is exactly why we had to add the alerting around Pause
. In our case, the "important mail" is the certificate bundle that controls trust between the Node
and the cluster -- it is "mailed" by the apiserver operator, and it expects it to be delivered in a timely fashion. If the old bundle expires before the new one shows up, your Node
stops talking to the cluster and becomes unmanageable.
More specifically, I'm referring to the kube-apiserver-to-kubelet-signer
certificate, which is managed by the kube-apiserver operator, which it rotates every 292 days (lifetime is 365 days, rotates at 80%).
It rotates it by updating the kube-apiserver-to-kubelet-signer
secret object in the openshift-kube-apiserver-operator
namespace.
And I said earlier, the MCO pays attention to many cluster objects so it can react when they change and reconcile. That apiserver secret just so happens to be one of the cluster objects that the machine-config-operator
pod cares about.
So when the apiserver rotates that certificate and the secret object changes, the machine-config-operator
notices and regenerates clusterconfig/renderconfig, which results in a new set of rendered MachineConfig
being generated containing that new certificate bundle that will totally save the day!
And that super important rendered MachineConfig
just...sits there behind Pause
. :(
We're Telling the Mayor There Will Be Trouble
So that's what the alert is for. Back in our metaphor, the alert we added is probably something like: "when we know 'important' mail is waiting at the post office, make sure the Mayor knows there is impact. Call him on the Big Red Telephone, and he can decide what to do".
But If It's So Important, Why Not Just Deliver It?
So now you're probably thinking "well why not just let that 'important mail' -- that certificate -- through?". It is "platform" config, not user config. We know where it's going and can sneak it in without a reboot, why not just do that?
Well, the short version is that:
- It's sets a weird precedent that pause isn't really pause
- We don't want to lie to you
The "contract" right now that we have with users is more or less that when pools are paused, no configuration gets rolled out. It's been that way since the beginning of time and people are used to it. Pause doesn't lie. The MCO will not move a node to a new configuration if the pool is paused.
Lie To Me
Alright, so what if we did decide to lie to you? We tell you your pool is paused, but we give a special dispensation to "important config" (however we classify it) and we just sneak it though and have the machine-config-daemon
just write it to the node.
That would mean (among other things):
- We have to have a way to know what is/isn't "important" enough to bypass pause
- We have to tell you about that special case (maybe in a "red box" in our documentation that you may or may not happen to read)
- We now have to apply only a subset of config (which the
machine-config-daemon
does not currently do) - Our
currentConfig
annotation will be a lie, because we aren't actually in thatcurrentConfig
anymore (MachineConfig
doesn't really have a status that supports "nuance" at this point -- either you're in a config and that's good, or your not and that's a problem) - Were there to be a severe enough error while sneaking through that "important" config, it could have stability consequences that the user didn't get to plan for.
We'd Rather Tell the Truth
So there isn't isn't an immediately obvious solution here without having to trade something:
- Maybe we break out certificates into another operator that only reconciles certificates (but now the node config is a combination of "whatever is in machineconfig" + "whatever the 'certificate operator' applied")
- Maybe we do a "one-off" rendered-config that's special that only contains the certificate changes (but now we're moving off of the current config "without the user's consent")
- Or maybe the alert is enough (but the user has to fix the issue themselves and potentially take an unwanted outage depending on what kind of config is pending)
One of those trades might eventually end up being right, but we're not quite sure yet, and regardless the ramifications would need to be fully communicated and understood.
How Do You Want The MCO "Postal Service" To Handle Important Deliveries?
So like I said at the beginning -- we're not always entirely sure how you're going to use what we put out there. We want to help you, but we also want to stay out of your way. Ideally we would be 100% helpful and 0% "in your way", but understandably that's difficult given the wide variety of use cases.
Maybe you'll use Pause
how we thought it would be used. Maybe you will use it an entirely different way, but either way we would love to know what you think.
When the MCO isn't worrying about the consequences of Pause
, we're working on cool new things like OCP CoreOS layering which lets you use Containerfile (Dockerfile) syntax to configure your nodes -- check it out!
Über den Autor
Mehr davon
Nach Thema durchsuchen
Automatisierung
Das Neueste zum Thema IT-Automatisierung für Technologien, Teams und Umgebungen
Künstliche Intelligenz
Erfahren Sie das Neueste von den Plattformen, die es Kunden ermöglichen, KI-Workloads beliebig auszuführen
Open Hybrid Cloud
Erfahren Sie, wie wir eine flexiblere Zukunft mit Hybrid Clouds schaffen.
Sicherheit
Erfahren Sie, wie wir Risiken in verschiedenen Umgebungen und Technologien reduzieren
Edge Computing
Erfahren Sie das Neueste von den Plattformen, die die Operations am Edge vereinfachen
Infrastruktur
Erfahren Sie das Neueste von der weltweit führenden Linux-Plattform für Unternehmen
Anwendungen
Entdecken Sie unsere Lösungen für komplexe Herausforderungen bei Anwendungen
Original Shows
Interessantes von den Experten, die die Technologien in Unternehmen mitgestalten
Produkte
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud-Services
- Alle Produkte anzeigen
Tools
- Training & Zertifizierung
- Eigenes Konto
- Kundensupport
- Für Entwickler
- Partner finden
- Red Hat Ecosystem Catalog
- Mehrwert von Red Hat berechnen
- Dokumentation
Testen, kaufen und verkaufen
Kommunizieren
Über Red Hat
Als weltweit größter Anbieter von Open-Source-Software-Lösungen für Unternehmen stellen wir Linux-, Cloud-, Container- und Kubernetes-Technologien bereit. Wir bieten robuste Lösungen, die es Unternehmen erleichtern, plattform- und umgebungsübergreifend zu arbeiten – vom Rechenzentrum bis zum Netzwerkrand.
Wählen Sie eine Sprache
Red Hat legal and privacy links
- Über Red Hat
- Jobs bei Red Hat
- Veranstaltungen
- Standorte
- Red Hat kontaktieren
- Red Hat Blog
- Diversität, Gleichberechtigung und Inklusion
- Cool Stuff Store
- Red Hat Summit