Not all traffic has the same priority, and when there is contention for bandwidth, there should be a mechanism for network appliances outside the OpenShift Container Platform (OCP) cluster to prioritize the traffic. To enable this, we will use Quality of Service (QoS) Differentiated Services Code Point (DSCP), which allows us to classify packets by setting a 6-bit field in the IP header, effectively marking the priority of a given packet relative to other packets as "Critical," "High Priority," "Best Effort," and so on.
Marking packets with DSCP as they head out allows a router to distinguish between them and determine, for example, which require higher bandwidth or higher priority and handle their requirements properly.
Starting from OCP 4.11 (enabled by default to all customers), a new Developer Preview OVN-Kubernetes Container Network Interface (CNI) feature is introduced: EgressQoS, which enables a cluster administrator to mark pods egress traffic with a valid QoS DSCP value. The markings will be consumed and acted on by network appliances outside the OCP cluster to optimize traffic flow throughout their networks.
Configuring the router to handle DSCP markings is outside the scope of this post. Instead, we'll focus on how we can apply different markings to traffic coming from pods heading to an external destination using EgressQoS.
A simple user story example: As a cluster administrator, I pre-configured my router to handle the different DSCP values (using colors for demonstration, in reality they are decimals from 0-63) of incoming traffic, by giving “green” traffic full priority, “yellow” traffic low priority, and “red” best effort. I want egress traffic coming from different applications (pods) on a given namespace (namespace1) to be marked with different DSCP “colors” so my router can handle them properly and allow their requirements to be fulfilled.
In this post, we'll explore how such configuration is available in OCP clusters that use OVN-Kubernetes CNI as their network provider.
What is EgressQoS?
Starting from OCP 4.11, EgressQoS
(Developer Preview) is a namespaced Custom Resource Definition (CRD) that enables marking pods egress traffic with a valid QoS DSCP value. A namespace supports having only one EgressQoS resource named default (other EgressQoSes will be ignored).
An EgressQoS resource allows specifying a list of QoS rules, each consisting of 3 fields:
-
dscp: DSCP value for matching egress traffic
-
dstCIDR (optional): Apply DSCP to traffic heading to this CIDR
-
podSelector (optional): Apply DSCP to traffic from pods whose labels match this selector
kind: EgressQoS
apiVersion: k8s.ovn.org/v1
metadata:
name: default
namespace: default
spec:
egress:
- dscp: 30
dstCIDR: 1.2.3.0/24
- dscp: 42
podSelector:
matchLabels:
app: example
- dscp: 28
This example marks the packets originating from pods in the default
namespace in the following way:
-
All traffic heading to an address that belongs to
1.2.3.0/24
is marked with DSCP 30. -
Egress traffic from pods labeled
app: example
heading to a CIDR that is not1.2.3.0/24
is marked with DSCP 42. -
All egress traffic is marked with DSCP 28.
IMPORTANT: The priority of a rule is determined by its placement in the egress
array. An earlier rule is processed before a later rule. In this example, if the rules are reversed, all traffic originating from pods in the default namespace is marked with DSCP 28, regardless of its destination or pods labels. Because of that, specific rules should always come before general ones in that array.
Usage Example
Following a similar example to the user story we mentioned previously, here we would like to have the packets coming from the default namespace to be marked the following way:
-
All packets heading to
172.18.0.6/32
marked with DSCP 40. -
All packets heading to
172.18.0.7/32
from pods labeledapp: demo
marked with DSCP 50.
To achieve that, we create the following EgressQoS resource in our OCP cluster:
apiVersion: k8s.ovn.org/v1
kind: EgressQoS
metadata:
name: default
namespace: default
spec:
egress:
- dscp: 40
dstCIDR: 172.18.0.6/32
- dscp: 50
dstCIDR: 172.18.0.7/32
podSelector:
matchLabels:
app: demo
Assuming these are the pods in the default namespace:
We can expect the traffic to be marked like:
And, indeed, running tcpdump on each of the destinations and pinging them from the pods results in:
tcpdump on 172.18.0.6 host:
bash-5.0# tcpdump -i eth0 -v icmp
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
10:40:06.238100 IP (tos 0xa0, ttl 62, id 23892, offset 0, flags [DF], proto ICMP (1), length 84)
ovn-worker > a7acb5556708: ICMP echo request, id 7424, seq 0, length 64
10:40:08.280624 IP (tos 0xa0, ttl 62, id 42569, offset 0, flags [DF], proto ICMP (1), length 84)
ovn-worker2 > a7acb5556708: ICMP echo request, id 6656, seq 0, length 64
tcpdump on 172.18.0.7 host:
bash-5.0# tcpdump -i eth0 -v icmp
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
10:44:33.847400 IP (tos 0xc8, ttl 62, id 58984, offset 0, flags [DF], proto ICMP (1), length 84)
ovn-worker > 90d8708e53a8: ICMP echo request, id 7680, seq 0, length 64
10:44:37.536332 IP (tos 0x0, ttl 62, id 33532, offset 0, flags [DF], proto ICMP (1), length 84)
ovn-worker2 > 90d8708e53a8: ICMP echo request, id 6912, seq 0, length 64
DSCP is derived from the tos field. To get the right decimal value from the hexadecimal we must convert it to decimals and shift 2 bits to the right (e.g., 0xc8 = 200, after shifting 2 bits to the right we get 50).
When a packet from a pod exits a node, its src is changed to the node’s IP, hence we see here that the packets come from our nodes.
Overall, from our tcpdump outputs we can see that we have reached the desired state.
Summary
In this post we saw how an OCP cluster running OVN-Kubernetes CNI can use QoS DSCP to mark selected pods’ egress traffic with a simple CRD. This allows routers and other network appliances that are connected to the cluster to prioritize packets from pods the same way they do for virtual machines (VMs) and bare-metal servers.
About the author
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit