As a team of Linux enthusiasts who are always pushing the limits of OpenShift with the latest and greatest technologies to enable new solutions, we at the Performance Sensitive Applications (PSAP) team have been playing with WireGuard to see how we can leverage the technology moving forward.
One of the problems that we face on the PSAP team is that we have a number of different internal labs in different physical locations, where we have various hardware configurations. For example, some of the new hardware like the NVIDIA DGX-2 might be located in Lab B, but our other OpenShift clusters are in Lab A. If we want to verify that our operators (such as the GPU operator) work well on the DGX-2, how can we do this without physically moving the hardware around (especially challenging dealing with lab access currently)? How can we enable hardware from any lab to be added to our clusters on different networks? What we have done in these instances is use WireGuard as an overlay network to bridge the private lab networks via the public network.
If you are not yet familiar with WireGuard, you should be, as it is the latest in modern VPN technologies to get merged into the Linux kernel (5.6+). WireGuard has a number of differentiating features compared to the VPNs of yesteryear. It has excellent performance, uses strong cryptography, changes the traditional operational model to improve security, and is easily configurable with standard Linux tools. Another benefit is that due to the “opinionated” nature of WireGuard, there is extremely little to configure in order to get up and running.
Figure 1. Block diagram of sample lab spanned cluster
Adding a “special” worker node to an existing cluster with a RHEL node was our first step, and we did this using ansible playbooks, which can be found at: (https://github.com/openshift-psap/wireguard-worker). The playbook first registers subscription-manager to get the necessary entitlements, installs the packages, does the key exchange, and then brings up the WireGuard network. Once we run the `wireguard-worker` playbook to completion, we can go ahead and run the usual OpenShift scale-up playbooks (https://github.com/openshift/openshift-ansible#run-the-rhel-node-scaleup-playbook) that we use to add RHEL hosts to the cluster. This works as expected, as we have our own private DNS server for each cluster on the “jump node” as shown in the diagram above. So we can add worker nodes to the jump node and enable packet forwarding between the private WireGuard network and the private bare metal network.
Once we got WireGuard working with RHEL nodes, we moved on to develop a way to use this functionality with RHCOS as the “special” worker node OS. Doing this on RHCOS becomes a bit more complicated as the RHCOS paradigm is entirely different (immutable infrastructure), given we do not install anything on the host. Instead, we turn to a pattern that we use often in our operators (GPU Operator, Network Operator, for example), by building and/or loading kernel modules inside a container. Enter the new `wg-rhcos` container (https://github.com/sjug/wg-rhcos), which will do what’s needed inside the container running on the “special” RHCOS worker node.
The requirements to install the packages are the same regardless of the node OS: 1.) We must have entitlements on the host. 2.) We must have the WireGuard configuration (key exchange) in place on the host that will be mounted into the container. Once we have 1 and 2 in place, we can proceed to run the `wg-rhcos` container on the host, and it will take care of the rest. In the future, we will be adding RHCOS support to the `wireguard-worker` playbooks so the whole deployment/network establishment will be automated.
Take a look at the command that we use to launch the `wg-rhcos` container:
```
sudo podman run -d \
--privileged \
--network host \
--mount type=bind,source=/etc/resolv.conf,target=/etc/resolv.conf \
-v /etc/os-release:/etc/os-release:ro \
-v /var/lib/containers/storage/overlay:/tmp/overlay:ro \
-v /etc/wireguard:/etc/wireguard \
--restart on-failure \
quay.io/jugs/wg-rhcos
```
--privileged -- We must run the container in privileged mode.
--network host -- We tell the container to use the host network and this way the interface is created and usable by the host network.
--mount -- We mount in the resolv.conf file so we can set the DNS server to the private DNS rather than the lab DNS server.
-v os-release -- We mount in the os-release to determine which repos we should enable.
-v storage/overlay -- We mount in the overlay storage as we ship some kernel packages that we need here.
-v etc/wireguard -- We mount in the WireGuard configuration directory so the container can access the configuration file.
The above mentioned container will install the necessary dependencies on our RHCOS node (given the entitlements are present on the host), install the WireGuard packages, bring up the network, and set the DNS. Assuming that we booted the “special” worker with the `worker.ign` that was generated by the OpenShift installer, once the WireGuard network is established, the ignition service will be able to reach the API server for the `worker.ign` append and the node will be added to the existing cluster.
Hopefully, you have enjoyed this sneak peek into some solutions we have put together to enable a distributed lab network layout, utilizing the new WireGuard VPN networks. Stay tuned to see more work that we are doing with WireGuard!
저자 소개
Sebastian Jug, a Senior Performance and Scalability Engineer, has been working on OpenShift Performance at Red Hat since 2016. He is a software engineer and Red Hat Certified Engineer with experience in enabling Performance Sensitive Applications with devices such as GPUs and NICs. His focus is in automating, qualifying and tuning the performance of distributed systems. He has been a speaker at a number of industry conferences such as Kubecon and STAC Global.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.