
Red Hat Ansible Automation Platform 2 is the next generation automation platform from Red Hat’s trusted enterprise technology experts. We are excited to announce that the Ansible Automation Platform 2.3 release includes automation controller 4.3.
In the previous blog, we saw that automation controller 4.1 provides significant performance improvements as compared to Red Hat Ansible Tower 3.8. Automation controller 4.3 is taking that one step further. We will elaborate on an important change with callback receiver workers in automation controller 4.3 and how it can have an impact on the performance.
Callback Receiver
The callback receiver is the process in charge of transforming the standard output of Ansible into serialized objects in the automation controller database. This enables reviewing and querying results from across all your infrastructure and automation. This process is I/O and CPU intensive and requires performance considerations.
Every control node in automation controller has a callback receiver process. It receives job events that result from Ansible jobs. Job events are JSON structures, created when Ansible calls the runner callback plugin hooks. This enables Ansible to capture the result of a playbook run. The job event data structures contain data from the parameters of the callback plugin hooks plus unique IDs that reference other job events. The following is an example job event:
"event": "playbook_on_play_start",
"counter": 2,
"event_display": "Play Started (all)",
"event_data": {
"playbook": "chatty_tasks.yml",
"playbook_uuid": "aca1b0da-f29c-4fcf-be35-1aa59a30a4e0",
"play": "all",
"play_uuid": "faacc0d4-457c-ac33-a7f4-00000000006a",
"play_pattern": "all",
"name": "all",
"pattern": "all",
"uuid": "faacc0d4-457c-ac33-a7f4-00000000006a",
"guid": "a70eb73c9c2241e0995963a6dcd4b89b"
},
These job events are pushed to the redis database queue and processed by the callback receiver. Each callback receiver has workers that process these job events and saves them in the database. Prior to automation controller 4.3, by default each callback receiver had four workers to process job events regardless of the size of the control node. For customers who vertically scale their control nodes, this could cause performance issues as the callback receiver workers were not scaled based on the capacity of the control node(s).
Performance Issues
Large Ansible Automation Platform clusters generate a huge volume of job events when running at their maximum capacity (max allowed forks), i.e. running loads of jobs. Also, if the job templates are run at higher verbosity, that generates even more job events. During our performance analysis, we noticed that job events were getting queued at the redis database waiting to be processed when a large volume of job events took place that could not be handled by the default four callback receiver workers. As more and more job events were queued up at the redis database (an in-memory database), the underlying control node ran out of memory (OOM) and the redis database processes were killed.
Solution
While versions of automation controller prior to 4.3 had the option of modifying the JOB_EVENT_WORKERS
setting to increase the size of the callback receiver from the default four, it was not a well known administrative setting. Now, in automation controller 4.3, vertically scaling control nodes not only increases capacity to run jobs (which generate events), it proportionally scales the number of callback receiver workers to better handle the output from those jobs and to utilize host resources available to automation controller.
This is accomplished by enhancements to the traditional installer and the Red Hat OpenShift operator. For virtual machine and bare metal installations, the 4.3 installer sets the number of callback receiver workers equal to the number of CPU. For example, if a VM control node has eight CPUs, the installer sets the callback receiver worker to eight.
For Red Hat OpenShift operator based installs, the number of callback receiver workers is set to the CPU limit for the task container if the CPU limit is greater than four. Additionally, administrators may set the callback receiver worker manually if they so choose by setting JOB_EVENT_WORKERS
property in a custom settings file. For more information on making this modification manually, visit the performance tuning guide.
Takeaways & where to go next
With the above change of how callback receiver workers are implemented, the risk of running into OOM issues is reduced and improves the overall performance of automation controller. In the next blog, we compare some of the results of the above change in two different clusters of automation controller.
If you're interested in detailed information on automation controller, then the automation controller documentation is a must-read. To download and install the latest version, please visit the automation controller installation guide. To view the release notes of recent automation controller releases, please visit release notes 4.3. If you are interested in more details about Ansible Automation Platform, be sure to check out our e-books.
저자 소개
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.