Southwest Airlines is expanding its automation use cases

Southwest Airlines (Southwest) runs more than 4,000 flights daily across its 120 airports. The network is a crucial part of its operations. The airline adopted Red Hat Ansible Automation Platform to support IT networking and to mitigate outages by standardizing configurations. Automation technology has also accelerated network device software updates and freed engineers to focus on innovation. Southwest is expanding its use of Ansible Automation Platform technology to bring new use cases to life across different departments.

Benefits 

  • Saved at least 5 months in building and testing NAC configurations 
  • Ensured scalability as well as speed of response
  • Safeguarded consistency to reduce errors and potential outages
  • Enabled a greater focus on innovation

Realizing an ambition to be the world’s best airline

Southwest Airlines (Southwest) is renowned for being the ‘airline with heart’. When it first took flight from a small airport in Dallas, Texas, in 1971, Southwest focused on making air travel accessible to everyone through friendly, reliable, and low-cost services. Having grown to more than 120 airports across 11 countries, Southwest’s heart remains in the same place. In addition to offering customers and employees ‘the utmost love and care’, Southwest’s ambition is also to become the world’s most-loved, efficient, and profitable airline. 

Information and technology play an increasing role in helping Southwest meet its goals. Southwest runs more than 4,000 flights daily, and its network infrastructure is critical to keeping planes in the air. Southwest’s network engineers’ number one mission is to ensure the network is up and running. They manage around 5,000 network devices, switches, and F5 LTM and GTM load balancers. Each airport can have a range of 25 to 100 devices each. 

“Our airports need access to all their applications, so our staff can get gates ticketed and customers on the plane,” said Carlos Tapia, Senior Systems Engineer at Southwest. 

With so many airports and engineers, keeping network configurations from drifting was a real issue. Added to that, more flying hours meant smaller maintenance windows. Engineers would often get up in the early hours of the morning to implement a change, then spend hours documenting that change

Automating tasks across network engineering

Southwest started exploring how an automated change process could create a change ticket, schedule the task, implement the change, and close the ticket quickly without manual input. Tapia spoke with Red Hat about Southwest’s challenges at Cisco Live. He quickly realized that Red Hat Ansible Automation Platform could help. 

The team’s first automation use case dealt with network access control (NAC), preventing rogue external devices from obtaining an IP address and connecting to Southwest’s network. More recent use cases include standing an airport’s network up at speed.

Southwest created a standardized network setup called the “golden configuration,” designed by their network engineers. This pre-configured setup provides a baseline for 90% of what’s needed at each airport. They run an Ansible Playbook to set up the core functionality of switches and routers. Then, engineers spend an additional 20 minutes or so configuring any airport-specific details. With automation, the total process takes around 20 minutes. In comparison, the manual process took significantly longer. Before automation engineers would take about 30 minutes to configure a single device. Depending on the size of the airport it would be multiplied by the size. Furthermore, the “golden configuration” ensures a more consistent and reliable network across all airports.

Two members of Tapia’s team write the Playbooks, using Ansible roles to organize tasks, templates, files, and variables. They follow the YAML approach for directory layout, leverage Jinja2 templating to format any text and integrate with GitLab for a single source of truth. 

One of the biggest Playbooks manages Cisco IOS networking software upgrades. Southwest has at least 10 switch models, each with multiple firmware versions. “Our Playbook determines the correct version of the software based on the switch model, downloads the firmware, validates checksums, and performs the upgrade,” said Tapia. “It also deals with situations where the storage on the appliance is full.”

With automation engineers exploring new automation use cases, Tapia plans to invest more resources into network automation. He also plans to include a software development team to build a self-service portal where engineers can access the automation they need when they need it. Event-driven automation is also on the roadmap. Use cases include checking for and resolving disabled ports, running custom Python scripts in the network ATM (asynchronous transfer mode) environment, and automatically raising tickets if a circuit goes down.

Southwest logo

Industry

Transportation

Headquarters

Dallas, Texas

Flights

More than 4,000 each day

Software and services

Red Hat® Ansible® Automation Platform

Icon-Red_Hat-Media_and_documents-Quotemark_Open-B-Red-RGB Automation is mission critical at Southwest. Ansible Automation Platform is crucial as we continue our automation journey.

Carlos Tapia

Senior Systems Engineer, Southwest Airlines

Icon-Red_Hat-Media_and_documents-Quotemark_Open-B-Red-RGB Using Ansible Automation Platform with golden configurations also mitigates human error. Automation never makes mistakes.

Carlos Tapia

Senior Systems Engineer, Southwest Airlines

Saving time while reducing risk and accelerating innovation

Saved at least 5 months in building and testing NAC configurations 

Speed is one of the most significant benefits Southwest has seen with automation. “With Ansible Automation Platform, it only took us 6 weeks to build and test the configurations for the NAC use case, then deploy them to all switches,” said Tapia. “Before automation, it would have taken us between 8 and 12 months.”

He also explained how Ansible Automation Platform would prove vital if critical systems went down. Since minimizing downtime is essential, even a large team of engineers wouldn’t be able to manually check all layer 2, layer 3, and layer 4 devices, along with the firewall and DNS, quickly enough. 

Ansible Automation Platform can now launch 100 different Playbooks at once to access all the information an application needs to identify if and where there is a problem with the network environment.

Ensured scalability alongside speed 

Tapia used the Cisco IOS upgrade use case example to demonstrate how Ansible Automation Platform has helped his team accelerate and scale their operations. Enhanced speed and efficiency are essential when maintenance windows are shortening.

Cisco IOS upgrades were previously manual tasks where engineers typically upgraded up to 10 devices a night. “Upgrading devices used to be a very long and time-consuming project,” said Tapia. “Ansible Automation Platform allows us to make multiple changes in a short maintenance window; we could complete the Cisco IOS upgrade on at least 100 devices a night.” 

The network engineers have used the Cisco IOS update Playbook more than 2,500 times to upgrade switches. They’re embarking on a refresh this year to update the hardware and refresh the code in around 3,000 switches—all with the help of Ansible Automation Platform.

Safeguarded consistency to reduce errors and potential outages 

Standardized configurations help eliminate network outages caused by misconfiguration. And Ansible Automation Platform plays a critical role.

“We’ve built Playbooks that use our ‘golden configuration’ to eliminate configuration drift,” said Tapia. “When our engineers set up new devices, they use these Playbooks by accessing them from a web portal we’ve set up.”

Previously, a network engineer setting up a new switch would rely on documentation detailing the configuration for that type of switch. However, another engineer may have made an update to a switch of the same type without updating the documentation—causing configuration drift.

“Using Ansible Automation Platform with a ‘golden configuration’ also mitigates human error,” said Tapia. “Automation never makes mistakes.”

Enabled a greater focus on innovation 

Automation now allows Southwest’s engineers to focus on larger, more complex projects. Engineers are now exploring using automation for time-consuming projects, such as cleaning up Southwest’s routing environment and opening up connections with peer partners. 

Ansible Automation Platform also gives network engineers more time to innovate, including looking at tools for an automation pipeline.

“We’re looking at a network analysis tool called Batfish to help us understand if a change is going to cause a problem,” said Tapia. “We’re also exploring Molecule’s potential for testing Playbook roles.”

Expanding success with automation across IT

Southwest is looking to expand on the success it has seen using Ansible Automation Platform for network automation. The network team is looking to use Ansible Automation Platform to automate firewalls and a new software-defined WAN, among other things.

The end goal for Southwest is to implement an Infrastructure as Code (IaC) model, with GitLab serving as the source of truth. Engineers would make changes once in GitLab. Listeners would check where updates are needed, schedule the changes, and then implement and validate them within the maintenance window. 

Tapia would like to see this wholly ‘hands-off’ approach working across all network devices, platforms, and environments. And with network engineers keen to use more automation, Red Hat Ansible Lightspeed may help them get started. This solution is a generative AI service engineered to help automation teams create, adopt, and maintain Ansible content more efficiently. Connected to IBM watsonx Code Assistant, Red Hat Ansible Lightspeed helps Ansible creators turn their automation ideas into content based on natural language prompts.

“We want to get more of our engineers into the automation mindset,” said Tapia. “Ansible Lightspeed can help them understand how to structure a Playbook. I’m also planning to use it to onboard network devices from a new vendor or if we expand into the cloud. It would give me a framework for the new Playbooks.”

The broader IT department has shown a great interest in what the network automation team has achieved. Tapia describes how one operations engineer utilized the Playbook to upgrade 30 switches. It took just 27 minutes, whereas without automation, upgrading each switch would typically take 4 to 6 engineers around 30 minutes. This engineer now wants to use automation for all upgrades, looking at use cases in other areas of infrastructure to expand its Ansible Automation Platform footprint.

About Southwest Airlines 

Southwest Airlines Co. is a major American airline that operates on a low-cost carrier model. Headquartered in Dallas, Texas, it has scheduled services to 120 destinations in the United States and 10 additional countries.

About Red Hat Innovators in the Open

Innovation is the core of open source. Red Hat customers use open source technologies to change not only their own organizations, but also entire industries and markets. Red Hat Innovators in the Open proudly showcases how our customers use enterprise open source solutions to solve their toughest business challenges. Want to share your story?