Transforming Your Timelines

About the episode

"Modernizing your IT systems is a worthy achievement. You can reap the benefits of digital transformation at last. But don’t be lulled into a false sense of security—digital transformation isn’t about reaching the finish line. It’s about reaching for the horizon.

Randall Núñez of Experian explains how perpetual improvement is a key element of digital transformation, because systems can’t stay modern forever."

About the guests

Randall Núñez

Linux Admin Automation Engineer Expert
Experian

Transcript

00:02 — Jamie Parker
Change is hard, transformative change even more so. Digital transformation is a process many go through out of necessity. You're likely to have difficult technical problems. And because of the power of metaphorical inertia, people are going to keep doing what they've been doing, which means you're likely to have difficult cultural problems too. The term digital transformation is a little misleading. Transformation can imply that, like with most projects, there's a beginning, a middle, and an end. But the reality is far more complicated. In nature, the caterpillar becomes the butterfly. Randall Núñez, automation engineer at Experian, explains why your IT teams may need to cozy up in the chrysalis again and again and again.

00:57 — Jamie Parker
This season of Code Comments, we're covering different aspects of digital transformation.

01:07 — Jamie Parker
There's a tendency to frame it as a journey. You start from somewhere, follow your path, and end up in a new place with the tools to participate in the modern economy. It's not wrong, but the reality may not be what comes to mind first. Randall has some experience with digital transformation projects. To him, the term implies a perpetual process of improvement.

01:33 — Randall Núñez
It's about integrating digital technology into all areas of a business, fundamentally changing how you operate and how you deliver value to customers. I would say also, it's a continuous process of improvement and adaptation to new technologies and methods, rather than just a one-time project. Is there a completely finished? It's an ongoing journey because, as the technology evolves, it's a must for our systems and practices to stay competitive, efficient, and secure, of course.

02:07 — Jamie Parker
Those first projects to take you from legacy hardware and development process to more modern ones, think of those more like the first legs of your trip. That's the most tangible outcome of the process, new infrastructure and new ways of working. But you don't stop there, or at least you shouldn't. We frequently make hay about how quickly the tech industry changes. That's the first clue about why you need to keep moving. When Randall joined Experian, they already had some modern tools and workflows, and everything was working just fine. Their automation platform was stable and had been working for years, but they needed to change again.

02:46 — Randall Núñez
It almost never went down. So I think the company almost get used to it, until we figure out that we actually have to move to the new version, even though the current one is completely stable. It was stable, but we were behind on the Ansible Core versions, the Python versions. Obviously, that could be a security constraint, so we definitely had to take the decision and just move to the new technology.

03:15 — Jamie Parker
So they had to upgrade from one version of software to another. Is that such a big deal? They've gone through digital transformation before. Would it be any more difficult to do it again? The answer is: it depends. Sometimes, upgrades are as easy as push of a button. In this case, as with many others, it was much more complicated than that. The version of Ansible they were moving to had different dependencies it relied on that they would also have to upgrade. They had multiple moving parts to contend with. Throw in a database schema incompatibility, and that turned the project's complexity all the way up to 11.

03:53 — Randall Núñez
So basically, as a summary, the steps in order to upgrade from Tower to AAP are obtaining a backup from the current Tower instance. Then we have to create a internal test environment and install Tower, that same version that we have in production at that time. Then we have to restore the backup from our production environment into the internal test environment.

04:19 — Jamie Parker
Do you remember those stacking tower puzzles where you need to move rings one at a time from one peg to another in a certain order to solve the puzzle? It was kind of like that. They couldn't lift the whole system from one environment to another, which can be difficult enough on its own. What they ended up doing is more complicated than we have time for, but here's the gist. They had to move everything to a staging environment that mirrored the original. They performed the upgrade, which then became a sort of hybrid environment with older dependencies and the upgraded version of Ansible. Then, they had to provision another environment that looked like the final production environment with the updated dependencies and a fresh install of Ansible. Bear with me, we're almost done rearranging the rings. They backed up the hybrid environment in case of any issues, and they migrated everything to the final environment. To make that possible, however, they also had to deal with the actual schema incompatibility.

05:18 — Randall Núñez
The problem with that is that we had two different schemas, what then was a public schema. So that gave us a lot of issues. They have to modify them, but they have to investigate how it work first. I think they even have to engage the developers in order to find out how to properly make it compatible. So yeah, that was the whole process. It took some time, but in the end, they finally did it. So we were able to restore the backup and we were able to move all the objects to the newer production environment.

05:55 — Jamie Parker
Victory! Everything arrived stacked up in the right order. What differentiates this example from others we've covered all season? Not much, really, other than the fact that it happened while migrating from a not ancient system to a new one. But that's the point. Even if you finish some digital transformation projects and have, relatively speaking, modern systems, you still need to plan for upgrades. And you might need creative solutions to complex problems. Randall shared how he and his team felt about the ordeal.

06:32 — Randall Núñez
To be completely honest, a little frustrated because that was going to delay the process. I consider myself a geeky person, so I was really, really happy about the new platform at the moment. I really wanted to see it working, and it was just not working. We tried many things and anything was working.

06:54 — Jamie Parker
Technical issues happen. They're frustrating, but they're a part of the process, sometimes a small part and sometimes a bigger one. This technical issue caused delays, and Randall wasn't the only one who was frustrated.

07:07 — Randall Núñez
So the business students that were more involved, specifically one, it was I think the third organization that had more templates in the environment. They were constantly asking us, "Okay, when is it going to be done? We are working on the code improvements. Can we move it now? I still don't see it up and running."

07:27 — Jamie Parker
A technical issue is rarely just a technical issue in isolation of everything else, and you should be prepared to handle the secondary issues that crop up.

07:37 — Randall Núñez
We have constant communication with all the stakeholders that help us, the engineers, the architects, in order to adjust the timelines and expectations accordingly. And we emphasize documentation, frequent updates, which kept the project transparent and allows for quicker decision-making,

08:02 — Jamie Parker
Sharing updates, being transparent, and communicating well are all ideal elements of a healthy organization. They'll help keep things measured and smooth in the face of tough issues and delays. So digital transformation can look like a multi-leg, never-ending journey rather than a marathon to the finish line. You'll need to incorporate changes in technology in your modern system so that it stays modern. When we come back, we'll tackle some of the ongoing cultural challenges that you might face in the long run.

08:40 — Jamie Parker
You can convince most of your teams to learn the new tools and processes digital transformation requires. Even so, that level of coordination can be difficult to pull off.

08:57 — Randall Núñez
Coordinating multiple teams across different functions was also challenging. We have several business units that use our environment, but ensuring everyone understood the changes, updating their systems and processes in sync, and managing dependencies was crucial for a small transition.

09:22 — Jamie Parker
Not everyone uses technology in the same way, so teams may be affected by change to varying degrees. There are steps you can take to increase the overall chance of success with thorough documentation and more.

09:35 — Randall Núñez
That's why we documented pretty much everything. We even created a frequently asked questions in Confluence. We created information about how to use Ansible Lint for their code updates. We created information about how to use the execution environment in this new environment, how to use the collections, which is a different process that we have before.

09:57 — Jamie Parker
Even after having undergone digital transformation projects, Experian is making changes to their training processes in response to what worked and what may be needed. That process includes making sure your teams have a lot to learn from and providing opportunities for them to practice, ask questions, and even share what they use with other teams.

10:20 — Randall Núñez
As I mentioned, we have several business units that used the platform. Some of them were extremely engaged with it. They were asking many questions. They were actually doing a lot of things. For example, there was a particular business unit that they even created a Docker container image with Ansible Lint so other business units could use it. So it'll be easier for everybody. So they shared that.

10:43 — Jamie Parker
That's the dream, teams being proactive and ready for change, and also helping other teams get to that point as well. But some want to stay right where they are, and may resist moving away from what they know. You can lead a horse to water, but you can't make it drink. There's always a chance you won't reach every team, whether it's because of hubris, misunderstanding, or stubbornness.

11:11 — Randall Núñez
I guess they were too comfortable with the current state of the platform. Probably they didn't investigate that it was going to be a big change, even though we informed everybody. Maybe they just thought that the new platforms, that you have to deploy the code in the new platform and it was going to work just like that, which is not the case.

11:31 — Jamie Parker
There are times when code works across multiple versions of a tool. Assuming that's the case is taking a risk. Thinking that's the case, despite being told the opposite, might have something to do with cotton in your ears. And if the teams aren't ready when the new systems are set to launch, that's a problem.

11:51 — Randall Núñez
It requires some changes. And actually, we had some issues at the beginning because some business units had very old code, which was compatible with Ansible Core 2.9. So it was a challenge for some business units to update their code to be a more secure, reliable, and compatible with a new version, in this case, Ansible Core 2.15. Actually, we had to deploy a workaround for that in the new environment.

12:19 — Jamie Parker
A couple of teams hadn't updated their code in time, and that meant their parts of Experian's systems would stop working when the painstakingly orchestrated update rolled out. Nobody wanted to suffer from an outage, especially when it could be prevented. Randall's team set up a workaround to accommodate the teams. They created an additional execution environment running the old version of the system.

12:47 — Randall Núñez
In the state of the art of the platform, we don't want business units to be using outdated code, which it could be potentially unsecure and can impact our business. So we definitely don't want that. But at the same time, we don't want to stop operations because the business continues. So that's the solution we gave them, but with a warning, which includes "we will give you this, but it will be decommissioned in some time." And we are planning that right now.

13:18 — Jamie Parker
So they got some respite, but with the understanding that they had to catch up with the rest of the company before the concession was taken offline. Because keeping them running means keeping the same vulnerabilities open for longer, spending additional resources to keep them running, and complicating the overall system beyond what is necessary. It's important to identify the issues that could lead to such a situation. Randall explains some of the possible reasons teams fell behind, despite the wealth of information that was available.

13:54 — Randall Núñez
Some teams struggled due to outdated practices that have become ingrained over time. So introducing new tools and processes requires significant re-education and support. But that's why we, alongside with Red Hat, we planned some workshops. Not everybody joined to do workshops, but a lot of people did.

14:17 — Jamie Parker
Make learning opportunities available to the teams who will be affected, emphasize the importance of attendance, and make clear the effects of missing out. Sometimes, the problems aren't about technical mismatches and not being ready in time. Sometimes, they're about baffling misuse of resources.

14:35 — Randall Núñez
So we previously had two environments in the Ansible Tower. We have a production environment and a staging environment. The staging was also available for the business units. But unfortunately, a few business units were using it as a production environment.

14:56 — Jamie Parker
The code worked! Their system was running fine, but it wasn't right. Staging environments are meant for testing systems before they go live to production, not a place from which to run their code. The team should have known that.

15:12 — Randall Núñez
I know it's hard to believe, but I believe that happened because they were creating and testing new code. And probably, once it worked, they decided to leave it there and didn't bother to move it to production since it was working in staging. So we decided for the new platform that we will just make the production environment public to them and having two organizations created for each business unit, which will be identified as production and the other one identified as staging. And we will be keeping our staging environment private just for the admins to test whatever is needed, like upgrades or stuff like that. That way, we will avoid the units to use a non-production environment as a production environment.

15:57 — Jamie Parker
And of course, using a staging environment as production caused problems.

16:04 — Randall Núñez
Actually, a couple of those business units had issues with the networking side of things because we have two-sided blocks, one for production and one for staging. And when they moved to production to the new environment, they discovered that they no longer had access from Ansible to the target machines that they were configuring tasks or running jobs against it.

16:27 — Jamie Parker
Some people aren't ready in time. Some people perhaps didn't pay as close attention to concepts as they should have. In the end, the cultural problems lead to technical issues and workarounds. That's why getting the cultural side of digital transformation right is so important. But it can be hard to do.

16:48 — Randall Núñez
Because changing people's habits and mindsets requires persistent effort and engagement, which is more complex than just solving technical issues that have more straightforward solutions.

17:01 — Jamie Parker
You can see the errors in the code and find solutions, even if they're complex puzzles that need tricky shuffling of resources. But people problems are a different beast entirely, and can be hard to predict or even identify. Even so, the work of digital transformation must go on because even as you get a new system up and running, that doesn't mean you've set it up to make full use of its features.

17:33 — Randall Núñez
I believe we're kind of moving to the next scene. The environment is working fine. So now that it is stabilizing, we are adding work to our backlog. For example, we want to leverage event-driven automation, which is a great feature. We didn't install that at the beginning since we were familiarizing with the platform and trying to get it to stabilized. We are also in the process of start creating CI/CD pipelines to streamline the execution environments. So in general, there's always space for improvement. Yeah, so it will definitely never be finished, which is okay in the end because that's part of the digital transformation.

18:13 — Jamie Parker
The work is never done. Updates, upgrades, and extensions continue until you have to make another big move and the whole process starts again. So when we talk about digital transformation, we're talking about a change from one state to another, old IT systems to modern ones, and a change in working culture to meet the potential of those modern systems. But it also means transforming your organization's approach to change by accepting it as a constant feature. And that means being ready to face and address problems over the long run, whether they're technical or cultural.

18:57 — Jamie Parker
You can learn more at redhat.com/codecommentspodcast, or visit redhat.com to find our guides to digital transformation. Many thanks to Randall Núñez for being our guest. Thank you for joining us. This episode was produced by Johan Philippine, Kim Huang, Caroline Creaghead, and Brent Simoneaux. Our audio engineer is Kristie Chan. The audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Nick Burns, Aaron Williamson, Karen King, Jared Oates, Rachel Ertel, Carrie De Silva, Mira Cyril, Ocean Matthews, Paige Stroud, Alex Traboulsi, Boo Boo Howse, and Victoria Lawton. I'm Jamie Parker, and this has been Code Comments, an original podcast from Red Hat.

“It's a continuous process of improvement and adaptation to new technologies and methods, rather than just a one time project.

It's never completely finished. It's an ongoing journey, because as the technology evolves, it's a must for our systems and practices to stay competitive, efficient and secure.”

Randall Nùñez

More like this

You Can’t Automate The Difficult Decisions

The tensions between security and operations and developer teams are legendary. DevSecOps is trying to change that, and automation is a big part of making it work.

Scaling For Complexity With Container Adoption

Spinning up a Kubernetes cluster is just the beginning. How do companies get value from container adoption?

Challenges In Solutions Engineering

Tech changes constantly. What does that mean for companies adopting new technology?