IBM Z Mainframes Explained for Data Scientists

In this guest blog, we’re joined by Sarah Burne James, Client Adoption Specialist (Z Software) at IBM, who brings a fresh, fun perspective to the world of mainframes—complete with misconceptions, “aha!” moments, and a lot of cool tech.

*record scratch* *freeze frame* Yep, that’s me. You’re probably wondering how I got here.

You know that film and TV cliche (and latterly meme) – in which a character begins to explain how they got into an improbable situation? Well, that’s how I’ve felt at times during 2025.

At the start of 2025 I was happily working in IBM’s Client Engineering department, building pilots to get clients started on our coolest new technologies, with a focus on data and AI.

Having moved into a new position over the summer, I end the year in a very different role. Still happy, but now working with AI assistants for mainframe, which help other mainframe newbies get to grips with the technology faster, and automate things to help more experienced folks.

It’s certainly been a learning curve – this time last year, mainframe conjured up the image of a huge, old-fashioned computer filling an entire room, or something out of a spy thriller. So much so that I felt a genuine sense of achievement when I sat through a meeting and understood all the acronyms. I also learnt that mainframes are not quite that big these days, perhaps the size of a large fridge.

Pause: Sorry, what the heck is a mainframe?

Mainframes are used by 43 of the world’s top 50 banks, and eight of the top 10 payment companies use mainframes as their core platform. They can support thousands of applications and input/output devices, which means it can serve thousands of users at once. IDC estimates that mainframes support more than 95% of all non-cash transactions worldwide. A mainframe is essentially a very reliable, fast and secure server: IBM released the latest mainframe model (z17) in June this year.

It is probably more useful to consider mainframe a style of computing. To massively simplify: mainframers often refer to the non-mainframe world as “distributed”, whereas mainframe computing is “centralised”.

This means that in mainframe world, businesses can do a lot of what they need to do on a single machine (or small number of machines). In distributed-world they might instead be spreading workloads across many machines, whether in their own data centres or on the cloud. If you want to run high volumes of transactions, really fast, reliably and securely, it makes sense to do this on a mainframe. For a start, you don’t have to deal with the latency and additional attack surfaces introduced by extensive networking between machines in different locations.

In reality, businesses today that use mainframes do a bit of both. For example, banks tend to run core transaction processing on mainframes, but when you log into internet banking, you might be seeing a copy of your account data, hosted in a cloud service.

What does all of this have to do with data?

As data scientists, we are often spoilt with abstractions.

We use tools that mask the complexity of the platforms that train our models, or run our analyses. It can feel like the underlying platform doesn’t matter, until it does. (When I was doing my MSc dissertation, I did try to train a small neural network on my laptop, because it seemed easier than working out how to get access to a university GPU. Four days later, when it was still training, I had some regrets.)

There are some things which are simply not feasible to run on distributed systems (and definitely not on my laptop).

Take fraud detection as an example. Traditionally, to check if a transaction is fraudulent, banks had to send transaction data off-mainframe to a fraud detection model. In a world where transaction times are measured in milliseconds, this cannot be done for every transaction; so instead, they would take a sample of transactions to test, and allow the others through unchecked.

As mainframes get more and more AI-related features, you can now bring the model to where the transaction is already happening. It suddenly becomes much more feasible – to run every transaction through the fraud detection model, ultimately protecting more customers from fraud.

But mainframes aren’t just used by financial organisations: over 70% of the Fortune 500 use them. So, if you are a data scientist working for a large organisation, your employer might have a mainframe and you haven’t realised.

Since virtualisation was literally invented for the mainframe, you might be able to get an environment to run some of your chunky workloads. You don’t have to be where the mainframe is. You don’t have to use COBOL, or a green screen; there are lots of modern tools for interacting with mainframes, and you can run programmes in modern languages, like Python. I have run Jupyter Notebooks on the mainframe, and they looked like any other Jupyter Notebook, just faster.

What’s next?

I hope you can see why – after a bit of research – I jumped at the chance to work in this area, and why mainframes are relevant to data folk. Honestly, I just think mainframes are super cool.

In 2026, I’ll keep doing my best to keep up with whichever technologies come my way, new trends in AI, and other things that are well out of my comfort zone as a data scientist (LPARs, IPLs, anyone?). In the world of mainframe, I am very much still a newbie. I’ve heard of some mainframers with over 50 years of experience!

Will I still be a data scientist this time next year? I don’t know, but I will definitely be surrounded by cool tech and cool data.

Want to chat mainframes and pushing yourself out of your comfort zone as a data scientist? Find me on LinkedIn.

About Us

For Business

The Data Lab Academy

For Universities and Colleges

Community

Reflections on 2025: A Data Scientist Dives into Mainframe (and lives to tell the tale)

Pause: Sorry, what the heck is a mainframe?

What does all of this have to do with data?

What’s next?

Get in touch

Follow us on social