Banks & Big Data: How to Create ValueZions Bancorp Uses Data for Fraud Detection
At Salt Lake City-based Zions Bancorp., big data is more about anticipating what might be needed in the future, rather than getting hung up on how much value the data offers today.
As Aaron Caldiero, Zions' senior data scientist, points out, when it comes to fraud detection and enhanced account and transaction monitoring, an institution never knows what value data might hold at some point down the road. That's why all banking institutions should collect as much data as possible, even if they don't exactly know what to do with it just yet, he says.
"That's the beauty of big data: to be able to bring together all the data that you ever could possibly need and never have to worry about structuring that data before you load it," Caldiero says in an interview with BankInfoSecurity's Tracy Kitten (transcript below).
From a fraud-detection and risk-mitigation perspective, big data is critical, or at least it has been for Zions, a $54.6 billion bank
"For us, it improves security, because we can see different channels of fraud and different channels of access," Caldiero says. "Being able to tie all those different channels together to create a more holistic picture of where fraud can and does happen, and then be able to build predictive models based on that information and being able to centralize all of that, has been really impactful."
Leveraging big data has helped Zions stop fraud and thwart losses, he says.
But what's the one mistake many institutions make? They gather and hold the data; but fail to bring in the right people to analyze it, Caldiero says.
"It is essential to be able to bring the right team together, to be able to have those people mesh together and to be able to interact with the business and bring those business needs to big data," he says. "Having the right people to be able to do that has been essential for us."
During this interview, Caldiero also discusses:
- Why big data is critical for enhancing cross-channel fraud detection and risk scoring;
- How Zions is working with other banking institutions to help get better handles on big data; and
- How banking institutions can ensure they're hiring the right people and teams for big data management.
Caldiero performs data mining, statistical modeling and analytics on Zions' Security Data Warehouse, known as the Hadoop cluster. Using data science tools, Caldiero builds, implements and maintains risk models for fraud and malware detection. He has nearly a decade of experience in the analytics and financial-services industries.
Big Data Consulting
TRACY KITTEN: Big data, which only recently has become a household term, has been a focus at Zions for the last four to five years. What can you tell us about the consulting that Zion offers other institutions and industry agencies about how financial services can and should take advantage of big data?
AARON CALDIERO: When we consult with outside financial institutions, we talk to them about what data sources they would like to bring together, and also once they do bring those things together, what kind of things can they do with it. For instance, most of the time, we recommend that they bring as much data as possible together. That's the beauty of big data, to be able to bring together all the data that you ever could possibly need and never have to worry about structuring that data before you load it.
Defining Big Data
KITTEN: How does Zions define big data?
CALDIERO: We define big data in a couple of different ways. The first way is just by the tools that we use - Hadoop and some of the large, unstructured data formats - and so the tools kind of define what big data is for us. Also, big data has been around for a long time, but people just didn't have a name for it; so it's mostly just being able to house large amounts of data that you weren't able to house before.
Addressing Big Data
KITTEN: How does Zions address big data internally versus what it recommends for outside organizations?
CALDIERO: Internally, because we've built a really strong infrastructure to house a lot of big data, we're able to capture anything that we could possibly want, and things that we don't know we want yet. We can capture a whole bunch of text feeds and just store those off into Hadoop, and later on down the line we can extract things that we need, and then if we decide that we also need "A, B, C, D," we can go back to that unstructured data and it's there already. ... We already have it loaded and ready to go.
KITTEN: Then what about recommendations for outside organizations? I guess it just depends on how much data or how accessible that data is?
CALDIERO: Right. If they do have large amounts of data and they do have the infrastructure to handle it, we would definitely suggest for them to load as much as you possibly can. Load everything and then, after the fact, worry about extracting the data that you need and structuring the data the way that you need.
KITTEN: When it comes to big data, a lot of organizations struggle with just locating the data. What do you see as being as the greatest challenge for financial institutions and organizations generally when it comes to big data?
CALDIERO: I know that we face, internally, being able to interact with the different business units to be able to find out what their data sources are so that we can integrate all of that data into our centralized location. Being able to interact with the business units has been really essential for us, being able to go to the different business units and say, "What are the data pieces that you're missing? What are the data pieces that you have already? Where are you getting your things from? Is it from the mainframe? Is it from an Excel spreadsheet, all sorts of disparate data sources? I think being able to bring all of those together has been really essential for us.
KITTEN: One of the things that kind of jumps out at me is the fact that organizations not knowing where data is located probably means that they're not adequately protecting that data. But big data can actually help an organization improve its overall security posture. How does that happen?
CALDIERO: Yes, it can definitely improve security, and for us it improves security because we can see different channels of fraud and different channels of access. And being able to tie all those different channels together to create a more holistic picture of where fraud can and does happen, and be able to build predictive models based on that, and being able to centralize all of that has been really largely impactful - being proactive about stopping fraud before it happens.
KITTEN: Some of the organizations that Zions works with, when it comes to helping them understand big data, what seems to be the biggest concern or perhaps the biggest challenge that a lot of those institutions face?
CALDIERO: There are two different camps there. A lot of people either focus on what tools [they] need, and other people focus on what kind of people do [they] need and what kind of skills those people have to have. And we try to encourage people to focus more on the people and the skills that those people will need, rather than just the tools because the tools will change and adapt over time, but having the right people that have the skills to be able to adapt to all the new tools that are coming out in big data has been essential for us.
Finding the Right People
KITTEN: I wanted to ask about the fact that so many organizations and institutions do struggle with having the right people and the right knowledge base to interpret and analyze big data. Often times it seems that organizations rely on their IT staffs, but you've noted before that that's not really recommended. It's not the best approach. How can institutions address those concerns by ensuring that they're hiring the right types of people?
CALDIERO: It's very difficult to find the right people. We've tried to look for those hybrid types where it's somebody that has some IT experience, programming experience, math experience, statistical experience and some domain knowledge, and trying to bring all of those things together into one person is very difficult. So we've found a couple of those unicorns, the mythical beasts, and then we've also found a few different people who have those individual skills that we're able to mesh together as a team.
KITTEN: When it comes to having so many different people involved in the process, I'm sure this raises some sort of data access concern. How can institutions and organizations generally ensure they're providing data to the right individuals without exposing themselves to risk by making too much data too accessible?
CALDIERO: With us, the way that we've handled that so far is a lot of the big data tools right now don't have a lot of security baked in, so we've added a lot of features on top of Hadoop and MongoDB and some of those other technologies to kind of limit what people have access to. Also, most of this stuff that we send outside of our group has already been filtered down to just show what we want people to see. We kind of keep everything in house for right now and then it puts a lot more stress on our team, because then we have to be able to adapt to all the different requests for data and we're still kind of working out the logistics behind all that.
KITTEN: I wanted to ask about the platforms that Zions and other institutions build to help them understand big data across these different channels. How can an integrated view of big data improve fraud detection across those channels, and how do you build those types of platforms?
CALDIERO: It helps to detect fraud, because like I said before, you're able to integrate more pieces of data to be more predictive about detecting fraud. And the technologies that we've used are able to handle large amounts of data and also they scale very easily just by adding new hardware nodes, so when you do have a new data source that may overload your systems all you can do is add a few more servers to your cluster and it scales very easily.
KITTEN: When it comes to relying on vendors or different proprietary solutions, how does Zions address that when it's looking at big data?
CALDIERO: With big data, it's really difficult to find vendors out there that have everything that you need. So we have kind of a hybrid approach of build it in-house and vendor solutions. Mostly, our vendor solutions are just in support for some of the products, but all the stuff that we build on top is all built in-house so that we have adaptability and are able to access the fraud models and be able to adapt those as quickly as possible in real time when we need to.
KITTEN: With some of the other organizations that Zions works with, do you actually outsource any of the platforms that you build to some of those organizations?
CALDIERO: So far we have not. That's something we're definitely looking into. For right now, we're just consulting with a few different organizations through some of the fraud financial groups, but right now we haven't sublet out any of our technology that we've built in-house.
KITTEN: Before we close, I wanted to ask simply, what recommendations you might offer to an institutions or an organization that is interested in addressing big data?
CALDIERO: I think the biggest concern would be to find the right people. It's essential to be able to bring the right team together, to be able to have those people mesh together and to be able to interact with the business and bring those business needs to big data to be able to answer the business questions that you wouldn't be able to answer otherwise without big data. Having the right people to be able to do that has been essential for us, and that's definitely what I would recommend to other institutions.
Associate Editor Jeffrey Roman contributed to this piece.