Fraud Detection: Is AI the future of anti-fraud products?

Will AI become the new standard for Fraud Detection solutions? What will happen with rule based systems? In this talk, Diederik Klopper (Payments Consultant & VIP Podcast Host on PaymentGenes) and João Moura (Fraudio's CEO) discuss how AI is changing the Fraud Detection paradigm, specially cloud-based Fraud Detection solutions. Enjoy:

Interview Transcript‍

‍

[00:00:00 - Diederik Klopper (DK)]: Hi everybody, and welcome. My name is Diederik Klopper and I'm the host of the Voices in Payments podcast season two, in which we'll talk all about the ABCD of Payments.

Throughout this season, I will be inviting payment experts to talk about artificial intelligence, blockchain chain, cloud and data, and how these apply, overlap and drive winning business strategies in current day fintech.

In season two, I'm joined by directors, VPs and other senior payment experts who oversee these business operations. But even more often, I'm joined by C-Level executives, partners and founders who have leveraged these technologies to generate immense growth.

In this episode, I'm joined by no other than João Moura, the founder, CEO and CTO at Claudio. With close to 15 years of experience in top academic and industry settings, he has acquired strong theoretical knowledge during his PhD studies in artificial intelligence. Fraudio, which is backed by ING and Viva Wallet, is an Amsterdam based fintech company fighting complex fraud and money laundering in payments.

Self proclaimed as the only SaaS product in the market with top tier artificial intelligence behind it. We all know that artificial intelligence is rapidly developing, but some are still unaware of the applications of AI, particularly in fintech and more specifically in fighting fraud.

In today's show, João explains how AI can be utilized to combat money laundering, merchant fraud and payment fraud. Again, a very special episode with the founder, talking us through the AI essentials for merchants and fintech. So let's go right into it.

Hi, everybody, and welcome to the Payment Teams podcast series Voices and Payments. I'm your host, Diederik Klopper, and today I have João Moura. A very warm welcome to the show.

‍

[00:01:52 - João Moura (JM)]: Thank you very much.

‍

[00:01:53 - DK]: João you're the CEO and co-founder of Audio. I always start my podcast the same: nobody woke up one day and said, I'm going to be a payment expert. How did you stumble into payments?

‍

[00:02:05 - JM]: Yeah, it was actually completely by accident, I would say. So I was working as a data scientist for another company here in Amsterdam, and I was basically scouted. So I was scouted to join Payvision, now owned by ING, which I joined as a lead data scientist back then and, I was responsible for products that were AI related.

‍

[00:02:52 - DK]: And from that, from Payvision, you decided to start your own company. What was it in the market that you saw: "okay, I can do this better"?

‍

[00:03:02 - JM]: You are correct. So at Payvision, I was actually responsible for a program that was geared or aim to improve authorization rates. So given failed transactions, we would look at them and we would retry them in a smart way. This led to increased revenue for Payvision's merchants. Everyone was very satisfied and we wanted to roll it out across the whole portfolio. And fraud was a concern. So we wanted to make sure that everything was done in the best possible way for everyone.

And so we decided to develop our own Fraud models. And when we did it, we realized that they were actually very good. So when comparing it to vendors that we had access to, we realized that they were very good. And I also had back then, let's say, one of those moments of epiphany. Exactly. And I realized that we could do it in a very different way. And we will get to that. One thing let to the other. Also, with ING, acquiring Payvision, Payvision became very focused on bringing on ING's portfolio.

‍

[00:04:40 - DK]: A change in risk appetite as well.

‍

[00:04:42 - JM]: Exactly. And so we decided to spin-off.

‍

[00:04:46 - DK]: You mentioned that you were focused on improving authorization by example with smart retry. Can you shortly explain how that works? What are some of the parameters that you're working with there?

‍

[00:04:55 - JM]: Yeah, of course. So when a transaction fails, there are what are called "hard declines", of course, insufficient funds or stolen card. So fraud related ones. But there are some "soft declines" where you don't actually know what's going on.

And those are the very dreaded do not honor and do not honor account, we've realized, for about 40% of the declines. You immediately think, is there something that I can do?

And typically the declines come from those transactions triggering rules on the issuer side. And the issuer chooses to reject those transactions for various reasons. Sometimes it's because they don't like 3-D Secure, and the transaction that's of high amount with 3-D Secure just gets rejected because they don't want to take the liability, or sometimes they don't like the channel in combination with specific MCC code, things like that.

I cannot go into too many details about how we did it, but we would generate candidates transactions. So we put minor tweaks and we would score them against a model, an artificial intelligence model that we created, that told us what was the likelihood of those candidate transactions being authorized if we were to retry them.

‍

[00:06:34 - DK]: With retry, do you mean just doing the same thing again, or do you try something different

‍

[00:06:41 - JM]: Minor tweaks, so we will change some parameters of the transaction. So sometimes, for instance, something very simple, which actually the card schemes allow you to do, is that sometimes in MCC codes you have equivalent MCC codes. I don't have any example in my mind right now, but for instance, gardening services or agricultural services are similar. This is probably a bad example.

‍

[00:07:18 - DK]: Just tweaking the MCC code makes it in a different category and different parameters apply there.

‍

[00:07:22 - JM]: Exactly. And we realized that the likelihood of some MCC codes being authorized, a priori, was lower than others. And there's an equivalence list, and we took some advantage of that. This is just one example.

‍

[00:07:39 - DK]: And then looking towards Fraudio, because the theme of this podcast series that we have is the ABCD of payments. We already shortly discussed prior to this podcast, but talking about artificial intelligence, blockchain, cloud and data, how do you leverage these four components in Fraudio?

‍

[00:07:58 - JM]: Yes, AI is our bread and better, really. So everything that we do is AI driven and we try to really help our customers be AI driven themselves. So that's our I think pillar number one. Blockchain, at the present, we really don't do much with it. With cloud we're 100% cloud based. And the D, the Data, it's impossible to do AI without data. We do have a lot of data.

When we touch upon the product itself and the different generations of AI, I will explain a little bit more how we are actually able to put everyone data together and to treat it as one entity, centralized entity.

‍

[00:09:06 - DK]: Perhaps that's interesting to get into that right now because have you talked about the three generations of AI? For those who have heard the buzzword but don't understand what it is, where it comes from and what the potential is, can you just walk us through the development to where we are right now and explore the future potential as well?

‍

[00:09:25 - JM]: Of course. So the first generation and it transposes or translates very well to fraud detection. So the first generation that vendors provided their services, was and still is very rules based. So you have a system that allows you to go and set up rules and those rules as you set them to be quite static. But sometimes rules can be a little bit more interesting and a little bit more dynamic.

But still rules based. Then the second one is completely machine learning based, AI based, but built in silos. So vendors go to a customer, receive their data, and then they develop models specifically, with this data, for this customer.

‍

[00:10:21 - DK]: Okay.

‍

[00:10:22 - JM]: This is very professional services based, and ultimately isolate the data and the learnings from one player (can be a merchant, can be an issuer), from one entity, and keeps it in isolation.

Whereas what we consider to be generation three is really a centralized AI. A centralized piece of software that learns from every data point. And the way we do it is by actually translating everyone's data into our own centralized data schema and then put everyone's data together and training our AI with data.

‍

[00:11:11 - DK]: Of course, in this age, in the age of big tech and big data, people hearing, "okay, we're going to centralize and keep all the data", that sounds as well, a bit scary, but can you explain what it exactly means as well? So is it possible, for instance, for you to identify: "this is person X, and this is his or her behavior". Are you able to use it commercially or not?

‍

[00:11:36 - JM]: No. The simple answer to that is no. We do not identify people. We do identify credit cards. We do not relate those credit cards to an individual. We're very careful with anonymizing data, but we anonymize it always in the same way, which allow us to actually, given a new transaction that's coming through, we use the same anonymization functions.

And so if we see your credit card coming from or being used in Amazon Germany, it's being used to buy something from Adidas online in the Netherlands, we will be able to annoy this pseudo-anonymization. It's actually not reversible. So that's point number one, that's very important. And thanks for bringing this up. So we do not require personally identifiable information, or we do receive it, but we immediately change it to anonymous it and store it in an anonymous way. We also do not receive PCI information. So say credit card numbers. We do not need to receive it.

‍

[00:13:00 - DK]: You received a token for that card.

‍

[00:13:02 - JM]: Receive a token, correct.

‍

[00:13:04 - DK]: And of course, that makes it a lot easier for you. You don't need a PCI license. Makes it regulatory wise as well, a bit easier. You mentioned that the second generation of AI is AI machine learning. Those are two buzzwords that are often used in combination with each other. Is there a significant difference between the two?

‍

[00:13:27 - JM]: No. So machine learning is one of the pillars in AI. So machine learning is a subset of what is AI. So there are other pillars, and one very important one is actually the semantic pillar, the symbolic AI. And this is crucial for us actually, because that's exactly how we do this state of translation or centralization. So we treat entities. Let's say, let me give you an example. So the Ajax Stadium, or

The Johan Cruijff Arena.

Exactly. And with those two words or sentences, I can refer to the Ajax Stadium as Ajax Stadium or the Johan Cruijff Arena, and they mean exactly the same thing. Right? That's exactly what we do. So we go down to the semantic bare bones of what the data is.

‍

[00:14:34 - DK]: I can imagine two people not speaking the same language are using the same semantics in that sense and being able to translate it back to what it actually means. Then you can summarize the data and extract information for that. But how then does that translate into fraud detection, for instance? What is the extra benefit there?

‍

[00:14:57 - JM]: Yeah. So a payment is a payment, especially a card payment always goes through the same route. So the syntax of a payment, if it goes via one specific processor or acquirer, will look one way. If it goes by another acquirer, the integration is different. It will look different. But when it goes down, when it reaches the issuer, it's always the same, which is one specific issue.

So there's translations that are made along the way. We take advantage of that. So we take advantage of the fact that it is possible to translate payments from one language to a different language.

‍

[00:15:50 - DK]: So for you, it doesn't matter whether it was, for instance, Payvision, which was the acquirer, or it was acquired by ING or ABN AMRO or whatever. If it comes to the issuer, you can still translate the data into the same language, which then you can use.

‍

[00:16:04 - JM]: Exactly, we translate it into our own internal data schema, language, which we then use to train our models. And this allows us to put everyone's data together. So if we receive data from, say, from an acquirer in Brazil, we then receive data from an acquirer in Italy (just two examples) typically they will be kept in silos. This data will sit somewhere, but the one for Brazil sits in one table, the other one sits in a different table. We put it all together, which is really very powerful, and that allows us to have billions of transactions, whereas those transactions can be used to serve smaller players.

So you have 30,000 transactions a month, typically you cannot do much with AI, so you need to revert to rules based. We can use the transactions that we have to serve those players.

‍

[00:17:07 - DK]: You can broaden their database. But because I can imagine you mentioned the example of Italy versus Brazil, I can as well imagine that the usage of credit card is different from Italy to Brazil. Is there something has too much data, that if you put all the data together, that you create one giant heap and that you lose the distinguishments between different countries or different merchant category groups or whatever?

‍

[00:17:38 - JM]: Absolutely. And that's exactly what we realized very early on. That's why I don't say that we have a model, we have an AI. And that AI is comprised of actually hundreds of models that are organized hierarchically. For instance, we will have specific sub-models for ecommerce, say 7995, gambling in Italy, and we'll have the same for Brazil. And those models will be different. When we don't have enough data to create a specific model, we go one level higher. For instance, we use the model for ecommerce in Italy, or for 7995 in Italy. And when we don't still have enough information, we go one level higher and we just go with the country or with the region or with the MCC code. So indeed what we do is create models per segments.

‍

[00:18:44 - DK]: So there's data coming from all over. What are some of the industries that are generating the most data in that sense? Is there a massive difference between how much data is gathered?

‍

[00:19:00 - JM]: In this case, payments are produced by people buying something. So payments has a very interesting characteristic of being very representative of the real economy, not the trades between companies, etc, or state, but rather people's spending behavior. I hate to say it, but the Internet is quite heavy on adults, of course, and gambling. So ecommerce is very heavy on that. But then, of course, retail is important, different forms of retail and then point of sale. Point of sale, of course, is everything that you see. So I'd say this is a good breakdown. So you have adults and the different categories in adults, gambling and gaming. Gaming is of course, big. And then all the different forms and their price of retail. And then in point of sale, it's quite more spread out and a little bit more even I would say.

‍

[00:20:22 - DK]: In that sense I can imagine that, for instance, merchants that are operating in the more high-risk digital entertainment areas, they, on the one hand, are extremely good clientele of yours because, you have one solution that really helps them. But on the other hand as well, perhaps those merchants that do not have the data flow that allows them to have their own AI systems in place, those are as well really helped by your solution. Where do you see the most added value?

‍

[00:20:59 - JM]: I would say both. So we're working with a very large digital services provider. So it's actually selling vouchers and tickets online, et cetera. So it's a very good business. They are IPOing as we speak, and we are able to provide them with really very good scores. So they have everything that's digital that you're able to use a credit card to buy, something that you can then use or sell, is very prone on having higher fraud. And in this case, it works really very well, remarkably well, I would say. And they have a lot of data. It's a large company that processes a lot of transactions. It's actually possible to just produce good machine learning models for them. But the benefits of putting their data together with our data is actually massive. On the other hand, for the small players, I would say it's almost a no-brainer because often they have so little data that a rules based solution performs ten times worse than ours. And what I mean by better is that to get to the same level of fraud detection, sometimes blocking two thirds of fraud, we produce ten times fewer false positives.

‍

[00:22:48 - DK]: Because of course, this is the Holy grail to minimize your false positives and to maximize the capture rate of fraudulent transactions. I know one of the founders of Payment Genes always made a comparison saying if you have a nightclub and you have no bouncers in front of your nightclub, then sure, there will be a fight or two in the club. If you have ten bouncers in front of the nightclub and everybody needs to show their ID before coming in, there won't be any fights, but the club will be empty. And there's a balance somewhere in between that every now and then you want a club full and you want energy in the club, but there's a chance that fraud will happen. How do you see it going forward? How do merchants would be able to best find the balance in that equation? Because of course, it's dynamic, fraudsters always find new ways to commit fraud, so it's a constant battle. How should the merchants prepare themselves?

‍

[00:23:41 - JM]: Yeah, I'd say there's two different access to variables to this equation. At least two. So one is margin. So what is the customer's margin? The end customer, the merchant's margin. If a merchant has a high margin, then their tolerance to fraud tends to be quite high. So let's say 50% margin, which is exaggerated in some cases.

‍

[00:24:11 - DK]: You can have 50% to still break-even.

‍

[00:24:13 - JM]: Exactly. If it's digital goods, the service that you're rendering doesn't cost you much.

‍

[00:24:22 - DK]: The cost is already made. You can re-sell it and re-sell it and re-sell it.

‍

[00:24:25 - JM]: And so their tolerance to chargebacks is higher. Whereas if you're in an industry or your business is a type of volume business, where you have very tight margins, so something below 2%, even sometimes lower, your tolerance of frauds or to chargebacks, goes out the window. So there you will want to catch a lot of fraud. But I'm going to your analogy. I like the analogy to some extent. So the analogy of the bouncers, to some extent, because you have smarter bouncers and less smarter bouncers, and you are basically saying that we're implying that every bouncer is the same, which is not the case.

‍

[00:25:12 - DK]: True. They are not the same.

‍

[00:25:14 - JM]: So we are the smart bouncer.

‍

[00:25:19 - DK]: How did you become so smart? If you're able to be so smart, why can't others?

‍

[00:25:25 - JM]: That's a very good question. And that takes me a little bit back to what we really invented and patented. So this concept of putting everyone's together is not trivial at all. I make it sound simple.

‍

[00:25:42 - DK]: It sounds easy.

‍

[00:25:44 - JM]: It sounds easy and from an external perspective, it's very simple. Internally, it's quite complex. So in order to make the data sets coincide, there's a lot of complexity that needs to be dealt with, and that's a problem. Also, as I said, the functions, the anonymization functions, need to be coherent across the whole spectrum and the whole portfolio. And so for someone who has collected data in the past but not in the right way, that is not mergeable. So you need to start from scratch doing this, which we did. And again we patented the key part that allows us to do this. We are quite active, I'd say, in the academia as well. So I, myself, I hold a PhD in AI and I've spent a lot of time in the academia, and I know that even in the academia there isn't work around this, okay? So this is quite advanced from a technical perspective, and that's the first difficulty. And then think about it in these terms. If you are a vendor, current vendor, and you are charging six figures to run a proof-of-concept for a customer, and then after it's done, after six months, you sell that custom made product again, in some cases for a million a year, why would you change that business model?

Do you see what I mean? Basically, you need to have a lot of data science teams, a lot of data scientists, which you then allocate to working in these projects, and you make a lot of money. It's almost a consultancy business, you allocate people with expertise, and of course they already have the pipelines in place. They're able to do it quickly, but it's a professional service and they get very well paid. If they were to centralize it and to do it the way we do it, then two thirds, three, fourths of their workforce would just be redundant.

‍

[00:28:12 - DK]: Then a question that came to mind because you talked about different vendors and as well with payment teams, we've supported both merchants, but as well payment companies in vendor selection. When it comes to working with or choosing an AI based partner, the proof of the pudding is somewhere hidden in the code. So their true capabilities is more difficult to figure out than in a more traditional company. How should merchants and companies alike handle this in the negotiations or in RFP processes to make sure that they really understand the value that the company can bring? And indeed, is it truly AI based or is it rule based, sold as AI?

‍

[00:29:00 - JM]: I would say the proof is in the pudding, as you said. It does not matter what's behind the curtain. What matters are the results and the value that a vendor is capable of bringing to a customer. What I really advise every company looking for a solution is to create a set, a data set, with some data and then with labels, so each transaction is marked as being fraudulent or not, so having been charged back or not, and then have one month or two months of data that's unlabeled. So it's a blind test and then give it to the fraud vendors. Whoever performs better is the best. It's as simple as that. So what's behind the scenes should not matter too much. Of course, some will say that they have very bad systems where they have rules. But listen, if those rules work very well, who cares?

‍

[00:30:13 - DK]: Yes, indeed, in the context that they work in the current environment. But as well, fraud is something that's totally dynamic and it changes all the time. So the question is, will it keep up? And I think through AI is better equipped to keep up with that than having a rule base, because then you're always behind the curve.

‍

[00:30:33 - JM]: I agree. In that sense, the fact that we're centralized, you don't need to continuously pay us to improve your models. We improve the models for everyone.

‍

[00:30:47 - DK]: With regards to those developments in the fraud landscape, of course, I think you are in a key position to identify developments in that area. Is it so quick as people think it is, or is it a bit more stagnant? What are some of the major developments that fraudsters are currently doing?

‍

[00:31:07 - JM]: So it is really malware driven. So fraud, especially in , it's malware driven fraud.

‍

[00:31:18 - DK]: Perhaps for those who heard the verb but don't know specifically what it is.

‍

[00:31:23 - JM]: Yeah. So malware, think of viruses, computer viruses, computer programs that get installed in your devices, in your laptop, in your mobile phone, and they will do different kinds of things. So, for instance, they can log your keys. So whenever you write on your keyboard, they are constantly recording what you write on your keyboard. And then every once in a while, they will send that batch of keystrokes back to what's called the Command-and-Control server. And then on that side, it's very easy to identify a sequence of 16 digits, and that will be the credit card number, followed by some CVV code. That's a major way of how credit cards get exposed.

A different one is, of course, data leakage. So there's a database that gets hacked and a bunch of credit cards are all exposed at the same time. That also happens often. But if I have the password, if the passwords are exposed, then I can go on the back-office of that service and get, for instance, my phone number, email, the first six digits of the credit card, the last four, the expiration date. Now I just call you. I pick up the phone and call you and ask "so I need to verify your credit cards". And it's called social engineering, but fraudsters tend to be very opportunistic. They are smart people.

‍

[00:33:14 - DK]: Minimum input, maximum output.

‍

[00:33:16 - JM]: Exactly. Minimum input. They do tend to become lazy. So what they do, most of them is to buy things, buy lists of stolen credit cards. In some cases, they will actually open up a store themselves, an online store, and then they will process stolen credit cards in their own store.

‍

[00:33:36 - DK]: Ah, wow.

‍

[00:33:38 - JM]: Then they will just run away. This happens often. We have a tool it's geared to acquirers, that looks for this. So for merchant initiated fraud.

‍

[00:34:04 - DK]: Merchants can be the perpetrator. And of course, that happens if the KYC procedure doesn't fully work or KYB procedure.

‍

[00:34:13 - JM]: And sometimes it works. There's this saying in some circles that low risk is high risk because low risk, you want to onboard rapidly, right?

‍

[00:34:25 - DK]: Yes.

‍

[00:34:26 - JM]: And it's very easy to buy a stolen card. An ID card off the streets, you pay in some cases €30 to get it, and then with that you change the picture and you pass a KYC test very easily. And so you're able to open up a low risk store with a low risk MCC code. And you process stolen cards.

‍

[00:34:54 - DK]: Yeah. We currently discussed how you stop fraud while it's being processed. I can imagine as well, and I know that there's a lot of efforts to prevent fraudsters even getting started, prevent credit card details coming on the street, etc. Is that something that you are also consultant on, or not at all?

‍

[00:35:20 - JM]: Not really. So we do not work on the KYC side and also, of course, not on the cybersecurity side itself. Actually, my background is in cybersecurity. I've worked for several years in cybersecurity, and we have good connections in that space, and we're able to augment our data set with cybersecurity related information. I'll give you an example. An IP address known to have been compromised in the past, and then if we see fraud coming from or if we see a payment coming from there that we suspect is fraud, we sort of multiply the score. That's an example. And we also alert our partners that we're seeing fraud coming from this, which is, let's say, a validation that this IP address is compromised. It works both ways. And it's very good to actually look at what people are doing in the cybersecurity space.

‍

[00:36:30 - DK]: We've talked a lot about AI, but as well, cloud and data are big parts of your company as well, but with regards to cloud, you said you're fully cloud operational. How did you select which vendor to use for that, for instance? Because I'm assuming it's Azure, or AWS.

‍

[00:36:49 - JM]: So we work with AWS and Google Cloud. We have everything in the European Union. We do not let data go out of the European Union. And we centralize everything here. And we're very careful with GDPR. We use Google Clouds as our main serving cloud for one reason: this is a technicality, but their Kubernetes implementation is very good. And it allows us to be very nimble. Also, they have servers in Amsterdam.

‍

[00:37:26 - DK]: The nimbleness that you're seeking for is that you're more easily able to extract the data.

‍

[00:37:31 - JM]: No, that we're capable of doing updates more frequently and more easily. And then we use AWS to do the data science and to do all the data crunching.

‍

[00:37:46 - DK]: Why have the two side-by-side and why not all in one platform?

‍

[00:37:51 - JM]: It's an architectural choice from our perspective. The deployment phase is very simple and effectively, we're multi-clouds. So we're ready to deploy in Azure, in AWS, or even on-prem if a customer has an extremely good business case for it. We can also easily deploy on-prem, even though we don't like it.

‍

[00:38:20 - DK]: For instance, if you're going to Russia, then you will need to have data storage within the Russian border or Turkey. I think it's all the same.

‍

[00:38:27 - JM]: Right, correct. So also in South America, we do the data collection through there. And also, let's say an AI serving endpoint needs to be close to our customers. So the network latency is low and they get meaningful scores, but in a timely manner. Right now, we're below 50 milliseconds in terms of latency, which is fairly good, I would say. And we're able to do that in multiple regions.

‍

[00:39:10 - DK]: Yeah, that's nice. And then talking about the last one. Well, it's all around data. How has data usage developed from the time that you started Fraudio, until now?

‍

[00:39:25 - JM]: So we have a lot more data. That's number one.

‍

[00:39:29 - DK]: And is that trend going to continue? Is it just going to be more and more data gathered?

‍

[00:39:33 - JM]: Yeah. So data needs to continue to grow. And going back to the example that I used. So the 7995, ecommerce, in Italy, we want to have that kind of granularity, conceptually for every country, every channel, and every MCC code combination.

‍

[00:40:01 - DK]: Bring it down per region or whatever.

‍

[00:40:03 - JM]: Exactly. There's a point, of course, having 1 billion transactions for one specific segment or having 2 billion is pretty much the same.

‍

[00:40:18 - DK]: And the different type of data points that you're gathering, so, for instance, not only credit card number or IP address, but as well, which browser somebody uses or how long somebody surfs the Web, is there a saturation point there for how many different types of data points you're gathering or whatever we can use, we will use?

‍

[00:40:39 - JM]: Yeah, pretty much whatever we can use, whatever we receive, we will try to use. Of course, there's only so much data you can capture, right? Yeah, I would say the more data, the better. There's some data that's sort of irrelevant. And there are some misconceptions in this space. And for instance, the interaction with the browser, et cetera. What you want to do is to represent the interaction with the browser in a way that's always the same. So again, Amazon in Germany will collect that information in one way, but Ball.com here in the Netherlands will collect it in a completely different way, and then you cannot use it in a centralized way. So what we want really is the data to be uniform. And so we prefer uniform and homogeneous data, and fewer data points, than more data points, but that's just gibberish or garbage.

‍

[00:41:54 - DK]: Are there specific data points that surprise you that either make a huge difference or you thought would make a significant difference, but don't make any difference at all when it comes to fraud?

‍

[00:42:06 - JM]: So I would say, everything that has a lot of variance, actually loses meaning. Some data fields that tend to be touted as being very important, actually are not as important, at least for us, because we gather a lot of data, than people think. For instance, the device fingerprint. A lot of people say device fingerprint is the most important thing. We don't see it in that way. It is important, but it's mostly important to determine what is fraud for sure. But the problem is that it also causes a lot of false problems. So if you see device fingerprints that are...

‍

[00:42:58 - DK]: It's not a given that it's fraud.

‍

[00:43:01 - JM]: If you use it incorrectly, it's problematic. IP addresses, super problematic. There's a lot, of course, IP addresses that sit behind the proxy. And so what you see is always the same IP. If you see frauds or even a lot of fraud coming from that IP..

‍

[00:43:17 - DK]: That could also mean that it's a proxy server. So we're moving towards an age where more and more data is going to be gathered, and I don't feel like it will stop anytime soon. I think that with Moore's Law, et cetera, that we'll be able to store it and make use of it quite easily. When it comes to cloud, do you think it will be the gold standard or is there something on the horizon that will change more specifically for the payment industry?

‍

[00:43:51 - JM]: So if right now it's still not the gold standard, and I don't think it is as of today, it's quickly becoming the only effective way to move forward. So the player that's now starting the journey in the cloud is too late, I would say. From many perspectives. So from a technological perspective, from a cost perspective, the cloud brings a lot of benefits. Of course, it's not a silver bullet because it also has some challenges associated to it, but indeed, it's here, it's the new paradigm. Now, what's also coming to the market is the serverless technology. So in cloud, typically people think about I rent a server in the cloud, and that's today's paradigm indeed. But what's coming is you rent a function in the cloud, so you execute, for instance, you tell Amazon, I want to execute X plus Y, and then you don't have anything running. And then when you receive a transaction that has X and Y, you send it to them, X plus Y, and they then automatically allocate resources to process that function. And this is the future, because you are not paying for a server to be always on when you don't need it.

‍

[00:45:22 - DK]: To translate that. For instance, would that be a possibility that I as a merchant, have my own AI based rule or a system to indicate, okay, these are for sure safe transactions, so those I will process no matter what. And if the AI tool indicates, okay, there's something dodgy about this, then I will impose a fraud check to put like that. Yes, that makes sense. That, of course, will drastically reduce cost instead of checking all transactions.

‍

[00:46:02 - JM]: Absolutely.

‍

[00:46:04 - DK]: Another thing that a friend of mine told me is that, the size of the data stack that you're working with is 90% of how effective you will be. Of course, fraudulent transactions are only a percentage of the actual transactions. How can you leverage the knowledge that you have about the transactions that are valid and make sure that you retrieve most value out of that instead of looking at the fraudulent transactions?

‍

[00:46:32 - JM]: So you really want both? So one thing is to characterize what is a normal pattern. And if you are able to characterize what is a normal pattern, if a transaction looks like that pattern..

‍

[00:46:46 - DK]: If it smells like chicken, tastes like chicken, looks like chicken.

‍

[00:46:49 - JM]: It can be rabbit.

‍

[00:46:50 - DK]: But it can be anything.

‍

[00:46:55 - JM]: And the same applies to fraudulent patterns or to bad patterns. Now, mixing both is very important. It's difficult to deal with the previously unseen. So if you only have non-fraudulent transactions, when something else comes, it will always look like something that you've seen. Probably. How far away is it from what you've seen?

‍

[00:47:27 - DK]: Indeed, if it's something you don't know, do you take a gamble and say, okay, let's try this, and let's just keep in close eye on what happens to it later versus okay, let's definitely not go there. And they were back to what is your business model look like? What are the margins that you're working with? Whether you're willing to take the risk, what is your risk appetite?

‍

[00:47:47 - JM]: That's what we call unsupervised learning or unsupervised learning is used to deal with this. And actually what we look for are zero day fraud patterns. So if one of the smart fraudsters, not the lazy ones, comes up with a new exploit or comes up with a new batch of credit cards from somewhere, and they start to use some tool to run those credit cards, this will be a pattern that we have not seen yet, or set of patterns that we have not seen. The first few transactions, they will look odd, but we still don't know they look odd, but we don't know if they are odd bad or odd good. And so we need to wait a little bit while they accumulate. And then when we start having better evidence that these new patterns are linked, this is how we sort of plug the gap between the transaction that we see now and the chargebacks that we will see in the future.

‍

[00:49:08 - DK]: Exactly. All right. Well, I think we've nearly discussed everything. Is there anything that we haven't mentioned yet that we should be looking at if we're talking about fraud, if we're talking about AI, blockchain, cloud and data within payments.

‍

[00:49:26 - JM]: So I would say and really just to wrap up, this is more of a general consideration. I think it's important to look at these things not in a very operational way, but rather in a very holistic way. So you want and these things, I mean fraud, I mean authorization rights, payments in general. If you look at authorization rights atomically, and if you look at trying to control frauds in an atomic way but don't care about authorization rates, they will conflict. For instance, strong customer authentication. It's not a silver bullet. It's a tool. It's the right tool for many applications, but it actually causes a lot of issues. So what you want to do really, is to use the tools that you have at hand in a smart play and try to optimize, ultimately the customers experience, the merchant experience and minimize fraud. So strong customer authentication right now, and then maybe this is something that we can actually discuss to wrap up, right now in many countries, since the 1st of January. It's quite annoying. I'm sure that you've already experienced it in these last two weeks. So now, for instance, whenever I want to rent a car here in one of those car sharing apps, I get asked to go to my credit card app and authorize it and then it takes a while, it's raining, because it's always raining. Which is quite painful. So what we really want to do is to allow our acquirers, our customers, to go below the threshold necessary for the exemption.

‍

[00:51:20 - DK]: Yeah, exactly. That's what we say to our clients as well: look at your business model and look at the exemptions, and how within your business model can you best use for those exemptions? Because if you can do that, then that's a huge conversion increase for your business.

‍

[00:51:36 - JM]: Absolutely. So in Spain, I read recently that since the beginning of the year when it became mandatory, 80% of the transactions failed during the first week.

‍

[00:51:49 - DK]: Wow.

‍

[00:51:50 - DK]: Yeah. So that's huge.

‍

[00:51:52 - JM]: Of course, it's going to stabilize as people are more used to it.

‍

But still in the UK, and this is recent, the UK, they have strong customer authentication in place for a lot longer than in other places. And the loss in conversion is of 30% with 3-D secure.

‍

[00:52:18 - DK]: Yeah, exactly. And it goes back to what you said. It's one tool. And you should always be aware that if you turn up the dial on one tool, other tools are also affected. So it's always important to find a balance and be aware that there is a balance. It's not just one variable that you're adjusting.

‍

[00:52:40 - JM]: Exactly.

‍

[00:52:41 - DK]: Yeah. All right. João, thank you a lot for this inspirational talk, it's really been enlightening for me, I must say. And I hope for the audience as well. I hope you had some fun as well. That's always important and I'm confident that we'll speak more soon, but I want to thank you very much for being on this podcast and all the best to you.

‍

[00:53:04 - JM]: Thank you. And thanks for the opportunity to be here and thanks to our audience for for bearing with us. And I hope it was as much fun for you also as it was for me.

‍

[00:53:16 - DK]: Yes. Thank you very much.

‍

Hope you've enjoyed our podcast conversation between Diederik Klopper and João Moura about how AI is changing the Fraud Detection paradigm. See you on our next talk!

‍

About Fraudio

Fraudio is an Amsterdam based scale-up helping companies in the payment ecosystem fight payment fraud and financial crime with its unique ability to build high performing AI and ML models without costly customisation. It is trusted by some of the fastest-growing companies in the world, protecting them from payment fraud, merchant-initiated fraud and money laundering.

Fraudio's founders are from the payments industry and don't believe in black-box solutions. They ensure that end-users are provided with insightful and timely information to control payment fraud and merchant portfolio risk while ensuring the highest level of security and auditability. It's easy to integrate with products that deliver best-in-class fraud detection from day 1, allowing clients to scale their customer bases safely, reducing operational costs and fraud losses while maximising revenue.

‍

Fraud Detection: Is AI the future of anti-fraud products?

Interview Transcript‍

About Fraudio

Measure results yourself !