Discussing Forter with CTO Iftah Gideoni

This conversation covers:

  • The value that Forter provides, and the types of companies that they work with. Iftah also explains what makes Forter so unique.

  • The underlying technology that Forter is using, and how they quickly process hundreds of complex backend workflows. Iftah also talks about some of the tools that they are using, including AWS and Apache Storm.

  • How Forter approaches the cloud, and how it’s helping them concentrate on the business of detecting fraud. In addition, talks about the types of cloud services that Forter is using.

  • Forter’s ability to scale — including how they responded to increased customer demand during COVID-19.

  • Forter’s biggest technical challenge that they are currently working through.

  • Iftah’s thoughts on the security- speed tradeoff.

Links

Transcript

Emily: Hi everyone. I’m Emily Omier, your host, and my day job is helping companies position themselves in the cloud-native ecosystem so that their product’s value is obvious to end-users. I started this podcast because organizations embark on the cloud naive journey for business reasons, but in general, the industry doesn’t talk about them. Instead, we talk a lot about technical reasons. I’m hoping that with this podcast, we focus more on the business goals and business motivations that lead organizations to adopt cloud-native and Kubernetes. I hope you’ll join me.Emily: Welcome to The Business of Cloud Native. I'm Emily Omier, your host, and today I'm chatting with Iftah Gideoni. Iftah is the CTO at Forter. Iftah, first of all, thank you so much for joining me.Iftah: Very glad to be here.Emily: So, I wanted to have you start by introducing yourself and what you do, and then also what Forter does.Iftah: Hi, I'm Iftah. I’m a physicist of education, and in the last 20 years, a CTO of several companies, mostly [00:01:11 unintelligible] governmental companies, and companies that I founded. In the last six and a half years, I'm with Forter. And what Forter started to do from 2014 is to provide what was, at the time, very bold vision of fully automated, fully cloud-based decisions about whether to allow or decline e-commerce transactions.Now, from that time we actually implemented and executed that, we decide very many more than 3 million transactions every day, today, all in real-time without a human in the loop. And we expanded into being a fully-fledged trust engine that gives decisions not only about transactions, but about many other points of interaction with the consumer, for example, in their login time, and in other points where trust decision is needed.Emily: So, just because I think it might be interesting to listeners, give me some examples of, like, when somebody might interact with Forter or have some sort of action approved or declined by Forter.Iftah: Right. The prime customers of Forter are the big e-commerce enterprises. Think about the [00:02:42 Sephoras], the Nordstroms, the Home Depots, and this kind of companies. And whenever you press the button of requesting to committing to the purchase and you see this small things rounding on the screen, then it is sent to Forter and Forter within, usually, half a second returns a decision.Now, Forter does not act as an additional data point, or input, or score into some system of the merchant. It actually answer whether to approve or decline the transaction. In very many—and most of the revenue of Forter comes from a covered transaction that, if this transaction was fraud, it’s on Forter. Forter will guarantee it. And we were pioneering this model to putting our mouth where our money is.Emily: Tell me just a little bit about why this is so difficult. What makes what Forter does unique?Iftah: What Forter does is unique because it tells the human story, and takes it all the way to the decision itself. For example, it's very easy to approve the fourth transaction of a person that is sitting at home, browsing from home, making the purchase on the same desktop they made at previous times, and sending the shipment to the same home. That's very easy. But we want to be able to approve the traveler, the person that is sending a gift to a third party, or a person that is sending a gift to another state while not browsing from home and not from his common device.We want to be able to approve those transactions that are checking out as guests from a new device and that's the first time this person ever appeared on our radar. And the ability to do that and to take the calculated risks and to look at the behavior, the cyber clues, and still be able to tell that this is indeed a new person and not someone that visited before and is trying now to hide. That's what makes what we do very difficult and complex.Emily: So, tell me a bit about the technology story. What technology do you use to accomplish this, and how does it work? What does your stack look like?Iftah: When I came to—from 2014, I looked at the system and what is actually needed in order to cater to such a complex story? And I thought to myself—and we'll talk about maybe a bit later about how all this is excellently suited for the Cloud, but what I found that throughput and big data is not the problem. First, it’s more or less solved, but it is the e-commerce business; it's not Facebook scale throughput. And on the other hand, it's not hardcore real-time, right? We're talking about tens of milliseconds, not the microseconds domain.What is extreme about what we do is the complexity of the flow. We have hundreds of processes that are needed to be ran within that half a second in order to test, and check, and infer, and decide on many aspects of this transaction and of this person. So, first, we started from Amazon Web Services, and we started with, actually, Apache Storm. And why we decided that because we wanted to have something that enables first, a lot of parallelism—doing many things in parallel—with smart joins, that is with processes that takes information from other processes that executed in parallel, and can decide whether what they have so far from these processes is enough. Because we are very high availability, we didn't lose more than 10 seconds straight in the last four years. We are very high availability, but a lot of our sub-processes are not.So, you need such a machine that will be able to infer about whether the information at hand is good enough and to move forward and still give, after half a second, the answer. We also wanted to have within this high availability system, we wanted to have the domain experts, the analysts, and the fraud researchers, we wanted to give them a very direct access to the code and each insight that they get, in close to real-time, maybe in 10 or 15 minutes from the time that they understood that there is a new wave of attacks or a new fraudster in action in a particular store or across stores. We wanted all these insights to be manifested in the system within 10 or 15 minutes without these people needing any engineering in order to do that. So, we created incubators within these Apache Storm processes that enable them to write, in Python, their wisdom into the system without being technologists or engineers. So, this was the basic.Then we went on to see how we do the best similarity in the world. That is the understanding of whether we already saw a person, even if this person exhibits a new persona and is trying to hide. That is, they didn't give us the same phone number or email, it's not the same cookie, or the same IP, or the same credit card, and they don't use the same account, and we still want to know that this is the same person. This is a big part of what makes us efficient in exterminated fraud rings and enabling us to increase the cost of doing business for the sophisticated fraudsters. These are the prime building blocks and the last very important building block is the way we represent the world.Usually and traditionally, world was represented by the transaction. The transaction was the building block. But we represent the world as people. We know more than 700 million people and of their interactions and their browsing, in many stores. These 700 million people include most of the people that interact online in the US. And the same is for the IPs, and the addresses, and the devices in the US. The US is where our coverage is best.And all what we do revolves around the person because we believe that the person is what is actually persisting in the world with persistence reputation. That is a person is a legitimate person, they will stay legitimate. Usually, they won't flip on us. And if they are fraudsters, they will stay fraudsters. Not the same for IPs, for addresses, and for all other entities, you can think of. I hope this, it answered to a degree what you are asking about.Emily: Yeah. And I'm going to go into some more questions, but it's it's really interesting that what you're combining is both this sophisticated technology as well as sort of an understanding of—almost like a law enforcement understanding of how fraud works. Or how—like, a anthropological investigation of how fraud rings work.Iftah: Yes. And we found that there is a lot of—and I think our main asset is the ability to combine what analysts understand about the spoofing of the device, and how you detect that it's not really a mobile phone, it's an emulator on a desktop? And how can you tell that someone is trying to mess with an application that you protect? And what are the ways in which you can approve a transaction that looks very fishy to begin with, but it has some hints of legitimacy.How we combine this with a very robust, high availability and very secure machine because it needs to be secure. We touch a lot of personal, identifiable information in our regular course of business, and we need the system to be ultra-secure while it is on the Cloud. And our booklet, actually, of 101, how to secure your startup [00:12:45 unintelligible] usage on the Cloud was actually trending number one on GitHub for months in 2017 when we issued it. [laughs].Emily: That's excellent. Well, let me ask some more technology-specific questions. One is, just—you sort of alluded to this, but how is the Cloud important—and in fact, I believe you said critical to your business? Would Forter even be possible without the Cloud?Iftah: Forter would be possible with an on-prem cloud, right, because when we say Cloud, it could be Amazon, or Azure, or GCP, but it also could be in a cloud that we built somewhere. This would be possible. We didn’t go there, and most of e-commerce companies would not go there, and we'll dive into this in a minute why it's not wise to go there. But Forter is heavily relying on knowledge of the people, regardless of which merchant they visited.So, if we see a person in the Forter, it could be their first time Nordstrom sees them, but we already know them, and we can project the reputation of the person from previous interactions with other customers of ours. We don't share any customer data, of course, with any of our customers, but we can share parts of reputation, especially for people where this is the first time they visited a particular merchant. Now, this is a prime reason why it cannot be on-prem of the customer. And several customer—and I will not mention name, but huge conglomerates of carmakers, actually, asked us to be on their Cloud. And we refused and we let go of the business because that's not how we do, and the best value for them would be to share the data.And so far, all the customers that we have so far actually agreed to share the usage of reputation with all the rest of the network of customers that we have. This is something that they cannot do in-house, and this is something, per your question, that cannot be done if we are not in the Cloud, but on their premises farm.Emily: And so are you operating in all public clouds, or do you have your main technology running in one?Iftah: We have our technology running in a few regions of AWS. And we are now deploying a few regions in Azure, too.Emily: And so it doesn't matter if your customer, which public cloud. So, if you have a customer that uses GCP, doesn't matter, right?Iftah: It doesn't matter. And most of them are [00:16:01 naturally] aware where we are. Bear in mind that we are serving companies—I mentioned the names—which are inherently not technology companies. And it doesn't matter where they sit; we are a full SaaS company for them. They send us the request, the transaction, and we give them a decision within this half a second, and that's the core of the business. Doesn't matter for them.As [00:16:36 unintelligible] to say earlier, the concept of the public cloud and using other people's cloud infrastructure, be it GCP, or Azure, or AWS or others, is very suited for the e-commerce because of these two prime characteristics of the e-commerce: first, you don't need it to be very hard real-time, you're talking about tens of milliseconds, and giving answers in hundreds of milliseconds, ultimately; and second, unlike the Twitters, and the WhatsApps, and the Facebooks, and the Googles, the e-commerce is not big data in the sense that every transaction of e-commerce is, on average, a very high monetization. So, the ultra cost of using public cloud is definitely worth it for the e-commerce entity, comparing to creating your own farm. It is good for their flexibility and it's good for the focus and attention on their core business, where if you run your own farm, you are into a lot of domains of expertise which are far away from selling whatever you sell.Emily: And tell me, also, a bit about, sort of, your own technology. Things like how you manage scalability. How important is it for Forter’s bottom line, the ability to have a scalable system?Iftah: We are running from 2014—from day one, actually, from 2013, we have to be scaled out. We can’t scale up. We don't have anything that is done by a single computer. All the transactions are on what are called brains that are a scaled out on both redundancy and scalability.All our data stores are scaled out. All the data stores that are storing the transactions, and the logging, and the entities that we talked about are scaled out and they are replicated, and the transactions that are dealing with our hundreds of thousands of browsing events that we receive and analyze every second, of course, they are scaled out. So, from day one, from 2014, everything that we do is scaled out. In the first two months, it was for redundancy in different availability zones of different regions, but from then on, it's all scaled out. And I will be very happy to dive into the particulars of the technologies, but what is important in the context of this podcast, I believe, is that doing it on the Cloud using the cloud infrastructure is actually enabling us to concentrate on the business of detecting fraud and business of these massive topologies of hundreds of processes that are both in Java and Kotlin and Python, and have very complex acyclic graphs connecting them. And we just do it in parallel on very many servers that we can scale up and out as we wish. And this is something that helped us focus our core business: understanding fraud.Emily: Going back, actually, to this idea of scalability, I know over the past six, eight months because of COVID, e-commerce has gone through the roof. And I'm assuming, in fact, I read that Forter’s business has also been going through the roof. How have you managed scaling?Iftah: Yes. Forter business went through the roof with their several verticals: with food deliveries, of course; and we the big department stores, which COVID accelerated their digital transformation; it did dive with the travel business, of course, right? Few things happen to our customers, and for Forter, scaling was natural. If we have 20 minutes warning scaling is a natural to us, and here we had about 10 days of warning. Easy, right?For our customers, it was a bit different. First, a lot of them came to us, actually had to eliminate all their manual processes. And Forter was there for them. Now, what Forter did for them beyond eliminating any manual fraud-related tasks and loads was to reduce substantially the hike in the customer success load. Because Forter is able to be more accurate and to decline less legitimate customers, you don't have that many calls to the customer success centers.And these are two bottlenecks for our big merchants: the customer success, and fraud and fulfillment. And the fulfillment, that is being able to capture the money and to send the goods is also streamlined by the fact that it's all done in real time. These are the direct effect, but there are additional phenomenon that happened. One of them is that suddenly, in COVID, a lot of customers that didn't usually do things online started to buy online. And we saw that the amount—or the percentage of new buyers, in many of our customers, suddenly jumped.And when you have new buyers, you need a very sophisticated system to be able to allow them in, to approve their transaction, and to allow them to build their reputation; so this happened. The spikes in throughput happened; every day in the last four months is like a Good Friday and Cyber Monday combined for us. And that's good. We didn't lose any availability, and with the current technology, it wasn't that problem from the scalability aspects. And indeed, have we been on private, or our on-prem, this would be much harder.Emily: And now tell me a little bit more about the technology required. We talked a little bit about it not being exactly a throughput problem, but you do have hundreds of processes that you have to run in, you know, several seconds, what technology do you need to leverage in order to make that happen?Iftah: Everything that we run is on flash disks; we don't have rotating disks anymore. We do run low CPU and low memory on all—low memory usage and low percentage of CPU on every [00:24:30 unintelligible] that we run, to accommodate and reduce a spike pickups. We do use the Apache Storm for our base; it is the base of our topology of these processes that we talked about, and hundreds of them in each topology. And we have several topologies for both the transaction time and what we call the visit time, the browsing time. And we run—we are a big customer of Elasticsearch, we run Elastic from the very early days, and we use them for sophisticated queries in their own annoying language. [laughs].And we have one of the largest clusters. We have about 15 clusters of Elasticsearch that serve our entities that we talked about, our mapping of people to these entities, our logging, and our real-time matching between the current transaction and all the hundreds of millions of people we already know that acted online previously. These are the core technologies in our stack, and on top of that, we use several other technologies: Spark and our wrappers over Spark for the MapReduce work of our machine learning processes, and we use Kafka for persistent distribution of our data among regions, and among availability zones.Emily: What would you say is your biggest technical challenge? And by this, I mean, like, something that you're perhaps still working on, you don't feel like you've totally figured it out yet.Iftah: I think we are very advanced in our matching, the similarity problem. That is something that we think is a pillar of our superiority in this field, but it's a never-ending story. The ability to detect relevant anomalies in the behavior of the crowd is something that we work very hard on, and we expect a lot from these technologies because they have the potential to help us mitigate threats which are new to us; zero-hour threats of modus operandi, of MOs that we did not encounter earlier. These are the main issues.One issue that is mundane and prosaic is the cost of transaction. We do a lot of processing and we start, in our scale, to feel the heat of the cost of serving all these transactions. Nothing that will take us out of the Cloud, but it's something that we need to work hard on. Last, but definitely not least, is security. We think we turned our emphasis on security to our unfair advantage in this field, but still, hardening your systems and thinking about the possible attack vectors on your systems and on your merchant’s system is something that I lose sleep at night over, and it is something that we can never say that we are done with.Emily: What do you think about the security-speed trade-off? Do you think it's real? Or do you think you can move just as fast and be secure?Iftah: We can move with negligible sacrifices for the security. Again, if you are talking about real-time systems where the microseconds count, then it's a different story. But for us, having everything encrypted both at rest and at motion is something that does not need to come at the expense of security. What is very interesting in this trade-offs of security and speed is the trade off, not of the real-time speed and the processing speed, but of the engineering development speed. And here, the magic is in the automation, every security aspect, and with your ability to mask all these security aspects from your engineers, and giving them the right APIs so they can develop the application itself. Which is developed to our domain in the same speed, while still being totally secured without them needing to take care of the plumbing. And that's something that we invested a lot in, and it's a never-ending game. I think we're good at it, but never good enough.Emily: And do you rely primarily on the Cloud service providers? So, on AWS’s native services, or do you tend to find additional out-of-the-box services, or build your own? Do you have, sort of, a philosophy on that?Iftah: You know, philosophy is one thing, and then what you're doing practice sometimes need to be traded off with reality. But we are currently running on both AWS and starting to run on Azure, so we are making our processes agnostic to the particular cloud that we run on. It is interesting to do when you come to security configuration because you need to create abstraction layers over the particular security mechanisms in AWS and Azure, which are quite different. And that's where we are now. So, we are moving to be totally agnostic. So far, we did use occasionally, not—we weren't a heavy users of AWS services, but we did use a analytic databases; we did use Kinesis, but we moved now to Kafka, and so on. And we did use very cloud-specific queues, but we're moving out of this now.Emily: Why do you think it's important to be cloud-agnostic?Iftah: Because we run on two different clouds. We run on two clouds because of the very high availability requirement that we have. First, we need to be totally available to our merchants. Second, we need not only to be totally available to merchants, we also need to be very, very accurate, always.So, it's not that I can degrade gracefully and say, “Okay, I always answer approve in certain occasions,” because the fraudsters will very quickly understand that. So, we need to be with full brain capacity, always on. And if we are not, we started within tens of minutes or a hour or two, to be very susceptible to great losses. So, that's the reason we need to be with multiple regions, and we need to be with both clouds. It does take a heavy penalty, and we do think about how to reduce the penalty of working with two clouds, but that's what we currently do.Emily: Can you tell me how much your technology stack has changed since 2014?Iftah: We did change a lot in the representation of the world, and this was big. We did move into a Elastic from more traditional NoSQL and SQL [00:33:20 unintelligible] BMS. And we move now, again, to new high throughput databases for our browsing events, the ones that do get hundreds of thousands of events per second. And we do move slowly [00:33:40 unintelligible] many more items or small stack items like queues, and data distribution channels that are no longer serving us well as we scale out, and as we move to being cloud-agnostic. For example, we move now our analytics database from AWS’s Redshift to a cloud-agnostic database.Emily: Excellent. I'm going to wrap up pretty soon; this has been really interesting. But a couple, sort of, last questions I wanted to ask. One is, can you describe what a day looks like for you? What does the day for the CTO of a Forter, of a cloud-based SaaS fraud prevention company—what do you actually do?Iftah: First, I am looking at what may endanger our business in the next year and in the next three years. The reason why we are [00:34:42 unintelligible], we call this process internally, the ‘what can kill us?’ process. Is mainly because we are in a good shape, and when you're in a good shape you need to look at the threats and how to protect the business from them, and what new business you need to do.Then I'm looking at the health of our precision teams. And our precision teams are both the data science team, the cyber R&D teams, the fraud researchers teams, and the engineering teams that are supporting them. All these are—we need to see that we maintain our superiority. We so far never lost a QC or a bakeoff on any performance issues, and it's a tall order to keep it that way. So, this is the second task that keeps me up.And the last is to see that, indeed we have all what we need in order to enable the spear of development and for the new products. Companies in their seventh year, as we are, are in an inflection point between the startup and the enterprise, and that's where you need to make sure that we stay agile. We stay agile, it depends on the agility of the organization. How can you scale? Or do you rely on several heroes? And the agility of the development itself that relies, to a great degree, on the tech debt kept low enough.Emily: Fabulous. And what is a tool or platform that you think is sort of essential to functioning?Iftah: I think that we built a very robust, extensive monitoring and alerting infrastructure, and this monitoring and alerting infrastructure enables us to understand quickly whether something has happened in the world. And I must say that most of the time that something is happening in the world, it's not something that we need to do something about manually, but sometimes it's something that the merchants need to do. We discovered, for example, that one of our online travel agencies customers started to issue flight tickets for one percent of their price; for three and four bucks instead of four hundred bucks. And we detected it not by looking at the prices, but by seeing spikes of purchases from this OTA in Malaysia and Vietnam, and we were able to tell this to our merchant, to the customer, and the whole thing was rectified about 14 minutes from the time it started. So, our alerting and monitoring systems, which is both on the application level, on the business level on them, and on the other end, on the machine levels, this is very, very important, and pays for itself handsomely.Emily: I think I've read accounts in the newspaper of travel agents, or airlines having that type of mistake.Iftah: Yes.Emily: It tends to get some publicity. Last question is, how can listeners connect with you or follow you?Iftah: Well, iftah@forter.com. Look us up in forter.com, and we will be very happy to talk to you.Emily: Excellent. Thank you so much, Iftah, this was really fascinating.Iftah: Thank you very much for having me.Emily: Thanks for listening. I hope you’ve learned just a little bit more about The Business of Cloud Native. If you’d like to connect with me or learn more about my positioning services, look me up on LinkedIn: I’m Emily Omier—that’s O-M-I-E-R—or visit my website which is emilyomier.com. Thank you, and until next time.Announcer: This has been a HumblePod production. Stay humble.

The Business of Cloud ...