Analyzing Billions of Transactions to Understand Consumer Behavior - Michael Babineau and Kevin Hale

30m read

·Nov 3, 2024

Mike: Kevin was a group partner when you did YC in the summer 2015 batch. What idea did you apply with?

Kevin: Our basic idea at the time was really to use credit card data to help investors make better investment decisions. I think one thing that is actually not really far from what we do today. The only—like the main evolution is that now we work with companies as well, not just investors. But I think a big part of the idea, though, is not just to look at credit card data and try to find interesting things and tell us about it, but instead to build an analytics platform, throw that in front of investors, and then let them answer their own questions.

Mike: What led you to coming up with that idea?

Kevin: Oh, that is a good question. I did not come from an investing background; I don't come from that at all. I actually worked in video games, and the same is true for my co-founder Lillian. She and I met at Electronic Arts; we worked together there and then at another gaming startup. Before that, I was in ad tech. We've always been software engineers; we've always been in the tech world. But we have plenty of friends in finance. One of those friends just, out of the blue, called me one day and was like, "Mike, I need your help. I've got two terabytes of data on a hard drive. How do I load this into Excel?"

Kevin: That was one of those moments where, again, as a software engineer, I was like, "Yeah, you know, I hear it. I get this question." Like, "God, why? Why are you asking me this?" So he's in New York; I'm in the Bay Area; it's the middle of the afternoon. I'm like, "Why am I fielding this?" I wasn't feeling particularly helpful. I asked, "What did your engineer do? Ask your dev team!" I just hear silence.

Kevin: Then he says, "Mike, what are you talking about? We've got an IT guy, and that's it!" That blew my mind because he was at a thirty billion dollar hedge fund. I just assumed that all hedge funds looked like Two Sigma or Renaissance. These places that have hundreds of quants and hundreds of engineers, but in reality, most hedge funds have a handful of analysts and just some back office support. They don't have any coders in-house.

Kevin: I think that's when we realized there's this huge opportunity because investors—they make money off of having an information edge, off of knowing things that other people don't. A lot of people who work at hedge funds are very clever. They're looking for this edge wherever they can find it. Over recent years, increasingly, they’ve been looking at things like Google Trends to see, "Oh, is there some leading indicator in search terms that would indicate some bigger shift in consumer sentiment about, I don't know, some company?" Very unsophisticated sort of analysis—yeah.

Kevin: But at the same time, like a clever idea, and oftentimes it works. You've got investors subscribing to things like comScore, looking at how many visits to a website are happening because that roughly correlates with actual sales. It’s also a nice leading indicator in the sense that public companies only report metrics once a quarter, and it’s not right at the end of the quarter; it’s actually sometime afterward. You can look at how many people visit, if you can see how many people visited amazon.com over the past quarter. Then, you can look at the full quarter of information and see how well that correlates with the resulting reported performance.

Mike: So, how'd you go from helping someone with a two terabyte Excel problem and working on video games to being like, "Okay, it's now time for us to quit our jobs and solve this problem"? Because what were you doing there?

Kevin: Yeah, so we are not video game programmers. We were working at a video game company, but my specialty was building high-scale infrastructure. Lillian's specialty is building data pipeline and analytics teams. When you look at the video game space—like how does a company like Zynga epitomize this? They’re very metrics-driven, very data-driven.

Kevin: One of the things they did very well is optimize. When you think about an online game and what you want to optimize for, you want to actually find this balance. If your game is too hard, no one's going to play it; if the game is too easy, no one's going to play it either. So you have to find this balance.

Kevin: If you have an online game, then you have this amazing leg up over games that are pressed to discs and then shipped out because you can update them. The best way to tell if your game is too hard or too easy is to simply look at how far people make it into the game. For instance, you could look at how many players make it from level 1 to level 2. If there's a severe drop-off—if not enough people do it, then it's a signal that maybe we need to tweak this.

Kevin: Of course, the person who needs to answer that question is a game designer, and usually, game designers aren't writing SQL. You've got all these metrics—you're tracking all these events. Like, “Oh, a player passed level one,” or “a player died,” or whatever. All of these events are being tracked, just like a standard sort of analytics pipeline.

Kevin: You instrument your application, you have all these events streaming out, you store them somewhere, you do some sort of processing on them, and then you dump them into a place where you can query. But then you've got these people who typically aren’t coders, like a game designer or a product manager; they want to answer questions about how people are behaving in the game.

Kevin: You basically have two paths at this point. If a game designer says, "How many people made it to level two?" as somebody on the data team, you can say, “Okay, let me run that report for you.” You go and you query it, put together the results, and send it back. They look and say, "Oh, that's great! How many people made it to level three?" You roll your eyes and think, "I see where this is going."

Kevin: At that point, you're like, "Okay, I have a choice. I can either play this go-between and keep fetching data over and over again, or I can build tools." If I build a tool and hand that tool to the designer, I can say, "Here—answer this yourself." Then I can focus on doing much cooler things and much more interesting things.

Kevin: Also, I’m out of the way. I'm no longer in the way of this person answering their own questions. This is exactly what we did in the video game space, and this is a pattern that we recognized could be really useful in the investment space. You’ve got all these investment analysts, and they know so much about the companies they’re making investment decisions on. They know what questions they want to answer.

Mike: This is fascinating; you guys are building tools for understanding how to improve video games. How does that become a skill set needed for financial analytics software and insights to people who run hedge funds and investment firms or even to do corporate competitive tracking?

Kevin: I think it comes down to what the fundamental problem is being solved. The fundamental problem is that you've got somebody who probably isn't a coder, and they want to answer a question about behavioral data.

Mike: How did you decide what the first product was going to be?

Kevin: For us, it was really digging in further to understand what types of data investors were most interested in. What we found is that transactional data—specifically credit card transaction data—is one of the things they were really excited about, but they were banging their heads against it.

Kevin: This is fundamentally transactional. Credit card transaction data is a messy data set with unstructured data problems baked into it. The skill sets of investors, even the more technical ones, tend to lean more toward time series analysis as opposed to dealing with large messy devices.

Mike: What kind of questions were investors interested in from that data set?

Kevin: One of the main things is just, "How is Chipotle doing?" They famously had a food poisoning incident a couple of years ago. Investors wanted to know what the impact was on their actual revenue.

Mike: How come there was no way to answer this before you guys came onto the scene?

Kevin: One of the interesting things is that there actually was a way to answer it—it was just a terrible path. The way to answer this before was with a survey. You go to some market research company and say, "Hey, there's this Chipotle food poisoning thing. Can you help me understand how many people stopped going to Chipotle?" They just have to try to find a bunch of people that match a demographic and then hope these people answer accurately.

Kevin: It takes weeks or months; it costs tens or hundreds of thousands of dollars. You end up with this tiny sample of, you know, "Oh good, we got a hundred respondents, and they said that twenty of them considered stopping their Chipotle dining altogether."

Mike: So what do you guys do instead?

Kevin: For us, because we have direct observations of millions of US consumers, we see all their purchases. We can just look it up right away. In fact, we don't even need to look it up—we can just give you a tool, and then you can find the answer yourself.

Mike: You got your first customers during YC. How do you go about getting them?

Kevin: That's a good question. I have to think back; this is 2015. Our very first customer was a VC who also ended up investing in us. I think one of the things that was interesting is that we got to cheat a little bit because we were in YC. VC firms are always excited to talk to YC companies. They're trying to figure out who is in the batch and then trying to invest before demo day.

Kevin: We had a whole bunch of these funny meetings where we were trying to get in front of them to pitch them on a product. They were happy to take the meeting because they wanted to hear about what we did. It ends up being this dual purpose thing where they’re like, "Okay, show me the product. Now tell me your business model." And you're like, "Well, would you like to buy the product?" Fortunately, a lot of times, the answer ended up being yes.

Kevin: Most of the VCs here in the Bay Area are now our customers, but it was really interesting navigating those early conversations.

Mike: What were they excited about? Because with credit card data, there are some things that it's really good at showing and identifying, and some things that are not so good. For example, it tends to be great for predicting consumer trends.

Kevin: Basically, you just have to keep in mind what it is we're actually seeing. What we're seeing is spending for a large proportion of US consumers. So if you want to understand a company that doesn’t target consumers—if it doesn’t target specifically US consumers and more specifically if it doesn’t sell things directly to them, like General Mills, which goes through grocery stores—then those are the things we can help with.

Kevin: Things like Uber, Lyft, all the meal delivery services—we can track those. You're not going to see B2B enterprise companies, but it tends to be that lots of people are interested in consumer stuff because they're the fastest-growing and most interesting segment.

Mike: Are they using it as a market sizing tool?

Kevin: Probably the primary use case among VCs is actually diligence. When you think about it, put yourself in the shoes of a venture capitalist. Somebody comes in and pitches you, and they show some numbers on some slides. You're like, “Okay, great! But I have lots of follow-on questions. Do I try to get these numbers from you?”

Kevin: Additionally, there are a whole bunch of questions I have about your market, which you may not even know the answers to. A good example of this would be if you're a VC. Somebody comes in and pitches, and they're showing you this perfect hockey stick chart. You're like, "This is amazing! I've never seen growth like this before!"

Kevin: At the same time, though, you've heard of other companies; you know Lime is out there. You want to pick the best one.

Mike: Exactly! Are you talking to number one or number two? And also fundamentally—if Bird is showing good unit economics, is that best in class? Could it be even better?

Kevin: This is one of the key areas where we help VCs. We give them visibility not just into the company they're talking to but into their competitors as well. We can say, "Oh yeah, Bird and Lime; here's where Bird is winning, and here's where Lime is winning. Here are the differences in how well those customers perform—how much they spend, and so on."

Mike: When you say "unit economics," how do you uncover that data?

Kevin: Obviously, we don’t see unit economics. We only see the spending side, so you could say, “An average Bird customer spends forty dollars a week, versus a Lime customer that might spend twenty.”

Mike: What other metrics are you able to show? I was always impressed when looking into the dashboard inside of Second Measure about how I could see how much revenue is being pulled but also things like cohorts, lifetime value, etc. What metrics get investors super excited?

Kevin: Let’s take a step back and think about what are the main problems we’re trying to solve. One is generally focused on company performance. This includes things like competitive intelligence and benchmarking.

Kevin: For example, "Show me, I don't know, what is the relative market share of the various meal kit players?"

Kevin: "How long do their customers stick around? How much do they spend over time? What are the lifetime sales after twelve months?"

Kevin: Again, if we split those into different cohorts—are newer cohorts performing better or worse than older cohorts? There's all of these things in and around company performance, and then separately there's stuff around consumer behavior.

Kevin: These are things like, "Where else do my customers shop?"—things intended to help you get a better picture of who your customers are. This really helps you hone in on who your best customer is.

Mike: I'm saying "you," but really it could be you, a competitor, or a company you're doing diligence on.

Kevin: What are some good examples of that? Your blog is basically just this—it's like insights.

Kevin: Yeah, it's interesting because our core product is really about empowerment. It's about saying, "Hey, you as a user can answer whatever questions you want within this space of US consumer spending."

Kevin: But then, we don’t sell—we don’t sell research.

Mike: Oh, so you don't answer questions for people directly?

Kevin: We’ll do it on a project-by-project basis, but we're not the ones coming up with the questions. If somebody comes to us and says, "I have this specific question. I tried this in your application, and I can't quite answer it yet. I have this more specific question—can it be answered?" Those are cases where we can do a one-off research project, and those are paid projects.

Kevin: But the thing we don't do is we don’t proactively do research and go out and call up ten of our clients to sell it to them.

Mike: What are some things that you’ve put on the blog recently that are your favorite?

Kevin: So we’ve started this one thing: if we talk about our blog, we also need to talk about our press mentions. We actually work with the press a ton. We keep getting quoted in, like, the Wall Street Journal, Financial Times, etc. This has been great for us. It's great for the reporters too because they're trying to write about like the upcoming potential Lyft IPO or whatever, and they want to support their reporting with more information.

Kevin: We can help provide that information. We're happy to do so. The Uber and Lyft thing is a recurring topic, and so in our blog, we’ve decided, "You know what, we’re just going to keep publishing periodically on these imaging updates."

Mike: So when you choose a question about Uber versus Lyft, do you guys come up with the initial questions and then listen to what the press are asking, or is it all—wait, you guys are coming up with it?

Kevin: I'd say it is us always coming up with it. We actually have a dedicated editorial team.

Kevin: We literally have a team of data scientists and writers who just pay attention to what's going on in the news—what's going on with companies that could potentially be interesting to others. The person who runs it has a journalistic background. This is their core focus, finding interesting things to write about and then writing about them.

Mike: So let’s talk about some examples. Before we started recording, one you mentioned was Stitch Fix and where the customers of Stitch Fix do and do not maybe spend.

Kevin: This is a really interesting thing. One recurring question we heard about Stitch Fix was, "Is Stitch Fix cannibalizing department store sales? Are they competitive with department stores?"

Kevin: We decided to dig in. We had no idea what the answer was. We decided to attack the problem by looking at people spending at department stores before and after they became a Stitch Fix customer.

Kevin: What we found is that Stitch Fix had no impact on department store spend. People just started spending more on clothes—period.

Kevin: In fact, the people who are Stitch Fix's best customers actually spent even more on clothes before becoming a Stitch Fix customer than after.

Mike: Oh! So Stitch Fix inspired them to go out and find more clothes!

Kevin: Yeah, one way to characterize it is that it piques their interest in fashion. They don't spend any less; they just… it's part of it; they probably jumpstart a variety. They’re like, "Oh, I’m introduced to a variety of stuff that I never would have considered beforehand."

Kevin: Now, when I'm out there looking at stuff, I go, "Oh, there's more that might appeal to me because I’ve been exposed to them."

Mike: The key thing is that it's not displacing their spend and that was a real surprise.

Kevin: That's also a really important question to answer because if you're at a department store, trying to figure out, "Is this Stitch Fix friend or foe?"—this really points more to friend.

Mike: So do you actively track the rise and fall of brands? I'm wondering if there must be instances of certain things being swapped out.

Kevin: On a recent post, there was Peloton memberships going up ahead of SoulCycle. So that’s really interesting. Are there trends happening that you can follow?

Kevin: So sorry, when you say "trades," do you mean people? Like, you know, are people signing up for Peloton instead of SoulCycle?

Mike: Yeah!

Kevin: So, I mean really, we again, this is something we will attack from an editorial perspective, but again, it's, you know, our core business is about putting a product in front of our clients through which they can answer their own questions.

Kevin: Now on the blog side, the Peloton and SoulCycle story is super interesting. Peloton is a beast, and SoulCycle has some interesting—after we came out with this article, SoulCycle basically had a nice like non-denial denial statement.

Kevin: They said, "We don’t know what they’re talking about. Our numbers are great," but didn't actually dispute the metrics. To give some context, what did your blog post say, and then why wasn’t SoulCycle nervous about it?

Kevin: The short version is that Peloton has now surpassed SoulCycle in terms of the number of active Peloton members. This is based on a spending basis. Active Peloton members on a monthly basis have surpassed the number of SoulCycle active riders on a monthly basis.

Mike: Is there an overlap? Like a Venn diagram of people who used to be SoulCycle and have switched to Peloton?

Kevin: There are both a current overlap and a kind of Sankey diagram type thing of people who used to be one and now are another.

Mike: Whoa. Have you been following how AmazonBasics has developed their products?

Kevin: I am generally familiar with it. I'd say for us that's not something we have a lot of visibility into because we just see Amazon—general Amazon.

Mike: But you've done some research about Amazon Prime people, yes?

Kevin: Yeah, we did! This is a case where we did a much deeper dive, and we actually gave several talks on this. One of the things—this is spearheaded again by our editorial team.

Kevin: One of our data scientists, Brandon, he dug into Amazon's customer base and specifically wanted to understand really the differences in behavior between Amazon Prime members and non-Prime members.

Kevin: He wanted to understand how that's changed over time and really how important Amazon Prime members are to Amazon.

Mike: What was one of the interesting takeaways?

Kevin: One of the interesting takeaways is that increasingly Amazon is looking more and more like a subscription business. They're increasingly reliant on Amazon Prime customers for their revenue.

Mike: Then another interesting thing is that even people who become an Amazon Prime subscriber—even if they lapse, right? Even if they are no longer a subscriber—they're still spending more on Amazon than they did before.

Mike: How do you get to that conclusion? What was the evidence that showed that Amazon is more focused on subscribers?

Kevin: I would characterize it that they're less focused on subscribers but instead that an increasing proportion of their revenue is derived from people who are Amazon subscribers.

Mike: Gotcha, so it’s turning out that Amazon's most valuable revenue seems to come from Amazon Prime subscribers.

Kevin: Yes, and we don’t know why.

Mike: But like obvious things, people can select "Prime" for example.

Kevin: Exactly! It’s just like, "Hey, they already pay for this membership, so they might as well use it."

Kevin: When they’re ordering and buying stuff, it's like, "I’ve already paid for the membership. It’s like it costs something."

Mike: When it comes to product development on your side, are you incorporating this data in any way, or is it just talking to your users and developing products from there?

Kevin: So when we think about improving a product, we have a few different streams for really feeding the backlog. One is internally driven, and this is based on where we know we want to take our application. It also factors in us going out and proactively speaking with our own customers—doing user research and really digging into their use cases and figuring out where the gaps are and then attacking those.

Kevin: That's one. Another is, I mentioned earlier, we do some custom research for customers. This is like, think of it as a professional services approach. This is something that also helps feed our backlog because if we see recurring requests, then this is probably something we should add to our product.

Kevin: Lastly, we have the editorial side, which for us is the best form of dogfooding. We're out there using our app to answer a question. If we find that we hit a wall—we can’t go any further and have to go to the data behind it to answer the question—it's a great signal that this is something we should probably build.

Mike: One thing that’s interesting to me is that I feel like we just recently spoke to Jay Klemke at Insight Data Science, and I feel like data scientists, like hearing about your company, think, "This seems like a dream job! I work on interesting problems and questions." And even if it's with your editorial board figuring that stuff out, it seems fascinating to me. Every problem is going to be kind of different.

Mike: We put that out there, whether solving it for your customers or stuff to promote the company. How do you look at finding—because you guys are hiring right now, right?

Kevin: Yes!

Mike: How do you find good data scientists? What traits are you looking for that you know are going to be a good fit for this kind of nebulous work?

Kevin: It's such a good question. I feel like "data scientists" is such an overloaded term. For us specifically, what we're looking for are people who are scientists—with a capital S—who have very strong quantitative backgrounds and can understand from first principles the problems they’re trying to solve.

Kevin: Very frequently, what you find are people interested in data science; they learn a lot of the tools but maybe skip over the fundamentals.

Mike: When you say "able to think from first principles," this is something I hear as a common theme also for people we’re looking for to be good engineers or product managers, etc. What does that mean exactly?

Kevin: Let’s think about it this way. We have about a third of our company with PhDs. Most of the team is technical, and there's about an even split between engineers and data scientists. On the data side, you’ll find people with backgrounds ranging from statistical genetics to cognitive neuroscience to string theory to Earth science to climate science—really all over the place.

Kevin: The common theme, though, is that all of them are extremely good in statistics. There’s a bit of a statistical foundation in our opinion; everything is built on that. We believe if you come in with that strong foundation, learning the tools can be taught.

Kevin: We can help people get on board with using Python. Like, "Okay, cool! You've only used R; that's fine. We can help you switch over to Jupyter notebooks." But the thing we’re not going to teach you is how to do math.

Mike: How does that translate into first principles?

Kevin: Because I usually think of it as someone who’s willing to challenge. Like I will give someone a task, and sometimes they will come back and say, "Actually, can we just dive in?" I ask, "What’s the reason behind this task?"

Mike: They might be able to say, "Oh! I think I can improve the question we need to be looking into instead."

Kevin: I think this a lot of this ties in with the nature of the types of problems we're trying to solve. There’s no out of a playbook of best practices for dealing with the problems associated with transactional data.

Kevin: There’s no playbook for building an analytics platform focused on consumer spending behavior. A lot of the things we’re doing, we’re either doing them for the first time or simply being done for the first time in some cases. So, we benefit from people who can approach these big, nebulous, and open-ended problems and come in and figure out how to structure and decompose the problem and tackle it piece by piece.

Mike: Do you train for that, or do you just hope that they have it? What is the test?

Kevin: This is less something we train people to do and more something we hire for in the hiring process. We've taken great care in designing and actually iterating on our interview process. There's a significant technical evaluation where we're trying to test for exactly these types of things for a data scientist.

Kevin: One of the things we do is we actually give them a big messy data set and say, "Do some research. It's open-ended. Tell us what you were looking for and tell us what you found."

Mike: What are some common mistakes that people do that end up not working out? What are some things that really great employees and applicants have been able to do?

Kevin: I’d say the number one mistake that people make is assuming the data is perfect. They assume what we give them is easy. They think, "Oh, all I have to do is load it into whatever—into pandas, throw it on a database, run queries, get the answers, and throw it into a slide." It never really works.

Kevin: This isn’t how data in our world works. There are always dragons somewhere. A big part of this exercise is, "How diligent were you in looking for dragons?"

Kevin: And anticipating these types of problems. You don’t necessarily need to solve all of them, but you need to be aware of them because they can distort your findings. If you identify them—even if you have findings that are invalid—but you can admit, "I found this thing; I made this deliberate simplification assumption to complete it in a reasonable amount of time." That’s fine.

Mike: So the good people, what they're good at is not starting from their own assumptions, but actually trying to query and figure out what were the assumptions they’re working with.

Kevin: Yes, exactly! So once you have that, it helps you understand how strong or how weak your ultimate conclusion is going to be as a result. It’s like building a house. If you were to hire a construction crew to come out and build a house, and they just came out on-site and started erecting walls, and then they hand over the keys, you slam the front door, and the whole thing falls over because it was on a shaky foundation—clearly, they failed.

Kevin: For us, we like to find people who really want to understand the foundation they're working with to make sure it will be sound when they build the house.

Mike: I’ve never done a project involving credit card data, but then I use these tools like Mint, and it consistently classifies things as the wrong thing. Can you explain to me why this stuff is not normalized?

Kevin: I think the easiest place to start is, think about your last credit card statement. Think about a time where you've looked at your credit card statement, and you saw a transaction on there, and it says something like "S Bucks" or “MW space San Carlos,” which would be Men's Wearhouse San Carlos.

Kevin: It doesn't say Men's Wearhouse; it doesn't say Starbucks. It says something which if you squint at it and scratch your head a little bit, you as a human can figure out what it is.

Kevin: The problem is that there are many different companies all putting in some piece, and the fundamental problem here is that a human decided how to represent that store in a credit card statement. They're working within the constraint of limited space and have to type something in that communicates to a human that, yes, you were at Walmart and that you don’t dispute the charge.

Kevin: But it was never designed for a machine to read. The result of this is that you end up with this cardinality problem. You end up with many different variants for a single merchant, and part of our job is to find all the variants and map it back to that singular merchant.

Mike: So you're saying there are multiple text strings associated with Men's Wearhouse in San Jose?

Kevin: Correct! Within our dataset, we have over 50 billion transactions, yet there's one billion unique transaction descriptions. And I’ll tell you what, there are not one billion merchants in the US.

Kevin: For instance, Macy's has over three million different representations.

Mike: What?!

Kevin: Yeah, I’m just kind of baffled that it was never like, "Hey Macy's, you’re store number 1200." There are basically two layers of problems. One is that, you know, there’s a human layer where different humans can represent things differently. They can even typo the name of their own company, which happens.

Kevin: The second problem is that there are various perturbations that can take place in the processing.

Mike: I think part of it was like the corrections had to happen by users of Mint?

Kevin: Yes, and I think humans don’t want to correct that data. I also think that people get frustrated; it's like, "This is the 50th time I had to correct this." As a result, they no longer want to correct this anymore because it's just not good.

Kevin: The problem actually is that all of them are so different, and so humans are giving up on the classification when really it's like, "Actually, I have such limited incentive to classify my own."

Kevin: Really, people don’t care how much.

Mike: The problem gets even worse.

Kevin: Surprise!

Mike: I don't want to know.

Kevin: If Amazon was all classified in one category, that would not be good.

Kevin: If you’re coming into this with a software mindset, right, you’re thinking, “Oh yeah, there should be some unique identifier for Blue Apron.” But if you actually just look at all the Blue Apron transactions, you’ll find out that there’s actually more than one Blue Apron.

Kevin: Did you know that there’s a Blue Apron grocery store?

Mike: Really? That’s very confusing!

Kevin: Yeah, and they show up in some cases; they show up exactly the same on your credit card statement.

Mike: How much time are you guys spending cleaning up data? Is it like perpetual and non-stop?

Kevin: So we don’t think of it as fundamentally human there are human elements of it. Really, it’s something that we try to use machine-based approaches to operate as a giant lever for. I guess we think of it this way: we’ve basically had to build two different products.

Kevin: One is this pipeline that ingests raw transactional data and then outputs something useful. The things we do in that process are things like this entity resolution, which is what we were just talking about with merchants.

Kevin: But it also includes other things like figuring out for an Uber transaction. It says San Francisco, but not all Uber rides are in the greatest city.

Mike: Right!

Kevin: So we look at other transactions around it, and we can say, “Oh, maybe this originated somewhere else.”

Kevin: We figure out the location of the purchaser based on, you know, where their other purchases are, and that lets us fill in the gaps. So we say, “Oh, you know what, ignore this location for Uber, and instead use this computed location.”

Kevin: There are other things that we needed to solve. Then there is this whole other thing around de-biasing. We basically have this longitudinal study going on. We have this panel of consumers, and obviously, it's not going to be a perfectly representative sample of the US.

Kevin: So we endeavor to figure out all the ways in which it isn’t representative and then apply corrections to ensure that whatever results you get represent the greater population.

Kevin: Anyway, so that’s one thing we’re building. We have 10 to 15 people working on that. But then we also have our analytics platform; think of it as the hyper-specialized Tableau, where we’ve built in lots of different analyses that operate on this clean dataset that the pipeline outputs.

Kevin: One increasingly growing set of customers for you guys are like corporations doing this for competitive analysis. How did that come up?

Kevin: Why is that? I mean, I can see why it would be interesting to them, but I'm just wondering, are they looking at questions very differently when they're looking at your platform to answer them?

Kevin: I think this is a really interesting journey for us. We started out building a platform that was focused on helping investors understand company performance.

Kevin: YC hammers in that you need to focus—it's better to have something a small number of people love than something that many people just like. We took that to heart. We didn’t want to work with companies for a long time because we were afraid it would spread out our focus.

Kevin: One thing that changed our thinking was this: there’s a book from Clayton Christensen. He’s a professor at HBS, and he wrote “The Innovator’s Dilemma.” More recently, he published a book called “Competing Against Luck.”

Kevin: In it, he talks about the theory of jobs to be done. The basic premise is that when you’re thinking about substitutes for your product, you shouldn’t be thinking about things that just look similar. Instead, you need to be thinking about fundamentally what is the job that your customer is hiring your product to do?

Kevin: If—and this I guess changed the way we thought about focus. We had been thinking, "Oh, investors, investors, investors!" But in truth, there are many different use cases for investors.

Kevin: A fundamental discretionary hedge fund, think of it as a group of analysts who are working in Excel, trying to figure out, "Is Stitch Fix poised for growth in the long term?"

Kevin: They have a very different use case from a quant investor who has a purely systematic strategy and is trying to trade daily, weekly, or even just quarter to quarter based on where they think companies are likely to beat or miss relative to expectations.

Kevin: These are different use cases. If we think about one of our core use cases being helping people understand company performance, then that's when we began to understand some other companies want to know how their competitors are doing.

Kevin: We had a really convenient way into this because we were working with so many VCs. They were actually bringing our product into the boardroom, showing it to their portfolio companies, and then the CEO would raise their hand and say, "Wait, how do I get that?"

Mike: It's an interesting sales strategy.

Kevin: Yeah, I think maybe you could speak to that a little bit more because there are so many YC companies, and oftentimes people think "Why C's just consumer?" Very much not true.

Kevin: Or "YC's just software?" Also not true. How do you guys think about your sales process?

Kevin: This is an area of focus for us now. We were fortunate to have a ton of virality, which is a funny thing to talk about in the context of really enterprise sales.

Kevin: We actually haven’t done any outbound sales yet. We have a hundred fifty clients; every single one of them came to us through inbound.

Kevin: Basically, somebody signed up, and then they told their friend about us. Their friend reached out, loved what they saw, signed up again, and told their friends, and on and on.

Kevin: It’s a box of Secrets!

Kevin: To me, it’s just like, “Hey! I have this thing, and it lets me see stuff that I’ve never been able to see!”

Kevin: This is a very remarkable thing that’s easy to spread around.

Mike: So, exactly, everyone knows that Uber is bigger than Lyft, but how much? We can actually quantify it, and it’s a lot of fun.

Mike: For certain people, right, it sort of unlocks a new way of doing their job, so it’s become like table stakes, and that’s been great for us.

Mike: But now, like, you know, we just raised our Series A. That was led by Bessemer and co-led by Goldman Sachs.

Mike: Then we also had participation from Citi.

Mike: Goldman Sachs is such an interesting partner or investor to be leading a round. Why were they super excited?

Kevin: I think we fall into this general category. When you're talking about the investment world, we fall into the category of companies generally known as alternative data companies.

Kevin: Alternative data basically refers to anything that can help you understand how companies are performing that isn't just their traditional reported fundamentals, stock prices, or things like that.

Kevin: Collectively, this refers to credit card data, satellite imagery, web traffic data, geolocation data from mobile devices, and so on.

Kevin: Goldman Sachs has made a significant push into the alternative data space. They had not made an investment in any company dealing with credit card data, and so we’re their horse in that race, if you will.

Kevin: They’ve been phenomenal.

Kevin: Here in the Bay Area, there’s a lot of focus on working with big traditional VCs, but we’ve actually had tremendous success working with less expected players out here.

Kevin: Our seed round was actually led by Jefferies, another investment bank. One thing we've found to be true for both Jefferies and Goldman is that they are extraordinarily well-connected in New York City and the East Coast—not just with investors but also with companies. They’re investment banks, so they've been tremendous in helping us get in front of more of the types of clients we want.

Kevin: Now for Citi, of course, they have a ton of transactional data, and this is something they feel internally. All the things I described about messy transactional data—they understand.

Mike: It’s odd to me that they wouldn't have a handle on this already themselves.

Kevin: It's a really hard problem. I can understand why they would say, "Why is everyone else’s so bad?"

Kevin: It's not that everyone else is so bad. It’s that people are focused on solving specific problems. I wouldn't say that Mint is terrible at identifying and understanding transactions.

Kevin: They're just good at different things because they're focused on solving a different problem. Mint.com is trying to provide a best guess as to what this transaction is.

Kevin: But they’re trying to do it for all the transactions.

Kevin: We flipped that problem upside down; we say, "We don’t care about most transactions. We only care about the 5,000 or so companies that we track and are growing."

Kevin: We care about that, and we can’t be wrong. If we’re wrong, someone is going to lose millions of dollars.

Kevin: The constraints actually helped make it much easier, as opposed to not having to focus on everything.

Mike: It makes the problem tractable.

Kevin: Yes, and because we’re focused on that, what we’re discovering is that, oh, this could help, you know, this type of company find new customers.

Kevin: It’s a company that sells to other businesses and they want to find fast-growing companies to sell to them. This has been one of the interesting parts about our journey—discovering really by accident all of these additional use cases that we really didn’t anticipate.

Mike: One thing that's tricky is that it's probably one of these great problems to have as a company. If you’re like people’s secret weapon and it becomes table stakes to be like, "Hey, if we want to stay ahead of the game, I have to be Bloomberg."

Mike: Bloomberg is a good example, like, "Oh, I have to sign up for Bloomberg from a trader to use this." I think Second Measure might easily become that category as well for a lot of investors.

Mike: I feel like the tree beep artists, if all of a sudden, now everyone is using us, how do you develop the product? How do you keep it interesting?

Kevin: So people onboard versus like jumping ship or trying to find some other solutions.

Kevin: This is a really good point. In particular, for the investment audience, right? Because investors are looking—they make money off of information edge. They make money off of knowing things that other people don’t.

Kevin: This actually informed a lot about how we tackled this problem. We could have very easily focused on selling insights or signals to hedge funds, where we say, "Oh, here are the most interesting trading signals, and we send those out."

Kevin: But as we add more and more customers, then the value to each one becomes significantly diluted.

Kevin: So we took the view that in particular, because transactional data, there’s no single owner of transactional data, there's no way to control how many people have access to it.

Kevin: So why not just assume everybody's going to have access to it one day and then focus on building a tool to help people answer more creative questions?

Kevin: In our view, even if everybody has access to the same data, if they simply focus on asking better questions, they'll still find their own edge.

Kevin: Now that’s for the investment community, though. On the corporate side, I mean really it's the fact that—you're right. Like if somebody else—that would be delightful. It's like every major corporate company.

Kevin: It's like we have to use this for competitive analysis.

Mike: I mean, like if the worst-case scenario is you were Bloomberg, you’d be okay, right?

Kevin: Alright! Awesome! Mike, thanks for coming in!

Mike: Definitely, thank you!

Analyzing Billions of Transactions to Understand Consumer Behavior - Michael Babineau and Kevin Hale

More Articles