Visit techpolicyinstitute.org for more!
Feb. 7, 2022

John List on How to Make Good Ideas Great & Great Ideas Scale

John List on How to Make Good Ideas Great & Great Ideas Scale

John List is the Kenneth C. Griffin Distinguished Service Professor in Economics at the University of Chicago. His research focuses on questions in microeconomics, with a particular emphasis on using field experiments to address both positive and normative issues. For decades his field experimental research has focused on issues related to the inner-workings of markets, the effects of various incentives schemes on market equilibria and allocations, how behavioral economics can augment the standard economic model, on early childhood education and interventions, and most recently on the gender earnings gap in the gig economy (using evidence from rideshare drivers).

Liked the episode? Well, there's plenty more where that came from! Visit techpolicyinstitute.org to explore all of our latest research! 

Transcript

Bob Hahn:

Hello, and welcome to the Technology Policy Institute’s podcast, Two Think Minimum. I’m your host, Bob Hahn. Today is January 31st, 2022, and I’ll be speaking with John List, who is the Kenneth C. Griffin Distinguished Service Professor in Economics at the University of Chicago. We’ll be talking about his new book, The Voltage Effect: How to Make Good Ideas Great and Great Ideas Scale. John, welcome to Two Think Minimum.

John List:

Thanks so much, Bob, and thanks for having me.

Bob Hahn:

So, my mother used to ask me… Actually, my dad used to ask me this, “If you’re so smart, why aren’t you rich?” And it seems to me that you’ve actually written a book that might either get you rich or the people who read it rich. What was your motivation in writing this book?

John List:

<Laugh> Well, I’m certainly not rich, and you don’t write these kinds of books to get rich. Yeah, it’s certainly not a get rich quick scheme, but for me it was more about in academia we have this notion that it’s “pearls before swine.” And what I mean by that is doing the original research and doing the original discovery is akin to pearls. And then after that comes the sludge, or the swine. What do you do with it? And my entire career, you know, starting in the late eighties, early nineties doing field experiments, I would explore different ideas and programs, use the world as my lab and figure out how the world works, and then I would go to the next problem. 

And around 2014, I got slapped in the face by policymakers, and I was slapped in the face as I was telling them about a great program that I had helped to create called the Chicago Heights Early Childhood Center, which is a pre-K school for three-, four- and five-year-olds that we started in a suburb just outside of Chicago called Chicago Heights.

So, this was a program that Roland Fryer, Steve Levitt, and myself built from scratch, and it opened up in 2010, and after four years, we started to get great results. So, I naturally said, I’m going to go and tell policymakers about these great pearls that we’ve innovated, we’ve created, and I want to give them our curriculum for free. I want every kid in the world to have the curriculum that we’ve developed at CHECC, and that’s when the slap in the face happened, Bob. 

<laugh> Now, the slap in the face was, “Professor List, this is a great program, but we don’t think it will scale.” 

And I said, “Why?” 

And they responded, “You know, it just doesn’t seem to have the silver bullet.” 

And I said, “Well, what is the silver bullet? And where do I buy a few dozen of them? Because I need them. I want to keep doing great science and having that great science change the world.”

And they basically said, “Well, we don’t really know, but every time an academic or a researcher brings us a great idea and we try to scale it, it never meets expectations. It’s always much, much less optimistic in the end when we roll it out than what you told us it was going to be.” 

And then I said, “Well, why do you think that is?”

And they said, “You know, we really don’t know, but there’s a new field called implementation science that basically started after President George W. Bush put out a call for more evidence-based policy and to think harder about the scale up problem. But, you know, they kind of tell us it’s because of fidelity, but we’re not really sure.” 

So, that’s when I said, “Well, I’m going to roll up my sleeves, and the back nine of my career is going to be to try to add science to scaling.”

Bob Hahn:

Okay. So, just for our listeners, you were a star in a field that I think you helped create or were the pioneer in field experiments. Could you minute or two talking about what they are, or maybe just giving an example, maybe out of the education example you just mentioned or whatever you want, and how they’re relevant for the questions you’re addressing in this book?

John List:

Absolutely. So, the way I want you to think about field experiments is in some way moving from the lab, and you can say, “Well, what’s the lab in social sciences?” Well, the lab is, you have a laboratory on campus, maybe it’s a classroom, and you bring in 30 sophomores and you put 15 of them in treatment and 15 in control, and you examine how they behave and how their behavioral differences are correlated with your treatment. That’s what social scientists, in particular economists and psychologists, call a lab experiment. 

So, we started doing field experiments in the early nineties and, and you’re right, I was sort of on an island by myself. A lot of people told me I was dumb for doing it, and it didn’t make sense, but my idea was we need to somehow add naturalness to this laboratory environment, and it could come in many forms. It could come in, “Why not explore how the real actors behave in these experiments? Why not put naturalness around the actual environment in these particular experiments?” 

So, now you can think about slowly moving from the lab to the field. There are a lot of different kinds of field experiments. The ones that we might be most familiar with would be… I used to be the Chief Economist at Uber, and we rolled out tipping in the app in the summer of 2017. My group, which was called the Ubernomics team, my group was responsible for testing, “What are the best ways to show the tipping choices in the app?” 

So, what we did is we took a million people and showed them, here’s what you can tip, $0, $1 or $2. Then the next million people got 0%, 5%, 10%. That’s just a way to explore the ask string. And of course when the world is your lab now, with that in hand, you can see, well, gosh, you could really vary a lot of different things and in doing so, you can learn, not only if you’re in the policy world, what can work in policymaking, what can work in the business world. So, essentially a field experiment is using the world as your lab and randomly putting people into different groups and exploring or analyzing how their behavior changes when you change the treatment.

Bob Hahn:

Right. So, is this a little bit like A/B testing? What they call A/B testing in marketing?

John List:

Marketing? I think that’s right, but it’s A/B testing on steroids. And what I mean by that is typically in a field experiment, you will start with doing an A/B test. That’s great. And in fact, a lot of the other methodologies like using mounds and mounds of data and beating it up to try to get it to say something causal, those essentially can give you what an A/B test can give you as well, with certain assumptions. I think of a well-done field experiment is you start with A/B, and then you dig a little deeper into asking what are the sub-treatments that can not only tell you the effects of causes, but also the causes of effects. So, an idea is I learned that something matters, and then I understand what are the mediation paths and what are the moderation paths that are happening in the real world. Because those help me to determine whether I can generalize those insights or scale them up. So, a full-blown field experiment, in my mind, is you start with what the marketers have done for 50 years, but you use economic theory or previous literatures, or what have you, to explore and measure “the whys” behind what’s happening in your measure.

Bob Hahn:

Makes total sense. A lot of your book focuses on scaling. Can you provide an example maybe where scaling has worked in the real world and maybe another example where it hasn’t worked so well?

John List:

Yeah, absolutely. Absolutely. So, let’s start where it hasn’t because the world is replete with those kinds of examples. And what I talk about in Chapter One is the “Just say no” campaign. And if you are roughly my age, I graduated from high school in 1987, and there was a wonderful First Lady named Nancy Reagan back then who was married to the President, Ronald Reagan. 

And as we all know, First Ladies take it upon themselves to have a mission, and in many cases, these missions are just wonderful chores that they try to carry out to change the world. The First Lady decided to take on the drug use amongst teens. So, she advocated for a program that was called D.A.R.E., and this is basically an educational program that is meant to tell high schoolers or teens, “Just say no to drugs.”

So, back then, I can still remember one of these officials came into my high school and they gave the just say no pitch. And I looked at my teacher and I said, “You know, I don’t use drugs, but I have a lot of friends who do, and there’s no way that this is going to work. No way.” 

He looked back at me and said, “Well, John, maybe you’re right, but they tell us that there’s real data behind this “Just say no” campaign. 

So, as part of my research, I looked. I looked for the data. There actually was a really good study that was carried out in Honolulu and it had 1,777 kids. And in that experiment, and I can’t find errors in that experiment, but this experiment was simply a false positive. And when I say false positive, what I mean is the data were lying.

So, when we set up our experiments, we know that there’s statistical error and we set alpha, which is a false positive rate to 5%. So, even if we do everything else right. You know, forget about publication bias and P-hacking and all that stuff, and do everything right. There’s still a 5% chance the data are lying. Now, when you go back and try to replicate that result in Honolulu, nothing happens. Go to LA, doesn’t work. Go to Denver, doesn’t work. Go to New York, doesn’t work. It was simply a false positive that we spent millions of dollars on, and that’s one of the vital signs that I talk about in the book when I say, “Make sure your idea is not a false positive,” because the policy world is just replete with examples of trying to scale something that never had voltage to begin with. Okay. That’s like one example, and you’ll see a ton more in the book, but let’s talk about some good stuff. 

So, you wanted an example of something that has scaled, and I want to talk about something that has scaled, and it meets my five vital signs that I talk about in the book. So, let’s turn back the clock even further than Nancy Reagan and talk about Jonas Salk. Now, Salk was a great innovator, and he comes up with the polio vaccination, and like any good scientist, he tries it out on his kids first. So, this is a story that is very similar to a lot of other scientists, not only in the hard sciences, but also in my life. I usually try things on my kids first. I have eight kids. I happen to have a set of twins. So, I have a natural control group as well. <Laughs>

So, Salk tries it on his kids. It works. He tries it on a lot of other kids. It works. He finds out that it’s not a false positive. Okay, so that’s vital sign number one. Vital sign number two is to make sure that it works for all of the kids if you want to scale to everyone. So, what they did next is they did all kinds of large-scale experiments on a bunch of kids, and they found that it works for everyone, black, white, brown, et cetera, et cetera, et cetera, SES, doesn’t matter. It works. That’s vital sign number two is understand the slice of the pie that your idea can help. Vital sign number three is where Salk potentially runs into the biggest problem, and that’s how to get it into people’s arms. Now vital sign number three is properties of the situation, and that’s any situational features that you’re going to be confronted with at scale, make sure your idea works with those constraints in place. 

So, now here’s the beauty behind the polio vaccination, because what they did is they opted to leverage the healthcare system, and what I mean by that is any listeners who have had children, you sort of know what happens. Baby comes out. You are exhilarated and the doctors take it away for tests, and they give a few shots to the baby. You bring the baby back in six months, more shots. Bring the baby back in 12 months, more shots. Bring the baby back in 18 months, more shots. That’s a playbook that we’ve developed using science to make sure our children are healthy, and they’ll be healthy for years to come. What they did is they put the polio vaccination in that suite. So, now it’s not very costly for parents to do it. It’s not costly for you to get them in because look, when you have a kid, it’s going to be Hell or high water to stop you from making sure your child receives the healthcare that they deserve. So, you’re going to make your 6- and 12- and 18-month appointments. So, you’re going to get the vaccinations. Great. 

Now vital sign number four is understanding the spillovers from your idea. Now here, the spillovers are great because once you get the polio vaccination, you can’t give it to other kids. So, once you have a certain fraction of kids that get it, especially all the kids, viola! You’ve solved something really important. 

The last vital sign is the supply side of scaling. So, as a Chicago economist, a lot of times, at this point, people say, “Well, is he really a Chicago economist? He’s just talked about the benefits side or the demand side in four of his vital signs. What happened to the supply side?” Well, this is where the supply side comes in because when you think about the voltage effect, which is I find something on a small scale, and when I scale it up, does it maintain the same benefit-cost profile?

So, the literature, when I started working in implementation science, they focused exclusively on the benefit side, and they tried to explore like voltage drops, which would be you get a great result in the Petri dish. You scale it up, and you get like 1/10th of a standard deviation effect that you got in the Petri dish. In this fifth vital sign, it’s purely about supply side, and here it’s simply about economies of scale and diseconomies of scale. So, this is a bread and butter of, Bob, what you and I always talk about in economics, and if you ever talk to any entrepreneurs, talk to Elon Musk, the secret behind his sauce is usually the supply side. And it’s usually an idea that revolves around economies of scale. So, with the vaccinations, just like any other drug, the big cost right away is R&D.

Okay. That’s a sunk cost already after we have a vaccination that works. So, now you have economies of scale as you produce it, of course, it’s a lower cost on average for the average person the more that we produce. Usually in this literature, the next cost that comes along is the cost to get it in people’s bodies. So, you think about COVID right now, we’re giving people real money, whether through a lottery or subsidies or what have you to get them to take the COVID vaccination. So, it’s a real expense to get it in people’s arms, the COVID vaccination. Here again, Salk works perfectly because, well, what’s the cost? It’s zero because you are going to be bringing your child in anyway, and you want your child to be taken care of. So, viola, the child gets a vaccination. So, to me, the polio vaccination is a great poster child for an idea that checks all of the boxes that I talk about in The Voltage Effect, and then it works perfectly.

Bob Hahn:

All right. So, you mentioned… I love the polio vaccine and I’ve taken it, and I also had the whatever it was, the say no to drugs lecture that you had. I think I was in junior high or high school <laugh> and had the same reaction. I was sort of rolling my eyes for an hour. Why are we being subjected to this? 

But I guess you raised the issue of COVID, and since it’s on everybody’s mind and has affected most people’s lives, like you and me, what went wrong there? We’ve got the vaccines and the boosters and whatever, and I guess two questions. We don’t have to spend an hour on COVID, but just focusing on the vaccine, it would seem to me almost a no brainer. You know, the government rolled it out and said “Here it’s for free. You can take, you know, the first one, the second one, the booster, what have you.” What went wrong and how might we do better?

John List:

Yeah, it’s a great question. And I do talk about this in the book. I think that there’s plenty of blame to throw around to folks on both sides of the aisle. When COVID became politicized, it became something that was beyond a public health problem. It became a political problem. And I think anytime that there are important threshold benefits, and what I mean by that is when after a certain fraction of people get immunized, call it a herd effect, then we’re all safe. We still haven’t reached that herd effect. And I think it’s because it became politicized very early. On the rollout side, there hasn’t been a natural way that we can decrease the barriers to get people to go and do it. It this isn’t about a child who is born, and once they get the vaccination, it’s good forever.

This is something that there’s a real cost. You have to spend some time, maybe go wait in line, maybe sit in the grocery store like I did for an hour and a half to get my booster shot. So, you might think, well, these are pretty small costs for something that’s so important. It’s true. But as we know from behavioral economics, transportation barriers, other types of subtle or small barriers, end up having large effects, especially when you put them with the political, let’s say ideologies, and you start to wonder, you know, is that really for me? Is that something I should have? And then I think on top of that, you wonder, “Well, what are the long-term effects of this? This is just started. What’s going to happen in 10 years? Will there be something different, or will there be a COVID vaccination response?” That might enter some people’s minds.

I think when you put that all together, you have a person who has enough question marks to where maybe something like simple, a ride to Walmart to get the vaccination. I’m part of a research team right now. I’m the Chief Economist at Lyft right now, and I’m working with a team to try to lower the transportation barriers to get people a free ride to a Walmart or a CVS or wherever, so they can get the vaccination, and this is going to be rolled out soon. So, we’ve had some success using that approach when we’re trying to get people to consume healthier foods. People who live in a food desert, we’ve done a pretty large-scale experiment to try to give them free rides to shop at places that have healthier foods. So, we’re trying that same trick here. Just because there are subtle changes that in some cases can lead to large effects.

So, to answer your question, it’s multidimensional, of course, and there are many things that all of us could have done differently. Where I think it was a big success is think about how fast that vaccination came to market, and, and I believe it’s safe. When you think about the efficacy test so far, and you think the science behind what the vaccination is doing to your body, it feels safe to me. The famous last words, right? But it feels safe to me in a way that a typical drug trial would have the efficacy and safety profile that we’ve tended and we’ve come to expect. So, I think that was a major achievement of our economy and of our politicians. Now, getting people to actually uptake the vaccination, I think by and large has been a major failure.

Bob Hahn:

So, you mention food and you have this cool example in your book about a restaurant that I used to go to with my wife in Oxford called Jamie’s, and I did notice that it closed down one day, and you have a story about it. Do you want to tell our listeners about Jamie’s? Because it was quite popular and we had to wait on line to get in there when it first opened.

John List:

No, absolutely. So, Jamie Oliver, a wonderful chef, and you’re right. He like so many others had a great idea, and in the Petri dish, or in his case in one restaurant, that great idea worked very well. And when I looked not only at that example, but at a lot of restaurant examples, you have a lot of restaurants that have tried to scale, and it doesn’t work. You have a lot of restaurants that have tried to scale, and it does work. So, you start to ask yourself, why is that the case? And if you look a level deeper than just the initial run of data, what you tend to see is when the secret sauce behind why the restaurant was so great to begin with is the chef, it won’t scale. And the underlying reason why is because humans don’t scale, right? If you’re a unique person, and your idea works because of you and you alone, it won’t scale.

Now the restaurants that do scale are the ones that secret sauce is literally the secret sauce that can be replicated, take Domino’s. Domino’s works because you can replicate the ingredients at scale. In fact, you have scale economies there. As you grow more and more Domino’s, it’s not that hard to keep getting the same cheese, the same pepperoni, the same sausage, et cetera, et cetera. And you can do it at a lower cost. So, that’s something that will have high voltage. Those restaurants tend to work. Now you have some tweeners, so to speak. Like some restaurants that you kind of question whether the secret sauce is scalable, and I want to you here to think about a brick oven. So, sometimes when you go to an old pizzeria in an old town that maybe the pizzeria in Italy is 500 years old, and you have this great pizza, and you wonder why it’s so great.

In many cases, it’s that brick stove oven, and it’s very difficult to replicate that in other restaurants because there’s some secret sauce behind the ashes and fumes and the musk in that big furnace and chimney that you just can’t replicate. And there are other cases where it can, where you have the cooking element where you can replicate it at scale. Now this falls into my third bin, which I call representativeness of the situation. Is the situation in under which you drew the original data… Can you replicate that situation at scale? And if you can’t, you need to go back to the drawing board and do something that you can replicate. If you can, then we’re in business. We pass that vital sign.

Bob Hahn:

So, I still didn’t quite get the picture on Jamie’s. Why did it close? 

John List:

Oh, it closed because both he and his manager were the secret sauce, and he scaled quickly, right? They had a lot of restaurants. 

Bob Hahn:

Yeah. They were all over England. 

John List:

They were all over the place. But guess what? Jamie and his general manager could not be all over the place because they’re humans, and their secret sauce could not be spread over the restaurants, and guess what started to happen? Lower quality food, lower quality environment, lower quality supply chain because the manager was doing that. Jamie was doing the cooking. Take both of them out of the equation, they’re done. Humans don’t scale.

Bob Hahn:

So, you mentioned the idea of implementation. And years ago, I read a book which was influential at the time that I think Aaron Wildavsky and Jeffrey Pressman wrote, Implementation: How Great Expectations in Washington Are Dashed in Oakland. I probably have it wrong…

John List:

No, you’re right. That’s a wonderful book. Absolutely.

Bob Hahn:

So, here’s what I want to press you on. I mean, you take a very different, a very scientific approach to this whole endeavor, which I think is great, but you’ve worked in the real world. You sat in a windowless office. I sat in a very nice office in the OEOB before you doing similar things. The world is full of politics. And how does politics enter into to your scalability equation, particularly because you’re concerned not only about making businesses more efficient, but about making government programs work better.

John List:

A hundred percent, a hundred percent. So, let me talk about first of all, a broader point, and then I want to dig into political will and politician’s pursuits and how my book can speak to them. So, a first broader point is that when you look across the different hats that I have worn, exactly as you’re saying, I worked in the White House for a year and a half. I was the Chief Economist of Uber for two years, Chief Economist of Lyft for four years. I’ve worked with a lot of local governments, lot of organizations. What has been interesting to me is that the same thread that connects these organizations revolves around scaling. They all face the fundamental same problem of, I have an idea. How much should I invest in that idea to scale it? And when I first started reading this literature, what I learned was this is all art.

This is move fast and break things. Throw spaghetti against the wall, whatever sticks, cook it. Fake it till you make it. Of course, she’s now in jail, but you get the point, is that it was art. And I said, look, I have an opportunity here to add science, and to add science to this area, that to date, has been dominated by art. Okay. 

So, now the one piece that’s different between government and a firm is the political will of those inside the government, not only the policymakers themselves, but also after they decide to do a policy. Is it actually rolled out correctly, and is it implemented the way that it’s supposed to be implemented? In a firm, you have that a little bit, but not nearly as much as in government. If a firm has a great idea, the profit motive will tend to get them every time to do it or to do a piece of it.

Where in government, you have these other hurdles. What I noticed in the political will problem is first of all, of course, you know as well as I do that benefit cost analysis is just one piece of the decision making apparatus that we have in government, right? It’s inform if it’s a new regulatory rule, making a new law, et cetera, if it’s economically significant, it has to undergo a formal benefit cost analysis. I did dozens of those, I’m sure you did too. Okay. That’s just one piece of the decision making process. My argument is that will be a stronger piece. If we are more confident in what the benefit profile will look like, and what the cost profile will look like, when we scale it. Because to date, there’s a lot of uncertainty around how the benefit profile or cost profile will look. In fact, we don’t even pay much attention to it. 

In my days, we never even talked about it, but that’s something I think once you get a better grasp of that, you have a lot better feel for at scale, here’s what will happen. So, that’s point number one, I think it’s going to have a bigger seat at the table when we roll it out or when we make the arguments. Okay. Let’s talk about rollout now. In many cases, if you in the Petri dish gave a blue pill and it worked, a lot of times what’s actually rolled out is a red pill. That’s a fidelity problem, and it ends up being the red pill because well, the implementers or administrators didn’t really believe the blue pill would work because they didn’t really see it. So, they did the red pill instead. Well, it looks a little bit like the blue pill, it’s the same size, but we’re going to roll this one out.

Why? Because we think it will work and it’s less costly for us to do it. The other one’s too difficult to do. It’s hard to do. What I found is, if you use the original scientist on the implementation team are much more likely to roll out something that was actually done in the Petri dish. Point number two, if the original researcher can document “the whys” behind the program, for example, why it worked. Here’s why CHECC worked. So, for us CHECC worked, we had a parent academy within our Chicago Heights Early Childhood Program. That worked for intact families because they could spend more time with the child. So, now I can tell you a little bit about why our CHECC program works. When you tell the implementers “the whys” behind something, they are more likely to implement it with fidelity. Okay. So, these are just a few features of it. There’s no doubt this is still just one piece of the decision-making process. We have to first overcome political will and get policymakers to listen to us. I think if you can speak a lot more confidently about what scales and why, and then when you get to the implementation phase, what works and here’s why it works, and here’s an implementation scientist or the original scientist to help you out. These are features that have been missing in the process so far. And I think they can help tremendously.

Bob Hahn:

So, you’re basically saying fill in some of the blanks or the linkages for what works and why it works, and it may motivate better policy and in the public sector.

John List:

That’s correct.

Bob Hahn:

Okay. Let me ask you completely different question that just occurred to me in listening to you. You’ve worked with some, what I think of as very successful startups, in Lyft and Uber. I don’t know if they’re making money now, but you know, they’re pretty big. You know, I’ve heard of them, and I’ve got both apps on my iPhone, and my iPhone is ancient. People laugh at me. 

My question relates to how you think about firms. Do you think about them as very sort of efficient machines, or do you think of them you know, as we sort of do an Econ 101 when we model perfect competition and things like that, or do you think of firms as very inefficient or a whole range based on your experiences working in the private sector, and part of my motivation is a piece that Herb Simon, who was a very great economist many, many, many years ago, he introduced one of his articles, and he said from 10,000 or 30,000 feet, and this isn’t a direct quote, you know, when you look at General Motors and you look at the government, they’re not that different, you know, in a lot of dimensions. So, what’s your take on how efficient firms are and how they can raise their game?

John List:

Yeah, great question. So, there are pieces of firms that are very efficient. And what I mean by that is when there’s an innovative idea for a new strategy, the firms that I have worked with have been extremely efficient to roll that out in a way that both helps the consumer and helps their own bottom line, and when it doesn’t work, or when it begins to get bad signals, they’re quite efficient at bringing it back in. 

And I think the biggest difference between firms and government, and as you noted, I’ve worked in both for years. In many cases, it’s hard for the government to reel back in. And I think the fundamental problem is that governments tend to roll out new ideas in a way that makes it impossible to measure if the idea is working. So, I’m not calling for a full-blown RCT, a randomized control trial here. If you can, that’s great, but I’m not even calling for that. I’m calling for all governments to roll out policies in a manner in which we at least have a chance to measure whether it’s workingand then to have a system set up to continuously measure whether the program that has just been scaled is actually working the way we thought it was going to work, and if it doesn’t, to have a plan to adapt. I think that it, in the past, we viewed any adaptation, any quitting, is a moral and massive failure.

It’s not. As you know, in economics, entry and exit is proof of a well-functioning marketplace. The fact that we don’t have entry and exit of new ideas in government and new programs in government just tells you how inefficient the entry and exit rates are in government. And it’s because we don’t measure those programs when we roll them out, and we don’t have a plan to measure them. That’s a big difference for a firm. A firm will constantly measure and adapt and take back. Now, the one aspect that they could take back that I worked on in Uber was tipping. So back in 2017, when we rolled out tipping that summer, we knew that once we put it on the app, it would be impossible to take back. Because once drivers had the right to receive tips, we knew it would be nearly impossible to reel that thing back in.

So, we did a bunch of beta testing up front to make different markets 5% here, 5% there, to make sure that what we were going to roll out nationwide in the end made sense. Now, even when we rolled it out, we did it in an experimental way. We rolled it out. Some entire cities got treatment. Some entire cities got control because we wanted to test when you treated an entire city, what happened to equilibrium wages? What happened to equilibrium service levels? And the only way you could do that is if you randomly allowed some cities to get tipping and other cities not to. And then of course, after six months, we rolled in those other cities. So in the end, everyone got tipping. So, I think firms can be very efficient. Now when I said both, there are some inefficiencies that have noticed in firms too.

One is along the lines of working and sharing information across a large organization. So, when you think of an organization that has maybe four, five, 10,000 people, one problem you have to overcome is the silo problem. And what I mean by that is at Uber, say you have a marketplace that has 500 people working on refining the marketplace. You have another group working on driver acquisition, another group working on rider acquisition, another group working on diversity, another group working on insurance, et cetera, et cetera, et cetera. The inefficiencies that I’ve noticed in firms is that these silos end up attenuating the information flow between groups in a manner that makes it less efficient. Because in many cases, what we learn in one silo can be tried and applied in other silos. And it tends to take some time if it happens at all. So, that’s the biggest inefficiency that I’ve seen in larger organizations.

Bob Hahn:

So, you raise the issue of organizations in your book. You talk a little bit about the importance of culture. I’m reminded of the, the Simon and Garfunkel song years ago, like the man ain’t got no culture or whatever <laugh> 

How do researchers measure that and link it to sort of concrete outputs? And can you provide an example?

John List:

Yeah, that’s a good question. So, culture is one of those words, like creativity or critical thinking, right? So, you have to make a stand on what it means upfront, and, you know, with critical thinking, I’ve written a paper on critical thinking and what it means. On creativity, I’ve also worked a bit on that. And, and in each case you have to say, here’s what I mean by creativity or critical thinking. So, with culture, what I mean is on the one hand, people are happy at work and people are getting the most out of themselves and the teams that they’re working in, and the culture promotes that kind of productivity. Okay. So, when I think of culture in the book, I talk about cultures that I see working in some cultures that I see not working. So, the running example in that chapter is about Brazilian fishing villages.

So, in one case you have the Kabuchu villages, and these are villages where you have groups of workers all jumping into a boat and going out on the ocean with really heavy nets. And they have to fish as a team. And what they bring in to the boat and bring back to shore is split amongst the team. Now juxtaposed to that village is a village that’s just inland, and they have a lake that they fish on. And in this village, you have a bunch of soloists. And these soloists all have their own boat. They go out in the lake, they fish. What they catch, they bring back and they take it to their family. If they have extra, they sell it in the market themselves and they take the profits. What we find in doing research along things like cooperation or community or public goods provision, is that in that Kabuchu village, there’s a lot of community spirit. There’s a lot of public goods. In the soloist village, it doesn’t look so much so. 

So, my point is from the very beginning, an organization should think hard about creating an environment that is like the Kabchu villages. I call it, build your own Kabuchu okay. So, what do you mean by that? What I mean by that is from the very beginning, when you’re hiring people, there can be subtle changes in your job advertisements that lead to pretty big changes in who applies for the job and what wages they receive. So, for example, in some experiments we ran, we put people into two groups and we put out a job ad. And in one of the job ads, one group received that wages are negotiable. In the other job ad, it was identical to the first one, but we left that sentence out of the job ad.

What we found was in the first one, women negotiate like mad. They negotiate even more than men, and they’re really good negotiators, and they start with higher wages. In the job ad that did not have that sentence, women shied away from negotiating. In fact, the low quality men negotiated the most for their wages. So, this led to a wage inequality from the very beginning that can just grow. We find similar results on corporate social responsibility. We started our own firm, hired a bunch of people, and in some advertisements, we advertise this firm has corporate social responsibility in that the profits are going to be used to blah, blah, blah, blah, blah. And with the other advertisements, we left that out. You get a very different pool of workers. You get a much more diverse and productive pool of workers under the first type of scheme than the second. So, what I’m saying in this chapter is you should do everything with purpose. You should write your job advertisements. You should think about your wages. You should think about your workplace environment in a manner in which when we scale this up, can it not only work now, but will it also work when we scale? And as it turns out, having a more diverse, more equal pay culture is a culture that’s set up to be more productive and more scalable than, let’s say a counterfactual of that type of firm or organization. 

Bob Hahn:

That makes sense. I think it’s really neat that you can actually bring experiments to bear on words that I can barely understand, like culture. I wanted to ask you a question, this is sort of a personal question in some ways. I sat on a commission for a year or two called the Evidence-Based Policy Commission, and I thought it was a great idea. At a couple of points in your book, and I think in a couple of articles you’ve written, you say something like we should move our attention from evidence-based policy to policy based evidence. Can you explain to me and the rest of the world what you mean by that and why it’s a good idea?

John List:

Yeah, that’s right. So, thank you for noticing that. You’re right. Look, using data and using causal inference to make decisions is a good thing. Okay. So, when I talk to policymakers about evidence-based policy, first of all, I don’t think you’ll get an agreement on what evidence means. If you talk to 30 policymakers, the word evidence is a bit like creativity and critical thinking and culture, because they all have a different feeling or belief about “What is evidence?” Now, a lot of times, that’s self-serving, right? But I think about evidence as having enough data in information to say something with strong conviction. And that’s why you talk about having a few replications. So, you make sure that this result is not a false positive, or there’s voltage in your idea. Now, that is a wonderful first step is an efficacy test.

So, it’s great when researchers bring forward data. But most of the time behind the scenes, this is a researcher setting up the best-case scenario, and then they publish it. Because they’re giving the theory, it’s best chance to work. They’re giving their hypothesis the best chance to work. They publish it, and they forget to tell the rest of the world that that was an efficacy test. And then they move on to something else. In the medicines, we have natural safeguards set up for that because you have an efficacy test and you have phase one, phase two, phase three. And during those phases, you figure out what are the boundary conditions of this result? What’s the mediation path? Who does it work for? Who doesn’t it work for? What’s a safety profile for different groups? What is the uptake going to be, et cetera? They don’t get everything right.

But they’re a lot closer than we are in our, here’s an efficacy result, let me move on to my next AER paper, right? That’s a culture that as scientists we’ve built, so this is just scientists responding to the incentives that they’ve been given. And then they write in their paper, my study has policy relevance, right? And that’s partly because editors like me make them say that, and partly because they want to write it because they think other people want to hear it. It doesn’t have policy-based evidence, let’s say if it’s a simple efficacy test, let’s be honest. Okay. Now what do I mean when I say policy-based evidence, what I’m talking about is go to scale or go to the target setting and think about what are the constraints at scale, or what are the situational features that I’m trying to give my program or sell my program to.

And I want to take those constraints and bring them back to my original research and test does my idea still work when those natural constraints or political will or what have you are in place? So, with CHECC, think about check the Chicago Heights Early Childhood Program. There, when you do it, I had to hire 30 teachers. And as it turns out, my program works if you have good teachers, that’s a non-negotiable. So, a non-negotiable in CHECC is you better have a good teacher in the classroom. Now, it wasn’t that hard to find 30 good teachers. But what happens when I have to find 30,000 good teachers? Am I still going to be able to find 30 great ones? No. I can break the budget, but now I have a voltage drop because of the supply side, right? I go up the supply curve. So, if I want to keep the budget the same, I better bring that back to the Petri dish and hire stratify and block on marginal teachers and see if CHECC works with marginal teachers.

So, now I’m talking about a sub-treatment under my A/B to say in the efficacy test, it works, but I know I’m going to have to hire a lot of marginal teachers. So, let’s see if this idea works with marginal teachers. If it does. I compare that to, you know, how much voltage do I lose with marginal teachers versus good, if it’s still scalable. Now I’m speaking to policymakers because I’m doing policy-based evidence because this is evidence that the policymaker needs. This is what my program is generalizing into, or this is a situation that I want my program to work within its constraints. That’s what I mean by policy-based evidence. And until we do that, there’s no sense in saying in your paper, this has credibility for policymakers because you haven’t explored the space that we care about or that we need. Does that make sense, Bob?

Bob Hahn:

I think you and I are totally on the same page. I just didn’t quite understand the phrase. So, let me simplify this for our listeners and tell me, John, if you buy into this. You really want to know if this thing works in the real world?

John List:

With all the constraints and imperfections of the real world. 

Bob Hahn:

So, I mean, one of the things that turned me off about economics for a while was, you know, I’d read the same articles that you did, but about 20 years before you. You know, which at the end of the chapter, the conclusion, or the paper, they’d say here’s the policy relevance. And it was laughable based on a very narrow, narrow description of whatever they were doing. And, you’re saying, you know, you’ve got to broaden it, make it robust, resilient, whatever to meet the needs of the decisionmaker in the real world.

John List:

No, a hundred percent. But now what I’m proposing is something that maybe even is a little bit more radical because I want the experimentalist to flip the manner in which they usually do their research and backwards induct from understanding what those imperfections will be at scale, and then exploring those in their A/B and sub-treatments, to make sure that you still have voltage with those imperfections in place. That’s what we don’t do.

Bob Hahn:

So, that would be nice, but the question is, what are the incentives to do it? Can we get the big shots in the profession to change the incentives? So, instead of putting out this sexy new car, each time, you know, we actually, we actually pursue it. 

John List:

So, let me comment on that for a minute. Because there you’re right. So, right now we do that when we explore heterogeneity of people. That’s all the rage in economics. Does it work for men? Does it work for women? Is it weird? They test for heterogeneity. What I’m proposing here is to do the same thing, but with situations. So yeah, it takes donors. It takes a government. It takes journal editors. It takes a group of people to say, “We have a really good start in exploring heterogeneity along the lines of the dimension of the subject pool. Great. But what we don’t do is exploring heterogeneity along the lines of the situation pool or the imperfections that we have in the real world.” So, I don’t see that as a big ask. We do that. Now, what it is, it’s to add two or three treatment cells. Where in heterogeneity, what you’re trying to do is you’re trying to sample a broader group of people, so you have power to explore heterogeneities, but it’s sort of along the same lines.

Bob Hahn:

So, I see it as a big shift in the profession. How do you change the incentives to get into the top five journals and things like that? I just don’t see the journals moving there in my lifetime, but let me end, since you’ve been very generous with your time, with one very small problem, which is climate change, and I’m going to simplify it a little bit because we don’t have a week to talk about this. 

Many countries around the world are now introducing electric vehicles and let’s assume [inaudible] that that’s a good idea. And I have misgivings about it, and you may or may not have misgivings about it. How can we bring the tools or your insights about scaling to this problem if the government’s ambition is to get a large fraction of the population, say to be driving around in an EV in 10 years?

John List:

No, that’s right. And we don’t have to debate, because I do have some questions as well, whether the policy makes sense, but let’s pretend it does. I think a first question is this is like the household adoption of technology, right? How should we think about convincing households to adopt a new technology? So, first of all, we need to build the product so it’s affordable under the current prices. And if we don’t, we need to have subsidies. We know that, but also usable. So, in some work with Rob Metcalf and Christopher Clap and Alec Brandon and Mike Price, we look at smart thermostats, and smart thermostats, back in the day, were all the rage, right? They were going to put them in every home in the US, and we’re going to save a bunch of energy. So, we end up using and looking at some old power data that randomly puts households getting the smart thermostat, and some don’t, and some do.

And we compare are results to what the engineers said was going to happen. So, when the engineers did their estimate on the value of smart thermostats, there are huge estimates in terms of savings. Now what the engineers assume is that everyone is like Commander Spock, or, you know, like the, the perfect rational, I can solve a second order, partial differential equation in the drop of a hat, and I’m never going to make a mistake, and lo and behold, you get huge energy savings. 

But you are selling that technology into people like Homer Simpson or like me. I have one of these in my home. What we find in our data is because you’re selling it into a room of Homer Simpsons, you get zero energy savings. So, we did not set up the product with an understanding that the end user is going to have to use it and they can’t undo the presets.

That’s what humans do, is they undid all the good stuff. Because we didn’t build the technology right. So, what I’m asking here is you not only need something that’s affordable. You need something that’s built right. That people can use and get charged when they need it to be charged. And they’re not afraid when they’re driving across Nebraska, that they’re going to get caught, run out, and they’re going to be done, right? Because you need both pecuniary and non-pecuniary incentives to promote adoption and end use. Now, what we found in another set of studies is that behavior economics can be super useful in generating first adoptions. You know, if you don’t have a CFL in your home, or if you don’t have a battery powered car, whatever. That non-pecuniary, let’s say, non-financial incentives like social norming and nudges work reasonably well. But if you want a deeper investment, you need prices. So, in that way, you have behavioral economics and traditional economics working together and complimenting one another in this idea of, we want to take on this big problem that we have called climate change. And one culprit is going to be the household. So, we need the household not only to adopt energy saving vehicles, but also energy saving technologies for their home, and I think in that case, it’s using the suite of behavioral and non-behavioral interventions on margins that we know it works on.

Bob Hahn:

On that note, I think we’ll have to end. We’ve been talking with John List about his new book, which I very much enjoyed, The Voltage Effect. John, thanks for joining Two Think Minimum.

John List:

Thanks so much for having me, Bob, and I look forward to coming back.