vimarsana.com

Transcripts For CSPAN2 Michael Kearns Aaron Roth The Ethical Algorithm 20240713

Talk about their book the ethical algorithm. I think a day does not go by in the news or otherwise or in our own work when the subject of algorithmic fairness or privacy is not frontpage news. Today were going to speak to the two leading lights in that area and theyre going to help us understand whatthe stateoftheart is now and what the stateoftheart will be going forward. With that, i think we will welcome Michael Kearns first to the stage. Michael kearns andaaron roth, welcome to the stage. [applause] good morning, thanks to everyone for coming. My name is Michael Kearns and with my close friend and colleague erin roth we have coauthored a book, a general audience book called the ethical algorithm subtitle is the science of socially aware algorithm design. So we want to do for roughly half an hour or so is take you at a high level through what some of the major themes of the book are and then we will open it up as jeff said to q a. So i think many, many people and certainly this audience is well aware that in the past decade or so, Machine Learning has gone from a relatively obscure corner of ai to mainstream news. And i would characterize the first half of this decade as the glory period when all the news reports were positive and we were hearing about all these amazing advances in areas that learning which has applications in speech recognition, image processing, image categorization and many other areas so we all enjoyed the great benefits of this technology and the advances that were made. But the last few years or so have been more of a buzzkill. And there have been many articles written and now even some popular books on essentially the Collateral Damage i can because i algorithmic decisionmaking, especially algorithmic decisionmaking powered by ai and Machine Learning so heres a few of those books, weapons of mass destruction was a big bestseller a few years ago and it did a good job of making very real and visceral and personal the ways in which algorithmic decisionmaking can result in discriminatory predictions like gender discrimination, Racial Discrimination or the like. David and goliath is a wellknown book about the fact that weve essentially become think again to a commercial surveillance state and the breaches of privacy and money in trust and security thataccompany that. And erin and i have read these books and we like these books very much and many others like them. But one of the things we found lacking in these books which was much of the motivation for writing our own is when you get to the solution section of these books, what should we do about these problems, the solutions suggested are what we consider traditional ones. They basically say we need better laws, better regulations. We need watchdog groups. We need to keep an eye on this stuff andwe agree with all that. But as computer scientists and Machine Learning researchers working in the field, also know theres been a movement in the past 5 to 10 years to sort of design algorithms that are better in the first place so rather than afterthefact you wait for some predictive model to lets say exhibit Racial Discrimination in criminal sentencing, you could think about making the algorithm better in the first place and there is now a fairly fairly large Scientific Community in the Machine Learning Research Area and adjacent areas who do exactly that so our book is, a science book. Were trying to explain to the reader how you would go about trying to encode and embed social norms that we care about into our rhythms themselves. Now, a couple preparatory remarks. We got a review on an early draft of the book that basically said i think your title is a conundrum or possibly even an oxymoron. What do you mean an ethical algorithm . How can an algorithm be more ethical than a hammer . This reviewer pointed out that algorithm like a hammer is a tool. The human design artifacts for particular purposes and while its possible to make unethical use of a hammer, for instance i might decide to hit you on the hand, nobody would make the mistake of ascribing any unethical behavior or immoral activity to the hammer itself. If i you on hand with a hammer would blame me for it and you and i both know that real arm had come to you the cause of my hitting you on the hand with a hammer so this review basically said i dont see why these same arguments apply to our rhythms. We thought about this for a while and we decided we disagree. We think that algorithms are different even though they are indeed tools that are human artifacts or a particular purpose. We think theyre different for a couple of reasons. One of them is that its difficult to predict outcomes and also difficult to ascribe blame and part of the reason for this is algorithmic decisionmaking when powered by ai and Machine Learning as a pipeline so let me review what that pipeline is. You usually start off in some perhaps complicated data, complicated in the sense that its high dimensional and has many variables might have many roads. So think of a medical database for instance, of individual citizens medical records and we may not understand this data in any detail and may not even understand where it came from in the first place area it may have been gathered many disparatesources. And then the usual pipeline or methodology of Machine Learning is to take that data and turn it into some sort of optimization problem. We have some objective landscape over the space of models and we want to find a model that does well on the data in front of us and usually that objective is primarily or often exclusively concerned with predicate accuracy or some notion of utility or profit. Theres nothing more natural to do in the world if youre a Machine Learning practitioner and to take a data set and say lets find the Neural Network that on this data makes the fewest mistakes in deciding who to give a loan so you do that and what result is some perhaps complicated, high dimensional model. This is classic clip art from the internet of deep learning. This is a Neural Network with many layersbetween the input and output and lots of transformations of the data variables. The point is a couple things about this pipeline, its very diffuse. If something goes wrong in this pipeline it might not be easy to pin down the blame. Was it the data, was it the optimization procedure that produced a Neural Network or was it the Neural Network itself and worsen that, if this algorithm for this predictive model that we use at the end causes real harm to somebody, if you are falsely denied alone for instance because the Neural Network that you should be denied alone, when this is happening we may not even be aware that i you on hand with a hammer and also because we get algorithms so much autonomy these days. To hit you on hand with a hammer i have to pick the thing up and hit you. These days algorithms are running autonomously without human intervention so we may not even realize the harm being caused unless we know to explicitly look for them. So our book is about how to make things better, not through regulation and laws by actually revisiting this pipeline and sort of modifying it in ways that give us various social norms that we care about like privacy, fairness, accountability, etc. And one of the interesting and importantthings about this endeavor is that even though many , many scholarly communities and others thought about these social norms before us, so for instance philosophers have been thinking about this for time immemorial. Lots of people have thought about privacy and the like. Never had to think about these things in such a precise way that you could actually write them into a Computer Program or into an algorithm and sometimes just the act of forcing yourself to be that precise and reveal flaws in your intuitions about these concepts that you were going to discover any other way and we will get concrete examples of that during our presentation so the whirlwind high tour of the book is a series of sessions about different social norms, some of which ive written down here and what the science looks like actually going in and giving a precise definition to these things, a mathematical definition and then encoding that mathematical definition and an algorithm and importantly, what the consequences of doing that are area in particular tradeoffs so in general i want to an algorithm thats more fair or more private, the cost of less accuracy for example and we will talk about this as we go so you will notice ive written these different social norms in increasing shades of gray. And what that roughly represents is our subjective view of how mature the science in each one of these areas is so in particular we think when it comes to privacy in relative terms that you the most mature is what we think is the right definition of data privacy quite a bit known about howto embed that definition in powerful algorithms including Machine Learning. Fairness is a little bit lighter is the more recent more nascent field but is off to a very good start area and things like accountability and interpretability or even morality are in prayer change the cause in these cases we feel like their argument good technical definitions yet its hard to get started about encoding these things in algorithms and i promise you that theres a bottom tier which says the singularity but its entirely in white you cant even see it. So what were going to do with the rest of our time is talk about privacy and fairness which cover roughly the first half of the book and then we will have a few words about telling you the game theoretic twist at the book takes about midway through so im going to turn it over to erin for a bit now. So as michael mentioned, privacy is by far the most welldeveloped field that we talk about so i want to spend a few minutes just giving you a brief history of the study of data privacy which is about 20 years old now and in the process try to go through a case study of how we might think concisely about definitions. So it used to be maybe 20, 25 years ago that when people talk about releasing data sets in a way that was privacy preserving, what they had in mind was an opposition. I would have some data set of individuals, people record and they might in my data set out peoples names and i would just if i want to release this, try to anonymize the records by removing the name and maybe if i was careful, other unique identifiers like Social Security numbers but keepthings like eight or zip code , features about people at work enough to uniquely identify me. So in 1997, the state of massachusetts decided to release a dataset that would be useful for medical researchers. Medical data sets are hard for researchers to get their hands on because of privacy concerns and the state of massachusetts had an enormous data set of medical records, the records were every state employee in massachusetts and theyrelease these data set in a way that was anonymize. There were no Social Security numbers that there were ages, there were genders. So it turns out that although age is not enough to uniquely identify you, zip code is not enough to uniquely identify you, gender is not enough, in commendation they can be and there was a piece named latonya sweeney who was at mit at the time. Who figured this out and in particular she figured out you could cross reference the supposedly anonymized data sets with Voter Registration records also had demographic information like a zip code and Social Security number and gender the other with names so she crossreferenced the anonymized medical data sets with Voter Registration records of cambridge massachusetts and was able to with the triple identifiers, identify the record, medical record of all well was the governor of massachusetts at the time. She put those records on the net to make a point. So this was a big deal in the study of data privacy and for a long time people tried to access problem i basically just using little bandaids, trying to most directly fix whatever the most recent attack was so for example people not all right, it turns out that commendations of code and gender and age can uniquely identify someone in a record, what we tried coarsening that information instead of reporting age exactly, maybe we record up to an interval of 10 years, maybe onlyreport zip code up to three digits. And we will do this so that we can make sure that any combination of attributes in this table that we release doesnt correspond to justone person. So for example if i know that my 56yearold neighbor was a woman attended some hospital, maybe the hospital at the university of pennsylvania and they release anonymized data set in this way, they guarantee that i cannot the attributes that i know about my neighbor to get one record. I can connect them to two records, thats less intentional though for a little while people tried doing this. And if you think about it, if you look at the data set, you might already begin to realize this isnt getting quite what we need my privacy because although if i know that my 56yearold female neighbor attended the hospital at the university of pennsylvania i cant figure out what her diagnosis is because it response the records and i can figure out either hiv or politeness might already be something he didnt want me to know what the problem actually goes much deeper. Suppose that i know shes been a patient, not just at one hospital but to hospitals. And the other hospital has also released records anonymized in the same way and maybe even a little better because now my 56yearold female neighbor matches not just to three of these records. If both of these data sets have been released i can just reference them and their unique record, only one record that can possibly correspond to myneighbor and all of a sudden ive got her diagnosis. So the overall problem here is the same as it was when we just tried removing names and if that maybe attempts at privacy like this would work if the data set that i was releasing was the only thing out there. But thats never the case and the problem is, small amounts of idiosyncratic information are enough to identify you in ways i can uncover if i can cross reference the data set that then release with all the stuff thats out there. Though people try catching this up as well but for a long time the history of data privacy was a cat and mouse game where data privacy researchers would try to do heuristic things acting up whatever vulnerability led to the most recent attack. And attackers try new clever things and this was a losing game forprivacy researchers. Part of the problem is we were trying to do things, trying to do things we hope were private without even ever really defining what we meant by privacy area this was an approach that was too weak. Let me in an attempt to think about what privacy might mean talk about an approach that is sort of strong and then we will findthe right answer. So you might say okay, lets think about what privacy should mean. Maybe if im going to use data sets to conduct for example medical studies, what i want is that nobody should be able to learn anything about you as a particular individual they couldnt have learned about you the study not been conducted. It would be a very strong notion of privacy if we could promise it. And maybe connect more concrete with about whats come to be known as the british doctors study, a study carried out by all and hill in the 1950s and it was the first piece of evidence that something in lung cancer a Strong Association. So its all the british doctors study because every doctor in the uk was invited to participate in the study and two thirds of them did. Two thirds of doctors in the uk agreed to have their medical records included as part of the study. And very quickly, it became apparent there was a Strong Association between and lung cancer imagine you are one of the doctors who just in the study. Say youre a smoker. And this is the 50s so you definitely have made no attempt to hide the fact that youre a smoker, you probably be smoking in this presentation and everyone knows that youre a smoker but when the study is published, all of a sudden everyone knows about you. You know before and in particular they know that you are at an increased risk for lung cancer because all of a sudden you learn this new fact about the world, that smoking and lung cancer correlate. If youre in the us, it might have caused you concrete harm at the time in the sense that your Health Insurance rates might have gone up so this could have caused you concrete quantifiable harm. So if we were going to say that what privacy means is that nothing new should be learned about you as a result of conducting a study, we would have to all call the british doctors any violation of your privacy. But theres a couple of things that are wrong about this. First of all, observed that the story played out in exactly the same way even if you were one of the doctors who decided not to have your data included in the study. The supposed violation of your privacy in this case,the fact that i learned that you are at higher risk of lung cancer , that wasnt something that i learned about your data in particular. I knew you were a smoker before the study was carried out. A violation of privacy would have to be introduced to the facts about the world are learned, smoking and lung cancer were correlated and that was your secret to keep and the way we know that wasnt your secret to keep as i could have discovered that without your data, discovered that from any traditionally largesample of the population. And if we were going to call things like that violation of privacy, we couldnt do any Data Analysis at all was there are always going to be correlations between things that are observable about you and think you didnt want people to know and i couldnt uncover any correlation of the data at all without having a privacy violation of this size. So this was an attempt thinking about what privacy should mean, getting a semantics but it was one that was too strong area and the real breakthrough came in 2006 when a team of mathematical computer scientists had the idea for what is now called differential privacy. And the goal of differential privacy is to promise something thats very similar to what we wanted to promise in the british doctors study but with a slight twist. So again, think about two possible worlds but now dont think about the world in which the study carried out and the world in which the study is not carried out but think about the world in which the studies carried out in an alternative world in which the study carried out but without your data. Your data was removed fromthe data set. The idea is that we want to assert that in this ideal world where your data wasnt used at all that there was no privacy violation for you, because we didnt even look at your data and of course, in the real world your data was used but if there was no way for me to tell substantially better than random guessing whether were in the real world where your data actually was used or whether we are in the idealized world where there was no privacy violation, then we should think about our privacy as having been only minimally violated and that what differential privacy says is that the difference between, there should be no way to tell the difference substantially better than random guessing than the world which we use our data compared to the world in which we dont. There is something that we can qualify something that we can, its a knob we can tune to trade off accuracy with privacy. So you might think that this is too strong still. When you think about it it sounds like a pretty satisfying definition. You might worry that this like the definition we attempted in the british doctors study is too strong to allow anything useful to be done but it turns out thats not the case. I wont go through the simple example here unless we have questions about it in the q a but suffice to say that 15 years of research have shown essentially any statistical task, any Statistical Analysis you want to carry out which include essentially all of Machine Learning and be done with the protections of differential privacy albeit at a cost typically manifests itself inthe need for more data or in the need for diminished accuracy. And in the 15 years of Academic Work on this topic, in thelast few years this has moved from lets say the whiteboard to become a Real Technology. Its become something thats widely deployed. If you have an iphone it might as we seek to actively recording this ethics back to the mothership in cupertino, subject to the protections of individual privacy and google has tools that report statistics in similar ways in the real moonshot for this technology is going to come in just about a year. The us 2020 census is going to release all of its statistical products to the pretensions of differential privacy. This is the sense in which we say that of the topics wetalk about in the book , this is the most welldeveloped. Not that we understand everything there is to know about differential privacy but we got a strong definition that has real meaning. We understand the algorithms you need to satisfy this definition while still doing useful things with data and we understand a lot about what those tradeoffs are this hasbecome a Real Technology as used in practice. Im going to give a similar vignette for algorithmic fairness. So as i said at the beginning the study of fairness in algorithmic decisionmaking is considerably less mature than privacy and differential privacy in particular and we already know even though its less mature and its going to be messier area so in our book we argue that anybody who thinks long and hard enough about data privacy will arrive at a definition similar to differential privacy so in the sense differential privacy and has a definition of data privacy. We already know theres not going to be a single monolithic rightseparation of algorithm fairness so just the past years there have been a couple of publications that have following broad form , based on often say look, can we all agree that any good definition of fairness didmeet the following three mathematical properties. And a sensible reader to the three properties and says yes, of course we would want the three properties. These are weak medical properties and i would want these and Even Stronger ones also. Then his guess what, youre proving that there is no definition of fairness that can simultaneously achieve these three properties. To make this a little bit more concrete, this really might mean for instance in real applications that if youre trying to reduce lets say the discriminatory behavior of your algorithm might gender for instance, that might come at the cost of increased estimation by race or example. You might face these very difficult moral and conceptual tradeoffs. But this is the reality of the way things are. So we still post proceeding as scientists to carefully study alternate definitions and what their consequences are. So i want to do with most of the time is similar to what erin did which was kind of show you how things can go wrong instead of anonymity, how Machine Learning can result in things like racial orgender discrimination. And then how that lead to a particular proposal for how one might try to address these sorts of Collateral Damages. And so i want to talk about why Machine Learning might be out there, many of you in just the past few weeks have heard of these notable instances, one inwhich a Health Assessment model , predictive model that is widely used in large american hospitals and Healthcare Systems was shown to have systematic racial dissemination. And then perhaps less i agree there was a twitter storm recently over the recently introducedapple credit card underwritten by goldman sachs. There were a number of reports of married couples in which the husband says hey, my wife and i file taxes jointly. She has a highercredit rating than i do , yet i get 10 times the credit limit on the apple card and she does. Erin and i sense about a week ago friday, an hour in the new york state federal regulators office that is investigating this particular issue and we dont know unlike this assessment model, we dont know whether these are just a couple of weeks or whether they are systematic underlying gender dissemination and the credit limits that are given but these are the kinds of concerns that we are talking about when we talk about algorithmic fairness. So i want to somewhat like errands to medical databases take you through the toy example of how things can go wrong in building predictive modelsfrom data. And so lets imagine that erin and i for instance were asked by the Admissions Office to help them develop a predictive model for collegiate success they on two variables, your high school gpa and sat score what im showing you is kind of a sample of data points, each one of these green pluses or minuses represents its x value represents the high school gpa of a form or applicant depend and the y value represents the sat score and lets say this is an example of individuals who were admitted to penn. We know whether they succeeded at penn or not and by succeed, pick any quantifiable subjective definition that in hindsight we can objectively measure. One example would be success means that you graduated within five years matriculating with at least a 3. 0 gpa. A different definition would mean that you donate at least 10 million to penn within 20 years of leaving the area as long as we can verify in hindsight thats fine with the pluses and minuses me so for each point we have with a gpa and sat score of the applicant and the plus indicates students that succeeded at penn and the minuses mean students that didnt succeed. So a couple of things about this cloud of green points. So first of all, if you count carefully youd see that slightly less than half of these historical admits succeeded. Theres slightly more minuses and plus by a handful or so. Thats observation number one. Observation two is if i showed you this cloud appoint an essay could you build a good predictive model from this data that we could use on a forward going basis to predict whether applicants at penn will succeed or not, theres a line you could draw through this cloud of points, this blue line and if we predict that everybody about that blue line would be successful and the ones below would be not successful, you can see we do a good job. Its not perfect. Theres a couple of false accepts here and falls rejects down here but for the most part we are doing a good job. This is of course in a simplified form exactly what the entire enterprise of Machine Learning is about, including things like the Neural Networks. Youre trying to find a model , perhaps its more publication than this does a good job of separating positives from negatives. Now lets suppose that in this same historical applicant pool, there was another subpopulation besides the green. Lets call them orange population and heres their data and i want you to notice a few things about the orange population you they are a minority in the literal mathematical sense that there are many fewer orange points in this Historical Data set and there were dream points. Observation number two is that the data also looks different. It looks like the sat scores of the orange population are systematically lower but also note that they are no less qualified for college. There is exactly the same number of orange pluses are there as there are orange minuses so its not that the orange population is less successful in college, even though they have systematically lower sat scores. One reason you might imagine why this is the case is that perhaps this minority orange population theres less wealth so in the green population which is wealthier they can afford sat preparation courses. They can afford multiple retakes of the exam taking the mats of their scores orange population which is less healthy and has fewer resources, theyjust do selfstudy and take the exam once and take what they can get. If we had to build a predictive model for just the orange population, there again a very good one theres a perfect model on Historical Data. This line perfectly separates positives from negatives. So whats the problem . The problem arises of course if we look at the combined data set and ask whats the predictive model that does best on the combined data set. Is again a single model aon the green population. You can see that visually. I tried to move this line down in order to catch the orange pluses, im going to pick up so many green minuses that the error will increase by my trying to do that. So this is the optimal model on the underlying aggregated data lesson is very, you can see that its intuitively unfair. We haverejected all of the qualified orange applicants. So the false, we might call this the false rejection rate. False rejection rate onthe word population is close to 100 percent false rejection rate on the green population is close to zero percent. Now, of course you might say what we should do is just notice that the orange population as systematically lower sat scores even though they are not less qualified for college and we should build a twopart model. It should basically say if you are green, were goingto apply this line and if your red , if your orange were going to apply this line and by doingthis , we would actually compared to the single model on the, we would actually not only make the model more fair, but we would also make more accurate as well. Now, the problem with this of course is that if we think about the green and orange as the race for instance, there are many areas of law and relation that forbid the use of race as an input to the model. First, this twopart model has race as an input because the model says look at race and then decide which of these submodel to apply. And of course these definitions or laws or regulations that prevent the use of things like race or gender or other apparently irrelevant aerials and algorithmic decisionmaking are usually next to protect minority religion. Heres a concrete example in which regulations that were meant to protect the minority population guarantee that we will harm that minority population if we just do the most sensible Machine Learning exercise. In the same way that erin said the definitions of privacy based on opposition dont make sense, we argue in the book that any definition, anytime youre trying to get fairness and algorithmic decisionmaking by permitting inputs is fundamentally misguided. What you should do instead is not respect the inputs to an algorithm but constrain output behavior in the way that you want. And in particular, one thing you can imagine doing here even if we were forced to take a single model is i changed my objective function. I would say look, theres two criteria i care enough. On the one hand i care about making accuratepredictions , minimizing the predictive error of my model. On the other hand we also care about this other objective which is fairness and in this particular application i might define fairness as approximate equality of the false rejection rate. So i might say im worried about orange population being mistreated and a particular type of mistreatment im talking about is false rejections. Students who would have succeeded but we actually , our model rejected i can define a numerical measure which is what is the difference between the false rejection right on the green population and orange population so instead of saying minimize the error on the data set i could say minimize the error on the data set subject to the constraints that the difference in full rejection rates we need two populations is most lets say zero percent or i could relax that and say at most five percent or 10 percent and of course if i let this go all the way to hundred percent disparity, its like im not asking for fairness all anymore and im back to minimizing predictive accuracy in the same way differential privacy gave this nod to between how strong your privacy to our versus your accuracy demands, this definition of fairness also lets us interpolate between asking for the strongest type of fairness, zero disparity and it false rejection rates to no purpose whatsoever. And once your arm with a quantitative definition like that , you can actually loss the quantitative tradeoffs you might face in any real application. On three different real data sets in which fairness is consideration , im showing you actual numerical lot here. In which the x value for each one of these red points is the error of some predictive particular predictive model and the y value is the unfairness of that model in the sentence of thisdisparity between false rejection rates between two populations. And of course , small is better for both of these. Where id like to be is in the corner of your. Where my error zero and my unfairness is also zero. You can see that not having anyone of these data sets in real Machine Learning, even going fairness are not going to get zero error. But what you see is we face a numerical tradeoff. We can either choose to essentially ignore this and this point appear which gives us the smallest error. The other extreme we can ask for zero unfairness and much larger error and in between you and get things in between. And we argue in the book that is important as a society that we become quantitative enough that people, even nontechnical people can look at these tradeoffs and understand the implications of them because we do not propose that we should now apply some algorithm to decide which oneof these models we should pick because it really shoulddepend on whats at stake. So in particular , theres a big difference in whats at stake for instance in medical decisionmaking which might have life or death consequences versus the ads that youre showing on facebook for google which many of you may never look at and for the most part in general. And furthermore you can see that the shapes of these terms are quite different so for a couple like this one, its actually possible near the left and of the curve to get Big Productions in unfairness for only very small increases in the air and so that might seem like its working. Whereas this one here is sort of face hard tradeoffs right from the beginning. So this is an example of the kind of thing we discussed in the book you kind of start by thinking conceptually about what fairness should mean and whatyoure trying to accomplish. You might go through some bad definitions based on things like anonymity or not using certain inputs or variables in a computation and eventually arrive at a more satisfying definition and algorithms that canimplement that particular social norm on real data sets. So let me return it over to aaron to talk about all the warm fuzzy stuff later in the book s so we talked in depth about privacy and fairness which is the first half ofthe book. Im not going to talk in much depth about any particular thing but i want to now give you a quick survey of what the second half of the book is and maybe at a high level, you can think about the first half of the book as studying algorithms in isolation. We have some machine algorithm that we can think about and is this algorithm private or fair without thinking necessarily about the larger context in which the algorithm is embedded. But that context is important because what color than is doing affects the behavior of people and its important to think about how those things interact. So in the third chapter, we start thinking about this using the tools of game theory. If you algorithm, how will change the particular decisions that people make in a way that might reverberate to have larger societal consequences and we start by talking about an example which is maybe not the most consequential socially but is i think clear to get an idea of what were talking about. So many of you will have to experience using apps like google maps and ways to plan your daily commute for example. Like i can in the morning where i want to go and it will not just find directions on the map but look up traffic reportsand give me a rest which will minimize my commute timegiven the current traffic. So if you , this aspect of google maps , this integration with traffic reports, turn this interaction that im having with the into what an economist would call a game. In the sense that the actions that i take which route i choose to drive along negative externalities on other people in the form of traffic. And selfishly i would prefer that everyone else a home and i would be the only one on the road area i would just take a straight shot to work, get their real fast but other people wouldnt agree to that solution yet so different people competing interests and their choices affect the wellbeing of other people read each choice i made, the choice i make only has a small effect on any particular other person. I dont contribute too much to traffic butcollectively the choices we make our large effects on everybody. Class one week beauties apps is that they are helping us to play the game better, at least in all myopic sense. Before these were around i would have it best very minimal traffic information so i would probably take the sameroute every day , but now i can imprecisely respond to what other people are doing what a game here is would describe it as doing as telling me do my best response, given what everyone is doing, what can i do that will selfishly and myopically optimize for me. And everyone else is doing thesame thing. The result is that what these are doing is theyre driving global behavior to what would be called the competitive equilibrium, a mass equilibrium which in some states is stable in the sense that everybody is myopically andselfishly optimizing for themselves. If you take in the class on game theory or even just read the right books, you will know that just because something is a competitive equilibrium does not mean its necessarily a good social outcome. Theprisoners dilemma is perhaps the most famous example. So is not at all obvious and in fact you can come up with a clear case study where these transcends even though there optimizing for each person are making things worse globally for the population at large in the sense of larger average commute times. That might not be an enormous deal when we are talking about traffic, but this is just an example of a phenomenon that much more pervasive and algorithms mediate social corrections. So for example you might think of moderation algorithms that drive faith like the Facebook Newsfeed in a similar context and myopically, facebooks interests are not so misaligned with my own in the sense that their algorithms are optimized to drive engagement. What facebook wants me to do is stay on facebook as long as possible so i can view lots of ads and the way they do that is by showing the content i would like to engage with, that i would like to click on and read and comment on and myopically that seems aligned with my interests. I have a choice of what website to go to. If im engaging with the content facebook is chilling me i might be enjoying it but when facebook simultaneously does this for everybody even though its myopically optimizing for each person it might have global consequences we dont like. It might lead to the social bubble phenomenon that does lots of pandering and drives globally for example a society that is less deliberative so in this chapter we go through examples trying to take about an point out the ways in which algorithmic decisions can have widespread consequences on social behavior and how game theory is a useful tool in thinking about these things and in the last chapter, we Start Talking about another important problem which is the statistic presence in science. But then hes told actually it is only a certain color of jelly bean that causes acne. He tests them. He tests brown jelly beans and purple ones and for all of them hes finding a p value. He finds one green jelly bean that appears to be statistically significant. There seems to be a correlation of green jelly beans and acne at a level of 95 which means if you tested 20 hypotheses you would expect only one of these to incorrectly appear to be significant by chance. Of course he did test 20. And heres the head line. Green jelly beans linked to acne, only 5 chance of coincidence. This is called the multiple hypothesis testing problem. And its relatively well understood how to deal with it when its just, you know, a single scientist conducting these studies. Whats going on here is really just statistical malfeasance, someone has tested a bunch of hypotheses but only publishing the most interesting one without even mentioning the others. But of course this is just as much a problem if rather than one scientist studying 20 hypotheses, we have 20 scientists studying one hypothesis and each following proper statistical hygiene. This is just as much of a problem if only that one hypothesis that appears to be significant is the one thats published. And of course that is what the incentives that underlie the game of scientific publishing are exactly designed to do because if you find that blue jelly beans do not cause acne, that is not going to be published; right . You probably wont even try to publish it. Thats not a result that any prestigious journalist is going to want to present. But if you find something surprising, that green jelly beans cause acne, then well, thats a big finding. So the problem is, you know, if you view scientific publishing as a game, then even if each individual player is following proper statistical hygiene, you get the same effect thats described in this cartoon. In the chapter we talk about actually how these phenomena are exacerbated by the tools of Machine Learning, which promote checking many different hypotheses very very quickly which promote data sharing and how tools from this literature in particular, sort of surprisingly tools from differential privacy which we talk about in the very first chapter can be used to mitigate this problem. And thats it. So thank you. [applause] thank you very much. Q a . Yeah. Yes . Thank you very much. That was great. Were going to do questions. I know we have a lot of folks in the room who regularly work in this space and beyond, and so we would love examples of problems that you guys have faced or questions that you have that were spurred by this. Maybe i will start with this, so you talked a little, michael, about the limitations of Computer Science, when it comes to answering some of these questions of fairness. Having now talked about the book for a couple months, like have you found that the public kind of wants the computer scientists to solve this . No. [laughter] i mean, i think in our experience, they appreciate the fact that People Like Us, the community we come from can identify sort of the point at which theres judgment involved, right, and sort of moral decisions to be made and that the stakes matter, and so i think they are generally appreciative of the fact that both sides needs to come towards each other a little bit. Just like those types that i was showing between error and unfairness, you know, it takes a little bit of explanation to understand what such a plot is saying but in general i think people from nonquantitative areas who are stake holders in problems like these, for instance, people in policy think tanks and the like, they like that, but i dont think they are wanting computer scientists per se to take a leading role in picking out a point and saying well, heres your best tradeoff between error and unfairness because it depends on the data and the problem question. I dont think even we think that like computer scientists should be exclusively or even in large part the ones making many of these judgments, and were careful to say in the book that, you know, that theres the scientific problem the algorithmic problem and theres the part of the problem that requires, you know, moral judgment of various sorts and those are different. For example, we do not propose that it should be algorithms or necessarily computer scientists who define like what it is we mean by fairness. Once you pick a definition, we certainly dont propose that its computer scientists who should be picking out in various circumstances how we want to trade off things like privacy and fairness and accuracy. But what is important and what i think computer scientists have to be involved in is figuring out first of all what those tradeoffs are and how to make them as manageable as possible. So for example, at the u. S. Census right now, there is literally a room full of people, a committee whose job it is to look at these curves and figure out, you know, how we should trade off these very different things, one of which is privacy, which is legally obligated to promise to american citizens and the other which is statistical validity for this data. It is useful data. It is used to allocate resources, school lunch programs, important things, right, so there are different stake holders who disagree about how these things should be traded off and they are in a room hashing it out, you know, as we speak, but their work is made very much easier because we can precisely quantify what those tradeoffs are, and we can manage them, and thats what computer scientists i think have to play an Important Role in. That actually leads to another question i had while listening, which is, so in an ideal universe where the ethical algorithm is on every desk of every computer scientists and the frameworks that you describe are actually used in action in real and i think much of this is happening in industry, you know, some of it is obviously happening in government as well. What does it look like to have a community of people kind of living these principles . Is there a public api that we can all see . To use kind of a rudimentary example, like when we go to the grocery store, we can look at the side of the box, we know how much fat and sugar there is. In a world where some people might comply and some people wont comply, some people might have read the book, some people may not have read the book. What does success look like . Yeah, so, while we dont talk a lot about this in the book, we continue to procrastinate on writing a so called policy brief for the Brookings Institute where were going to talk a little bit more about kind of regulatory implications of these kinds of things. The reason i mention that in response to your question is once you have a precise definition of fairness or privacy, you know, you can do what we mainly discuss in the book which is embedded in algorithms to make them better in the fist place first place, but you can also use it for auditing purposes. In particular, if were specifically worried about gender discrimination for stem jobs in google advertising, which was something that was demonstrably shown to exist a few years ago, you can run controlled studies. You know, you can have like an api where you say like, okay, we need unfettered access to make automated google queries over time so that we can systematically monitor whether there are gender differences in the distributions of ads that people see, for example. And so we do think that an implication of a lot, and maybe not all the work thats going on in these areas is the ability to do that kind of technological auditing, and i think we believe that some of that should happen, and in particular, you can anticipate what the objections of the Technology Companies might be. They might include things like well thats our intellectual property. This is our secret sauce. We cant have automated queries which currently of course violate terms of service. And our response to that is like this is your regulator, right . They wouldnt have this access and be able to use it to, for instance, start a competing Search Engine in the same way that the sec has all kinds of very very sensitive counterparty trading data but is not allowed to use those to go start their own hedge fund, for example. Soy think, you know so i think, you know, in a world where the kind of ideas we discussed in the book become widespread and embedded, a big part of sort of on the side of the cereal box might be things like okay, you know, on the side of the google cereal box, heres the, you know, rates of discrimination in advertising by race, by gender, by age, by income, etc. , and you could really imagine having some sort of quantitative notion or score card, if you like, of Different Technology services and how well or poorly they were doing on different social norms. Yeah. I think also, you know, what were going to have to see is the regulations for things like privacy and fairness are going to have to become a little bit more quantitative. I think at the moment theres a disconnect where people in an industry are not sure exactly what is expected of them, what is going to for example, the issue with apple card where there was seeming gender discrimination, it would have been easy to find had only people thought to look for it. When we were chatting with the new york regulators a few weeks back, one thing we heard that i thought was a little interesting is that sometimes companies will explicitly like avoid running checks like this because if they dont dhek check then theres plausible deniability and if they do check, theres subject to a lawsuit. This is the kind of thing that flourishes when theres ambiguity, but if youre precise about what is going to constitute discrimination in the state of new york, then companies will look for it. I think our view is that even apparently strong regulatory documents like the gdpr are really kind of illformed documents. They look really strong, but they sort of push words like privacy, interpretability, fairness around on the page but nowhere on those pages do they say what they mean, so its a bit of a catch 22 or chicken and egg problem. It looks like strong regulation because they are demanding interpretability everywhere, for instance, but nobody is committed to what it means yet. I do think as is often the case, even the nascent clients we discuss in the book as sort of running ahead of certainly things like laws and regulation, and i think before the kinds of changes were discussing can take place on the regulatory side, you know, much of regulatory law has to be rewritten, and there needs to be cultural change at the regulators. That makes sense. Shifting gears just a bit, i was struck by the fact that the differential privacy is ahead. Do you guys have a view of whether its ahead because as you said theres an objective preference to answer, like an answer that can be defended almost like as a theorem . Is it ahead because privacy is just more important to folks . Or is it ahead because theres a perception that privacy is more important than fairness . Like, you know, theres some choice going on, like almost subterranean in that it got more attention earlier and can get solved faster, or is there no choice at all, and its just, you know, kind of two short comments and then aaron can chime in. You know, there are differences in how long these things have been studied, but as i kind of said when i was talking about fairness, i really think theres a technical difference, right . It just so happens that privacy is lucky in the sense that theres a wellgrounded, very general mathematical definition of privacy thats very satisfying in which Subsequent Research has shown you can do a lot with it; right . You know, you can meet that definition and still do lots of things that we want to do in terms of Data Analysis and the like. And fairness isnt like that. Its not a matter of time. These theorems that i mentioned that said hey here are these three properties that you would like from fairness that you cant simultaneously achieve. It is not like further work that theorem is going to be done done. It is a theorem; right . We dont talk about this much in the book, but i do think that privacy is lucky in the same sense that public key cryptography was lucky. There was a nice parallel of the development of that and a development of differential privacy, where there was a period where it was a catandmouse game. People would look invent games that looked random until they different look random to somebody. Then in the advent of that key cryptography that put the whole field on a much more definition, then it was off which again it doesnt mean those things do everything you work from security or perfectly implemented every time. But i dont think were ever going to get there with fairness. Thats just life. Yeah. I mean, i think its hard to project into the future, you know. Privacy is about 15 years ahead of fairness in terms of its academic study, and thats for a good reason. I mean, weve had data sets for a long time, and so privacy violations have been going on for a long time. When it comes to algorithmic fairness, it really only becomes relevant when you start using Machine Learning algorithms to make important decisions about people, and it is only in the last decade or so that both weve had enough data about individual peoples day lay interactions with daily interactions with the internet to be able to make those decisions and algorithms have become sufficiently good that we can start to automate some of these. As michael says, it is clear already theres not going to be one definition of fairness. But i do think that, you know, if you try to look 15 years down the road, which is you would have to look before fairness at least chronologically as mature as the study of privacy, you might still hope for a mature science. Its not going to have one definition, but perhaps we will have isolated a small number of precise definitions that correspond to different kinds of fairness, and we will more precisely understand how those necessarily trade off against one another in different circumstances. So it is going to look different, but you know, im optimistic that there will be a lot that you will be able to say given as much time as privacy has had. One other comment that i would make that i think i didnt appreciate until we started working in algorithmic fairness a lot, which i think another difference which will persist between, say, privacy and fairness that doesnt have to do with maturity or technical aspects is that discussions about fairness always become politicized very quickly. So, you know, in principle, everybody agrees that privacy is a good thing and that everybody should have it. Then you Start Talking about fairness, you immediately find yourself, you know, debating with people who want to talk about affirmative action or addressing past wrongs because all of these definitions require, you know, that you identify who you are worried about being harmed and what constitutes harm to that group and often why you think that that constitutes harm. And so some of the things we talked about, you know, like this forbidding the use of race in, say, lending or, you know, very much in the news the past couple of years, in College Admissions and the like, is that you are really these definitions are also requiring you to sort of pick groups to protect, and this always becomes politicized, i think, regardless of what definition you are talking about, and i dont think this will change in 15 years. So somehow privacy and fairness are different, just in this social or cultural sense as well in some ways they are expecting the algorithms to do something that society has to sort out for itself. Yeah, or conversely they dont think algorithms should play any role whatsoever not only in deciding those things or even mediating them or enforcing them or the like and we take pains in the book to point out that look, you know, racism was not invented with the, you know, advent of algorithms and computers. It was around before. You can just talk about it more precisely now, and, you know, you can have problems of fairness that are a bigger scale but you can also have solutions that are a bigger scale now as well now that things are automated. Questions in the room . Show of hands . Anyone . Theres a question in the back there. First of all, thank you very much for the talk. Really enjoyed reading the book. I have a question about the tradeoffs between differential privacy and aggressive data acquisition. In the book, it talked about google and apple, collecting user statistics subject to differential privacy, but the type of Data Collected is actually not the type of data they used to collect. It is like a new area of day a acquisition. Whats your comment about the tradeoff, using kind of like differential privacy as like as a shield of collecting user data, especially i dont know how like secure differential privacy is against like adversarial attacks. Do you see a possibility that users under the impression of differential privacy they are willing to give out more data, but only to find their data compromised in the end . Yes, thats a good question. And its a question that sort of relates to why you have to think about algorithms not just in isolation but in context. You are right in both the way apple and google used differential privacy, they didnt use it to add further protections to data they already had available, which turns out to be a hard sell to engineers. If they already have a Data Set Available then adding privacy protections corresponds to taking away some of their access, giving access only to a noisier version of the data. Whats a much easier sell and this is why it was how it worked in the first two deployments is to say look, heres some data set you previously had no access to at all because of the privacy concerns, heres now a technology that can mitigate those privacy concerns, and that will give you access to it. So youre right, that one thing that happens when you introduce technologies that allow you to make use of data, while mitigating their harms, is that you make more use of data, which makes sense, and so youre right, that one of the effects of differential privacy is actually that, you know, apple and google are collecting now a little bit more data. Now, on the other hand, they are using this extremely strong model of privacy, which we talked about in the book, the local model, and what it really means is they are not actually collecting in the clear your data at all. So they are collecting some random signal from your data, and the randomization is added on the device. Apple for example is never collecting your data. It is collecting only results of coin flips from your data. So although more data is being collected, different sial priva is offering an extremely guarantee of plausible deniability. For that reason it is not subject to, you know, like data breaches, for example, like you might worry that differential privacy causes companies to collect more data, and sure, maybe thats okay while they are using it subject to the protections of differential privacy, but as soon as some hacker gets into the system and the data set is released, all of a sudden things are worse off. Thats not how google and apple are using differential privacy. They are doing it in a way that doesnt collect the data at all. At the census, it is different. They are collecting the data. They have always collected the data, but they are actually adding these protections in a way that they didnt before, so they are giving researchers in 2020 access to data that is actually more privacy preserving than it was in 2010. So theres lots of tradeoffs. These are interesting things, but i think those are the sort of two different use cases that show different ways in which it can play out. Just a followup on that. For kind of a lay audience, can you explain that coin flip . Oh, yeah, sure, lets see, i have a picture of that. Yeah, so suppose just to use sort of the toy example we used in the book, suppose i wanted to conduct a survey of the residents of philadelphia, for example, about something embarrassing, like want to figure out how many people in philadelphia have cheated on their spouse. Okay . So one thing i could try to do call up some random sub sample of people and i could ask them have you cheated on your spouse. I would write down the answer. At the end i would tabulate the results, calculate statistics, the average, and call it a day. But i might not get the responses that i want because people legitimately be worried about telling me this over the phone and in particular like they may not trust me. They might worry that someone is going to break into my house and steal this list. They might worry in divorce proceedings, it will be subpoenaed. Heres a different way to carry out the same survey. I call people up and i say okay, have you cheated on your spouse, but wait, dont tell me just yet. I want you to first flip a coin. Okay . And if the coin comes up heads, dont tell me if it comes up heads, but if it comes up heads, tell me the truth. Tell me if you have cheated on your spouse. But if it comes up tails, just tell me a random answer. Flip the coin again and tell me the result of the coin flip. Okay . So people do this. Now they have a very strong form of plausible deniability, which is that since they didnt tell me how the coin flip came out for any particular answer they told me, they can legitimately and convincingly say well, okay, that wasnt my real answer. That was just the random answer you instructed me to give. And so i cant form very strong beliefs about any particular person. Everyone has this very strong statistical guarantee of plausible deniability, and this is something you can formalize in the language of differential privacy, but thats okay because the question i cared about wasnt pertaining to any particular person. It was about the population level average, right, the statistical property of the population, and it turns out basically because of a consequence of the law of large numbers, that even though i have only this very noisy signal from each person, in aggregate, i can figure out very precisely the population level average because i know the process by which noise was added, and so in aggregate i can kind of subtract it off. We talk about this in the book. This is not so different from what is happening on your iphone right now. Your iphone is reporting much more complicated statistics then yes or no questions, but in the end, this is for like text completion, right . Your texts are sent. Apple would like to know for example whats the most likely next word given what you have typed so far. And they collect data that helps them do that by basically hashing, you know, this text data down into a bunch of yes or no questions, a bunch of binary data and running a coinflipping procedure that looks not so different than this. To put it in the context of embarrassing, maybe, youre embarrassed that you still play seven hours of bejewelled on your phone, you know, a decade on. And so youd be reluctant to report that directly. But if all of our phones add a large random positive or negative number to our weekly usage of bejewelled, then if i look at any individual persons, like, random noisy report and it says jeff played 17 hours of bejewelled last week, i really wont know whether he played no bejewelled in a random number or he played 30 hours and 13 was subtracted. For any particular person who has reported to play a lot of bejewelled, they have the same plausible deniability. If i add up these very very noisy reports, the noise averages out and i get a very good estimate of average or aggregate bejewel usage. I dont play bejewel by the way. [laughter] from what im not embarrassed. Other questions . We are waiting for the mic. Jennifer . Thank you very much for the talk today. You spoke about how introducing differential privacy or unfairness can increase the error in an algorithm. Im interested to know does that affect the commercial potential of the algorithm. Does google earn that from advertising when they use differential privacy . If so, do you think that means regulators are needed . The short answer to the first question is definitely. Right . So i mean for instance google uses Machine Learning at massive scale for decades now to do specifically things like rate prediction; right . The more accurate their clip through rate predictions are the better targeted ads they can show and that directly translates into revenue and profit. Going in and insisting on things like not discriminating against this or that group in your advertising or more privacy in the way that the Machine Learning is deployed is going to reduce those accuracy rates and reduce profits, like, i mean, i dont know how to put numbers on it yet, but i think we can be sure that this is going to happen, and i think, you know, just relating this to things we have been discussing, i also think this is why a lot of the commercial deployments weve seen so far are kind of an experimental areas that arent part of the core business of these companies, right . So they are experimenting around, like, okay, they would like to know, you know, emoji usage statistics. It is not a core part of their business. But theyre sticking a toe in the water. I think it is to their credit that they are dipping a toe in the water. Im kind of waiting for the first Big Tech Company that says oh, were not just going to adopt these technologies around the edges. Were going to put them in our Core Services. By the way, all of the Big Tech Companies, we have many many excellent colleagues at these companies that do research in this exact area. So its not like any of the Big Tech Companies dont know a lot about differential privacy, about algorithmic fairness and these topics but of course theres a disconnect between the researchers who study these things and the, you know, people with a Business Unit that they oversee thinking about adopting them in the middle of their pipeline. So i do think, you know, i dont have strong on how this will play out. I mean, i hope that maybe theres some organic adoption by Tech Companies, you know, not just Tech Companies but other Large Companies that are essentially consumer facing in some way, of voluntarily really biting the bullet and saying were going to take a lead on this in our Core Services or products. I think the more likely answer is there is going to need to be regulatory pressure, and that will take time. Yeah. Thank you for your talk. [inaudible]. Im interested in data sets. In reference to the last slide, and p values and the reproducebility crisis thats going on in science, am i under the correct conclusion that we shouldnt be using Machine Learning algorithms in Big Health Data sets because the ability to they need to be theory based, you know, kind of algorithms to sub substantiate . No, we dont want to say we dont want to use Machine Learning but you need to be careful. If you have some result, you train some classifieier and it seems to be pretty good in predicting tumors of some sort, then you can legitimately put a confidence interval around that, you know, to estimate with statistical validity, you can attach a p value to it, but the problem comes when you start sharing data sets and in particular, like reusing holdout sets, right, like when youre taking Machine Learning 101, right, the way you often get statistical validity in Machine Learning when youre unlike in statistics not explicitly assuming that the data fits a linear model, for example, but youre training something complicated. You get statistical validity, you have a piece of the data set, you have never seen before, it is entirely independent of everything you are training which looks fine on paper, but, you know, if i read a paper of yours and i send you an email and say that was a great paper, could you send me your data set so that i could do Something Else with it . Well, even if i myself follow all of the rules of statistical hygiene, i read your paper and implicitly everything im doing is a response to the findings that you wrote about, which were a function of the data, and so as soon as anything like that happens, all of the guarantees that come with a holdout set go entirely out the window. Now, the sort of the easy way and i say easy meaning like theoretically easy but practically very hard to solve this problem is what people advocate for when i talk about preregistration, right, like i should make sure i cannot look at the data at all before i conduct the experiments im going to conduct, but if you take that seriously, it rules out data sharing for actually those reasons. And so although that works, its sort of draconian and would rule out a lot of interesting studies. What we spend a lot of time talking about in the chapter, again, an algorithmic science that actually allows you to share data and reuse data in a way that doesnt give up on rigorous statistics. But just to follow up, im betting that when you said a theoretical, you were referring more to sort of causality, about, you know, whether, you know, so theres this, you know, split between the Machine Learning community and many other communities including medicine and economics about causality and, you know, some machine learners are militantly anticausal, they are just like lets get the data, and if we get a good fit to the data and we practice sound statistical techniques, so i think, you know, certainly having strong priors and, you know, kind of having a causal model in your head is an example that, you know, something i would consider to having strong priors can help in sort of reduce the number of things you try on the data, but i still dont think it is a substitute for the kinds of things we discuss in chapter right because it is again a matter of discipline. I can think i have some causal model in my head but of course i dont literally usual lil there will be parameters usually there will be parameters to that causal model and play around with the strength of that causality. As soon as im doing that, you are going down the same rabbit hole testing many many hypotheses sequentially or in parallel on the same data set and prone to sort of false discovery if you are not very very careful. So i think again its very very early days for this kind of stuff, even earlier days than fairness, but i think we think that kind of disciplined algorithmic approach is including ones that involve differential privacy and other sorts of statistical methods are better than, you know, human beings themselves in their heads sort of saying like well, im not engaged in this sort of reproducebility crisis because i have strong priors in the form of a causal model. The book is really insightful. I think it points to us the challenges that we will be facing the next 50 years or maybe 100 years. We would be happy to still be selling the book [laughter] my question is more immediate in nature actually. The responsibilities that come on computer scientists and even physical scientists is to explain their observation to the public in a way that actually supports good policy making; right . We see examples all the time of thats not happening in a very good way, like Global Warming and all these things. Whats your perspective on even taking this next step on Machine Learning and how easily it would be explainable to the public and policy making . Yeah, i mean, i think thats very important, sort of Computer Science have sort of been unwittingly thrust into policy making, you know, just informally; right . Like if you were a Software Engineer at facebook, and you tweak a parameter in some algorithm and go to lunch and dont even think about it, you are affecting all sorts of things for, you know, millions of people. Facebook in many ways is informally making policy in ways that arent even sort of precisely thought out, and i think that given that were already in this situation, its sort of important that we work to make this more explicit, make it clearer how software decisions affect policy and therefore, its important to try to understand to as broad an audience as possible, okay, at a high level, like, what are algorith algorithms . What are they . How do they work . What are they trying to accomplish . Thats in large part what were trying to do with this book. Yeah, i mean, its funny because ive been around a lot longer than aaron, right . And i actually, you know, the word Machine Learning is actually in my thoughtful dissertation, and, you know, at the time, this was a very obscure area to be studying. In fact even majoring in Computer Science when i was an undergraduate was viewed as an odd thing. It is interesting, you know, sometimes i joke that sort of through no foresight or merit of my own, you know, sort of the world was delivered to the doorstep of Computer Science, sometime in the last 20 or 30 years. And that was thrilling for a long time because it had no down side in many ways, right . I mean, you know, there were all kinds of interesting new jobs. There was all kinds of interesting new science. And, you know, i think in many ways now, you know, the bill is coming due, and our book is about that bill and how we might pay it. But i think the other part of it is, you know, i dont like to use this term, but i think more computer scientists need to think about if not becoming public intellectuals, getting much more involved in, you know, the uses of their technology, the misuses of their technology and trying to, you know, help society solve the problems created by those technologies. And theres still not a lot of that yet. I mean, you start and its still i think is very a lot of it is very superficial, not in im not criticizing it, but you are starting to see computer scientists do things like write oped pieces, for example, but i think we really are going to need technically trained people who are really willing to spend their entire careers or much of their careers mediating between the technical part of Computer Science and the policy and social implications of it, and i this i the community of people that and i think the community of people that work on these types of problems is starting to breed a Younger Generation of people that are willing to make that career choice. And, you know, maybe People Like Us are at the point in our careers where we can sort of say oh, okay, i can do this and not worry about whether i will be able to have a Research Career also, but i think its very important, and its starting to happen organically, but its very early on. I think we have time for one last question. I thought i might ask something completely personal, which is, you know, you work on research, i think we all know for quite sometime and then you work on a book for quite sometime and it is a lot of yourselves in there. Like, what i know it is important work and it is and weve talked a lot about that. Besides that, what about the spark of it . What is it about this subject matter and you guys could have chosen any subject matter whats the 16yearold versions of yourselves that like really make this like what you want to spend your days doing . Yeah, so, a good question. You know, the 16 and 18yearold of myself, you know, wanted to be very mathematical. I started college as a math major and sort drifted towards Computer Science as i realized, you know, you could think mathematically about computation, and so moving more in that direction, i realized there was something called not just Machine Learning but learning theory. I took a class where i used michaels textbook. I thought that was really cool. Here you are, you can sit down and prove mathematical theorems about how machines might learn. I got to grad school and i realized you can apply this mathematical computational lens to all sorts of things. When i started grad school, differential privacy was just being defined and it was an exciting time and i enjoyed thinking proving theorems about privacy, you know, wow, you can think about privacy while using math, and, you know, more recently, you know, you can do the same thing with fairness. But theres something very different about writing for an expert academic audience. You know, you try to, you know, first of all, theres a lot of math. You try to very precisely define ideas in sort of a dry way to be concise. Writing this book was quite different. You know, its sort of fun actually. It is liberating to try to write in an engaging way. It is difficult but rewarding to try to describe these ideas which are at their root mathematical but, you know, without equations. We tried very hard to remove all of the equations from the book. And, you know, i hope in the end, you know, we succeeded in conveying what is not just the sort of i think natural interest in these topics. Were sort of lucky that these topics are not just interesting as mathematical curiosities but are real meaningful important questions of the day, but also the excitement of doing research in these fields because, you know, we can take people in the book right up to the frontiers of knowledge because so little is known so far. Yeah, i guess my origin story is a little different. If you told my 16yearold self that okay, at some point later in life, youre going to write, you know, a general audience nonfiction book, i think that would have made a lot more sense to me than if you told me oh youre going to be a professor in Computer Science and Machine Learning because, you know, in high school i was a very indifferent math student. I didnt like it very much. I didnt try very hard. I wasnt especially good at it then, and i started college as an english major. I was a declared english major. And pretty quickly realized that i had chosen english because i wanted to learn how to write and that majoring in english was going to teach me how to read. But at the same time, i had managed to kind of just hang on by my fingernails in math classes long enough that when i, you know, got to berkeley, i started taking more of them, and, you know, any of you studies math and Computer Science through high school and all the way through college, you know at some point theres this Phase Transition where things become much more interesting, and you start to become aware of the creative aspects of it. I think i first discovered those more in Computer Science just because of the buzz of sort of being able to program a computer to do something that you couldnt possibly yourself achieve in your entire lifetime and, you know, it takes ten seconds; right . It is something stupid like sorting a list of numbers, for instance. But i think i enjoyed that, and then i started, you know, kind of just hung around math long enough that then the purely mathematical aspects became interesting. So in some ways writing this book in some vague way does fulfill, you know, the type of thing i wanted to do when i was very very young. One other comment i will make about all this sort of work we have been doing in fairness. I mean, i remember, maybe six years ago or so, the specific moment when aaron and i went and we were sitting in a cafe and we first started talking about some problem in algorithmic fairness. That led to our first publication which was interesting but flawed. And id like to say, you know, when you work in Something Like fairness, you would like to say like, oh, you know, six years ago i realized this was really important for society, and we as Machine Learning researchers have responsibility to fix the problems, and id like to say that even if the research had turned out to be boring, you know, like maybe all the problems are easy or they are all too hard or theres nothing you can do, or, you know, the solutions are clear and technically straight forward, and its just a matter of going out in the world in convincing people to adopt them, i would like to claim come hell or high water i would have said oh no this is what we have to do as responsible citizens. Luckily i dont have to know what choice i would have made in that situation because it did turn out to be sort of a mathematically and algorithmically a very rich field. It is great to be able to work on a topic that, you know, a, society is interested in, in both positive and negative ways, where, you know, theres really interesting Technical Work to do thats creative and satisfying and also to be able to do it with somebody that, you know, youre so similar with. Its been a great deal of fun. To hear your ideas and have you share them directly with us means a lot. Congratulations on the book. Thanks for coming. Thanks for hosting. Yeah, thanks for having us. [applause] book tv recently visited capitol hill and ask republican senator john bozeman of arkansas about his reading list. Ive got several books that im working on right now one of them is 1944, and so myself and some other senators will represent the United States at the battle of the bulge 75th anniversary this year, and so im excited about that, and i really wanted to kind of know exactly what went on over there so i could ask more intelligent

vimarsana.com © 2020. All Rights Reserved.