Transcripts For CSPAN2 Brian Christian The Alignment Problem

CSPAN2 Brian Christian The Alignment Problem July 11, 2024

Human learning. Hes best known for contributions documented in bestselling book, most human and algorithm to live by. Christian is taking questions such as we continue to rely on systems to solve our problem, what happens when ai becomes a problem. Computer science and poetry, christian has spent career tackling the ethical and technological implication of a society thats become more reliant on technology. The work has caught the attention of Technology Leaders like elon musk who said christians book was night table reading. How can we get a handle on this technology before we lose control . He argues that although we train to make decisions for us, eventually humans will need humans. We will be discussing a lot in the next hour and i want to have your questions too. If youre watching live with us, please put the questions on in youtube so i can work them into the conversation today. Thank you, brian, welcome, thank you for joining us. Its my pleasure, thank you for having me. Great. This is not your first book. I want to ask the obvious question and it does set the conversations here, why did you decide to tackle this topic now with this book . Great question. So the initial seed for this book came after my first book had come out and as you had mentioned vanity fair reported that elon musk, it was his bedside ading and i found myself in 2014 attending a Silicon Valley book group that was a bunch of investors and entrepreneurs and they had see the thing about elon musk reading the boo and they inted him to join and to my surprise, he came and there was this really fascinating moment at the e of the dinner when the organizers tnked me and evyone was getting up to go home for the night and elon musk foed everyone to sit back down and h said, no, no, no, but seriously, what are we gng to do about a im not letting anyone leave t room until you either give me a convincing counterargument why shouldnt be worried about this or you give me andea for something that w can do about it. D it was quite a memorable vignette, and i found myself, you kn, drawing a blank. I didnt have a convincing argument why we shouldnt be worried. I s aware of the conversation arnd risks of ai, for some people its like a Human Extinction level risk and other people are focused on the presentday ethical problem. I didnt have aeason why we should b word about it but i dnt have a concrete suggestion on what to do. That was the general consensus in the room at that time. Hisuestion of so seriously whats the plan, kind of haunted me while i was finishing my previous booknd i really began to see starting and would say around 2016 a Dramatic Movement within the field t actually put some kind of plan together both on the ethical questions and the sort of further further into the future safety questions. An both of those movements have grown i would say explosively between 2016 and now and these questions of ethics and safety and whats in the book i describe as t alignment problem, how do we make sure that the objective that this system is crying out is, in fact, what what we are intending for it to do. These things have gone from kind of marginal and somewhat philosophical questions at the margins of ai to really today making central questions of the field and so i wanted to tell the story of the movement and figure out where in a way answering elons question. Whats the plan . What are we doing . One of the interesting things as i was getting into it, theres a lot of obviously its very complex technology and program that goes into this but i dont think a lot of people are aware that this has already been applied even situations in our society today. We started the program, you present a number of examples of failures of ai tools to accomplish what they hope to perform but one of your examples could not be more timely. Thats the algorithms that are being used by judges not just here in california but specifically to what im getting into in california where instead of cash bail, a jge will use algorithm to determi whether or not a suspect is released or remains in jail while waiting for trial. Proposition 25 that wou we reaffirm a law andeplace with system andts a complicated issue but what really surprise med when i was looking at it, the naacp and human rights oppose this proposition because of whats theyre saying of inequities of the algorithm. Why don we get into that example and build from there. What happens with the algorithm and how did we get there . Absolutely. Theres nearly 100year long history of statistical in 1920s called actuarial methods, an attempt to create science of probation and parole and that as i say started in the 20s and 30s but really took off with the rise of personal computers in the 80s and 90s and today its implemented in almost every jurisdiction in the u. S. , municipal count state, federal. And there has been an increasing scrutiny thats come with that and its interesting watching the Public Discourse pivot on some of these tools. So, for example, the n york city w writing the new york city times Editorial Board w writing up to 2015 the lters, open letters saying its time for new york state to join the 21st century. You know, we need something thats objective, that is evidencebased, weant just be relying on the whims of folks in roads behind the bench. We need to bring some actual sciences. Shply, that that pition changes and just by months later the end of 2015, beginning of 2016ew york city is running a series of articles saying, algorithms are putting people in jail,acial bias, we need to really throw the brakes and callinout by name this particular tool whi is called compass which is one of the most widely used tools throughout the united states, and the question has ts really iited an entire subfield condition statistics around the question of what does it mean to s that a tool is fair . Younow, this system is designed toake predictions if someone will reoffend if they are released onrobation or pendg trial, pretrial. What does it meano take these concepts that exist in the law, things like, you know, treatment, equal oppornity, et cetera, 14th amendment protections, et cetera. What ds it mean to actually turn them into the language of codend how do we look at a tool like this and way whether we feel comfortable actually deploying this . And you are giving some examples of how, you know, a black suspect and a white suspect with similar background and how much more likely the white suspect was to go free including surprising at one point one of the white suspects who was what were and so what was goes into, if you will, the baking of that cake that built these biases because the biases were not intentionally baked into it, but theyre still hard baked in some ways. Yeah. So this is a very big conversation. Your likelihood to not make your court appointment. The second iso commit a nonViolent Crime while youre pending trial, andhe third is to commit a Violent Crime pending trial. Th question is where does the da come from on which these models are trained. And if you look at Something Like failure to appear in court, well, if you fail to appear in court, the court knows about i it by defition, right . So thats going to be fairly unbiased regardles of who you are. If you dont show up, the court knows about it. If you look atomething like nonViolent Crime, its the case that, for example, if you, if youoll young white men and young black men in manttan about their rate of lfreported marijuana usage, they selfreport that they use marijuana at the same rat and if you look at the arrest ta, the bck person is a 15 times more likely to be arrest ared for using marijuana than the white person i in manhattan. In other jurisdictions it might be 8 times as likely, i think in iowa. You know, it varies from place to place. To is tts a case where its really important to remember that the model claims to be le to predict crime, but what its actually predicting i rearrest. And so rearrest is kind of this imperft and systematically so proxy for what we really care about, which is crime. Its ironic to me because as partf this project, researching these systems, i went back into the historical literature when they fir startegetting used which was in illinois in the 1930s. At the time, a lot o the objections were coming from th conservatives, from the political right. And, yeah, ironically making almost the same argument that progressives are making now, but from the other pseudo. So conservatives in the lat 30s were saying, now wait a minute, you know, if this if a bad guy is able to evade arrest expect system dsnt know that he and the system doesnt know tha he committed a crime and will recommend the release and other people like him. Now we hear it bei made from the left which is to say if someone wrongfully arrested and convicted, they go into the data as a criminal, and it will recommend the detention of other people like them. Thiss really the same argument just, you know, framed in differentays. T at thats a very real problem. And werstarting to see groups like, for example, the partnership on a. I. Whichs kind of a nonprofit Industry Coalition at facebook, google and a number of groups, inact, a hundred different stakeholders recommending tha we dont take these predictions of nonviolent rearrest as seriously as we take, for example, the prediction of [inaudible] and the second component that want to highlight here, i mean, its a very vt question, b the second thing thats worth highlighting is this question of what do you do with the prediction once you have the conviction. So theres say youre, youve got a higher than average chance that youre going to fail to make your court, scheduled court appointment. Well, thats a prediction. Theres a separate question, which is what do we do with that information. Now, one thing you could do with that information is put the person in jail while they wait for the trial. Now, thats one answer. It turns out theres an emerging body of research that shows things like you send them a text message reminder, theyre much more likely to show up for their court appointment. And there are people proposing Solutions Like providing Daycare Services for their kids or providing them with subsidized transportation for the court. So theres this whole separate question which is as much as scrutiny is rightfully being directed at the actual algorithmic prediction, theres a much more system you can question which is what do we do wi those predictns. And if youre a judge and the prediction says that this person is going to fail to reappear, well, ideally youd want to recommend some kindf text message ale for this many as opposed to jail but that may not be available to you in that jurisdiction. So, you know, you have to kind of work with what you have x thats a system you can problem. Thats not necessarily an algorithm per se but the algorithm is sort of caught in the middle, if you will. Lets take it out of the crime d punishment era into the business area. You talk later in the book about about naudible] amazon coming up with this a. I. System that would help cull job applicants, and what they were finding was that a lot of men [inaudible] were baked into t way t system were being trained and th way the systemas being used. And i think also when you get to this, you still have the question at the end, well, why are we trying to findeople just he the people we had. Tell us aut that and how dud they get in how did they g in. Yeah. So, yeah, this is a story that involves amazon around the year 2017. But by no means are they a unique example, just happens to be the example of amazon. But they, like many companies, were trying to design something to take a little bit of the workad off of human recruiters. And if you have an open position, you start getting x number of residents coming in, ideally youd like some kind of algorithmic system to do some triage and and tell you, okay, thes are the resumes that are worth forwarding on or lking at more cloly, these are ler priority. And in a somewhat huge or ironic twist, amazon decided they waed to rate applicants on a scale of one to five stars. So so rating the prospective employee the same way customers rate their products. But to do, that they were using a tube of computational model called word vectors. Without getting too technical, for people or who are familiar with kind of the rise of Neural Networks, these Neural Network modelshat were very successful at Computer Vision arod 2012 also started to move into computational linguistics around 2013. And in particular, there was this very are remarkable family of mels that were able toort of imagine words as these points in space. And so if you had a document, you could predict the missing word based on e other words that were nearby in this kind of abstract threedimensional spas, if you can imagine that. But they had a lot of cool other properties that you can actually do arithmetic with words. You could do king minus man plus woman and search for the point in space that was neare to that, and you would get queen. You could do tokyo minus japan pl england and get london. These sorts of things. If is so these numerical representations of wds that fell out of this Neural Network ended up being useful for this surprisingly vast array of tasks. And e of these was trying to figure out the, quoteunquote, relevance a given tv to a given job. One wayou could do it i say here are all the resumes of the people weve hired over the years, find allhose points of space, and then for any new resume, letsust see which of the words have the kind of positive attributes and which have the negative attributes. Okay, well, it sounds good enough, but when t team at amaz started looking at this, they found all sorts of bias. So, for example, the word women was assigned as a penalty. So if you played, you know, you went to a Womens College or you on the Womens Society of something, the word women on that was getting aing negativ deduction. So yes. Because it was even farther away from the more successful words that its been trained to watch for, is that right . Thats right. It doesnt appear on the, you know, typical resume that did get selected in the past, and its similar to other words that also appeared. Thats right. So, of course, the team, the red flag goes off, and they say, okay, we can delete this attribute from our model. They start noticing that its also applying deductions to womens sports like field hockey, for example. So they get rate of that rid of that. And then they start noticing that its picking up on all of these, like,ery subtle syntaxical choices that were more typical of male resumes than female. Words like executed and captured, like i executed a stragy to capture market value, certain phrases that were just more typical of men. And at that pnt they basically gave up, they scrped the project entirely. And in the book i compare it to something that happened witthe boston symphony orchest in the 1950s where they were tryg to me the orcstra, which had been ver maledominated, a little bit more equitable. So they decided to hold the auditions behind a wooden screen. But what someone found out only later was that as the auditior walked out onto the wooden par parquet floor, of course they could identify whether it was a flatsoled shoe or a highheeled shoe. So it was not until the 70s when they instructed people to remove their shoes bore they entered the room thatinally started to see the gender balance in the orchestra start to balance out. And the problem with with these language models is they basically always hear the shoes, right . Theyre detecting the word executed, theyre detecting the word captured, and the teams in this case just gave up and said we dont fl comfortable using this technology. Whatever its going to do, its going to identify some very subtle pattern in th engineering resumes west virgiad had before weve d before, the gender balance is off this, so its just going to sort of replicate that into the future which is not what we want. So in this particur case, they just walked away. This is a very active area of research of how do you debias a language model. Youve got all these points in space, and you try to identify, you know, subspaces within this giant threedimensional thing which represent gender stereotypes. Can you delet those dimensions while preseing the rest. This is an active area, andts kind of ongoing to this day. How much amazon spend develong that in. Its a great question. You know, theyre pretty tightlipped about it. St of it comes from aeuters article where pple werent giving tir name, so i wasnt able to do a lot of followup. But as i understand, they actual not only disbanded the product, but they disbanded the team and redistributed them to engineer other things. So really washed their handsment yeah. I just however many, i assume millions, were p into that, you knowthey could have hired an extra [inaudible] to cull their stuff. Yeah, tts right. Well, another example i wanted to get io, a different angle, the definitely the selfdriving c. You talk about inhe book the fality that happened because the way the car was recognizing a person. Thats right. Again, explain that and what happen with. Yeah. So this was the death of elaine in tempe, arizona, in the 2018, the first pedestrian killed by a selfdriving car. The sort o r d uber vehicle. And the full kind of national highway Transportation Safety board review came out at the end of last year, so fortunately i was able to get some of that into the book before it went to press. It was very illuminating to read the kind of official breakdown of everything that went wrong because it was one of these things where probabl six or seven separate things went wrong. And you think had it only been but for that entire suite of things going wrong, it might have ended differently. One of the things that was the happening was it was using a sort of deep Neural Network to do object dection, but it had never been given an example of a jaywalker. So in all of the Training Data that this model had been trained on, people walking across the street were perfectly core lated with, you know,ebra strip. They were perfectly correlated with intersectio and so forth. So the model just didnt really know what it was seeing when it saw thisoman crossing the street in the middle of the street. And most object Recognition Systems are taught to classify things int exactly one of a discreet number ofategories. Is so they dt know how to classify stuff that belongs to more tn one cate