Transcripts For CSPAN2 Brian Christian The Alignment Problem

CSPAN2 Brian Christian The Alignment Problem July 11, 2024

In his new book christian says if we continue to rely on Artificial Intelligence to solve our problems what happens when ai itself becomes a problem . Computer science and poetry tackling the ethical and technological. His work has caught the attention of people like elon musk who said his book the human human was his night table reading. It questions whether its a replication of an epitope vias and how can we get a handle on technology before you lose control . He argues although with trained these decisions force eventually humans will need humans. Well be discussing a lot in the next hour and i want to ask your questions but if youre watching live with us please put your questions in the text chat on youtube so i can into end of the conversation today at the thank you brian welcome in thank you for joining us. Its why pleasure. Thank you for having me. This is not your first book of course could i want to ask the opening question and its not the type of question you ask an author but that is why did you decide to tackle this title now . A great question. The initial seed for this book came after my first book had come out and as you had mentioned vanity fair reported elon musk i was his bedside reading and i found myself in 2014 attending a Silicon Valley group that was a bunch of investors and entrepreneurs and they had seen a thing about elon musk reading books and they invited him to join and to my surprise he came. There was this really fascinating moment at the end of the dinner when the organizers thanked me and everyone was getting up to go home and elon musk forced everyone to sit back down and he said no, no but seriously what are we going to do about ai . The not letting anyone leave this room until you give me a convincing counterargument about why we should be worried about this or give me an idea. It was quite a memorable vignette. I found myself drawing a blank. I couldnt convince him of why we shouldnt be worried and i was aware the conversation around ai and for some people its like a Human Extinction level risk. The presentday ethical problem. Id didnt have a reason why we shouldnt be worried about it but i didnt have a concrete just in. His question so seriously what is the plan i was finishing my previous book and i really began to see starting around 2015 a Dramatic Movement within the field to actually put some kind of plan together both on the ethical questions and the further into this future safety questions. Both of those movements have grown i would say explosively between 2016 and now and the questions about ethics and safety and what in the book i describe is the alignment problem and how do we make sure the objective that the system is carrying out is in fact what we are intending it for it to do. These things have gone from marginal and somewhat philosophical questions at the margins of the ai to i would say the central question. So i wanted to tell the story and figure out in a way answering elons question, what are we doing . When i was getting into this theres a lot of obviously very complex technology. I dont think a lot of people are aware of how its applied in lifeanddeath situations in our society today. You were talking before the start of the program that you gave a number of examples of ai tools but one of your examples was the algorithms being used by judges not just in california where instead of cash bail a judge will use an algorithm to determine whether or not a suspect is going to be released before trial. Proposition 25 would reaffirm a law to replace it with an algorithmbased system. A very complicated issue and what surprised me when i was looking at it was the naacp and Human Rights Watch affirms this prop position because our inequities of the algorithm. If you could give us an example and build from there. What is the algorithm and how do we get there . Absolutely. Theres a nearly 100 year long history of statistical what were called actuarial methods, this attempt to. A probation and parole. That is i say started in the 20s and 30s but really took off with the rise of personal computers in the 80s and 90s and today its implemented in almost every jurisdiction in the u. S. Municipal county state and federal. There have been an increasing scrutiny that has come along with that but its been very interesting watching the Public Discourse with these tools. For example the news york times was writing to New York Times Editorial Board was writing in 2015 these open letter saying its time for new york state to join the 21st century. We need something that is the active that is evidencebased and we cant just be relying on the whims of folks behind the bench. We need science. Sharply that position changes and by the end of 2016 the news york times is running a series of articles saying algorithms are putting people in jail algorithms have seeming racial bias. Calling up a name that particular tool which is one of the most widely used tools throughout the united states. The question has really has ignited an entire subfield around this question of what does it mean to say that a tool is fair the system is designed to make projections about someone who will reoffend if they are released on probation or up pending a pretrial. What does that mean to take these concepts that exist in the law for equal opportunity etc. And protections as senator what does it mean to turn this and how do we look at a tool like this and say whether we feel comfortable. You actually give us some examples of how a black suspect in a white suspect the similar crimes in different background and how much more likely the white suspect was to go free including one of [inaudible] so what goes into the baking of that cake with builtin biases because the biases were not intentionally baked into it via they are still hard baked in those sometimes. Its a very good conversation. One place to start is to look at the data that goes into this. One of the things that these systems are trying to do is predict one of three things. Lets just think about the pretrial case for now. To the glee at tool is predicting three Different Things but one is your likelihood to not make your court appointment. The second is to commit a nonviolent crime while you are pending trial and the third is to commit a violent crime. The question is where are these models trained and if you look at Something Like if you fail to appear in court the court knows about it a definition. That is going to be certainly unbiased in that a sense regarding who you are predict you dont show up. If you look at Something Like nonviolent crime its the case that for example if you pull young white men and black men in manhattan about their selfreported or one the usagebased self report that they use marijuana and yet if you look at the arrest of data the black person is 15 times more likely to be arrested for using where one of them a white person. In other jurisdictions that might be a times as likely and it varies from place to place. Thats the a case where its really important to remember that the model claims to be able to predict crime but what it actually predicting his rearrest sarao rest is kind of an imperfect and systematically so proxy for what we really care about which is crime. Its ironic to me how as part of this project in researching the system i went back into the historical literature when they first out of getting used which was in illinois in the 1930s. At the time a lot of the objections were coming from the political right. Ironically i made the same argument that progressives are making a bid from their site so conservatives in the late 30s were saying now wait a minute, the bad guy is able to evade arrest in the system doesnt know that he committed a crime in the system treats them like he is innocent and will recommend this release and recommend the release of other people like him now we hear it being made from the left which is to say someone is wrongfully arrested and wrongfully convicted they go into the data is a bad person a criminal and will recommend detention. This is the same argument just framed endeavor ways. Thats a very real problem and we are starting to see groups like for example of a partnership on ai which is a nonprofit Industry Coalition of Facebook Google and a number of groups with 100 different stakeholders. Recommending that we dont have predictions of bog violet rearrest. And the second component that i want to finalize its a very vast question but the second thing that is worth highlighting is the question of what do you do with the predictions. So lets say you have got an average chance that youre going to fail to make your court, scheduled court appointment. Well for prediction. Theres a separate question which is what we do at with that information. One thing you can do that information is to put this person in jail while they wait for the trial. Thats one answer. Turns out there is an emerging body of research that shows things like if you send them a text message reminder they are much more likely to show up for their clients appointment and their people proposing Solutions Like providing Daycare Services for their kids were providing them with transportation to the court. Theres this whole separate question which is as much that is going on as much scrutiny that is rightfully being directed at the actual algorithm in the prediction is a much more systemic question which is what we do with this prediction . And if you are a judge and this person failed to appear you would want to recommend some kind of text message alert for them as opposed to jail but that may or may not he available in that jurisdiction so you have to have what you have and thats a systemic robin. Its not an algorithm. Se but the algorithm is caught in the middle if you will. Would the get out of the crime and punishment area and you talk later in the look about hiring and amazon has come up with it. What they were finding was there were a lot of men and the reasons like this were the way the system was being trained and used and when you get to the question at the end why were you trying to find people but tell us about that and what were they. It involves amazon around the year 2017. They like Many Companies were trying to design to take a workload off of humans. If you have an open up position you get a number of residents coming in i d lee youd like some kind of system to triage. In a somewhat cute or ironic twist amazon decided they wanted to rate applicants on this tale of one to five stars so reading prospective employees the same way amazon reads customer products but to do that they were using that type of confrontational language model called word benders and without getting too technical for people who are from here with the rise of the network these Narrow Network bottles that were successful around 2012 started to move into confrontational linguistics. In particular there was this very remarkable family of models that were able to who is an agent words so if you have documents he could forget the missing word based on the other word that was nearby that is abstract with the 300 dimensional space but these models have a lot of other cool properties. Its an arithmetic with words. King minus man plus woman and search for the point in space that was nearest to that and to the queen. You could do tokyo minus japan plus england and these sorts of things. Numerical representations of words ended up being useful for this surprisingly vast array and one of these was trying to figure out the quote unquote relevance of a given job and so when we could do it is to say here are all the resumes of the people we have hired of the years. Throw those into this word model and for any new resume lets just see which of the words has the kind of positive attributes and which have the negative attributes. Okay so it sounds good enough but when they started looking at this they found all sorts of bias. The word women was assigned a penalty. If you are going to a Womens College then the word women was getting a negative deduction. Getting a negative rating because its farther away from the more successful training. It doesnt appear on the typical resume that did get selected in the past and similar to other words that often do. Of course the red flag goes off and maybe we can delete this attribute from our model. They start noticing that womens sports and field hockey for example they get rid of better Womens Colleges so they get rid of that. Then they start noticing its picking up on all of these various syntactical choices that were more typical of male engineering and female so for example the word like executed capture. I executed a market value in certain phrases that were more typical of men. At that point they basically gave up and scratched the project entirely. In the book i compare it to something that happened with the boston Symphony Orchestra in the 1950s where they were trying to make the orchestra which had been mailed dominated a little bit more equitable so they decided to hold the audition behind a wooden screen. But when someone found out later as the audition or walked out onto the wooden parquet floor of course they could identify so its not until when they additionally instructed people could remove their shoes before entering the room that they finally started to see the gender balance and the orchestra started to balance out to that the problem with these language models is they basically fear of issues. They are detecting the word executed in the team in this case just gave up and said we dont feel comfortable using this technology. Whatever its going to do its going to identify some very subtle pattern and the engineering resumes we have had before the gender balance thought there so its just going to sort of replicate that into the future which is not what we want. In this particular case it just walked away but its a very active area of research. How do youd need bias the language model . You try to identify subspaces within the threedimensional stereotypes and can need delete those dimensions . This is an active area of this kind of ongoing. s how much did amazon spend developing that . Its a great question. They are pretty tightlipped about it. Most of it comes from a reuters article. I wasnt able to do a lot of followup but as i understand they not only disbanded the product but they disbanded the team that made it so they really washed their hands. Im asking because i assume millions were put into that. They could have hired an extra h. R. Person or two. Another example i wanted to get into from a different angle is the selfdriving car and you talk in the book about the fatality that happened because of the way the car was recognizing this person. Again explain that. This was the death of Elaine Hertzberg in tempe arizona. The first pedestrian killed by a selfdriving car. It was an r d uber vehicle and the full national highway Transportation Safety board review came out at the end of last year so fortunately i was able to get some of the in the vote. It was very illuminating to read the kind of official breakdown of everything that went wrong. Was one of these things where probably six or seven separate things went wrong. It only been after the entire fleet of things that had gone wrong it might have been different. One of the things that was happening was using the network to do object detection but it had never been given an example of a jaywalker so in all of the Training Data that had been the model have been trained on people walking across the street were perfectly correlated with zebra stripes and they were perfectly correlated with an intersection and soap were so the model didnt really know what it was seeing when it just thought this woman crossing the street and most object Recognition Systems are taught to classify things into exactly one of the discrete number of categories. They dont know how to classify stuff into a category they dont know how to classify something in any category but this is again one active research that the field is making headway on only recently but this particular case the woman was walking a bicycle and so this sets the object recognition system ended his fluttering state words first functions like a cyclist but but she wasnt moving like a cyclist but shes a pedestrian but then it saw the bicycle and maybe it is an object that rolling in the road. No i think as a person, no i think its a biker. Due to quirk in the way the system was built every time it changes mind about what type of entity was seeing it would reset the motion prediction so its constantly predicting this is how a typical prediction would person with the vendor cyclist etc. And as a result where they will be in a couple of seconds right now. Every time it changed its mind and started recomputing that prediction. So it never stabilized on a prediction. So there were additional things here with overrides at the uber team had made. In 2018 cars have this rudimentary form of selfdriving to automatically break or swerve and to override that and sort of add their own system in interactive and ways but i think the object recognition thing itself is for me very schematic and theres a question of certainty and confidence that when it says im 99 sure that its a person or whatever might the how do we know if those probabilities are well calibrated and how does the system know what to do with them. I think many people within the uncertainty community now would argue that the mere fact that you are changing your mind should be a huge red flag to slow the car down. That alone and that wasnt done. So its very heartbreaking to think about how all of these engineering decisions add up to this event that would have been so much better to have avoided. The Silver Lining is that there are lessons that we take to heart not just an industry but also in academia thing we really need to get to the bottom of this question of certainty and uncertainty because i think thats a very human thing. You dont want this to exist in the medical literature and you dont want to see in a reversible action. You see it in the law with things like with a preemptive judgment and im forgetting to term but a judge may issue an order in advance of what the real thing would be because they are trying to prevent irreparable harm. There is a question for the machine writing committee which is how do we not want to make any reversi