Present and our future. Large scale Data Collection and analysis has and will continue to profoundly change politics in the developing and most especially, the developer world. Increasingly, private firms are assembling profiles of individuals. For example, every eligible voter in the united states, with a depth of information that wouldve made you a occur hoover we put and the J Edgar Hoover weep with envy. Singular purpose. To craft and deliver messages that will shape the future behavior of an individual. From buying a particular brand of whitening toothpaste to discouraging a citizen from voting, to choosing one rightsharing service over another to boating one way or another on profound decisions like the u. K. Brexit and president ial election. These profiles are assembled from our Digital Footprints. The traces of our daily lives captured and often sold and analyze. Many of us do not realize were leaving these footprints behind through our use of credit cards, web searches, online purchases, smartphone use, what we listen services, andg what we watch on cable tv. , professoronight from Stanford University who will talk about facebook likes in pictures. He will help us begin to grasp how assessments like these are being used and could be used to shape our political reality. Please welcome me and joining him to the stage. Thanked me thank you for helping us spirit behind the stage. Tell us about what you can help us learn from our likes and facebook pictures. I am a computational psychologist, which means that i am working mostly with a data. So instead of saving time with Research Subjects in my or usingy or running surveys, i would look at the Digital Footprints that you so nicely introduced before that we are leaving the hind while we use Digital Products and services. It is a great time to be a computational psychologist, it is a great day to be made because you guys we all, leave an enormous amount of Digital Footprints behind. We are, ibm as to my did leaving 50 megabytes of Digital Footprints a day. This is an enormous amount of data. If you wanted to back it up on paper as he rows and ones doublesized,per, font size 12, and you wanted to stock it up and its just one days worth of data, the stack of paper would be like from here to the sun four times over. So, well, hopefully you guys [laughter] well store them in the museum over here. So were all generating enormous amount of information. And now this information, of course, contains our trail of our behaviors, thoughts, feelings, social interactions, communications, purchases, even the things that we never intended to say. Im not sure you realize that if you type a message on facebook and then you decide, ok, its 2 00 a. M. And maybe i drank too much wine, i shouldnt probably be sending it and you abandon the message, close the window, guess what. The message is still being saved and analyzed. And now this is not just this one platform. In most cases data is preserved even if you think that you have deleted it. In my research, my main goal is to try to take this data and learn something new about human psychology or human behavior. One of the byproducts of doing that is that i will produce models that take your digital foot prints and will try to predict your future behavior. Maybe well try to predict your psychological traits such as personality, political views, religionocity, Sexual Orientation, and so on. Well, what was really shocking to me when i started working in this field is how accurate those models are. So this is one shocking thing. The other shocking thing in fact, is that those models are also very difficult to i know that a computer can predict your future behavior. A computer can reveal or determine your psychological traits from your Digital Footprints. But its very difficult for a human scientist to now understand how exactly a computer is doing it, which brings me to this black box problem which basically means that it might be that human psychologists or human scientists will be replaced one day by ai running science. But in the meantime, you basically have those models that we dont really actually understand very well how they do it but they are amazing at predicting your future behavior, your psychological traits, and so on. I worked with facebook likes quite a lot not because facebook likes are the best type of digital footprint we are leaving behind, not at all. In fact, facebook likes are not so revealing. Why . Because liking happens in a public space. So when you like something on facebook, you probably realize that now your friends will see what you have liked. So you wouldnt like anything really embarrassing or maybe something really boring or something that you want to hide from your friends. But now when you use your web browser or you search for something on google or you go and buy something, you basically have much less choice. You kind of will search for things that you would never like on facebook. You would visit websites that you would never like on facebook. And you would buy stuff that you would never like on facebook. Like you would buy medicine that is very revealing about your health. And most of us dont really like the medicine we are taking on facebook. Which basically means that if someone can get access to your credit card data, your web browsing data, your search data, records from your mobile phone, these Digital Footprints would be way more revealing than whatever i can do using facebook likes. So whatever findings im coming up with, they are just conservative estimates of what can be done with more revealing data. And you can actually see that the entire industries, not just one industry. They are moving towards basically building their Business Models on top of the data we are producing. And my favorite example is credit cards. How many of you guys have actually paid for the credit card recently . Ok. We have few people that maybe didnt do their research on line properly. But most of us, including me, we dont pay for credit cards. Now, guess what. If you are not paying for something think about it for a second. A credit card is just an amazing magical thing that allows you to pay for stuff without carrying cash around. Its a complicated network behind it, computers crunching data and so on. Now, were not paying for it. Why . Because were the product. Of a credit card company. And when you its not a secret. You can go to the website of a visa or mastercard or any other credit card operator and you will see that they see themselves not as a Financial Company anymore. They started as a Financial Company and it was helping to channel payments. Now they see themselves as a not computer. Customer Insights Company by observing the things youre buying and when you are buying them and how much youre spending. On the individual level they can learn a lot about you but also they can see extract interesting information on the broader level. When they see that, you know, recently people in San Francisco started buying certain things, or going to certain restaurants or what not, this is very valuable information that can be sold. So basically if youre not paying for something, youre most likely a product. So now think about your web browser that you probably didnt pay for, your facebook accounts, your web searching mechanism, and one of the gazillions of apps that you have on your phone. And now think about how much data youre sharing with the others. David is your use of facebook likes i guess at the time initially, a graduate student at the university of cambridge. Correct . And at the time, i believe, facebook likes were public. Anyone could see your facebook likes. So did that make that kind of dataset available to you since it was just public on facebook . Is that what led you to use that data . Michal yes. Youre pointing out here not a reason why another reason why im using facebook likes, which is that i was very lucky to get a huge dataset of volunteers that donated their facebook likes to me as well as their political views, their personality and other psychological scores, and basically other parts of their facebook profiles. So back in 2006 or 2007, my friend david stillwell, started this online personality questionnaire where you could take a personality standard Personality Test and then you would receive feedback on your scores. It went viral. We had more than six Million People that took the test. And half of them generously gave us access to their basic facebook profiles. So when you finish your test, we would ask you if you would be willing in return for us offering you this interesting thing, if you would be willing to give us access to your facebook profile that we would later use for our Scientific Research and more than six Million People, in fact, took the test and we got around three million profiles, facebook profiles. At the beginning, in fact you know, people like to say, oh, when i graduated from high school, i already planned, you know ran this research 20 years later. No, it wasnt the case. In my case, i kind of stumbled by accident, kind of got into this research by accident. What happened is i was developing traditional personality questionnaires. And traditional personality questionnaires are composed of questions such as im always on time at work or i like poetry or i dont care about abstract ideas. And i had this dataset of facebook likes where basically people liked poetry or they like i dont like abstract ideas or i dont like to read. And what struck me is that why would we even ask people this question if we can just go to their facebook profile, look at their facebook likes, and just, you know, fill in the questionnaire for them . [laughter] so i started running those machinelearning simple machinelearning models that would take your facebook likes and try to predict what would be your personality score. And this worked pretty well, which actually was pretty disappointing for me because i spent so much time developing those bloody questionnaires and now here a computer can do the same thing in a fraction of a second for millions of people. But then we started we had other data in our dataset. And in so we were like, him ok, so it can predict and personality. I wonder if you can predict political views, religionocity, Sexual Orientation, whether your parents were divorced or not. And each time we asked this question, the computer would think for a few seconds and then say of course we can predict it. Its curiousy. Its amazing. In fact, we were pretty suspicious. So at the beginning i would rerun the models with independent pieces of data or rewrite my entire code thinking that i must be doing something wrong given that a computer can look at your facebook likes and predict with very high accuracy, close to perfect, whether youre gay or not. And people dont really like anything obviously gay on facebook. Well, some do but its actually a very small fraction of people. For most of the users running these predictions, this was really based on the movies they watched or books they read. And it looked very counterintuitive to me at the time that you could do it. Now im a bit older and spent more time running those models. Its actual actually pretty obvious left met illustrate for you why maybe let me kind of try to offer you a short introduction to how those models work. Its actually pretty intuitive. Look, if i told you that there is this anonymous person and they like hello kitty its a brand, im told. [laughter] you would probably be able to figure out, if you know what hello kitty is, that this person is most likely female, young, an iphone user, and you can probably go from here and make some other inferences about her. And you will be actually very correct. 99 of people who like hello kitty are women. So you dont need computer algorithm or Rocket Scientist or even a Computer Scientist to basically make inferences of this kind. Most of your facebook likes or most of your purchases on amazon or most of the locations that you visited with your phone recording it or most of the search queries that you put in google are not so strongly revealing about your intimate traits. But it doesnt mean they are not revealing at all. They are revealing some of them to a very tiny degree but they are still revealing. So the fact that, lets say, you listen to lady gaga 30 times yesterday, its not only a bit weird, it does also show us something about your musical tastes. But it also reveals to some tiny me extend, in your religion, tastes, traits, behavior. The amount of information that is there is really tiny. So for a human being, this is useless. It now, what a computer aggregatecan do is information over thousands of Digital Footprints you are leaving behind two arrive at a very accurate prediction of what your profilers. This is basically the paper i published in 2013 and i was very excited about the promises of the technology and i am still excited about the promises of improve oursed to lives and so many ways. If you do not leave me, think about what if i or Facebook News feed, which is so engaging that people spend two hours a day if i remember correctly looking at them. They dont look at it because it is so boring and unpleasant. They look at it because the ai behind it made a very accurate prediction about what your character is in adjusted the message to make it engaging. Downsides are also which im sure we will be talking more about today. Basically, the paper i published some pressled, quite coverage. But most of the press coverage was so cool you can predict whether someone is, i dont know republican from their facebook likes. I was like, no, wait, there are tremendous consequences for the future. And youre like, no, no, no. Likes but thatit is as far as we go. Howresting, this is policymakers and companies took notice. Two weeks after those were published, facebook changed the preview policies and such a way that likes word no longer public. Before 2013, to go as marge roy published the paper, before that likes where public for everyone to see. So i should not even have to be your friend on facebook to see what you like. But our work shows that by seeing what you like, i can also determine your Sexual Orientation, political views, and other intimate traits that people are not happy to share. I think it is a great thing that facebook took notice to preserve your privacy and switch that off. Governments took notice and they started working on changing the legislation to protect their citizens from some of the shortcomings of this phenomenon. To talk about some of those things. About to hear you talk how private firms are using this Data Analytics. To shift Voting Results in one way or another in microtargeting messages defined by its intended persuasion rather than by its accuracy. One of these firms is cambridge analytic, part of an interconnected set of private because much of what we know about this political story is due to investigative journalism, especially by the guardian newspaper, without i went provide our audience a quick review. Alytica until august 13 had steve bannon as its secretary. One of the most successful Quantitative Hedge Fund managers was also, and a manager of the Trump Campaign. Steve bannon left his positions when he became manager of the Trump Campaign and of course is now chief strategist to president trump. Analytica employed social media data mining to develop a dossier on every u. S. Voter. First used by the Ted Cruz Campaign and later by the Trump Campaign to microtarget their messaging and influence voters. Relatedly, a Canadian Firm been aaggregate iq has central consultant for this kind of thing with a few organizations that pushed for the u. K. Exit vote. It appears to be the owner of property. Iq us Time Magazine reported yesterday that u. S. Congressional investigators were currently Analytica Cambridge in the course of their investigation of russian activity because they may have used techniques like those used analytica in their investigation of american data. How should we think about Cambridge Analytica claims and terms of their Trump Campaign research . Michael no is a lot there. First of all, we do not know how ticactive cambridge analy was. When you look at them themselves, they start by saying how amazing and efficient they were but then when they realized that governments are getting interested and maybe when they realized that some things they not be anould entirely legal, they someday change there should be all and now they say, it did not work massive scale. Outrage,emember any especially on the left side of the political spectrum. Spenty clinton not only three times more money thing on donald trump on doing personalized targeting on social media, but also hired way smarter people, in my opinion. Yes, she lost, but she did not lose because trump was using some kind of magical methods. The difference in the outcome was mostly caused by something house. Me, can datae ask analytics and personal marketing win the election, the answer is yes and no. Fact of lifet is a when you are running a Clinical CampaignLike Television spots and offensearticles and putting ads in the paper. Using it. Eryone is it is not giving anyone any unfair advantage and the only unfair advantage i can think of is of barack obama who used a nonny massive scale. Think people, because we as human kind of like to talk about the negative. It is great we focus on the negative because it is clearly a great psychological tr