Transcripts For CSPAN Big Data And Politics 20170823 : vimar

CSPAN Big Data And Politics August 23, 2017

In 1970 nine, cspan was created as a Public Service by americas Television Companies and is brought to today by your cable or satellite provider. Now a discussion on the future of Big Data Analytics and analyzing peoples personality traits based on their social media activities. This is from the Computer Science museum in mountain view, california. We will explore one of the most important facts about our present and future. Largescale Data Collection and analysis has and will continue to profoundly change politics in the developing and most especially the developed world. Firms arely, private assembling profiles of everyduals, for example, eligible voter in the United States with the depth of information that would have made jn girl hoover week with envy. The use of this socalled big data and the inferences made from it has a singular purpose, to craft and deliver messaging that will shape the future behavior of an individual, from buying a particular brand of toothpaste to discouraging a citizen from voting. From choosing one Ridesharing Service over another to vote one way or another on profound decisions like the u. K. Brexit and the u. S. President ial election. These profiles are assembled thiswhat our guests evening calls our digital footprint. The traces of our daily lives that are captured often sold and analyzed. Many of us dont realize we are leaving these footprints behind through our use of credit cards, web searches, online purchases, smartphone use, but we listen to on streaming services, and what we watch on cable tv. Tonight, of Stanford University will introduce us to his work making assessments of individuals from the digital rents, particularly their facebook likes and profile pictures. They will also help us to begin to grasp how assessments like these are being and could be used to shape our political and social reality. Ourse join me in welcoming guest to the stage. [applause] hi, david. Good evening, everyone. Thanks for helping us. On the curtain of the use of big data and our digital rent to assess and influence each other. Maybe we can begin by having you describe your work for us and what you believe can be learned from us, from our facebook likes and profile pictures. Learning about people using surveys, i would look at the digital footprint that you saw introduced before that we are all leaving behind while using. Igital products and services it is a great time to be a computational psychologist. Its a great time to be me leave an enormous amount of digital footprint behind, making 2012, ibm estimated that we are leaving 50 megabytes of digital foot rents per day per person, which is an enormous amount of data. If you wanted to back it up on paper by printing it out on paper, lettersize paper, doublesided, font size 12, and you wanted to stack it up one day worth of data, the stack of paper would be from here to the sun four times over. Guys exactly. You are generating, we are all generating enormous amounts of information. Now, this information of course contains our trail of our behavior. Our thoughts, feelings, social interactions. Communications, purchases. Even the things we never intended to say, like not sure if you guys realize this, if you type a message on facebook, and then you decide, it is 2 00 a. M. And maybe i drank too much wine and i should not send it, and you abandon the message, guess what. It is still being saved and analyzed. It is not just this one platform. In most cases, data is preserved even if you think you have deleted it. My main goal is to take this data and learn something new about human psychology or human behavior. One of the byproducts of doing this is i will produce models that take your jewel footprints and we will try to predict your future behavior. Maybe your psychological traits, such as personality, political views, and so on. What was shocking to me when i started working in this field, is how accurate those models are. That is one shocking thing. They are also very difficult to interpret. I know a computer can predict your future behavior. A computer can reveal or determine your psychological traits from your digital footprint. It is very difficult for a human scientist to understand how exactly the computer is doing it, which brings me to this black box problem. Which basically means, it might be human psychologist, scientists, would be replaced one day by ai. In the meantime, you basically have those models we dont really actually understand very well how they do it. They are amazing at predicting your future behavior, your psychological traits, and so on. I worked with facebook likes quite a lot, not because facebook likes are the best type of digital foot print we are leaving behind. Not at all. Facebook likes are not so revealing. Why . Because liking happens in a public space. When you like something on facebook, you probably realize now your friends will see what you have liked. You wouldnt like anything really embarrassing or maybe something really boring, something you want to hide from your friends. But now, when you use your web browser or you search for something on google, or you go and buy something, you have much less choice. You will search for things he would never like on facebook. You would visit websites you would never like on facebook. You would buy stuff you would never like on facebook, like you would buy medicine that is very revealing of your health. Most of us do not like the medicine we are taking on facebook. Which basically means if someone can get access to your data, credit card data, your search data, records from your mobile phone, this digital footprint will be way more revealing than whatever i can do using facebook likes. Whatever findings i am coming up with, they are just conservative estimates of what can be done more revealing data. You can see the entire industry, the entire industries, not just one industry they are moving towards building their Business Models on top of the data we are producing. My favorite example is a credit card. How many of you guys have actually paid for the credit card recently . We have a few people that maybe didnt do their Research Online properly, or are so fancy, they pay for their credit card. Most of us, including me, we do not pay for credit cards. If you are not paying for something, and thick about it a credit card is an amazing, magical thing that allows you to pay for stuff without carrying cash. There is a complicated network behind it. Computers crunching data and so on. We are not paying for it. Why . We are the products of the credit card company. You can go to the website of visa or mastercard, or any other credit card operator, and you will see they see themselves not as a financial company. They started as a financial company, helping to channel payments. Now they see themselves as a Consumer Insights company. By observing the things you are buying and when you are buying them. How much you are spending on the individual level. They can learn a lot about you, but they can also extract information on a broader level. When they see recently people in San Francisco started buying certain things or going to a certain restaurant. This is a very valuable information that can be sold. Basically, if you are not paying for something, you most likely are a product. Think about your web browser that you probably didnt pay for. Your facebook accounts. Your web search mechanism. One of the gazillions of apps you have on your phone. Think about how much data you are sharing with the others. David is your use of facebook originally as a graduate student at cambridge, i believe. At the time, i believe facebook likes were public. Anybody could see them. Did that make that kind of data sets available to you since it was just public on facebook . Is that what led you to use the data . Michal yes, you are just pointing out, another reason why im using facebook likes. I was lucky to have a huge data set of volunteers that donated their facebook likes to me as well as political views. Personality and other psychological scores. Basically other parts of their facebook profiles. In 2006 or 2007, my friend, he started this online personality questionnaire. Where you could take a standard personality test, and then he would receive feedback on your scores. It went viral. We had more than 6 Million People that took the test. Half of them generously gave us access to their facebook profiles. When you finish your test, it would ask you, if you would be willing in return for us offering you this interesting thing, if you would be willing to give us access to your facebook profile. Which we would later use for scientific research. More than 6 Million People took the test. We got around 3 million profiles, facebook profiles. At the beginning, in fact, people like to say, when i graduated from high school, i already planned to run this research 20 years later. It was not the case. In my case, i kind of stumbled by accident, kind of got into this research by accident. I was developing traditional personality questionnaires. They are composed of, i am always on time at work. I like poetry. I dont care about abstract ideas. I had this data set of facebook likes where basically people who like poetry or i dont like abstract ideas, i dont like to read. What would strike me, why would we even asked people these questions if we can just go to their facebook profile, look at their likes, and fill in the questionnaire for them. I started running those Simple Machine learning models that would take your facebook likes and try to predict what would be your personality score. This worked really well, which actually was pretty disappointing for me. I spent so much time developing those questionnaires. Now a computer can do the same thing in a fraction of a second. We had other data in our data set. Ok, we can predict personality. I wonder if we can predict political views. Religion, Sexual Orientation, whether your parents were divorced or not. Each time we asked the question, the computer would think and said, of course we can predict this. It is amazing. We were pretty suspicious in the beginning. We would rerun the models with independent data, thinking i must be doing something wrong. Given a computer can look at your facebook likes and predict with very high accuracy, close to perfect, whether you are gay or not. People dont like anything obviously gay on facebook. They do but it is a small fraction. For most of the users running those predictions, it was based on the movies they watched, books they read. It looks very counterintuitive to me at the time you could do it. Now, as i got older, i spent more time running the models, it is pretty obvious. Let me illustrate it for you guys, maybe let me try to offer you a short introduction to how those models work. It is actually pretty intuitive. If i told you there is an anonymous person and they like hello kitty. It is a brand, i am told. [laughter] you would probably be able to figure out, if you know what hello kitty is, that this person is most likely female, young. An iphone user. You could probably go from there and make other inferences about her. You would be very correct. 99 of people who like hello kitty are women. You dont need computer scientists to make inferences of this kind. Most of your likes, purchases on amazon, locations that you visited with your phone recording it, most of the search queries you put in google are not so strongly revealing about your intimate traits. It is not mean they are not revealing at all. They are revealing. The fact that you listen to lady gaga 30 times yesterday. It is not only weird, it shows you something about your musical taste. It also probably it for sure reveals some tiny little extent, your Sexual Orientation, political views, intelligence, and virtually any other psychological trait we would like to predict. It is just that the amount of information there is really tiny. For a human being, this is useless. With a computer algorithm, it can get this tiny bit of information and aggregate it over thousands of Digital Footprints you are leaving behind to arrive at an accurate prediction of what your profile is. This is the paper i published in 2013. I was very excited about the promises of this technology. Im still excited about the promises. It is used to improve our lives in many ways and we dont realize how. If you dont believe me, think about netflix or spotify. Facebook newsfeed. Which is so engaging, people spend two hours a day on average looking at it. They dont look at it because it is boring and not wasnt to do it. They look at it because the ai made an accurate prediction about what your character is. It adjusted the message to make it engaging. There are also downsides, as i am sure we will be talking more about today. This was basically the paper i published in 2013. It got quite some press coverage. Most of the coverage was like, this is so cool. We can predict whether somebody is a republican from their facebook likes. Nice, shiny gadget. I said, no, you have to realize, there are tremendous consequences for the future of our society. No, it is just so cool. This is as far as we go. Interestingly, this is how the general public treated the results. Policymakers and companies took notice. Two weeks after the results were published, facebook changed privacy policies in such a way facebook likes were no longer public. Before 2013, march when we publish the paper, before that, likes were public for everyone to see. I didnt even have to be your friend on facebook to see everything you liked. Now our paper, our work showed by seeing what you like, i can also determine your Sexual Orientation, political views, all those other intimate traits. I think it was a great thing facebook took notice and to preserve your privacy, switched that off. You also had u. S. Governments and eu governments that took notice. They started working on changing the legislation to protect their citizens from some of the shortcomings of this phenomena. David lets talk about some of the political uses of this work. I want to hear about how private firms are using Big Data Analytics akin to your work to shift Voting Results in one way or another by microtargeting messaging that is defined by its intended persuasion rather than accuracy. One of these firms is cambridge analytical, part of a network of interconnected privately held firms that were involved in both the u. K. Brexit vote and trunk campaign. Because much of what we know about this story is due to recent investigative journalism. Especially by the guardian newspaper. I thought i could provide the audience with a quick review of the story. Cambridge analytical is a u. S. Firm. Mostly owned by robert mercer. Until august, 2016, had steve bannon as the vice president. Mercer is one of the most successful Quantitative Hedge Fund managers. A major owner of breitbart news. A major financial supporter of the trunk campaign. Steve bannon left his executive positions when he became manager of the trunk campaign. Of course, he is now the chief strategist to president trump. Cambridge analytical employ data mining as well as government records, data sold by corporations to develop a dossier on every u. S. Voter, which was first used by the Ted Cruz Campaign and later the Trump Campaign to microtarget their messaging and direct their advised to influence voters. Firm has, a canadian been a central consultant for this kind of thing with the various u. K. Organizations that pushed for the brexit vote. Cambridge analytica appears to be the owner. I should note, Time Magazine reported yesterday congressional investigators are looking at cambridge analytical in the context of their exploration of russian activities. The u. S. President ial election as well, which may have included russian elements using techniques like those used by cambridge analytica. Can you tell us about how your work relates to this whole thing . How we should think about the claims about the effectiveness of their work for the trunk campaign . Michal those are very good questions. There was a lot there. First of all, we dont really know how effective they were. Interestingly, when you listen, they started by saying how amazingly efficient they were. When they realized governments were getting interested, some things they have done became public, were not entirely legal, they suddenly changed theirs feel. Now they say, it did not work at all. We are just making stuff up. Which obviously means they are lying now or they were lying then. What i can tell you for sure, first of all, we have a lot of evidence that we produce and in academia showing such approaches work really well. We also see, it is not only the trunk campaign or the brexit campaign, but all the serious politicians employing messages like this in their campaigns. In fact, barack obama was the first politician to do it on a massive scale. I dont remember any outrage, especially on the left side of the political spectrum. Hillary clinton as well. She not only spent three times more money than donald trump doing personalized targeting on social media, also hired away smarter people in my opinion. Yes, she lost, but she didnt lose because trump was using some kind of magical methods. The difference was caused by Something Else. When people ask me, can did analytics and personalized marketing win an election, the answer is, yes and no. It is a fact of life when you are running a political campaign. Like tv spots and writing articles, putting ads in the papers. But because everyone is using it, it is not giving anyone and unfair advantage. The only advantage here is barack obama was the first one to use it on a massive scale so it must have given him an unfair advantage. Also, we as humans, we like to focus on the negative. It is great we focus on the negative. This is clearly a great psychological trait because it allows us to be successful as a species. Probably even to successful to some degree. But, lets put aside focusing on the negative and risky. Lets think about advantages of politicians being able to personalize their message. There are a few interesting outcomes people seem not to notice. If i can talk to you one on one, that is what i can do. I can use algorithms to help me to talk to you oneonone about things most relevant to you. They ran those al

© 2025 Vimarsana