Transcripts For CSPAN2 The Communicators Brewster Kahle Inte

CSPAN2 The Communicators Brewster Kahle Internet Archive July 13, 2024

Members of congress, government officials and technology leaders. Brewster kahle, what you do for living . I run the archives. Internet library on the internet that catalogs books, trying to build the internet into the library of alexandria for the digital age. That sounds like the internet. Doesnt it . Guest the internet is getting there but the published works is not fast enough. The average life of a webpage is only 100 days. Before it is changed or deleted. One hundred days. We built our culture on this ever shifting hand soap with the internet archive it takes snapshots of the webpages on websites every two months. It takes a snapshot and its been doing this since 1996 and offers it as a free service on archives. Org and used by hundred of thousands of people a day to find all the things that been disappeared either maliciously or sometimes just drop off the net. How many websites are there today . Guest hundreds of millions and they are coming and going all of the time. We collect about 800 million pages every day. The total collection is about 800 billion urls. Its kind of huge and that turns out to be only part of what we do. We also archive television, abc, nbc, fox but also International Television if you go to tv. Organize. Org you can search to find clips of what other people said and be able to put those in blog posts. The idea is to make it that people can quote, compare and contrast and think readily about what has happened on television. The old daily show with jon stewart he did Something Like that and said you said this once and now they said this in can we do that now and its value by a journalist and users all the time. Its a free library. Library on the internet. Host why can i not just go to google and type in jon stewart . Guest you will find the jon stewart show it may have put up certain clips from their past or on youtube you might see a smattering but you dont know what show it came from and it doesnt have the context of television. Ours is a run of television. It can only pick bits and piec pieces, say the television, before we shut it down and try to make it so that the publishers are not unhappy with us but if you want the whole thing then we printed on a dvd or now a thumb drive and lend it to you. Then you have to send it back. If people wanted for a documentary and then they go to the publishers to say can i use this clip for my documentary. Its just like a library in the sense that you are borrowing things from the library. We also do this with books, digitized several thousand books a day. About 1 million books a year now and digitizing these and then weaving them into the net so that more and more wikipedia but note that if you go to a footnote and it has a page number you click on it and it pulls it to the right page. Then you can see a page back and page forward but if you want more of it you have to borrow it and if somebody has already checked it out then you have to wait. At least you get a couple pages. You can fact check and go in deeper then the wikipedia. Wikipedia is the encyclopedia of the internet but we want to be the library of the internet. Where you go deeper . How do you get to the published works of humankind, either old webpages, television, books, journal nurture . Or even music recordings. Host what lot apartment do you have to have at the internet archive to handle all the rights . Guest we do not have any law department. We are a library. We just operate like a library and lots of libraries dont have large law departments either. The idea is to not offend people or take or feel like they been taken advantage of so we dont make its a nonprofit library. We cut short when people like the television it is just clips. Just like a music collection when we digitize cds we try to link it to spot if i so we have the album arts but if its only 30 seconds. Unless its all material like 78 rpm records which is completely great and that was before my time but there is this wacky and fun so those we make downloadable and you can listen to it but they sound it was 78 so you crank it up and theres a horn and a dog in it that era the first half of the 20th century is largely forgotten because it wasnt looped onto long complaint records and cds and then to spot a fight. Some do but most of it not. Host how are you funded . Guest the same way wikipedia or npr is. Endoftheyear pleas donations and get grants and about a third of our income comes from libraries to collect to the webpages and we collect, for instance, the web collections of the National Archives of the United States or library of congress and that all comes from the internet archive. We have a room inside the Adams Building and you should go visit it. Its part of its a room in the library of congress where they bring book carts down and we are digitizing all day long. We have 20 locations around the country and now the world. Digitizing books and okay, you think should this not be done by a robot or is it done by now but it turns out it hasnt been done. If you look at the number of books that are online and the internet archive it goes up and up until the 1923 and there are copyrights. Basically everything beyond that is somewhat [inaudible] so goes up up and then crashes and decades of almost nothing online and then it comes back up again at the end of the 20th century or 21st century but we are missing [inaudible] amazon itself you say okay, its not online but i can buy it but we go to amazon and people have studied what books by decade are available and amazon new and goes up up but 1923 it crashes and then the 20 century basically is not online. Its amazing that we think theres so much Information Online and there is and a lot of it is crap but a lot of us is good but 20th century the published material is almost nonexistent, almost not there. We are raising a generation and ourselves really on not the best we have to offer. We basically had this elective amnesia about the 20th century. Well, you will be doomed to repeat it if we just forget the lessons from other times. We are trying to go through the 20th century, better world books is now donating all the books we dont already have two the internet archive and they get those from libraries that are [inaudible] and were trying to basically fill in the 20th century and make it so all those wikipedia but notes turn life. We even went and fixed the broken links in wikipedia for wikipedia, catherine the executive director was worried the truth might fracture that if we are not really work on trying to make wikipedia stronger and cited by better sources that people would start citing sources that were available but not good. Those citation words that happen behind the scenes on wikipedia are based on how good those citations are and whether you can click and see them. We committed to going and fixing all the broken links and filling in all the books in the journal or literature linked to wikipedia. We fixed 11 million broken links in wikipedia in the last couple of years and now we are going to all the books finding them and replacing those black texts with a blue link so you can click on it and go to it. If the books are missing and we try to find those books, digitize them, put them on. Host how did you come up with this idea . Guest it was the vision of the internet that a bunch of us but certainly i had of what i wanted the internet to be. It was 1980 and it was like why dont we go and make the library of alexandria for the digital age. But to build a computer and the internet and World Wide Web and i helped participate in this. I dont know, internet hall of fame, ive been asked the stop for a long time trying to build the earthly thing with the web came along and gave the publishers on the web but by 1996 we had enough momentum that i thought i could turn to build the library. The idea is to make it all the published works of human times with one click away. If you are in the middle of a rural place or in africa and some place and if you want access you should be able to have access. That was the dream of the internet that i signed on to. We are now in 2020 and we are still not there yet. But there are a mounting a number of us are saying lets get there. It was a good idea, ted nelson to make a hyper connected set of information so lets do that and some of us what is motivating is a disinformation. Misinformation. Fake news. The people are making stuff up. Theyre not being called on it because you cant get to the cited material. He cant actually go and say no, here is better information. People are making stuff up and we cant live that way. We have convinced a whole generation to turn to the net to answer questions. We dont go to libraries and more in the same way. We may go there for events and things but its probably not to go pull books. Audiobooks are great and on the rise but reference materials is the net and the net isnt good enough yet. We are working on it and we are the 300th most popular website and 1 million users every day that come to it. They look for information. Some people want to live in their bubbles but an awful wan want the internet archive is part of that. Host you had a little invention called alexa at one point. Whatever happened to that . Guest alexa internet, the company that amazon. Com bought it is not actually the little talking with it. Alexa internet is a web monitor named for the library of alexandria and i worked for jeff bezos directly for three years, terrific time smart guy and hopefully host hopefully he paid you in stock. Guest he did. Smartest thing i did was do not sell all of it. Its helped the internet archives grow and grow. Thank you to jeff bezos. And steve case who bought my company before that he ran american online. I built a company that America Online and steve case bought. Ive been fortunate but it was all towards this whole building the library. It was always toward this. Ive only had one idea and so just trying to stay at it. I really wanted my 2020 october 2020 and i set the school a few years ago lets say welcome to the library of the internet. The internet archive would be a piece but the internet is a library and has all the features that frankly, we grew up with. Whether its old periodicals or it has reliable axis and a card catalog that you can find things and can we make will live library of the digital age come to be that house enough to raise educated citizens. If we dont, we will end up with a generation that will learn from whatever they have in front of them and if its paid for stuff from political points of view or foreign points of view were just trolling people that are making stuff up we will end up with a mess and i say we are sort of scene that play out. Why dont we go and stand up and help out the facebooks and the twitter and they are trying to make referenceable material, maybe not as much as they should be but how do we make it possible . People need to go and know what they are looking at. And maybe made up but at least you can note that it is made up based on the analyses of the authors of the material. How can we go and build an internet that is a global brain that we can learn to trust because right now we are in this position where it starting to be scary out there. People are starting to worry that maybe the internet is full of junk but we dont have another alternative of where to go to otherwise so how do we go and reinforce and make some website that wants to be better and be able to be better and referenceable and how do we help authors, wikipedia contributors, how do we give them access to the library and the books in the library so they can reference right into it and how we give the readers my favorite thing recently with this weeding books into the web thing with wikipedia was my nextdoor neighbor, 15 years old. I was telling her we will [inaudible] she lit up and said i want that. I never get a rise out of my 15 yearold nextdoor neighbor and i said why do you want that and she said well, my school wont let me quote wikipedia in my Research Papers and wikipedia, thats not good enough. You have to follow through. If i could click on it and open the book i could do my homework in the middle of the night but thats good. Right . That is what we want. We want people to go deeper and make it so that publishers sell books up a storm and may even sell more books but that readers get the information out of music, video, journal literature, old periodicals that they know where it came from and what they can track. Host you have nine months for your 40 yearold goal. Will you make it . Guest we are trying to get, as they say in silicon valley, the minimum viable product. Can we have enough to do this so Phillips Academy andover went and had their whole library they lent it to us so we could digitize it and we now have the full library of one of the best prep schools in the country is now High School Library for anyone that wants to have access to it. Mary Grove College which is a University College that just went out of business unfortunately in detroit was a Catholic Girl School that it became coed but just last year was this last time and what they did with their library is they donated it to the internet archives. Now we are in the process of digitizing over the next nine months we will now have a College Library and a complete prep School Library plus about 1. 2 million other books and if we could get up to a total of 4 million books, its an 80 milliondollar project with a lot of money but doable we would have a yale, princeton or boston Public Library, Class Library available to anybody that wanted on the units. That is the dream we are going for. We start with these first steps and weaving them into wikipedia so people find them thats on the book aside but the website is going well and we are using it to help journalists be able to know when are things being disappeared by people and being able to seek some of the web reference even though they may have been taken away. Host what are the mechanics of digitize asian . To someone have to stand there page by page by page . Guest we build our own machines that holds the book like this so it does not break the bookbinding and there is or raises and lowers with a foot pedal so you are like a workout pedaling. If you raise and lower the glass and it lowers the glass, flattens the page it goes click click. And a person turns the page. Now click click. You say, so that not be down with a robot but we tried. We invented a Robot Company to try to get this to work and it ripped books and it was inefficient and broke a lot. We just said lets just have people do it and people are doing this now at a couple thousand books a day. Google has already digitized an enormous number of books and some of them are available but some got caught up in copyright issues. Our approach of doing digitized where we have a physical copy we digitize it and only one reader at a time can read it. You can get a couple pages like a preview like an amazon look inside the book but if you want the whole book then you check it out for two weeks and it comes back in the next person that wants it. For any time there is one book or two copies or other library then they can lend them out as well so its restricted, not even all that great because its restricted but it balances the copyright interest so make sure theres more copies floating around then we originally thought and purchased from the publishers. Host Brewster Kail in 1980 when you came up with this idea was in a 19 strike or a gradual thought process . What were you doing at the time . Guest i was walking over the Charles River so a friend of mine, he posed this question which i thought [inaudible] which was is that brewster you are a technologist, yes, you are also a utopian idealist, yes, painted a portrait of positive because of your technology. That turned out to be a very hard question. We are really good at complaining about things whether nuclear war or nicaragua problems or we are good at complaining but coming up with a positive vision much, much harder. I can only come up with two ideas. One was to save peoples privacy even though people will throw it away and the other was build a library out of everything. I thought a library of everything was too obvious so i try to work on the privacy wanted and found it was too difficult to try to make costeffective privacy devices by making chips in 1980 so i went to plan b. And never turn back. There are a number of us that have this vision of what the internet and the World Wide Web should be but weve made progress, its easy to say the internet is just a pile of bile or whatever but its also a terrific things and participation by lots of people but we need better tools to make our way through it. It feels like a delusion and it feels confusing it feels sometimes even threatening to people and by people being actively spreading disinformation and misinformation we need water tools so not going to let this go to the wrong way but theres a large number 4150 people with the internet archive but thousands and thousands of others that are all participating towards wikipedia, internet archives, Public Library assignments, the open source and they all have the same general dream of building something that is more than just ourselves, its an information interconnection system that connect people with information that they need and gives people an idea of what they can leave behind by writing things that will endure and that is the dream of the internet that i am still after and many, many others are as well. Host what was your role in the development of the internet, World Wide Web . Did you have one . Guest the actual internet from the open of the internet i was on the side more or less but i was part of the Engineering Group of the internet of how you go and build it but i was not the leader of that. It was a system for how to be the first publishing system on the internet and i think [inaudible] came before go for, before the web and that is probably why im in the internet hall of fame but when kim got the web going all of these technology folded in to the web, mosaic which was the first web browser for netscape, was a mosaic of several systems including waves, go pro in the web so it was part of that and the web was better. I tried to get publishers online and got the washington, wall street journal, New Yo

© 2025 Vimarsana