Waiting..
Auto Scroll
Sync
Top
Bottom
Select text to annotate, Click play in YouTube to begin
00:00:00
among other projects and you were doing lots of stuff you get involved in some very heady questions about the origins of truth on the Internet and this is where we're getting folks because the
00:00:12
the work that Danny is describing now in theory ultimately became a venture right metaweb don't so so that's right so what I really thought is that what we need to
00:00:25
do is have a way of representing the knowledge of the world in a way that machines can get out them and take advantage of it and that that should be shared everybody should be able to get at it that's in some sense that if the
00:00:38
nonhuman knowledge isn't a shared resource then what is I think what what if civilization been doing all these years so it created a company that built
00:00:51
this database called freebase was free database the and the company basically
00:01:08
took any kind of public knowledge that we can get information about anything and put it in a machine readable format we were kind of creating it with the idea that this is gonna be useful to the
00:01:21
world we didn't really have a business model and we started building it up and then it became useful to lots of different people including in particular all the search engines so eventually
00:01:34
Google bought it of course and then I got Google to agree to keep it open for three years but they only kept the part that was already open open and they
00:01:47
started building it up and so now if there's all the Google has something called the knowledge graph which is beyond the evolution of this and it probably has about a hundred billion different entities so everybody in this
00:02:02
room is in that graph the building this building is in that graph yeah I took a screenshot or when you're just Google know your house and all of these different yeah this
00:02:17
event is and yeah so so anything like a person a place in the vend anything like that is in this huge knowledge base and they're all the relationships between them are so when you for instance print
00:02:32
out a Google Mail that is rendered from the knowledge graph so the knowledge graph knows the bus schedules and it knows you know the address the restaurant and the traffic slowing all this information together around the
00:02:46
thing that these searcher cares about that's right so the map is just in some sense a custom rendering of a piece of the knowledge graph for your particular purpose yeah and also by the way I don't know the this doesn't have any ads on it
00:02:59
but the other thing is that the ads are also a lot a lot of knowledge graph about what the products are about and you know whether you know it probably
00:03:10
has knowledge about you specifically and so on so it's going to be way beyond the kind of public knowledge it's also beginning to probably have very particular private knowledge about people too now from Google's perspective
00:03:24
it's safe to say that this is a quantum leap in terms of the original basis of it's sort of citation based search you know model all of a sudden it is now
00:03:38
providing this multi-dimensional search that is drawing in way more richness yeah so that it still does the old kind of search so right now when you let's say I put in museums of New York yeah
00:03:52
museums in New York well it still does the old keyword search of searching for pages that have the word museum and the phrase New York but it doesn't if you say an exhibition in Manhattan or something
00:04:07
but you might have something as a museum in New York that actually didn't use the word Museum in New York on the page but the knowledge graph knows that Manhattan
00:04:17
is in New York and it knows that you know exhibitions are in museums or may know something is a museum even if it doesn't use the word museum and its title and so it's actually able to pick that up even though it's not it doesn't
00:04:31
have the keyword so that will play into the search results to come up it does a search that's based on the semantics and of course that's very important because that kind of knowledge is completely language independent too so the same
00:04:45
knowledge that informs your search in English also inform somebody's search in Mandarin and/or Hindi something like that so but the bad news is so the good
00:04:57
news is you know it's turned out to be really useful there are these big representations of knowledge but the bad news is the whole idea that being this free open thing that everybody was going
00:05:08
to use has actually become really just something that is a competitive advantage of Google and now you know other other search engines and other companies will make their own I'm sure apples working on it
00:05:22
Amazon and you know every each of the big companies IBM Microsoft you know they'll each work on their own database of I think so the world could go in one of two directions we could either have
00:05:36
this serve oligarchy of big companies that have been you know knowledge bases that they use for proprietary advantage or it could flip over and say it becomes
00:05:51
a public resource that we could say we want knowledge to be a public resource and we want in particular knowledge that's tied to who said what is this dog doesn't real reason truth remember
00:06:04
since who said stop and that becomes then a resource for doing things like sorting out what's big news or deciding what medical treatments yeah you know what what effects are in the scientific
00:06:18
literature you know things things like that that really don't align very well with commercial right and this is where under leg comes in underlay in many respects is your attempt to kind of reclaim this
00:06:31
technology for the as the public good that you can initially envisioned it as yeah it's it's my penance for having well so I've actually stuck on the screen here I thought there was a very
00:06:44
nice paragraph on the very simple underlay website which basically in written terms explains kind of what what it's attempting to do and it says the underlay aggregates statements and
00:06:57
reported observations along with citations of who made them and who published them for example it would not contain the bearer assertion that Sudan's population was 39 million in 2008 but rather that Sudan's population
00:07:10
was provisionally 39 million in 2008 according to the UN statistics division in 2011 referencing students national census as reported by its Central Bureau of Statistics and as contested by the southern People's Liberation Movement
00:07:22
yeah and and it would do that in not in those words but in a kind of machine readable right so that those could be and ultimately this is this version of
00:07:35
of what you were going at becomes almost a kind of record of all of these observations over time and it can be tracked you know so if we wanted to get to the heart of let's say you know
00:07:49
whether you know in one of these hearings we've just watched somebody said what are the other we could trace it potentially back to the first recorded instance yeah and and if you
00:08:03
take a problem like that I would regard that as an apple occasion of the under light just like Google Maps as a drawing a map is but if you take sorting through fake news and
00:08:14
recognizing when rumors are getting in on control in order to do that you really need a very complex representation of who's saying what so you're from kind of trace well this person said batter this person said that
00:08:27
this person said that or you know the New York Times said that you know the Drudge Report said that you know and so
00:08:39
there is something that needs to be built on top of of the underlay that is essentially a network of trust for that purpose so you know somebody has to say
00:08:51
well okay I trust New York Times where that trust Fox News or vice versa or you know I and these would be organizations or individuals with some sort of framework of analysis that could that
00:09:03
would that would leverage the underlay and for interpretive purposes and there's gonna be for different purposes I mean and you know the awful lot of the things that people argue about I mean you know is Taiwan a Province of China
00:09:17
well you know if you're doing something with the Chinese government you've got account of this one if you're doing something with Taiwan you're probably not gonna commit you know so for some purposes it is for some purposes it
00:09:32
isn't and and so there is what's the truth of that well there is an exactly the truth it's you know what's the purpose what's the trust in it and so on
00:09:43
and in many of these the so so I sort of feel like the underlay is in some sense is a piece of the plumbing that we need to deal with the fact that the amount of information has become overwhelming that
00:09:57
no human can hold it all in their heads no nobody can be sort of familiar with all the new sources or things like that and then that lets us build these things on top of it where computers help us be
00:10:11
smarter in sort of navigate these networks of trust and and so you're conceiving of this challenge this is in the mid-early odds right and what
00:10:25
was the you know what was the first Inklings of an approach that technology could provide to addressing this and to kind of capturing the the chain if you
00:10:37
will of custody of information so so the idea was to build something that that basically said what the agreed on what the things you were talking about the
00:10:51
entities that you were talking about let people make statements about the relationships between them but then have some provenance of who made those statements so that instead of recording
00:11:04
that you know the glass is sitting on the table you record Danny said the glass is sitting on the table on such-and-such a day and then then you want you to have all that information
00:11:16
recorded then that lets you first of all it lets you record the information without worrying too much about whether it's true it's true that I said that right which is much easier to determine
00:11:28
than whether it's true that the glass is actually on the table and but then it also lets you apply basically your idea
00:11:40
of trust afterwards after you get more information about Who I am or later you find out I'm a liar or later you find out the class with someplace else can weigh those previous recordings against
00:11:52
ya so so it's sort of the idea is that what we really need to do is we need to separate up two things we need to separate the record of what different people said and who said it the
00:12:04
provenance of what was said and then separately remember it separately have in some sense a network of trust which is going to be different for different purposes ultimately there's lots of
00:12:16
kinds of knowledge that I think really are fundamentally part of the public common the public good and I hope that those will end up in it and I think it's not as complicated as copyright law
00:12:28
where you know you're you're taking the expression of individual artists and things like that a fact is a fact it's not copyrightable truth you know somebody videos out that you know the
00:12:39
you know the geographical location of this building you know that's that's just a truth nobody owns that and and really it's - everybody's advantage to show that
00:12:52
[Music]
End of transcript