Waiting..
Auto Scroll
Sync
Top
Bottom
Select text to annotate, Click play in YouTube to begin
00:00:03
g'day everyone I'm going to give you a whirlwind tour of pergos and some of the cool stuff we've built on top of ipfs so what is pigos it's a global
00:00:15
peer-to-peer private file system an application protocol designed for the average person to use safely uh it's a file system so everything has a unique human readable path which
00:00:28
begins with your username you can share individual files or folders read only writeable constant time sharing with with a group or revocation so the usernames are unique how do we do
00:00:44
that so you need a pki this is basically just a mapping from username to a list of signed claims and each claim is essentially two public Keys you have your identity public key and your Home Server public key that's
00:00:58
an ipfs node I do uh this is slightly more secure than dids because in the general case because uh with with dods you can have DNS leaking into it via the service
00:01:11
endpoints which are URLs so we avoid that via the Home Server node ID uh this is all stored in a champ which is a compressed hasteroid map to prefix try which is a super cool data structure
00:01:23
it plays well with codts uh it's in insertion order independent a bunch of other stuff and so for the pki the only consensus we actually need uh is just time ordering so if two people can't claim the same
00:01:36
username but yeah you get efficient lookup uh comparison and merge and all this champ data is mirrored on on every instance uh and so you get private local search
00:01:49
if you're trying to connect with a new friend which is super important for something that's social and once you've got your identity of the person who's data you're trying to get or log in or whatever from the pki what
00:02:01
do you do next you need to get a mutable pointer so in our case the immutable pointer is a mapping from a public key to a signed pair of cids
00:02:13
and how do you how do you get this so with a peer-to-peer RPC call so these are just standard HTTP calls over peer-to-peer streams and this is I think this is an undistold feature of ipfest
00:02:27
which is amazing because you can totally avoid any dependency on DNS or the TLs certificate authorities you just say I want to dial this node by public key and send whatever you want
00:02:40
um and yeah so yeah it's awesome so this gives us fast retrieval fast remote updates but you could still fall back to ipnet to actual ipns if for a slower
00:02:52
read backup if your server is offline or for whatever reason and so this is the the basic architecture at pagos uh installs and runs an ipfs instance uh itself I've
00:03:05
just mentioned it's dns3 and trustless and so if Alice logs in on on one instance and tries to modify something so Alice has a Home Server as we mentioned uh all those rights get proxied over a
00:03:19
peer-to-peer stream and so the data ends up initially on on the the Home Server so with a file system especially a social one you need access control
00:03:30
so we we do that with a thing called crypto Plus you you've heard crypto several times today already so uh let's say what what does the plus mean so quick well quickly itself is uh was
00:03:44
invented in 2008 um so we've added a bunch of things on top of that that initial version uh including metadata privacy ciphertext privacy and made it post quantum
00:03:56
so it's pure capabilities so you don't need to rely on a server to enforce Access Control it's fine-grained it's also stored in a champ we like Champs the ciphertext access access control is
00:04:09
a relatively new thing that's as of January this year we do that with things called block access tokens or bats which I'll talk more about later and another super cool thing that we get is
00:04:23
zero IO seeking so if you have a huge file I don't know like gigabytes maybe even terabytes it's encrypted but you want to you want to be able to
00:04:34
seek to somewhere down there really quickly you've got the start of the file say how do you do that like obviously if you encrypt the the entire file at once you would have to download the entire file and decrypt it which is not going to
00:04:46
work so I mean the first part of that is you chunk the file obviously and you but each chunk is independently encrypted so you can get whichever bit you want uh to decrypt it but the other the other
00:04:59
key thing is uh how you get from the location of the first trunk to the location of some later chunk and that's the zero IO thing um which if you want to hear more about just talk to me later
00:05:12
um and as you'd expect with ipfest you get efficient modification so if I modify a byte of a terabyte file I don't have to re-encrypt and upload the whole thing so this is how it looks you've got your internal champ nodes
00:05:27
then you have a crimp tree node for for each chunk of your file or directory and that can have links to the the encrypted file fragments and so the keys in this champ are
00:05:42
basically random um subsequent keys in a file are not random but they're still not deducible uh by the server so the storage Your Home Server can't figure out or can't link
00:05:54
the different chunks of the same file so we use that to hide the the tires of the file among among other ways the read correctly is pretty simple it's it's been discussed earlier but yeah
00:06:07
it's a tree of symmetric keys if you have one key you can follow the the arrows follow the links it also gives everything a well-defined path so if I just give you access to this file you can follow the parent links to get the names
00:06:21
so you have a path but you still can't see if there are any other files in that directory any siblings or anything like that the right tree is even simpler so there's just one key for each file or
00:06:34
directory these are all symmetric keys by the way in the previous slide um also the top ones are symmetric Keys these are obviously key pairs at the bottom um and the the metadata that we protect
00:06:49
file names file name sizes if you care about that uh the file sizes I've mentioned so there's a chunking part get you down to
00:07:00
modulo 5 Meg we also had pre-encryption to a multiple of 4K so you you end up with five Meg over 4K or 1280 possible chunk sizes in the entire world so that's cool uh the iprd format for
00:07:15
cryptv that we used makes files and directories indistinguishable so you can't the server can't tell what's a file what's a directory or who has access or even the directory topology so this is how the Crypt view format
00:07:32
looks like so this is the the cryptory node itself this is a dags keyboard node and there's basically three independently encrypted bits the first two are quite small and
00:07:44
there to do with more with the structure of the crimp tree and this is the actual data like children if it's a directory or or the data of the file itself and there are these these bats these things that I keep mentioning
00:07:57
um and minor optimizations are so everything here is padded as well which you mentioned um but if a file or directory which most
00:08:08
directories are is is under 4K we just we inline it so you don't have to do any other Network requests so back to bats what is a bat uh so yeah the the important Point here
00:08:23
is you shouldn't be relying just on encryption for privacy if you make your ciphertext public that matters in a whole bunch of threat models um so with the bats we've we've got a
00:08:36
post Quantum Access Control at the Block level in ipfs it's again pure capability based uh and the cool thing it manages to maintain the auto scaling properties
00:08:48
of ipfs so in ipfs you know if one node retrieves a block it can then help to serve it up it and the way we've done it is the same the same thing they can help to serve it up and continue to apply the same auth
00:09:02
to it and what actually is about so a bat is just 32 random bytes the the auth we used over over a libhead appear is uh s3v4 signatures which are time limited
00:09:18
tied to the source uh the source ipfs node making the request that means that we can we don't have to worry about these auth tokens we could just broadcast them to the DHT there's no there's no such thing as a replay attack
00:09:31
uh and this whole auth token in with a signature and it's wrapping is 89 bytes so about two and a half cids and we we have one of those for every not every block some blocks are still public but the ones that actually have ciphertext
00:09:45
in them uh so it's quite a low overhead but of course you need a modified bit swap to to be able to handle this so we've added yeah bit swap which sends
00:09:56
this this auth string uh you can you can see it there um there's a URL one super important thing which uh which I kind of just mentioned is you need to check any any scheme you use you need to
00:10:10
check it against the actual uh the source node ID coming to bit swap which made the request and we use this in a Thing Called ipfs nucleus which is a strip down
00:10:25
ipfs implementation that has all the stuff we need it basically just has the block API so we call it an IPL Daemon and yeah you can see those are all the
00:10:39
API calls we have as well as obviously the the peer-to-peer HTTP HTTP proxy and this type of first nucleus thing has a has a customizable block allow API so this is the thing that bit swap hooks
00:10:55
into and this is the function signature basically so you have allow passes in the Cod the actual data of the block uh the source node ID and then the or string it received over the over the network and that just returns whether or
00:11:08
not bitthought should release this block so again you can check out ipfs nucleus if you want and there were there were two two things I haven't had time to talk about um which is uh
00:11:21
GC implementation we have a fully concurrent GC which so you might have might have noticed that there's no there's no pin API here so we we don't we don't actually have a pin API uh pins are implicit for us from
00:11:36
the basically from the mutual pointers and so the GC just grabs the mutual pointers and you've got an implicit pin set and you can proceed from there and the other thing is uh well I'll talk
00:11:48
about this tomorrow in another talk is we we've just released an application sandbox which lets you run private applications over private data in an untrusted way so that the
00:12:00
application if it was malicious couldn't steal your data or exfiltrate it So yeah thank you if you have any questions come find me [Applause]
End of transcript