Video: Gerry Sussman: We Really Don't Know How to Compute! (with Slides) (DocDrop)

00:00:00

hi well I'm gonna tell you about today is now I actually think we're in real trouble that is I think we actually have the foggiest idea how to compute very well mmm and I think that there are some

00:00:14

small you know glimmer of hope on the horizon in a number of different directions which I may point out okay but I think most of the things we've been talking about even here are

00:00:24

obsolete as far when I said to somebody after the high school session session I went to the workshop yesterday is G this is the most advanced of the obsolete

00:00:37

languages and well I mean is this okay see that you now see the triangle that isn't there right everybody see the triangle that isn't there good okay what

00:00:51

did you do it took only 100 milliseconds or maybe 200 milliseconds okay that means between your eyes and whatever it is that knew it 100 milliseconds earlier on breaks if you saw something like that right and it was crossing the street

00:01:04

okay you you know do that's only if he would've built that's only a few tens of neuron times I don't care how parallel the machine is I've the foggiest idea how to write a program that organizes a

00:01:16

complex process like that in a latency of about 30 or 40 steps now that's that's an example of something we don't understand how to do it's more serious

00:01:28

than that a human genome is about a gigabyte that's the instructions for making you from a cell okay this yeah you're pretty complicated machine

00:01:41

gigabytes about the same size as the source code appropriate for say making Microsoft Windows all right it's not much bigger or smaller it's something like that okay so yeah we have all the

00:01:53

possible things that it could possibly do and all the drivers that might be relevant and everything it's probably that size it's not can't be it's not much bigger let's put it that way they're not seven three orders of magnitude bigger and a good buddy to say

00:02:05

a gigabyte is not that big but what's interesting is that with a small change to it you make a cow instead of a person so it's flexible too so we've got this

00:02:18

enormous leaf flexible language there that we haven't the foggiest idea what it's like I'm not talking about DNA a DNA so you know this that's that's something like gates or something okay what I'm interested in is the high-level

00:02:30

language that's necessary to do things like that and we certainly don't know how to do that so now what can we say about that well I'll tell you one little thing that

00:02:41

maybe there's some hints you know these biologists do nasty experiments on animals and suppose you took a salamander you know a salamander can regrow its limbs if you cut them off

00:02:54

so all these nasty biologists do things like the cuddle cuddle a limb of a salamander and hand the arm off over here at the show just between the the shoulder and the and the elbow and between and at the same time cut the

00:03:08

same one between the elbow and the hand flip around that segment and resell it back on nasty thing to do the salamander but what do you think you get day buddy

00:03:22

no they've been here no no you get a three elbows it grows two new elbows to make up for the fact that the that the wrong thing is connecting to the wrong

00:03:32

thing okay it's pretty weird okay but but there's a clue that's a clue to something that we should be thinking about about how do you talk about how to

00:03:46

make a bazillion as it would in fun the humans like 10 to 12 cells okay a salamander is probably 10 to the 11th or 10th to 10th I don't know okay but the fact is that how do you make a how do you how do you make that

00:03:58

something big like that grow itself the little neighbors talk to each other and they say oh boy you know listener yeah I'm supposed to connect to a thing like this but ain't there so we'll make some of that okay and that's the sort of

00:04:10

amazing kind of process that's going to be necessary if we want to kind of approach the future in our computing because by golly it's almost impossible to set up a situation where we

00:04:23

understand everything it's going to happen in the future let me give you a get let go back and past a bit okay when I started computing in 1961 this was a this was the computer

00:04:38

which wasn't this particular one but it was an IBM 650 at Columbia University's I was computing at my machine cost a half a million bucks it was it had two

00:04:50

thousand decimal two thousand words of ten decimal digits each of memory on a drum that rotated so that the access time at twelve point five milliseconds

00:05:03

so think big at ten milliseconds there cycle time and that the total amount of memory is about probably about say 10 kilobytes now for the price of that

00:05:16

machine I can now buy 50,000 pcs okay literally okay each with a gigabyte of RAM that's what's it that's ten to the fourth times as much and with which were

00:05:29

about a million times faster so one of the things that's been happening over the little years I'm not trying to worry about mr. Moore's law anything like that what I'm worried about is that all of

00:05:41

our all of our sort of intuitions from being programmers have come from the time of assuming a kind of scarcity a world in which computation is expensive

00:05:56

where a memory is somehow expensive where we have to optimize it okay well there are some applications like that but the answer is most things are not like that at all right now

00:06:07

memory is free computing is free it's our problem now to figure out how to use computing like that and you know if I had 50,000 pcs hooked up with an appropriate network but he probably cuts a million dollars at that point

00:06:20

modern dollars okay well they actually bind in bulk by golly you know you could make pretty serious computer out of that now by those machines took 30 kilowatts but if you're interested I have the

00:06:32

power plug from the 1794 which was somewhat more advanced but that's a different problem okay so it is no longer necessary to minimize operations it is no longer necessary to worry about to worry about

00:06:47

how much memory we take up it is very important to minimize latency that's critical you have to understand that but the real cost of computation is not that at all it's the cost of programmers okay here's

00:07:01

a quote from mr. Evans from Glasgow made a while ago it turns out that almost all the cost of computing is you and me okay that's a different world that's the real

00:07:16

problem okay so what are we going to do well there's lots of things people worry about over the years we hear lots of pundits worrying about things people worry about correctness gee that's a

00:07:30

problem is it the real problem maybe probably not most things don't have to work look at Google if it makes a mistake it doesn't matter okay so long as the next time you get the right answer right or maybe something nice to the

00:07:44

close to the right answer something called reasonable you needs a reasonable answer okay what other kinds of things are there well people worry about about

00:07:56

security well they've got no idea how to handle security we'll find out for example that humans math should make it for about 70 years would be continuously attacked by parasites gee there must be

00:08:10

some clever things we're doing it's all in that gigabyte of code yes we got to think about that we don't know we certainly haven't got any any good ideas about how to make how to make things that last for a very long time we're

00:08:23

continuously attacked by mutating parasites okay the other thing that seems really important to me is that we

00:08:35

spend all our time modifying existing code the problem is our code is not adequately evolvable it's not adequately modifiable to fit the future in fact

00:08:49

what we want is to make systems that have the property that they are good for things that the designer didn't even think of our intent that's a big problem right that's the

00:09:01

real problem I want to go after today a little bit thank so what's up what is this problem the problem is that when we build systems whatever they are we

00:09:13

program ourselves into holes I've done it a number of times I've been programming as I said since 1961 I found more ways to screw up than anybody I know okay and I've learned a lot of hope

00:09:26

but not an infinite amount okay and let's say so here's the here's this problem okay well how do we keep ourselves from from programming ourselves into holes what is the what does that mean it means that we make

00:09:38

decisions early in some process that spread all over our system those the consequences of those decisions such that the things that we want to change later are very difficult to change

00:09:50

because we have to change a very large amount of stuff that's the problem thing and I want to figure out ways that we can organize systems so that the

00:10:02

consequences of the decisions we make are not expensive to change we can't avoid making decisions but we can try to figure out ways to minimize the cost of

00:10:14

the of changing decisions we've made okay so let me start with a simple a simple example one that is so familiar that we all do it these days certainly in in good in good dynamically typed

00:10:33

languages like Lisp or Python or whatever or in in good statically typed languages like Haskell or whatever okay we always have something like generic operators that call the operator overloading and things like that okay

00:10:46

let's talk about that for a second because I'm gonna be very extreme about this here's a here's this to the traditional arithmetic you would find in scheme okay I can add integers together

00:10:58

great I can add interest to rationals I can make new rationals I can add integers to quote reals okay there of course floating point thingies that are very dangerous and

00:11:10

nothing brings fear to might and part more than a floating point number numerical house the blackest art I know and I you know they just all worrying about this it's terrible but then again but we can deal with things like complex

00:11:26

numbers and and the exact any complex numbers which are integer parts and things like that and that's okay okay but you know those things are built in okay

00:11:37

how about dynamically extensible generic operations now we can do that of course if you do it separate compile time and interpretive time or run time something weird is happening here and I want to be

00:11:50

able to dynamically send things while my program is running I want a program to compute up what it's going to do okay remember in the old days when you used to program in assembly language or something at least I used to do that yeah it's a great language because you

00:12:02

could write the next instruction you're gonna execute now you could put it into the place where it was you were going to go to it just was and you know that many get to see programs we written that way it was very flexible there's too many fearful people you know

00:12:14

I can see a lot of people cringing but the fact is that you know you do that and this ball so you can eval something you come it can be concept that's boy is that useful but in any case okay so here

00:12:27

I have I can do things like expand this to have arrested a gun function you know now here's an example I could add together sign in square right over there okay and I got a new function because

00:12:39

that's a classical thing people could do in the algebra or in in mathematics you say if you've got two functions with the same kind of input then they can and they can the some of those is in fact assumed to be the sum of the outputs

00:12:51

let's say as it's a generic extension okay that is for things that are a type procedure one of the right team insane number arguments I could talk about the sum of the square roots of sine and

00:13:05

cosine applied to two point five and I got one that's right that's what I should expect that good that's an example of a simple extension they can be sent you and by golly you know you have all these nice things I'm looking

00:13:17

at watching somebody talking about Zita day and of course you can do things like that okay but you then you get to things with let's do things better there's symbolic algebra okay now this

00:13:28

is this involved this is a a again a generic extension I'm going to explain how it works except that I could tell addition and subtraction multiplication division all those things to deal with symbolic entities as well literal

00:13:41

numbers okay and what I get is a symbolic expression as a result of such a computation okay and I could of course the nice thing is that's a fragment of a program that came out so I could write a program that writes a program and I can

00:13:54

use it I'll tell you a good example of the use of that okay it's a numerical analysis example do you gear integrator everybody know what a gear integrator is anybody know the gear integrator is for something called stiff equations

00:14:07

differential equations okay turns out this and polynomials you have to compute it up and if you computer I thought about the head of time you only can do you know having their step or doubling the step size but if you want to be able to make the step size any size you want

00:14:19

you want to be able to compute up the appropriate polynomial at the moment so you write any a thing that symbolically does that then you pass that through the compiler as you need it and then you you run it and use it in the same program

00:14:31

they're sitting rolling integrating along turns out it's great this great thing to do and great fun okay but the point is that you could get essentially machine level or a machine quality code

00:14:42

you know you're ready fully fully fully native code efficiency except at the moment we obtain you step size to a funny to have step size okay that's a useful thing that's what I like this

00:14:54

because it has uniform representation of programs is data so that symbolic manipulations produce stuff that I can execute okay now but you know you can do something even more while you can did you know that automatic differentiation

00:15:07

does that everybody know what differentiation functions is and what a Mac differentiation can be done as a as a generic extension what you do is you create hyper real numbers numbers that

00:15:20

look like X plus Delta X the same way the complex number is real part plus J times the imaginary part of I if you're a mathematician I J if you're electrical engineer that's because her French invented youth

00:15:31

j4u strife or intensity which is current but in any case in any case there is a you could do this sort of thing gain you could say the derivative of the cosine and the sum of the cosine the square

00:15:43

function applied to X is now I'm using of course my symbolic parts as well so these things all mixed together ok this is the sum of 2 X n minus 2 sine X 2 X minus sine X ok and I can you know I can

00:15:57

look at something more complicated a derivative of a function of two variables which is in fact a structure of the two partial derivatives which is what it should be because if I multiply that by a an increment in the inker in the arguments I should get an increment

00:16:10

in the answer and that's a matrix multiplication this is the ocean a row vector coming out okay that's the sort of thing that you can do very easily of course I'll tell you the trick of how to do this okay just for fun okay for those

00:16:24

of you who are interested in things like this final trick this is the kind thing you're interested in so if you have if you take every possible that was it that was pulled out of Abraham Lincoln

00:16:35

actually use that comment if you have a if you if you extend things X plus DX and you have all your primitive functions including things like addition subtraction multiplication division okay and things like that

00:16:49

now arctan with two arguments and things like that produce an output which is f of X plus d of f x times DX where DX is the thing that corresponds in the that's up to the imaginary unit you know imaginary numbers okay

00:17:02

then you can then where you can do is you can just push things through that and gets tracked out the DF of the DF which is derivative of the function okay now that gets you the chain rule instantly in automatically

00:17:15

alright that's it so you do a generic extension of all your a primitive operations and you get the automatic chain rule which is right and that's hard to do symbolically without any other way but this works also numerically at full speed whatever you

00:17:29

want okay it's producing it's producing a composition in the same way you have combinators so that's a useful kind of thing and I actually use this all the time because I teach a class in fact I'm

00:17:42

teaching one this term in advanced advanced classical mechanics start with Lagrange's equations it goes through canonical perturbation theory and I used computer programming in my class and then two for these of

00:17:54

graduate students and physics and things like that use computer programming in this class to actually clarify the things we vote right on you know would we write down these equations because mathematical notation is both awful and

00:18:06

it turns out ambiguous and be perfectly honest it's in - it's impressionistic right it's very simple it's look what I'm talking to you in English it's the same phenomenon mathematicians talk to each other because they're almost all

00:18:20

like that's way reason why you can understand me as well most all identical and so there's really a few bits have to be transferred in order to get an idea across your maths dishes painting an idea in the Platonic world of another

00:18:32

mathematicians mind okay by talking to them and they write down basically impressionistic scribbles the same way my speech is not grammatical beautiful English okay but in fact when we when we

00:18:44

write programs we have to be precise because computers have dumb at this point and as a consequence that's a great way to show someone the real story okay and so I can write things like this which is the definition of of the method

00:18:57

of computing LeBrons equations right here from the lagrangian given a configuration space path now for those who don't know this sort of thing doesn't matter I'm not going to dwell on it but what this basically is is you make a state path out of configuration

00:19:10

space path which is a tuple of three things the time the the path the path value at that time and the derivative the path values and philosophies okay and you have to take the derivative with respect to the time which is the capital

00:19:22

D here because that's through one argument of the results of getting the partial derivative inspected they took the velocity argument to the Lagrangian and compose that with a state path etc okay and you get the right things here

00:19:35

and by golly if you take a harmonic oscillator or whatever you like you stuff them Lagrange's equations and outcomes and the outcomes the actual residual which is equal to 0 which is Lagrange's equations ok nice ok now if

00:19:47

you want it doesn't matter for what did I say but you know that's yeah that's that's that's easy stuff what if I had to do something with them oh no I mean it what if I deal with put-upon dealing with something with it with it

00:20:04

with a couple of rigid bodies that are that are wobbly and have all since then I can't write the equations down on this on this board okay in fact it's something if you do celestial mechanics like I've done sometimes the equations of twenty pages long

00:20:16

okay and by golly it better be done by a computer you the old days people did it by hem they got it right most of the time it was amazing it's amazing I can't do with what people did in 1911 or whatever but it's quite amazing that

00:20:28

they could do that sort of thing and now we can do this easily and it turns out that this is very simple because of the simple generic extension okay now I'm not trying to claim that solves any important problem okay

00:20:41

very careful about that okay that this is this is a little tiny feet feeling of what is it needed to make things more flexible consider the core they the value here the value is that I'm

00:20:54

dynamically allowed to change the meaning of all my operators to add new things that they can do dynamically not a compile-time and run-time okay I could programs to reduce things which are the

00:21:06

new way to add take that sort of stuff and they can attach that to addition and they can take it off and all that and it can look like it can look like binding I can bind that idea okay that's wonderful

00:21:17

okay what does it do what is doing it's giving me tremendous flexibility flexibility because the programmer I wrote to run on on a bunch of numbers with only a little bit of tweaking

00:21:31

suddenly runs on suddenly runs on matrices so long as I didn't make the mistake of commuting something wrong right if I did I'm in trouble right so this makes proving the theorems bad things very hard in fact maybe

00:21:45

impossible and that's the sort of cost the cost is and the benefit of our are very extreme okay I can pay I could pay correctness or proofs of correctness or

00:21:59

belief in practice I can pay that for from and this flexibility right there now again we worry about ideas like is correctness the essential thing I want

00:22:13

look I'm not opposed to proofs I love Brooks I'm an old mathematician okay the problem is that proofs putting us into the situation that mr. Dijkstra got us into about 30 years ago 40 years ago

00:22:25

where you know you're supposed to prove everything to be right before you write the program okay getting yourself into that situation puts us into a world where we have to make our specifications for the parts as tight as possible

00:22:38

because it's hard to prove general things except occasionally sometimes easier but usually it's harder to prove general theorem of data about something so you make a very specialized thing that works for a particular case you

00:22:50

build this very big Tower of stuff and boy is that brittle change the problem a little and they're all falls over we didn't learn something from the digital from the fact the electrical engineering I did the digital abstraction whereby

00:23:02

the inputs to something much bigger range than the outputs are allowed to produce and therefore you get rid of noise at every stage that's that's the kind of flexibility that I are expecting

00:23:15

yet and I hope to get now the number the old stuff you by the way when you do this contra Derek thing extensions it doesn't mean that the old stuff breaks it means the new stuff is not is not proved and I'd be careful about that

00:23:28

mmm-hmm okay so with old many of the techniques I'm advocating make proof much more difficult and perhaps impossible now I'm going to now be a

00:23:41

little more extreme I haven't been extreme yet this is the beginning of stuff okay I think one of the problems

00:23:53

of the is the kind of architecture we're assuming everything in the world this looks like at least looks like my 650 it's you know it's very occasionally

00:24:04

people have made wonderful things GPU is different right but or maybe I suppose the connection machine was an experiment which was different but here in the future it's going to be

00:24:17

the case that that computers are so cheap and so easy to make that you can put in the size of a grain of sand complete with a megabyte of RAM you're gonna buy them by the button by the

00:24:29

bushel I suppose you can pour them into your concrete and you buy your concrete by a mega flop and then you have a wall that's smart okay so long as you get the power to them and they could do something you know that's gonna happen

00:24:42

right it's sort of the way I said remember your cells are about are pretty smart they're about a gigabyte of RAM ROM but few a few a few bytes of RAM probably and or maybe a few hundred we

00:24:56

don't know how many but and they but they seem to talk to each other and do useful things so we have to begin to worry about that kind of a world okay well I'm going to tell you a story that

00:25:07

started around 1975 okay and but recently had some advances it's called propagators it started when I was teaching my first classes in electrical

00:25:21

engineering at MIT and in circuit theory actually actually the first place I thought was the field theory but then I saw a circuit Theory class with a real wizard of circuitry for theory paul Penfield and I observed that what we

00:25:36

taught the students wasn't at all what the students actually respected to learn that is what an expert person did when presented with a circuit that looked like that was quite different from what

00:25:48

what we tell them to write down the node equations and there are certain equations for the resistor and the capacitors and the and the inductors which are none of them here and the transistor which has a complicated nonlinear equation and that you're

00:26:01

supposed to grind these equations together somehow and solve them to find out what's going on but you know that's not what a part but it really a really good engineer does what a good engineer does he looks at this thing says meaning

00:26:12

well let's consider the bias analysis here that means these capacitors are open circuits okay and that means that therefore and then I'm going to consider this transistor to be beta infinite and

00:26:25

be a follower right now meaning its base its base emitter in voltage is going to be 0.7 that's assumption based on the assumption of how we're using it it might actually be

00:26:38

wrong which place I might have to back out of this okay given that there's no there's no current into the base here there's no current going into this capacitor so there's a voltage divider since I had no voltage here and I know the voltage here I know the moles here

00:26:50

therefore I know the voltage here whose point something's possible on that therefore I know the voltage across this resistor so I know the current through it that card must be coming through the from the compact a collector here therefore it must be going through this resistor therefore it lowers this

00:27:02

voltage by an amount which is RC times the current which is the voltage from here to here divided by the resistance which is points etc okay and so therefore I can tell you therefore I

00:27:14

can tell you the the the the all the bias conditions of this transistor I now use that to check whether or not my assumptions were right they are okay now given that as I incremental II I could treat this earthing that means this

00:27:27

capacitor is now a short circuit that means I tweak this in voltage which means this changes by the right amount which means this current changes by an amount which is the amount this voltage change divided by that resistance which means this voltage goes down by an amount which this which is the current

00:27:40

increment divided by don't multiplied by that resistance so the gain of this stage as an amplifier is RC over RC ra to within 5% that's right okay and every real engineer does that okay and that's

00:27:53

not the sort of thing that we were teaching the students so I want them to make the red band figure out how we teach students this sort of thing right and of course the right bond here at vision of things if you write a program and they give it to the students to read

00:28:04

okay that's what I did is I actually hired Richard Stallman trained in 1975 to work for me and he and I wrote this program together and it was quite a success and I'm just sure I'll show you a modern version of it because the old

00:28:17

code doesn't seem to exist anymore I went looking for it and it was either that was on the old I TS operating system and it's gone and probably probably on some backup tape somewhere which somebody has it could be found but it's not around now so here's a new

00:28:30

version of the same program here's an example that here's the representation of that amplifier I'm not going to tell you what I'm not gonna read it to you okay what it is it's a wiring diagram

00:28:42

guess what the same lambda calculus combination structure that you use for functions works for other kinds of things - all it is is a consistent naming scheme it says for example let R

00:28:57

be 1 be a resistor let let the rail node be a particular node combining a bunch of terminals from the various parts that were declared previously in the outer scope and so I have a bunch of nodes

00:29:12

then I can say I'm going to produce a few more things and and and that's why I'm my the output of that of this rest of the stuff ok wiring diagram the description description of how these things are put together that's a really

00:29:26

nice thing they remember underneath this is lambda calculus nothing more there's no magic in there ok it's just it's just naming as a very smart mystic once said if you have the name of a spirit you

00:29:39

have power over it so then once you do that then you can then you can for example make a test test jig for this which looks sort of like that ok and I could tell it what kinds of

00:29:54

what kinds of parameters I want notice by the way I'm allowed to use I get all my values are also allowed to have have units that's another generic extension it doesn't matter it's just simple once

00:30:06

you have generic operators you can make units and it works just great ok I can assume that I can make various other assumptions about here's the transistor assumptions that are pretty simple I can say what but my assumptions

00:30:19

are forgetting the thing on the phone on the chalkboard here this is to make the arguments short enough to put on one on one slide ok and I guess what's the value of of the potential law of the better the base of the amplifier in the

00:30:31

bias region and it does some deduction and it comes out with an answer in false ok and then you say well it's the value of the potential on the emitter and you get the same kind of thing not very interesting but the really interesting

00:30:43

thing is here I can ask why why do you believe mr. computer why do you believe that the potential on the emitter and the amplifier in the bias region is what it is and out comes a outcomes the thing with

00:30:58

a remarkable QED that has nothing to do with proving okay it's just it to be arrogant okay good what what is telling you is the because remember almost all these

00:31:09

things are lies anyway the models I told you for the transistor roll I get all of all of physics and and and and electrical engineering things like that is based on like useful lies which are appropriate approximations for

00:31:22

particular jobs and the reason why when Galileo invented the modern physics for all practical purposes what he discovered was the value of a lie Aristotle wrote understood that when you roll the ball along the floor it always

00:31:36

stopped Galileo said forget about the friction let's figure it out and then put the friction back in that was a required lie to understand what was really going on that changed the world

00:31:47

that was the big breakthrough of Galileo okay so this QED doesn't mean anything but it really matters is that I'm chasing down the provenance of every of every value as the values move around in

00:32:00

this thing and what's really going on here what's really going on is that I'm doing what the what the expert did I'm seeing something very obvious what I'm seeing is that if I know the value here

00:32:12

that I know the value here because of a local phenomenon okay propagating information around and if I know the value here and I know the value here and I know this value that I know that value okay and that that propagates through

00:32:26

here into here and so on okay so this is information is propagating around we call this propagator as' and that was a that was a bigger idea in 1975 okay

00:32:39

however it does have some like at some some power okay the power is this let's go back here that we have imagine of the propagator or independent little

00:32:52

stateful stateful machines that are connected to a number of little things I'll call cells not biological okay so that a propagator holds a bunch of figures on a bunch of cells

00:33:04

given that it it knows information from this one and this one this one may be able to deduce the value there and go clobber okay I've given that that value suddenly got changed then this guy said oh I see something

00:33:18

interesting and it may go change that guy okay so there's a that the information moves around and fill out to go in any direction but that's so there are these stateful cells that carry half state

00:33:30

okay and there are these fin to say these propagator machines and guess what this is very nice for a very parallel machine but even better Aleksey rad dual my graduate student two

00:33:42

years ago I suppose he got a doctor's degree made a great breakthrough here the great breakthrough was that we don't actually put values in these cells we

00:33:53

put information about a value in a cell and I'll get to that and later a little later what it means is that what a cell contains is a monotonically increasing value of information about a value it

00:34:06

gets better and better information and as a consequence things like the synchronization problems go away for parallel machine this becomes a very powerful mechanism we're building a very

00:34:19

very very large complex machine in a simple way but it's electrical engineers point of view it's not a computer science point of view it's not data flow okay it's closely related but it's not

00:34:34

it's a very different kind of thing because we're worried about information that's being an accumulating these cells but let me just go a little further one of the other problems I hate computer languages now almost all of them

00:34:46

including than once I invent okay and part of the reason is because they're full of expressions that could be in their universe they could be statements okay but they're expressions and what I

00:34:58

mean by that is an expression expression is a tree okay and what's good you got a bunch of values and they print they only come go up the tree and they collected about at one output that's what you want to think about it and you don't have any

00:35:11

names for the intermediate parts there's no name for this place or this place as in the circuit diagrams you have an electrical circuit boy those are useful places to put names on now sometimes they're a pain in the butt you

00:35:23

gotta be careful not to not to go overwhelmed with things but indeed yeah one big difference in this case is I'm going to change it so propagators of all all connections are explicit they have

00:35:36

mains CC this one this is a this is a step for the computation of the square root right and you've clipped by herranz method was so sometimes called Newton's method that was invented 2,000 years earlier by hearing of Alexandria Newton

00:35:50

did a generalization which was tremendously important which is to any function but for square root it was understood by by here on what you do is you take the average of the thing you're trying to take the square root of and the guess you have to the square root

00:36:01

and you add that to the you take the ratio of those add that to the and they guess you have and divided by 2 ceiling the average of the guess into the into the query plus the plus the

00:36:15

guess okay and that converges to the square root rot eclis that sets your ons method and so that's what this is and so here here's a propagator description which is a wiring diagram

00:36:27

and indeed I have to I'm looking now it's something that looks very much like again assembly language I've got good languages for this yet but I'm using lambda glue to put it together and then the glue and says so an excellent for

00:36:40

getting getting going before we understand what the real language is so here we have here we have the cells that we're making up which are all the intermediate ones this takes three cells

00:36:51

as inputs and says if I want to divide X by G I put the result in the x over G cell and divide and G 2 x over G and put the result into the G Plus X over G cell and so on and that's what that is okay

00:37:04

so you're seeing a little wiring diagram for little machines imagine each of these things is autonomous machine running continuously if I can burn processor I can do that supposing I have

00:37:18

as much processor as I had memory for closing a head for every for every little memory module I had I had a big processor associated with it don't worry about that'll happen it's going to happen because there's no other way to

00:37:30

make the future happen okay so of course this works ok as you expect and you got a better guess but now you can also make it iterative wiring diagram view of the world okay what are

00:37:43

we seeing here what we're seeing is of course a machine a wiring time for a machine the machine is potentially infinite because it's got a copy of itself right there inside of itself right it's got the here on step in a

00:37:57

good machine inside of it of course okay but this is a this is a way to think about things this is quite different okay turns out things may have multiple outputs okay this none of the ones here do but things may have more outputs for

00:38:09

example they when you divide integers you get the quotient and the remainder okay that's it why why I have to do one or the other and why not even think about that as being the normal way to think most processes have that property

00:38:22

or what if you have things which have many things that go in and many go out but in any case okay we have wiring diagrams like this and of course the way in which this guy squared entered knows

00:38:33

whether or not to expand himself that's a implementation question is a question like the difference between applicative water and normal water in calculus lambda calculus are you choosing to

00:38:45

something when yeah all the inputs are there only one input or you want the output or anything like that that's a different question okay I'm not gonna worry about that right now okay we can make machines do it any way

00:38:57

we like and we have connect policies that can be can be it can be allow you to do both remember a real engineer doesn't want just a religion about how

00:39:09

to solve a problem like object-oriented or functional or imperative or logic programming this piece of the problem wants to be a functional program this piece of the program wants to be

00:39:21

imperative this piece wants to be object-oriented and guess what the speaks wants to be logic feed based and they all want to work together usefully all right because of the way the débora the problem is structured whatever it is

00:39:33

and I wanna I don't want to think that there's any correct answer to any of those questions it would be awful bad writing right at writing a device driver in a functional language it would be

00:39:45

awful bad writing anything it like I said bolli manipulator in the thing with complicated syntax it's awful bad to write a to write any uh what's good a

00:39:58

numerical problem of numerical program in in anything but for trap okay that's a joke I actually fortune for Trent is a

00:40:12

wonderful language in some ways it's the ancestor of the wall besides lists Fortran listed into oldest languages that still can have anybody using them I think that they are the keys to everything else the current version of

00:40:25

languages okay but let me go further hey what are we what can you do with these propagators i'm only pushing this idea not because i think it's the right answer i'm trying to twist us so we say this is

00:40:39

a different way to think okay some other we have to think fifty two different ways to fix this problem we have i don't know the answer to how to get out of this i don't know how to make a machine that builds a person out of a cell okay

00:40:52

well but i but i think the problem is that we've been stuck for too long thinking about the fact that we've have been diddling with our details we're sitting here worrying about our type system but we ought to be worrying about how to get flexible machines and

00:41:03

flexible programming so here for example yes okay one thing i could do with propagators I can make I can make constraint propagators out and out of directional ones if I say that look I

00:41:17

can make a thing that will done as an adder that knows that this is the sum of these two which means if knows these it'll compute that if it knows these it computes this if it knows these it

00:41:29

computes this okay it doesn't matter which the other which way the information flows and in fact I can do that by this building to sub propagators which the directional three of them

00:41:43

multiplication here is a lie it's actually a little more complicated because multiplication by zero has special properties okay but you sure it doesn't depend upon the other argument so this is just for

00:41:56

show and you're seeing here and so it fits on the slide and there are other things like that okay but once I can do that of course I in fact make my electricity stuff because indeed a resistor a resistor really is a two terminal device with a

00:42:09

constraint that is I times R equals V and indeed some other constraints which is for example the voltage is the difference of potentials from two nodes it's connected to the currents and some of the currents into those terminals is

00:42:22

zero that the product of the current and the voltage is the power okay and that sort of thing okay I can build that sort of machine that's so I can build up from what I just showed you okay so that's

00:42:35

nice and it all works very nicely but the crucial thing here is what I said before what mr. Raul invented two or three

00:42:45

years ago for a doctoral thesis the idea that the idea that that the cells merge information monotonically so I'm going to tell you the story about you know the

00:42:58

story about if you've got a barometer a stopwatch and a ruler how to measure light of a building there are lots of ways to do that so here's one thing I

00:43:09

can have similar triangles okay that is I could have measured I could practice that Sun is coming in from there from from some direction except around here at this time in this day and so there's a there's a shadow cast by the building

00:43:23

and a shadow cast by the by the barometer if I standing on end so what I could do is if I know that if I'm the ruler I can measure the barometer and I can measure its measure its shadow I can measure the shadow of the seat of the of

00:43:36

the building if I mark it fast enough so the Sun doesn't move in time and I can commiserate out with the ruler and then I can compute similar triangles and getting a get a value which is in this case if I started out with a building

00:43:47

that was I didn't put units in here 50 for the building shadow was 50 between 54 feet 0.9 and 55.1 feet long I'm assuming that that's an interval which is error okay and the height of the

00:44:02

barometer is between point three and point three two feet and the shadow of the barometer three point three twenty three six and point three seven feet then the building height is an interval between forty four point five four would

00:44:13

one for at forty eight point nine seventy eight feet okay that's done by building a similar triangles propagator which is simple enough okay of course I could have done the other thing I could go to the top of

00:44:25

building drop the Pramod off and see how long it on the stopwatch it takes to go here the quick right that works okay but if I do that then I can have a flirt s equal one half ay T Square whoops okay

00:44:38

that's equal one half ay T squared fact for a propagator and I know if I measure the fall time I could say I can add that to the system and I get a building height which is forty one point but slightly different take those of course

00:44:50

measurements differ you're not the same but I could do something better okay I get started out with I can combine these measurements

00:45:02

that is supposing I have I know that the building shadow is this that's what I measured with my ruler and I measured the barometers height and shadow then I dropped the barometer off the building

00:45:14

and figured out the times okay and so I got a value there this is this is a better value if I go down here if I look what guess what this value let me go back here second the value here is

00:45:26

between forty one point one six three and forty seven point two four three here it is forty four so it's a tighter bounce but the best thing is that that information propagates back to the measurements hey you know the guy made

00:45:39

the measurement made a slight mistake and we can find it because each of these measurements reinforces the others okay and that's always true in real science by the way how much time is there taken

00:45:52

up so far huh forty five so I've got 15 more good okay thank you but that's I really don't want to talk more than I'm designed to talk for fifty minutes okay I can talk for sixty minutes on a topic

00:46:06

whether or not I know anything about it so each of these each of these values is improved from the values we had before from the ones we measure okay but you know now we can for example bribe the superintendent by giving him the spot

00:46:22

watch and we expect the if you have the the honest answer out of this okay and so we get a value which is not an interval which is 45 he claims the building's height was 45 okay okay so we get the content of the brahmins height

00:46:35

now is better still and all of these are improved measurements ah but you know I could do better than that because in the same way that you invented these Harry

00:46:46

meta monads to big club lambda calculus plumbing which is what they are they're wonderful stuff by the way but they are very hairy for people who haven't who haven't understood that very carefully

00:46:58

and thought about it for a long time so so in the same way you know you want to be able to carry extra information with something and what a monad allows you to do is to pipe around the value you wish

00:47:11

to carry also state or something like that that's the extra information that gets fed into the right place so it's hidden plumbing it can be done with the lambda calculus trick that most list type programmers knew about long ago it

00:47:24

didn't have a name for it and it's very useful to to realize that it's connected to a category theory and things like that okay so what do we do here okay well tracking a provenance okay well I'm going to attach other

00:47:36

information besides things like like units and dimensions and stuff like that I want to carry with it where information came from so I can make arguments as to what with where does it come from you saw that in the in the electrical circuit case that's very

00:47:50

helpful in your debugging a complicated problem so let's see what that is all right well we can start again okay I'm now going to add two I'm going to make my values before I call contingent

00:48:01

values so this value which I'm putting in the building shadow is the contingent value which is based on a premise called shadow it could be several premises which why there's a list there and I'm

00:48:14

assuming it's a symbol right now but the premises are there have no no interesting structure other than it can tell whether or not two of them are the same so these are all shadow shadow premises and I get answers which are contingent

00:48:27

on and say on the shadow oops okay I can do the same thing with the fall time okay but this one I'm going to make dependent upon the upon a premise I call

00:48:40

time the building height now is continued on both the time and the shadow I the authority the super gives

00:48:53

me an authoritative result okay then I can look at the building height okay it's depend upon the city and he gave me that information but the barometer site is now dependent on three things the superintendent's say

00:49:06

estimate the time estimate and the shadow estimate but the building shadow has to been strolling on the shadow estimate and the full time however has been approved by the superintendent's estimate but not the other one because the other was very interesting it was

00:49:19

superseded the information is monotonically increasing in each of these in each of these cells and monotonic increase is crucial the one way to do that kind of thing of course

00:49:33

is with things like intervals but imagine you had unification algorithm for example that's a way of getting combining information and there are lots of things like that suppose I have a

00:49:44

free a transformative signal approximate and I have the time domain for an approximate version of the signal both with noise if combined let's get veterans to estimates and what that is

00:49:56

mmm so now here's a this gives us even more power because what I'm gonna do is assume that of course information is you to be inconsistent almost all the information in my head is inconsistent I

00:50:10

believe lots of things that are false right we all do but the important thing is that I can I don't fall apart every time I think is the famous robot and some in some science fiction story okay

00:50:22

why it's because if I went where these inconsistencies I giggle that's the answer it's a it's because I can keep apart locally consistent subsystems of my

00:50:34

thinking and each of those I can make deductions that I know where they come from I hope and they dependent only on the things that I thought I specified so they look at this for a second for

00:50:46

example again I'm now going to make these things called TM s's truth maintenance systems they were invented again originally partly by Stallman and me but they were given a name by John

00:50:59

Doyle who was one of my graduate students who wrote a master's thesis about them this PFDs about something else and David McAllister did some work on them and made another version of it and then there's all book by two of my former

00:51:11

students yonder clear and and yeah Ken forbus which is called building problem solvers which is pretty much about truth maintenance systems so as a whole the whole nice thing but it is this as a

00:51:23

collection of facts ok with appropriate dependencies such as possible to say what is the best estimate you can give me of the consequences of these facts the logical or anything else and so I'm

00:51:36

in every cell I'm going to put a truth aiming system instead which is a thing that's going to maintain the best estimates of what's going on with the ability to back out ok so here's what

00:51:47

you're seeing here is that sort of thing and indeed I can I can do all the standard things I saw here ok and I'm not going to go through this but if you look at what's inside the building

00:52:00

height at the end though this is the slow sequence there's a three possible contingencies depending upon different dependencies ok and this can grow quite large but doesn't grow very large

00:52:11

because anything which is subsumed by some other deduction goes away and eventually we can do things like now change our point of view I can I can

00:52:26

change my worldview I can bring in Atlas you go back here where's a kick I can decide that I could I want to kick out the time yes suppose I want to kick out the so that

00:52:38

the time measurement and when I decided that wasn't so good well I can get the value for the building height which depended only on the shadownet whoops shadow measurement I could bring back the time measurement and kick out the shot of measurement and I got the suddenly depends on the time

00:52:51

measurement okay if I if I if I add the superintendent stuff in I bring the shadow measurement I get something else I can decide I don't really like the time measurement and I get the dependent I get the result that depends only on

00:53:06

those two okay and I can walk this thing around a lot okay that's a lot of fun but better than that I can get contradictions maybe somebody was made a mistake maybe there's a lie

00:53:19

maybe the superintendent's a lawyer or maybe they may be the the the ruler was it was not held cracked right quite right when I measured the height of them whatever okay look in real scientific

00:53:31

stuff supposing I have by a corpus of all the medical material in the world all the journals you know dying well that most of the most of the articles are wrong and there's a good reason for that the reason is that that people do

00:53:45

for experiments with P less than 0.05 Johnny that's the they that's their measure of whether or not something is good it is a good experiment is whether or not the chance of being wrong is one

00:53:57

in 20 now since they do far more than 20 experiments and they only publish the ones that come out positive okay you can do to make your deduction from that there's great papers about the survey even a this is a great paper about about

00:54:12

why almost all medical research is wrong and that's but you know we still live better than we did 50 years ago so can't be completely true but in any case mmm yeah if the physicist walks in with P

00:54:24

less than 0.05 thrown out of the room it's gotta be P less than 10 to minus seventh or something for anybody interested okay anyway so here we go here we hear whatever they look at a contradiction the contradiction depended

00:54:37

upon the superintendent of the pressure measurement I was measuring this one off I used the barometer a second time a a vi2 barometers okay so I'm getting back here and I measured the measured that by a pressure measurement which I suppose I didn't put on the on the screen which of

00:54:50

course you know the difference in to pressure from the height as you move the belfry never up and down the building is going to be the weight of air between those two okay and so so I got a contradiction between those two and I can decide which of those I

00:55:03

like or don't like as you can see here okay or here and I get two different different values so I can maintain you can consistent worldview major incident worldview by having locally consistent

00:55:16

worldviews ah but better okay yes even better I can also have the machine automatically fine for me

00:55:30

the consistent sub sub sub worldviews that are consistent but worldview is consistent that's called dependency directed backtracking and unlike chronological backtracking which you would get say by

00:55:44

using the sequenced monad and a failure mode right for those of you who are Pascal lights and that would give you a chronological backtrack unlike that I could do much much more efficient backtracking because I know

00:55:57

the opponents are very fat so I could say if this has ever got a contradiction here okay and I've got another contradiction and they intersection is this guy it's how I probably probably he's the bad bad guy but even so if I

00:56:10

just find something to rule out okay I can rule out a whole subtree without having to undo anything and redo it redo computations when that were already correct because they but they were

00:56:23

chronologically later remember this because the parallel machine in principle is right now simulated and serially of course but a little time sharing system but parallel machine with with with where all of these things are

00:56:35

independently looking at information carrying them the whole place to the next so here's an example of the famous famous story there's a five-story building Baker Cooper Fletcher Miller

00:56:47

means to live on one over five floors I require that Baker Cooper Fletcher Miller in Swift live on distinct floors that Baker is not on the fifth floor Cooper's not on the first buttress not on the fifth Fletcher is on the

00:57:01

first notice on a higher floor than Cooper and the and the the Smith and Fletcher and Fletcher Cooper orange are not on adjacent floors okay and I want to know what the answer

00:57:13

that's pretty easy to get number of course it's very very efficient if I did this by by a chronological process this would be three thousand or something backtracks to investigate this space not

00:57:27

that search is a good thing search is always exponential and will always eat you if you let it so this is not to say that we should be doing search is just showing that I can automatically find consistent subsystems

00:57:39

of a complex system because this is this is working by backtracking on consistency inconsistencies okay now let me tell you go back to the original thing I was talking about at the beginning okay when I was an

00:57:54

undergraduate at MIT I used to walk by Jerry lectins lab everybody who Jerry laughing was who does everybody remember here the lepton Leary debates the debate he was a famous neurophysiologist

00:58:08

who was the first person to measure potentials from neurons and he didn't he made great amplifiers he was professor of engineering and biology and

00:58:18

humanities at MIT and I used to quite go by his lab which said laboratory of experimental of histology I mean I wasn't I was employed as Minsky what a

00:58:32

miss keys computer hackers okay so I would go over there and I pass by the left was slamming I dropped in and listening to talk about wonderful things about neural physiology and things like that and one thing that Levin he died by

00:58:43

the way last year one thing that he would explain where it split quite often is that almost all the cells in the central nervous system are very tiny much tinier than the ones that you hear that have long axons and things like

00:58:58

that and he can't it's really hard to see them even with a good microscope and it's pretty hard to it's pretty hard to measure potential struggling action potentials and things like that so he had this idea that they actually

00:59:12

are unreliable undercounted that they may be hypothesis many of these neurons do not explicitly are not explicitly directional that they are actually things that are like these constraints that feel a bunch

00:59:25

of other things and adjust the adjust the sensitivity of other neurons they touch okay so if they say gee I feel these three things be doing something then they say let me make this guy do something to do okay but if

00:59:38

there's only these two then I better make this guy do something to leave this one alone or inhibit this one okay that was the kind of idea he had which might be false I'm not trying to suggest as true or false I don't know I don't know

00:59:49

anything about neurophysiology okay but except if everyone I've learned by walking into London's lab but the oh yes he could talk about any soul and not going back for an hour - sure but

01:00:02

getting back to the Kinesis triangle illusion is it possible is it possible that almost all the computation that we see that's really fast is filling in

01:00:14

details what is going on in that electrical circuit example I've shown here even with a big circuit it's filling in details that the diagram is a memory and I'm writing in specific little node to find places the answers

01:00:28

that I could deduce from local information sometimes I have to invent a variable and propagated around given equation but I hope not going often okay but most of the time I'm doing by

01:00:41

picking my models correctly I'm choosing to make the subproblems simple enough so I can do this kind of propagation which fills in details what's happening here what's happening here is this guy is giving me evidence that there's

01:00:53

something that there's something including this black circle by golly there's good evidence here that this black triangle is being included by something - okay maybe there's good evidence that's a good enough evidence

01:01:06

to think that maybe there's a white thing that connects these and things like that and maybe that's done by some magic process that looks something like that for all we know I don't know this

01:01:18

is true I don't claim it's true okay it's different way to think maybe it's worth investigating it might be completely nonsense on the other hand we have to throw away our current ways of

01:01:32

thinking if we ever expect to solve these problems so in summary okay I'm planing the problem facing us as computer engineers is not correctness

01:01:47

it's evolvability a trivial example of making things more evolvable is the extensible generic operations may help right because they

01:01:58

allow us to extend functions without modification but they may proves very difficult a more radical proposal is maybe there are freedoms that we can unlock unlock by throwing away our idea of architecture how machines have to

01:02:11

work here's one particular example which is concurrent and peril in the essential way it's redundant and degenerate meaning it's possible to compute the same thing many times and put them into the same place that doesn't cause any

01:02:25

trouble except if one of the things dies you get the same the correct answer another way it could be degenerate means I can commute a different way the same way I can compute in my physics class by

01:02:36

Newton's Newtons vector vector methods I contribute the motion of something like the planets okay I can compute it by the Lagrangian method I can compute it by the Hamiltonian method they get the same

01:02:50

answer but they get they reveal different phenomena I'm not going to find out about the how the symmetries and sips the system corresponds to the correspond to the conserved quantities if I'm looking at Newton's method

01:03:03

branches if I want to do a canonical transformation it better be Hamiltonian ok so there are all these that's where the generous D means many ways of doing something if I can't eat sugar because

01:03:15

I'm screwed up in some way ok then I can get my calories out of out of proteins every biological system has many ways of doing things that's called degeneracy ok I can

01:03:27

maintain provenance as I can know where a data comes from and follow it around and combine the data combines I could combine the problem this is appropriately sometimes you have to subtract as in a conditional proof if you say assume a get conclude B

01:03:40

therefore you can conclude a implies B without the assumption of de so there are certain kinds of rules that have to delete the provenance when they do it but you back to know that so their means of combining problem

01:03:52

they have to be taken care of okay and I can tell you about dependency directed back tracking which is a useful way of controlling controlling non-deterministic search sometimes that's an example of a of things that

01:04:05

I'm just pushing forward as being a possible way to break out but I want to hear every way they can and that's all I have to say today