Auto Scroll
Select text to annotate, Click play in YouTube to begin
i didn't see the notification yet there it goes hi folks uh welcome to the april northwest t plus plus users group uh tonight we've got walter bright talking about
adding modules to c and 10 lines of code um one quick announcement before we start the august 17th meeting will be lightning rounds
so you don't need to sign up or anything um quick quick show of hands how many people are thinking about presenting a lightning round this year we've done it last year okay at least one i think chris you were talking about at one
point but i don't see you at the moment there uh so we'll have a couple and hopefully by then we'll have some more but for those of you who haven't done this yet the lightning rounds are you know five to ten minute presentation
little mini topic on whatever you would like so that will be the august 17th meeting i am still looking for a speaker for the july meeting i might have
one um and if i don't get a speaker we'll have a lively debate or a topic on something okay with that i'm going to turn it over to walter um i will start sharing my screen
and take it away okay all right well let's see now can i get this thing to go away here i guess not
oh there it goes all right so this is my presentation on their coming to take me away because nobody can have modules to see in 10 lines of code well i'm going to prove it can be done
and i'm going to give this presentation and at the end you can decide if i'm crazy or not and if you like what you see you can follow me on twitter or you can come to the d conference that's the link at the
bottom which will uh which will be live and in person in london next slide all right um how many of you are here to watch me
crash and burn who doesn't believe me i'm taking type in the chat window next slide but before you know
i provide entertainment to the crowd i'm gonna make you suffer with some uh necessary uh background material to explain uh how all this came about
next okay what happened here that's not my next slide oh yeah that's what i've got is the next slide let's see oops let me double check
that's where i'm at i've been here okay go to the next slide after that that one okay yeah okay i seem to have held up my presentation
when i made a last-minute revision to it okay uh oh this is even weirder can you go back one slide there we go uh d is designed to be easier to
interface to see which is uh at the interface of zero cause compatible size familiar centix compatible semantics and the big difference is to leverage existing c
code next slide and the idea but you need to make the c code accessible from d and the idea is to simply translate the c dot h file to d
and it's easy it doesn't take long shouldn't be a problem and you know famous last words so it turns out to be a very large
problem in practice so what would be the simplest most obvious most perfect way for decode to simply get all the declarations from say studio.h
what would you like to see if you were a d programmer and you wanted to access your c code next please you would just want to type in import studio right i mean what could be easier i have studio.h i want to get
his declarations i just want to import the studio file next slide and behold import c was conceived and i did this a few months ago
um in order for that to work the d compiler has to incorporate a c compiler yeah absolutely a complete c compiler
and isn't that kind of a big project building a c compiler not really because uh d and c are similar enough that we can share most of the compiler it's
not like going around the horn to get to a c compiler next slide so here's sort of a view of what the internals of the c compiler or the d
compiler looks like it has a lexer and then a parser and a semantic processor and then the backend code generation all we need to do to compile c is to have a little uh
diversion and go through a c parser instead of the v parser and then go back to the semantic and the back end and it turns out that we can share most of the lecture and
you know probably about eighty percent of the lecture is shared and you go back please and uh probably about 98 of the semantic
processing is also shared and the back end of course is 100 of that share so you see i didn't really have a whole lot of work to do to make that happen
so next slide and so as a deep programmer you could mix and match d files directly no need for translation you could import a d file or import a c file
and you could continue on and if you imported a c file uh the d code could then access all of the declarations in your [Music]
in your c file but that led to uh next next slide please so on the walters follow so presumably i have all this working which i did i'm going
going on and we come to next slide please we come to the foundational idea which is uh what is the root data
structure of a compiled piece of c code anybody have any ideas you want to type in the chat window translation unit well that's that's not really a
root data structure symbol table that is the uh actually what it is next slide please it's just a symbol table
and which consists of at your top level your functions variable genome structs unions and type def which can be represented as a simple array so if you want to
import c code all you really need to import is an array so let's let's say we have a file uh hello.c which is a c program that includes studio.h
and includes std.stdlib and you know it says hello world and then exits you know you're playing simple program what does that look like from the symbol table point of view
exciting uh it looks like it's got three symbols the symbol for printf symbol for exit and the symbol for main and next slide
now if we want to look at it in the form of what would it look like look like with imports we see you have hello.c and it wants to import the printf and it wants to import the exit
and it wants to compile the main so it has essentially three symbol tables to deal with when it's doing a symbol lookup whereas in the normal c program you only have one symbol table you're
looking at for the global symbols here with imports you'd have uh three symbol tables so the trick is to make all this work excite
these so in order to make this work we have uh three problems we've got to solve we've got to instruct the c compiler to to compile another c file
independently by running a separate instance of the c compiler on that imported c file to generate the symbol table array for us and then we have to look up we have to
adjust our uh symbol lookup mechanism so if it's not found in the global symbol table it goes and looks in the imported c symbol tables
i want you to recall again how the compiler is organized here and you'll see that really all we need to do is first off instruct the compiler oh
next slide please we need to add some syntax so remember i said earlier we wanted to go import uh stdi stdio call semicolon
well if we had a name like import we're likely to collide with somebody's variable name which inevitably happens so we use the reserve name by putting two underscores in front of it and call it underscoring core
import and it turns out that since the departure is still sitting in there we can hijack the deparsor for
the d import statement and you'll see a link here the https thing which points to the specification for the uh the import declaration parser
next slide please okay once the import declaration is added to the c symbol table when we're running the semantic path that sees the import declaration
and it fires up another instance of the c compiler which runs the parse on it and the semantic on it separately and then returns a global
symbol table and then that just important symbol table is uh inserted into the uh [Music] as a reference in the
importer's global symbol table if i'm not being too confusing saying that um so maybe to put it simpler it just runs a separate instance of the c compiler which produces a simple table
and we swirl it away so we can look at it later next slide please since the d and c semantics are shared code it turns out that we can just use the d
semantics that already look up names unresolved names and imported symbols so next slide please and here we have the magic 10 lines of code it's actually slightly less but it
sounded better if i said 10 lines of code and all you have to do all i had to do was type these lines of code in
compile it and it worked and there's a link to the pull request so you can see that that's actually what it is i wasn't counting the comments in the documentation just lines of code
and all it's doing is looking at this token if it's an import token i call these uh parse import and which returns an array
and if the array is of non-zero length i append it to the existing list of symbol tables and that's it and the interesting thing was i tried this
and i literally couldn't believe it worked it was so simple i just typed this in it worked suddenly i could import files from c so what about the good stuff
um i presume you're all c programmers and know how that works and when you're uh hash including uh header files you do things like use barriers or hash
fragment once to uh not have to repeatedly uh parse the same header file over and over again um [Music]
with this scheme with imports you can even have circular imports and it's not going to go into an infinite loop it's just going to go oh i've already imported that i don't need to import it again i just need to share the symbol table
one neat thing about it is the semantics of a module become independent of the module it imports um you know it's hygienic the macros and
the imported file are not imported into the importer so you can have whatever macros you want in the imported file and it won't affect anything in importer
static symbols in the imported modules are also not visible to the importer as you would like this is not true for hash and glue um
this is not done with a separate file with a module symbol table so there's no need for disk files to store the symbol tables keep track of the symbol tables keep track of their dependencies
and the nice thing is you don't have to import dot h you don't have to write the dot h files anymore you can just import the c source file and it will pull out your global declarations in
that and make them available to you the whole mess with dot h files you know you can just toss aside next slide please of course not everything is perfect because you
know it's c and there's nothing about c that's perfect some things are not so good if you're relying on preprocessor macros to do magical things for you and meta programming this isn't going to work
because the preprocessor macros are not important which means that if you're doing macro meta programming and see using the preprocessor
well you're out of luck and too bad so sad you shouldn't be doing uh meta programming with pre-processor macros anyway in my not so humble opinion
yeah you know my opinion that if you're doing meta programming with a c3 processor you're ready to move to a more powerful language but in fact most c header files are
declaring macros only for use by the um [Music] well it turns out people do use uh macros for what are called manifest constants you
know things like string literals and stuff like that and those won't work with us which is uh uh for string macros you can
use things like uh comps declarations to uh substitute those instead also if you start using uh imports a lot and see you'll wind up pressuring the c
language community to properly provide substitutes for some of the preprocessor features next slide okay what if you got name collisions
like in 8ih you declared a variable named f and then the beat ihu um declared a function named s
and you imported both and now it happens uh in defense i would say well that wouldn't work if you hash included them either so you know we can be better
but uh it's uh we're not going to be worse at least than hash includes but with the these import syntax if we add a colon
and then a symbol name after the import it will import only that symbol and it will override the symbol in any other import so
we can resolve the collision uh using this syntax and here uh b dot f overrides the uh f
what's this okay so next slide please well that's well and good for import c but uh how about implementing it or retrofitting it into
another c compiler will it be as easy well it won't quite be as easy but here's how you can make it work as long as the compiler doesn't use a bunch of global variables to store the compiler
state it can you can use it to run another instance of itself uh my c compiler that i wrote in the
1980s the digital margin compiler is all global variables it's still doable but the first thing that would have to be done with it it would have to be changed to eliminate all the global
state of it which would be a fair chunk of work so if your compiler is old-fashioned code like mine is it's not going to work very easily if it's a new style code where
the parser was just an instance of a class it's very easy to make it work as you saw with the few lines of code i wrote so next slide please
now this is kind of the big question which sort of leads you to believe that of course
this can't possibly work because if it worked somebody in the last 40 years would have done it doesn't that seem like a reasonable question including me
why didn't i do it for the last 40 years this thing was staring me in the face and it simply never occurred to me i just can't explain it
uh c plus plus has been trying to invent modules for 20 years uh they finally succeeded i don't know why it took them 20 years
i don't know what was going on with that i don't understand it but that is i think a very good question to ask of the c plus plus folks that if i can do it in 10 lines of code
why does it take them 20 years to do it so next slide please so we can uh compare it with c plus plus modules and
i i take this a bit with a grain of salt because i'm not really an expert on c plus plus modules i know very little about it but it does look pretty similar from what i've read about it
uh one significant difference is in my implementation the uh the file name is derived from the import name which makes lookup straightforward
and i believe this is not the case with uh c-plus plus modules i don't understand why not it makes it significantly more complicated to have them differ
next slide please but i think the idea can work although i didn't i'd avoid having uh things like c plus lessons bmi which is their binary module interface what their design does is compiles the
uh c plus plus file into a symbol table and then erase the symbol table to a bmi file so there's all these complications around maintaining you know defining a
file format and maintaining the file and stuff like that and i mean i'm not sure what benefit c plus plus gets from that
um clearly i uh d has modules and d doesn't do anything with the writing symbol tables out to disk that's very fast and effective so i don't understand [Music] why they would need that
next slide please so let's look at uh compare with the uh well d and c isn't the only two languages
that form a hybrid uh famously c plus plus and c also have uh two language hybrid can hash include c files uh c plus plus can hash include c files and c plus plus
can hash include c files and well there's something seems to be missing of course it's pretty obvious what it is if anybody doesn't know what that is just type it into the chat and we'll see
everybody does it that's good okay so next slide please so let's look at the equivalent thing and with imports and they can import d files d can import c files c can import c
files and again we have huh it's that obvious whole again what are we going to do about that next slide please and it turns out
and this is one of those things that i kind of thought this can't work and i tried it and it worked you can take a you know an import c file and import a d
file and it actually works it pulls in the it compiles the code as a d source code and presents a simple table
to the c file and what can you do with that next slide please well you can add function overloading to c
so in your math.ded file you can have two different versions of the square function one for instance and one for doubles and in your c program you import mass
and you can call each of the overloads depending on your argument types no extra coding required
no need for c11 famous underscore generic [Music] statement which will uh uh do which will do a primitive form of
function overloading so you don't need that if you just import it from d so next slide please well it turns out you can even do templates
isn't that amazing imagine that templates for c so you really don't need to see preprocessor for um programming anymore you can write a little d file with your meta
programming stuff and then just import it so although we can't specify parameters for the uh or we can't specify template arguments
because the c language doesn't have the syntax for it what we can do is take advantage of these ability to infer the
template parameters and we do that here just by calling square with an ins type or a square with a double type and the template code will automatically figure it out and produce two instantiations and you
can call them and it works isn't that amazing um apparently people in the the news group have also figured out how to do member functions with uh
you know bring menta uh d member functions and call them from c and even constructors and things like that so this is the since it shares the semantic routines i
didn't go through and break all of these stuff it's all sitting there if you can figure out how to use it it'll work as long as it can still see syntax so
next slide please um flip b and c of course there's some rough edges there's no scheme for public private members for example you can't specify a
type dev and a c file as being private or static it's just going to be there so all your type desks are always going to be visible one of the most significant problems
i've had is d is designed to be parsed first as a separate from the symbol table or the semantic processing
and so there's an order dependency in there particularly with type deaths and uh [Music] parsing
height does cause parsing difficulties one of the difficulties is like a cast expression or a unary expression cannot the difference cannot be
determined just by doing a parse you need to know if a symbol is a tight depth or not a tight depth so you end up with some parsing difficulties if there are if you're
relying on type desks being imported from a c file i have some ideas for extensions to see to make that work but it's going to wind up being extensions
to see so it's not perfect another thing we've been thinking of doing is uh [Music] taking a look at the c preprocessor and detecting
the macros that are there for manifest cons constants and uh make those manifest constants available to the importer a lot of these problems stem from c is a
very old language and it's sort of like you know putting a v8 in a volkswagen bug um [Music] it'll work people have done it but if you uh step on the gas too much
you'll tear the vw bug apart so it's best if you try to re-engineer the vw to be able to handle it so next slide please so you get to
decide did i deliver the goods am i lying am i exaggerating because it's not true that you can have imports to see in 10 lines of code you get to decide um are you going to pressure your
favorite c compiler vendor you know if welder can do it in 10 lines why can't we have imports and see and of course his hash include really
most sincerely dead in my opinion the hash include belongs on the ashy for history um even for old-fashioned c and even if you
stick with c hash include is really it's really a tired old game and it really shows its age and if we can have imports and see with a few simple changes to the
compiler why not move into the future it doesn't take away from what c is or how great c is next slide please
preferences you know the two airplanes their dnc flying together and how fun that is you have your buddies mount fuji in the background by the way
and uh [Music] we have a reference here to uh the spec on import c and a reference to a deconf coming up in july
and i'd love to see you all there detox so questions so folks go ahead and you know open your mic and speak up for questions since you don't process the compiled modules to
a bmi or ifc does that mean every time you rebuild the project you have to recompile the code that's right however there i can type this in the chat i guess is
the chat being recorded too um [Music] not directly okay well i'll type in the chat and then read it the chat is not being recorded
okay here's what you can do with the the compiler you can list all the c files at once on the command line for it
and they will compile all at once whereas your normal c compiler works by compiles one file writes one object file then it compiles
the next file writes one object file the compiler i wrote if you have a c project that consists of multiple files you can just type all the files on the command line even if it's 100 files you
can use a response file or a script to do that and it will compile them all at once which means that each file that is imported is only compiled once no matter how many times it's imported by how many
files so this has turned out to be very effective in b and i expect it will be equivalently affected with c if the compiler is capable of running
multiple instances of itself it's capable of also doing this where you list all the files at once on the command line and just compile them all together at once there's no reason
to uh you can still do separate compilation if you want but you know separate compilation is an old-fashioned way of doing it and it's not really necessary anymore so yes that is a that was a great
question so is it smart enough not to recompile based on the dates or something like if i like if i had 100 files on my dmd command line and then i changed one of
them and then i wanted to recompile and it was sort of the root node is it still going to recompile 100 files if it's all in one command line yes it does not
look at the dates that's the trade-off but it turns out it's actually so fast that compiling it it doesn't really matter yeah yeah but yeah you can still do separate compilation or
you can even break it up in the batches like 10 10 files for this compilation and have it and it will generate a merged object file for all 10 of those files
so you can still you can do a hybrid approach where if you've got 100 files in your program you can compile them in batches of 10 and wind up with 10 object files and then link those 10 object files together
so if your uh project breaks up nicely into separate sub projects you can do each sub project as a separate uh compilation step so i actually kind of use the same
technique in c plus plus by just including all of my code into one file that i can yeah that would be that would have the same effect you're right
unity build i missed that last question oh i was just responding to mike uh yeah it's called the unit unity build okay unity bill okay that's a good word for it i never heard of that but it's
something i've been doing for the last 20 years or something so oh why didn't you think of modules foresee [Laughter] now the whole thing is funny about this
how you get so used to the way things are you don't think of the obvious next step and you know that can be so frustrating
go ahead so it looks like now you have one compiler can compile both c files and d files in any sort of mix could you write a whole program of c files with the dmd compiler
yep you you can build with the dmd compiler you can just type in dmv hello.c and it will compile it and produce an executable it'll run
once you're compiling the h files you might as well finish the job which is very little extra work and you have a c compiler that works which is uh
you know not something i ever really intended to do but i wound up with it and how come nobody's asked me why i don't do that for c plus one i was going to ask that actually
but it seems a lot harder i guess well that's because you know writing a c prize from a c parser you know it takes a couple of weeks the writing of people post compiler is a years long project
there's no fundamental reason why it won't work but it would be cool to have that work too and several times i i keep toying with the idea of uh adding a c plus c with classes
feature which is you know the early subset of c plus uh basic you know with uh classes and member functions but uh what kind of stops me for doing
that is that's not modern c plus plus and people don't write c plus plus code that way anymore so it seems kind of pointless to do that even the simplest c plus plus code makes
heavy use of templates from what i've seen so i see a hand up from peter i already asked my question so i'll put my hand back down okay hey do we have anybody else
any other questions come on you guys rip me to shreds i have a quest for lloyd can we get the in the back get back to having these things in person again uh
at some point i actually robin do you know what microsoft uh what their reactor rooms are doing these days are they coming back or are they staying closed or i haven't heard anything official from that yet
i didn't hear or do you hear me yes yes i didn't hear anything official about the reactor but microsoft starts gradually going out
to work in person so i personally go out and work two days a week some people might want to work more than half of the days a week so
soon it is expected but so far i didn't hear anything additional yeah so we'll keep tracking it and kind of see what happens i think it's going to be kind of one of these on again off again things for a little bit
hmm one other question yeah what's a reasonable substitute for the assert macro because it's like one of the last lingering things that you really still need includes for
um that's a great question well d has an assert uh function and i don't have a good answer for that other than i know it's doable i'll just
have to think about it well it's great to have such a proof of concept maybe you should prompt the c community to adopt modules too yeah it's too bad that hashtag or hash include
is used and the assert is a macro in it um i have written my own assert function and i use that in my own code but um [Music]
the assert function doesn't automatically insert the file and line on it so it's kind of inadequate but yeah um on x86 on x86 you could make the
assert function just be an azim in three which throws a processor exception which will then throw you into a debugger yeah but people like the message with uh you know the line and the file on it wait i
do but yeah um it's like my solution to the cast problem with type desks i alluded to earlier it's possible i may just simply could
create an underscoreness for a cert which then will uh completely bypass the assert mechanism with its own a certain mechanism and
[Music] then that'll work i don't feel too bad about adding a couple of extensions to make imports more usable within port c because if you're already using
underscores for import why not use a couple of extensions to make it work easier like there might be an underscore in a score a cert added and you just use that instead and it'll work
so yeah i think the problem is solvable but it's a great question and you're right it does need to be uh it does need to be addressed
in order to um make this more usable so good idea ali i see your hand up as well go for it
um um so i use this import c topic and all these adding modules to see in my local bay area d um group for that i prepared some code if anybody
is interested i may be one of the lightning speakers for the next session to show hands-on examples i can do that please let me know but a question to alter i had to
struggle with the restrict type qualifier and also i had to undef underscore underscore gunusi underscore underscore macro
before including studio io dot h on my system do you know why i had to do that and also you did you touch upon the preprocessor here
that it needs to be handled externally or not okay um okay i tried the uh you know hello dot c with hash includes studio agent it works on my linux system and i
don't need to undefine anything so i'm not [Music] not sure how that differs from your system uh the next thing is yeah i did touch on the preprocessor stuff
i'm in the process of adding pre-processor support into uh and the import c compiler so because currently you have to do that as a separate step and
i know that's a pain so i'm currently working i've got a couple of pr's out pull requests outstanding that bring that into the uh decompiler and i forgot what your other
question was uh the restrict type um i wasn't going to support the restrict thing but after a while i just simply uh
i put it in it's implemented it just ignores it you wanted to do something other than ignore it oh i don't know i actually don't know really what restrict is i didn't know before import c and i know very little
about it now all right so i would suggest that you just ignore it which is what the latest compiler does now it just ignores restrict
i don't you know the decompiler or import c also ignores volatile because there are no there's no semantic routines in the uh the compiler to support volatile so
it just ignores it um volatile kind of is a not terribly useful thing anyway so and part c isn't perfect okay anybody else
and by the way ollie it'd be great if you show up for the uh lightning talks yeah that'd be fun i i know if anybody else uh knows of a speaker i'm always looking for speakers too so
how's that for a shameless plug it's pretty shamelessly all right i don't want to be the only one asking questions but i was kidding
if you can access objects and classes from c that are defined in d when you're can you actually allocate on the stack a d object and then what happens when you exit that
function does it just well wait this is complicated because d uses garbage collection so like are you asking if raii works in imported c code
basically yeah um yep cool thanks i haven't actually tested it i'm pretty sure it works i should test that that should be part
of my next just presentation me know when you'd like to do that yeah well it's kind of a process of discovery there's all this stuff in there and um
as long as it conforms to c syntax it should work yeah that was actually a really interesting approach of just invoking the compile of the c compiler inside the decompiler and
all this magic just comes right up together and works yeah it's really uh it was really fun to just try that out and and it worked and it was like i did nothing and this worked
this is so cool when that happens uh most of the bug reports i get on import c is that as a piece of code will be following d semantics instead of c semantics so i
have to put a little uh a little uh logic fork in the semantics and well if it's compiling so you do it this way instead of that way yeah that's about 95 percent of the bugs
i'm having in it uh so i'm sort of relearning all the looseness and dark corners of of c by doing this for example and see
you can do things like like this let me type it in here n star p equals three and that's a valid c code so i have to
you know stop the error message from happening if you're compiling with uh in part c you're welcome anton ah well okay
thanks everyone for uh showing up and uh thanks not today and you didn't throw any virtual rotten tomatoes at me well lloyd that's a piece that's a feature the team should have
yeah hey robin if you're still on go ahead and stop the recording at this point i think we're in good shape thanks yeah thanks for uh presenting walters this was excellent
actually good food for thought even i'll send you i'll send you updated slides with that uh error in it okay
End of transcript