Auto Scroll
Select text to annotate, Click play in YouTube to begin
hi there everybody it's skeptic mystic from the obsidian forum and from the discord in this video i'm going to be going over mine and emile's plugin called graph analysis
so first thing we'll need to do is just install the plugin you can do that from the community plugin page and you'll see that i've already got it installed but using the default settings
and you can just enable it from there so before we even look at what graph analysis can do let's just consider what it's meant to do so it's plugin that allows you to see the results of various different graph
analysis measures being applied to your obsidian bolts and that can sound like a lot basically we're using some pretty fancy math in some cases to
analyze like which notes are most similar to one another or try and find groups of notes in some cases but we'll see how that works and the two main types of algorithms
that we use are structural these algorithms only consider the links in your obsidian graph and they don't consider the contents of the notes and the other kind
look at what's inside the note they compare the the words inside the note to other node contents uh just before i go any further i just want to note that i'm using a publicly
available vault which you can find over here at this github repository i'll post a link to this in the description yeah this vault um has quite a few files in
it and lots of useful links between them about a lot of different subjects so you can use this to test your plugins for example or graph analysis which
so um let's enable or take a look at the results of graph analysis so the main thing we're going to be using from
the plug-in is the graph analysis view um and to open that we need to run this command open graph analysis view and you'll see on the right side uh a new leaf appears
and it should look like this depending on what page you're on in your vaults so let's just pick a random note and we'll pick this algorithm to run i'll explain these
in a moment [Music] not all notes are going to have results but the language is a good one to use so the notes about language
um gives us all of these results to see and this is a lot of information to take in um and so as i've noted here in a lot of cases there are going to be a lot of numbers sometimes that's not the result
of the algorithm but what i want to show the importance of is that the exact values aren't um like really what you should be considering most of the time the relation between
those values to other nodes in your vault is what gives you the most information and we'll see how that plays out using [Music] using some examples good so
i'm going to go over each of the algorithms available in graph analysis except for co-citations cocitations is quite a big topic on its own and so emil
the resident expert on co-citations is going to make a separate video just for that so with co-citations aside there are a few different types of algorithms that have grouped into these categories
the first kind i'll go over are link prediction algorithms and a link prediction algorithm is taking the structure of your graph and saying look at the current nodes and for
every other nodes give me the probability or predict how likely it is that these nodes should be connected to each other whether or not they actually are okay
so this is the result that i showed on the language so let's go back to language notes um [Music] and we can see all of these results here and what this is saying is that
based on the structure of your graph this is showing which nodes should be linked to the current nodes so it's predicting quite highly that the nodes about humans
should be connected to the notes about language and that sort of makes sense intuitively and a helpful way to kind of gauge the result is to see if those notes are actually linked so if we see this little
icon next to a result that means that the nodes on language is actually linked to the nodes on brain and we can see that over here for example this icon will appear
bi-directionally so if the node brain links to the node language it will also appear next to the result there yeah and we can
gauge these results language and psychology are quite linked for example you can scroll through them you can also sort the results in ascending or descending order
so of the results that weren't zero logical connective is the least likely to be linked to language we can also well yeah um we actually filter out zero
so this toggle isn't actually doing anything in this case but yeah it's nice to be able to sort the results in ascending or descending order and just jump around between the notes if we go to the notes on humans we can
see it's most likely connected to the notes on brain if you hover over the results we can see which notes those uh yeah
which notes these two have in common so this is saying that um the nodes human and brain have common neighbors developmental psychology
feeling human brain for example and again these will appear bi-directionally so even though the human node has no links leaving it the notes on developmental psychology has a link going to human
and yeah and we can see lots of results here [Music] and they all kind of make sense so that's a link prediction algorithm
it's telling you how likely is it that the current note is connected to every other note so let's go back to our starting page we're done with link prediction
and let's take a look at similarity algorithms next um there are a few different ways you could kind of define this my understanding is that again based on
the structure of your graph the the links between your notes and not the content in those notes the similarity algorithm is telling you how similar the current note is to every other node
so let's change over to the jacquard similarity algorithm and go back to the anki node for example and we can see that [Music] you can hover over the information icon
to see a little bit more again based on the structure tells you how similar they are and a bit more in depth this algorithm is showing you the ratio of the number of neighbors these two nodes have in
common so the total number of neighbors they have um together so the maximum value could be is one and the lowest is going to be zero and we can see all those results here sorted in
descending order but so the notes on anki is very similar to the notes on obsidian and they are actually linked it's very similar to the notes on repetition suppression and those aren't
actually linked so perhaps that's giving you a hint that those nodes are similar enough that they should be a link between them yeah and again we can jump around the nodes control
hover to [Music] preview the nodes if it actually has content yeah not so we can do that it's those are similarity algorithms you'll see that there's another
similarity algorithm listed but it's not shown here even if you scroll and that's because in the settings for graph analysis not all of the algorithms are shown by defaults and there are quite a
few and it can be quite overwhelming so we as the developers have chosen not to show all of them but you can come here and turn some of them on because co-citations is shown by default but
not for this video so if i turn on the overlap algorithm the other similarity algorithm if you refresh the index to make it appear here you can then show the overlap algorithm and
yeah get an idea of the overlap similarity between notes for which i haven't given a description so if uh you do want to see a bit more in-depth uh information about each of
the algorithms you can read the readme on graph analysis you can find it here on the repo and we give quite a bit of information about each of the algorithms and
the actual formula like what what is the math doing for this algorithm um yeah and that can be quite a lot to take in so uh mostly what i want to get across is an intuition
for what each of these algorithms are doing so again this is a similarity algorithm telling you that like the cognitive strategy notes has a high overlap similarity with the time note
for example good so those are similarity algorithms let's have a look at centrality um a centrality algorithm
is telling us uh it's giving us an idea of which notes are like most central if you sort of bring the most highly linked notes to the center of the graph these are going to tell you which nodes those
are so the centrality algorithm in graph analysis is hits and if we activate that and jump to it here
we can see these results so one thing to note because it hasn't come up yet is this little icon here it's an icon of globe of the earth and this is telling us that the hits
algorithm is a global algorithm what that means is that it doesn't depend the results don't depend on the currently focused notes so if i change notes you'll see that
these results are just staying the same it's a global algorithm it the results don't change based on which node you're currently focused on and so the hits algorithm
is telling us the hub score and the authority score of every note in the vault right and so an authority is a note with lots
of links coming into it lots of different notes cite this notes and so it has a lot of authority a hub is a node which has lots lots of links leaving it right and the hits algorithm is showing us
both of the scores in one by default it's sorted by authority in descending order so we can change that to ascending order we see that authority is all zero
and we can also change it to sort by the hub score in descending or ascending order right so just using the the default
sorting we can see that the node on brain has the highest authority most nodes link to the brain notes um yeah a bit of synchronicity there maybe
um and it has quite a low hub score yeah so this gives us a way to assess that and if you change notes you can see uh if your the the currently focused notes is
bolded in the results good so these are some interesting ones we also have um not that the others aren't we also have
community detection algorithms um the first of which i'll show is label propagation but first a community detection algorithm is taking the structure of the graph and
using different methods based on the algorithm it's saying well like this clump of notes here forms a community these notes are more highly connected to one another than they are
to the other notes in the graph and so they should form a cluster a community of sorts right and we can sort of gauge that just by looking at the graph like there's a cluster over here maybe one here
a smaller one over here but these community detection algorithms are giving us explicit uh communities so let's look at label propagation see it's also a global algorithm
so the results don't depend on the currently focused note and this is a lot to take in this is the result of running a label propagation algorithm on your graph
i've given a description here basically you you give each node in the graph a label which happens to be its own name and then over multiple steps multiple iterations
each note takes on the label of the most popular label among it so first that's going to be quite a random process but over time certain
labels are going to become more and more popular in the graph and sort of take over their given community and after the specified number of iterations which you can change um
we see the the community that each node landed up in um and so on this note on polyamory for example uh these
ended up in that cluster and let's take a look at the more popular and yeah so the notes on happiness uh form quite a big cluster and we can see based on this icon that the
personality note is linked to the happiness notes um and so that gives you a good idea of like the validity of that cluster of that community
but even if the notes aren't actually linked to the community notes um the results are still pretty good right like um jealousy is relevant to like polyamory for example
and yeah and so the results are going to be different for your uh for your vaults but it always kind of surprises me how accurate it is and there's quite a bit that you can change
you can lower the number of iterations to one and based on the algorithm if you only pass labels once then by definition all of the notes in that community are going to be linked to
the community notes um yeah and so if the the number of iterations is quite low then you can have quite a few communities because it hasn't had a chance to the more popular
uh labels haven't had a chance to take over in a sense and if we increase the number of iterations then the happiness node has 517 nodes in its uh community
good and you can change the sorting order for example yeah it's quite a fun one i quite like the the label propagation algorithm we also have the new vein community
detection algorithm and this is doing something quite similar it's also running a community detection algorithm one that i don't quite know how it works the
coding library i use just implements it for us and so we've added it's a graph analysis this is not a global algorithm so it does depend on the currently focused notes
and the leuvene community detection algorithm only shows you the community that the current node is in it doesn't show the the current node because it will always be in that community
um yeah and there's a bit of randomness to it so if you refresh index sometimes the results will be different um yeah and you'll see that they're pretty accurate yeah
again you can change the resolution fewer iterations makes for larger communities and more iterations makes for more refined communities that
are generally more accurate um and lastly the clustering coefficient this is a global algorithm and this is kind of interesting um it
tells you the likelihood that a node's neighbors are connected to one another so this note on endorphins has um [Music]
the following neighbors uh there's a bit of a bug there but yeah so this is showing um the the triangles that this notes is a part of but we can
just think of that as its neighbors and so it's telling us that there's quite a high probability that the notes on endorphins that node's neighbors are probably connected to one another so
happiness and pleasure are likely to be connected for example yeah and and that likelihood sort of tells us um about the note that it's connected to
about the notes on endorphins yeah so it's a quite a different algorithm there something to play around with good um and before i show the last type um i
haven't mentioned this frozen icon it's a it's a flame um and the frozen icon so if we go to a non-global algorithm because it's non-global it depends on
the currently focused notes if we click around the results are going to change but sometimes you don't want those results to change and so we can freeze it on the currently focused notes
and that means that if you click around it will change notes but the results won't change because it's still frozen on the note we froze it on right and then if you do want to unfreeze it
you can just click that again and the results will refresh based on the current notes which that's a nice way to kind of keep the results um but still change notes and then when you want you can unfreeze it
and the results will refresh all right so the last few algorithms that i'm going to show are [Music] natural language processing algorithms
and so uh all the previous algorithms that i've shown you are structural they only consider the the graph the the links between notes they don't consider the actual content in the notes right but these natural language
processing algorithms do they look at like the actual words and the notes and um tell you things about them a little caveat the co-citation algorithm does consider content it
actually considers both um but yeah that'll be a different video um and so to use these natural language processing algorithms
i'm just gonna disable these other ones and just turn on those notes in order to use these you'll see that we need the nlp plugin this is a separate plugin that i created
um and we need its uh functionality in order to use the natural language processing algorithms in graph analysis okay so the nlp plugin does a whole bunch of other things but it also gives us a way
to run these algorithms on our vaults and i'll uninstall it just to show how to get it using branch okay so um
the nlp plug-in is not on the community plugin list so you can either install it manually or you can use the bratch plugin which is on the community plugin page if we go to bratch we can say add a beta
plugin and if you type this the repository for the nlp plugin rat will install it for us as if it was on the
community plugin page there's quite a big plug-in so you'll see that it's been installed it's going to refresh that enable the plugin
and you'll see that we have it here these don't matter right now actually they do so in the nlp plugin settings if you want to use it with graph analysis we
need to turn the setting on um and yeah so the reason i've kept it as a separate plugin is because this can take quite a while you only need to do it once every time you start obsidian but waiting like
five seconds every time can be quite a lot and so it's not on by default you only need this on if you want to use the nlp plugin with graph analysis which just a warning that it can take a
little while depending on the size of your volts so i'm going to reload just to let the nlp plugin do its thing
[Music] not sure why that got disabled um sure okay so yeah this now works so we installed the nlp plugin we
toggled that setting on so that we can use it with graph analysis and now we see each of these three algorithms will work so let's go over each one of them
the bag of words um tag of words analysis [Music] takes the content of a note it splits it into its individual words
counts how many times each of those words appears and then uses those frequencies to kind of calculate a similarity between notes so if two notes share a lot of words in
common then they can be considered more similar to one another right so for example the graph analysis note is quite similar to the monte carlo method which i think is pretty accurate
um just based on the words in this notes and the words in this notes computational algorithms yeah lots of um stuff related to the graph analysis
notes good and it's uh you can tell it's an nlp algorithm because of the little speech bubble yeah then we can jump around looking at
different um notes good so um these are again i'll just point out they're fundamentally different in that they're considering the content instead
of just the structure of the graph they're looking at the actual words in the notes all right the next one is um i won't say that butcher it but
this is also a similarity algorithm just like the lou vein similar or community detection algorithm i'm not sure what it's doing under the hood um the nlp library that i've implemented
just offers it as a feature and so i've implemented it here but again it's taking the content of the currently active notes and every other notes and comparing them seeing how similar they are to one another
and again we see again pretty similar results monte carlo method we've got a different nodes at the top um the nodes on latex for example is
very similar to these nodes and that does seem pretty accurate and lastly [Music] we've got a sentiment analysis algorithm this one can take a little moment to
load and it's a global algorithm so it just gives us a value for each note not dependent on the current nodes and this is running a sentiment analysis
like the sentiment is a positive or negative and yeah giving that value to us so a higher value indicates more positive sentiment a low value indicates negative sentiment
and so we can look um at in ascending order the lowest note the note with the lowest sentiment is on trauma and that makes quite a lot of sense we've got these notes about
like anatomy for example there's a pretty bland not a lot of um happy birds in there yeah so these kind of make a lot of sense as well good something to
play around with this all right so those are each of the different algorithms different video for co-citations but i've gone over every other algorithm so i hope that like really helps give
you an intuition for what it's doing i know that it can be a lot to take in but perhaps something helpful to note is that there isn't a lot of setup
for graph analysis you can just install it open this view and very quickly get going start like singling prediction and similarity between your notes so there's not a lot that you need to do
to start using graph analysis but perhaps just overcoming the hurdle of like so much information coming at you right so i'll just reiterate again like the exact values aren't that important like
it doesn't matter that this is two point nine zero four seven and not four six right it's more just the relation between the notes so this note has a higher sentiment than this one for example
okay and just before we finish i'm going to go over the settings of graph analysis and so we've shown that you can choose different algorithms um
and you can also choose a different one to show by default um it will only show allow you to choose from the currently selected one so only now we can choose from all of them
there's an option to exclude infinity by default i don't know if that actually happens anymore so these might be outdated um oh this is really cool so we can also
include other files not just markdown files in the box so this means like images uh videos powerpoint presentations whatever you can put in your obsidian bolts graph analysis can
analyze something that goes quite nicely with this is to show thumbnails for images so if you do show if you do have all file extensions included then showing the image can be
quite cool as well [Music] so for example if we go to the notes on road cycling we can see now that images are being included
so this note on road cycling is deemed to be like relatively similar to the nodes on [Music] the different power levels not sure
about cycling don't do it myself but yeah so it can show the images it can show any file extension that you have in your vaults you can hover over it to make it bigger
you can jump to the image um you can see results from an image's perspective so this image is um quite similar to these notes
yeah so even though these notes aren't connected to this image these nodes don't link to this image graph analysis says that they're quite similar which this applies to
other algorithms too so this image is in this community on road cycling which again is pretty accurate and this is to do with co-citation so i'll skip that you can include unresolved links links
which haven't been created you may or may not want to do this i think it's on by default there are also two means of excluding notes from graph analysis so if you don't want um
particular notes to be shown in the results you can either exclude them using a tag so you can say don't show me notes which have the tag like cycling for example
this this fault doesn't have any tags in it so i can't really demonstrate it but yeah you can use like cycling on the road you can use nested
tags you can use multiple tags so like sucker for example and any notes with these tags won't be included in the graph analysis results okay
and lastly you can also exclude nodes using regular expression um any file name which matches the regular expression will not be included
so we can see that the road cycling node appears quite highly here and so i will exclude any notes which
have cycling in the name importantly the regular expression is being tested against the full file path of the note not just the base name so you need to include folders and the
extension which gives you a lot of control over the regular expression but might be like a catch if you don't realize it um sure so let's do a regular expression live
on air [Music] so any folder if the node includes [Music] the word cycling
yeah then it disappears okay so yeah yeah we can um exclude nodes using a regular expression and with tags um and that will hide them from here i
personally use that to exclude my daily notes which i prepend or append dn to them yeah so this will exclude that good
i'm going to finish there covered quite a lot um but it's actually basically everything it's just code citations which emil is going to make a separate video for as i said uh but yeah there's so much that you can
do with this plug-in um i know that it's a lot but it can be like really quite powerful um so yeah have a play around with it uh test it in your vault and let us know what you think it's always cool to hear
about like interesting results um and how you find it useful also please just let us know if you find bugs we're always looking to squish bugs okay cool yeah here's the repo
thanks a lot for listening and i'll post any relevant links in the description cool
End of transcript