Waiting..
Auto Scroll
Sync
Top
Bottom
Select text to annotate, Click play in YouTube to begin
00:00:02
hi there everybody it's skeptic mystic from the obsidian forum and from the discord in this video i'm going to be going over mine and emile's plugin called graph analysis
00:00:16
so first thing we'll need to do is just install the plugin you can do that from the community plugin page and you'll see that i've already got it installed but using the default settings
00:00:29
and you can just enable it from there so before we even look at what graph analysis can do let's just consider what it's meant to do so it's plugin that allows you to see the results of various different graph
00:00:45
analysis measures being applied to your obsidian bolts and that can sound like a lot basically we're using some pretty fancy math in some cases to
00:00:57
analyze like which notes are most similar to one another or try and find groups of notes in some cases but we'll see how that works and the two main types of algorithms
00:01:10
that we use are structural these algorithms only consider the links in your obsidian graph and they don't consider the contents of the notes and the other kind
00:01:22
look at what's inside the note they compare the the words inside the note to other node contents uh just before i go any further i just want to note that i'm using a publicly
00:01:36
available vault which you can find over here at this github repository i'll post a link to this in the description yeah this vault um has quite a few files in
00:01:52
it and lots of useful links between them about a lot of different subjects so you can use this to test your plugins for example or graph analysis which
00:02:03
so um let's enable or take a look at the results of graph analysis so the main thing we're going to be using from
00:02:17
the plug-in is the graph analysis view um and to open that we need to run this command open graph analysis view and you'll see on the right side uh a new leaf appears
00:02:31
and it should look like this depending on what page you're on in your vaults so let's just pick a random note and we'll pick this algorithm to run i'll explain these
00:02:44
in a moment [Music] not all notes are going to have results but the language is a good one to use so the notes about language
00:02:57
um gives us all of these results to see and this is a lot of information to take in um and so as i've noted here in a lot of cases there are going to be a lot of numbers sometimes that's not the result
00:03:11
of the algorithm but what i want to show the importance of is that the exact values aren't um like really what you should be considering most of the time the relation between
00:03:23
those values to other nodes in your vault is what gives you the most information and we'll see how that plays out using [Music] using some examples good so
00:03:36
i'm going to go over each of the algorithms available in graph analysis except for co-citations cocitations is quite a big topic on its own and so emil
00:03:47
the resident expert on co-citations is going to make a separate video just for that so with co-citations aside there are a few different types of algorithms that have grouped into these categories
00:04:03
the first kind i'll go over are link prediction algorithms and a link prediction algorithm is taking the structure of your graph and saying look at the current nodes and for
00:04:16
every other nodes give me the probability or predict how likely it is that these nodes should be connected to each other whether or not they actually are okay
00:04:28
so this is the result that i showed on the language so let's go back to language notes um [Music] and we can see all of these results here and what this is saying is that
00:04:44
based on the structure of your graph this is showing which nodes should be linked to the current nodes so it's predicting quite highly that the nodes about humans
00:04:57
should be connected to the notes about language and that sort of makes sense intuitively and a helpful way to kind of gauge the result is to see if those notes are actually linked so if we see this little
00:05:08
icon next to a result that means that the nodes on language is actually linked to the nodes on brain and we can see that over here for example this icon will appear
00:05:20
bi-directionally so if the node brain links to the node language it will also appear next to the result there yeah and we can
00:05:35
gauge these results language and psychology are quite linked for example you can scroll through them you can also sort the results in ascending or descending order
00:05:48
so of the results that weren't zero logical connective is the least likely to be linked to language we can also well yeah um we actually filter out zero
00:06:01
so this toggle isn't actually doing anything in this case but yeah it's nice to be able to sort the results in ascending or descending order and just jump around between the notes if we go to the notes on humans we can
00:06:14
see it's most likely connected to the notes on brain if you hover over the results we can see which notes those uh yeah
00:06:25
which notes these two have in common so this is saying that um the nodes human and brain have common neighbors developmental psychology
00:06:38
feeling human brain for example and again these will appear bi-directionally so even though the human node has no links leaving it the notes on developmental psychology has a link going to human
00:06:51
and yeah and we can see lots of results here [Music] and they all kind of make sense so that's a link prediction algorithm
00:07:03
it's telling you how likely is it that the current note is connected to every other note so let's go back to our starting page we're done with link prediction
00:07:19
and let's take a look at similarity algorithms next um there are a few different ways you could kind of define this my understanding is that again based on
00:07:32
the structure of your graph the the links between your notes and not the content in those notes the similarity algorithm is telling you how similar the current note is to every other node
00:07:46
so let's change over to the jacquard similarity algorithm and go back to the anki node for example and we can see that [Music] you can hover over the information icon
00:07:59
to see a little bit more again based on the structure tells you how similar they are and a bit more in depth this algorithm is showing you the ratio of the number of neighbors these two nodes have in
00:08:11
common so the total number of neighbors they have um together so the maximum value could be is one and the lowest is going to be zero and we can see all those results here sorted in
00:08:23
descending order but so the notes on anki is very similar to the notes on obsidian and they are actually linked it's very similar to the notes on repetition suppression and those aren't
00:08:36
actually linked so perhaps that's giving you a hint that those nodes are similar enough that they should be a link between them yeah and again we can jump around the nodes control
00:08:49
hover to [Music] preview the nodes if it actually has content yeah not so we can do that it's those are similarity algorithms you'll see that there's another
00:09:03
similarity algorithm listed but it's not shown here even if you scroll and that's because in the settings for graph analysis not all of the algorithms are shown by defaults and there are quite a
00:09:14
few and it can be quite overwhelming so we as the developers have chosen not to show all of them but you can come here and turn some of them on because co-citations is shown by default but
00:09:27
not for this video so if i turn on the overlap algorithm the other similarity algorithm if you refresh the index to make it appear here you can then show the overlap algorithm and
00:09:40
yeah get an idea of the overlap similarity between notes for which i haven't given a description so if uh you do want to see a bit more in-depth uh information about each of
00:09:53
the algorithms you can read the readme on graph analysis you can find it here on the repo and we give quite a bit of information about each of the algorithms and
00:10:06
the actual formula like what what is the math doing for this algorithm um yeah and that can be quite a lot to take in so uh mostly what i want to get across is an intuition
00:10:18
for what each of these algorithms are doing so again this is a similarity algorithm telling you that like the cognitive strategy notes has a high overlap similarity with the time note
00:10:30
for example good so those are similarity algorithms let's have a look at centrality um a centrality algorithm
00:10:48
is telling us uh it's giving us an idea of which notes are like most central if you sort of bring the most highly linked notes to the center of the graph these are going to tell you which nodes those
00:11:00
are so the centrality algorithm in graph analysis is hits and if we activate that and jump to it here
00:11:12
we can see these results so one thing to note because it hasn't come up yet is this little icon here it's an icon of globe of the earth and this is telling us that the hits
00:11:25
algorithm is a global algorithm what that means is that it doesn't depend the results don't depend on the currently focused notes so if i change notes you'll see that
00:11:37
these results are just staying the same it's a global algorithm it the results don't change based on which node you're currently focused on and so the hits algorithm
00:11:54
is telling us the hub score and the authority score of every note in the vault right and so an authority is a note with lots
00:12:06
of links coming into it lots of different notes cite this notes and so it has a lot of authority a hub is a node which has lots lots of links leaving it right and the hits algorithm is showing us
00:12:19
both of the scores in one by default it's sorted by authority in descending order so we can change that to ascending order we see that authority is all zero
00:12:31
and we can also change it to sort by the hub score in descending or ascending order right so just using the the default
00:12:43
sorting we can see that the node on brain has the highest authority most nodes link to the brain notes um yeah a bit of synchronicity there maybe
00:12:57
um and it has quite a low hub score yeah so this gives us a way to assess that and if you change notes you can see uh if your the the currently focused notes is
00:13:11
bolded in the results good so these are some interesting ones we also have um not that the others aren't we also have
00:13:27
community detection algorithms um the first of which i'll show is label propagation but first a community detection algorithm is taking the structure of the graph and
00:13:41
using different methods based on the algorithm it's saying well like this clump of notes here forms a community these notes are more highly connected to one another than they are
00:13:53
to the other notes in the graph and so they should form a cluster a community of sorts right and we can sort of gauge that just by looking at the graph like there's a cluster over here maybe one here
00:14:06
a smaller one over here but these community detection algorithms are giving us explicit uh communities so let's look at label propagation see it's also a global algorithm
00:14:20
so the results don't depend on the currently focused note and this is a lot to take in this is the result of running a label propagation algorithm on your graph
00:14:33
i've given a description here basically you you give each node in the graph a label which happens to be its own name and then over multiple steps multiple iterations
00:14:45
each note takes on the label of the most popular label among it so first that's going to be quite a random process but over time certain
00:14:57
labels are going to become more and more popular in the graph and sort of take over their given community and after the specified number of iterations which you can change um
00:15:10
we see the the community that each node landed up in um and so on this note on polyamory for example uh these
00:15:21
ended up in that cluster and let's take a look at the more popular and yeah so the notes on happiness uh form quite a big cluster and we can see based on this icon that the
00:15:35
personality note is linked to the happiness notes um and so that gives you a good idea of like the validity of that cluster of that community
00:15:47
but even if the notes aren't actually linked to the community notes um the results are still pretty good right like um jealousy is relevant to like polyamory for example
00:15:59
and yeah and so the results are going to be different for your uh for your vaults but it always kind of surprises me how accurate it is and there's quite a bit that you can change
00:16:15
you can lower the number of iterations to one and based on the algorithm if you only pass labels once then by definition all of the notes in that community are going to be linked to
00:16:28
the community notes um yeah and so if the the number of iterations is quite low then you can have quite a few communities because it hasn't had a chance to the more popular
00:16:41
uh labels haven't had a chance to take over in a sense and if we increase the number of iterations then the happiness node has 517 nodes in its uh community
00:16:53
good and you can change the sorting order for example yeah it's quite a fun one i quite like the the label propagation algorithm we also have the new vein community
00:17:08
detection algorithm and this is doing something quite similar it's also running a community detection algorithm one that i don't quite know how it works the
00:17:20
coding library i use just implements it for us and so we've added it's a graph analysis this is not a global algorithm so it does depend on the currently focused notes
00:17:32
and the leuvene community detection algorithm only shows you the community that the current node is in it doesn't show the the current node because it will always be in that community
00:17:45
um yeah and there's a bit of randomness to it so if you refresh index sometimes the results will be different um yeah and you'll see that they're pretty accurate yeah
00:18:03
again you can change the resolution fewer iterations makes for larger communities and more iterations makes for more refined communities that
00:18:14
are generally more accurate um and lastly the clustering coefficient this is a global algorithm and this is kind of interesting um it
00:18:36
tells you the likelihood that a node's neighbors are connected to one another so this note on endorphins has um [Music]
00:18:48
the following neighbors uh there's a bit of a bug there but yeah so this is showing um the the triangles that this notes is a part of but we can
00:18:59
just think of that as its neighbors and so it's telling us that there's quite a high probability that the notes on endorphins that node's neighbors are probably connected to one another so
00:19:12
happiness and pleasure are likely to be connected for example yeah and and that likelihood sort of tells us um about the note that it's connected to
00:19:24
about the notes on endorphins yeah so it's a quite a different algorithm there something to play around with good um and before i show the last type um i
00:19:40
haven't mentioned this frozen icon it's a it's a flame um and the frozen icon so if we go to a non-global algorithm because it's non-global it depends on
00:19:53
the currently focused notes if we click around the results are going to change but sometimes you don't want those results to change and so we can freeze it on the currently focused notes
00:20:05
and that means that if you click around it will change notes but the results won't change because it's still frozen on the note we froze it on right and then if you do want to unfreeze it
00:20:17
you can just click that again and the results will refresh based on the current notes which that's a nice way to kind of keep the results um but still change notes and then when you want you can unfreeze it
00:20:30
and the results will refresh all right so the last few algorithms that i'm going to show are [Music] natural language processing algorithms
00:20:50
and so uh all the previous algorithms that i've shown you are structural they only consider the the graph the the links between notes they don't consider the actual content in the notes right but these natural language
00:21:03
processing algorithms do they look at like the actual words and the notes and um tell you things about them a little caveat the co-citation algorithm does consider content it
00:21:15
actually considers both um but yeah that'll be a different video um and so to use these natural language processing algorithms
00:21:27
i'm just gonna disable these other ones and just turn on those notes in order to use these you'll see that we need the nlp plugin this is a separate plugin that i created
00:21:43
um and we need its uh functionality in order to use the natural language processing algorithms in graph analysis okay so the nlp plugin does a whole bunch of other things but it also gives us a way
00:21:56
to run these algorithms on our vaults and i'll uninstall it just to show how to get it using branch okay so um
00:22:13
the nlp plug-in is not on the community plugin list so you can either install it manually or you can use the bratch plugin which is on the community plugin page if we go to bratch we can say add a beta
00:22:26
plugin and if you type this the repository for the nlp plugin rat will install it for us as if it was on the
00:22:38
community plugin page there's quite a big plug-in so you'll see that it's been installed it's going to refresh that enable the plugin
00:22:56
and you'll see that we have it here these don't matter right now actually they do so in the nlp plugin settings if you want to use it with graph analysis we
00:23:13
need to turn the setting on um and yeah so the reason i've kept it as a separate plugin is because this can take quite a while you only need to do it once every time you start obsidian but waiting like
00:23:27
five seconds every time can be quite a lot and so it's not on by default you only need this on if you want to use the nlp plugin with graph analysis which just a warning that it can take a
00:23:40
little while depending on the size of your volts so i'm going to reload just to let the nlp plugin do its thing
00:23:57
[Music] not sure why that got disabled um sure okay so yeah this now works so we installed the nlp plugin we
00:24:16
toggled that setting on so that we can use it with graph analysis and now we see each of these three algorithms will work so let's go over each one of them
00:24:27
the bag of words um tag of words analysis [Music] takes the content of a note it splits it into its individual words
00:24:42
counts how many times each of those words appears and then uses those frequencies to kind of calculate a similarity between notes so if two notes share a lot of words in
00:24:53
common then they can be considered more similar to one another right so for example the graph analysis note is quite similar to the monte carlo method which i think is pretty accurate
00:25:06
um just based on the words in this notes and the words in this notes computational algorithms yeah lots of um stuff related to the graph analysis
00:25:18
notes good and it's uh you can tell it's an nlp algorithm because of the little speech bubble yeah then we can jump around looking at
00:25:30
different um notes good so um these are again i'll just point out they're fundamentally different in that they're considering the content instead
00:25:44
of just the structure of the graph they're looking at the actual words in the notes all right the next one is um i won't say that butcher it but
00:25:56
this is also a similarity algorithm just like the lou vein similar or community detection algorithm i'm not sure what it's doing under the hood um the nlp library that i've implemented
00:26:09
just offers it as a feature and so i've implemented it here but again it's taking the content of the currently active notes and every other notes and comparing them seeing how similar they are to one another
00:26:22
and again we see again pretty similar results monte carlo method we've got a different nodes at the top um the nodes on latex for example is
00:26:34
very similar to these nodes and that does seem pretty accurate and lastly [Music] we've got a sentiment analysis algorithm this one can take a little moment to
00:26:54
load and it's a global algorithm so it just gives us a value for each note not dependent on the current nodes and this is running a sentiment analysis
00:27:06
like the sentiment is a positive or negative and yeah giving that value to us so a higher value indicates more positive sentiment a low value indicates negative sentiment
00:27:19
and so we can look um at in ascending order the lowest note the note with the lowest sentiment is on trauma and that makes quite a lot of sense we've got these notes about
00:27:32
like anatomy for example there's a pretty bland not a lot of um happy birds in there yeah so these kind of make a lot of sense as well good something to
00:27:46
play around with this all right so those are each of the different algorithms different video for co-citations but i've gone over every other algorithm so i hope that like really helps give
00:27:59
you an intuition for what it's doing i know that it can be a lot to take in but perhaps something helpful to note is that there isn't a lot of setup
00:28:11
for graph analysis you can just install it open this view and very quickly get going start like singling prediction and similarity between your notes so there's not a lot that you need to do
00:28:23
to start using graph analysis but perhaps just overcoming the hurdle of like so much information coming at you right so i'll just reiterate again like the exact values aren't that important like
00:28:36
it doesn't matter that this is two point nine zero four seven and not four six right it's more just the relation between the notes so this note has a higher sentiment than this one for example
00:28:50
okay and just before we finish i'm going to go over the settings of graph analysis and so we've shown that you can choose different algorithms um
00:29:02
and you can also choose a different one to show by default um it will only show allow you to choose from the currently selected one so only now we can choose from all of them
00:29:15
there's an option to exclude infinity by default i don't know if that actually happens anymore so these might be outdated um oh this is really cool so we can also
00:29:28
include other files not just markdown files in the box so this means like images uh videos powerpoint presentations whatever you can put in your obsidian bolts graph analysis can
00:29:42
analyze something that goes quite nicely with this is to show thumbnails for images so if you do show if you do have all file extensions included then showing the image can be
00:29:56
quite cool as well [Music] so for example if we go to the notes on road cycling we can see now that images are being included
00:30:11
so this note on road cycling is deemed to be like relatively similar to the nodes on [Music] the different power levels not sure
00:30:23
about cycling don't do it myself but yeah so it can show the images it can show any file extension that you have in your vaults you can hover over it to make it bigger
00:30:34
you can jump to the image um you can see results from an image's perspective so this image is um quite similar to these notes
00:30:46
yeah so even though these notes aren't connected to this image these nodes don't link to this image graph analysis says that they're quite similar which this applies to
00:31:01
other algorithms too so this image is in this community on road cycling which again is pretty accurate and this is to do with co-citation so i'll skip that you can include unresolved links links
00:31:18
which haven't been created you may or may not want to do this i think it's on by default there are also two means of excluding notes from graph analysis so if you don't want um
00:31:33
particular notes to be shown in the results you can either exclude them using a tag so you can say don't show me notes which have the tag like cycling for example
00:31:46
this this fault doesn't have any tags in it so i can't really demonstrate it but yeah you can use like cycling on the road you can use nested
00:31:57
tags you can use multiple tags so like sucker for example and any notes with these tags won't be included in the graph analysis results okay
00:32:11
and lastly you can also exclude nodes using regular expression um any file name which matches the regular expression will not be included
00:32:23
so we can see that the road cycling node appears quite highly here and so i will exclude any notes which
00:32:35
have cycling in the name importantly the regular expression is being tested against the full file path of the note not just the base name so you need to include folders and the
00:32:48
extension which gives you a lot of control over the regular expression but might be like a catch if you don't realize it um sure so let's do a regular expression live
00:33:01
on air [Music] so any folder if the node includes [Music] the word cycling
00:33:23
yeah then it disappears okay so yeah yeah we can um exclude nodes using a regular expression and with tags um and that will hide them from here i
00:33:35
personally use that to exclude my daily notes which i prepend or append dn to them yeah so this will exclude that good
00:33:48
i'm going to finish there covered quite a lot um but it's actually basically everything it's just code citations which emil is going to make a separate video for as i said uh but yeah there's so much that you can
00:34:00
do with this plug-in um i know that it's a lot but it can be like really quite powerful um so yeah have a play around with it uh test it in your vault and let us know what you think it's always cool to hear
00:34:13
about like interesting results um and how you find it useful also please just let us know if you find bugs we're always looking to squish bugs okay cool yeah here's the repo
00:34:25
thanks a lot for listening and i'll post any relevant links in the description cool
End of transcript