Obsidian Performance Test - Take 1

Imagine a database containing the full text of all the books and publications you have ever read, plus all your reading notes with links to the source paragraphs and your text highlights in the source. It also has everything you have ever written: your journal, all the articles, books, documents, and reports. Such a database would be a powerful source of knowledge.

With persistence, it is possible to build this in Obsidian. But can Obsidian handle this volume of information? This post is about my attempt to understand Obsidian's performance limits. I will present my approach and findings.

Obsidian's architecture at a glance

Obsidian is a web application hosted in a local web browser environment called Electron. It operates on a folder of files stored locally on your computer. Obsidian calls this folder the Vault. Information is stored in documents and attachments such as image files. Documents are formatted using markdown, a simple, platform-independent markup language. Obsidian provides features that facilitate the navigation and editing of these documents: search; linking to documents, to images, to chapters and to paragraphs (a.k.a. blocks); autocomplete for links; management of backlinks; automatic updates when you move or rename files; publishing; synchronization, etc.

As I was preparing for this test, my expectation was that Obsidian should be very scalable because it stores documents and attachments in the computer's file system. All Obsidian needs to maintain is an index database to aid full-text search and the maintenance/navigation of links.

My test database

I loaded 2459 full-text books into my test Vault and generated 565 full-book literature notes containing 161 820 block references. The total size of my test Vault is 727 MB or 8 074 688 non-empty paragraphs (a.k.a blocks). I did not load any images. The largest document I loaded is 14.1MB (2 662 616 words) long.

I used plaintext books from the Gutenberg Library. For these, I auto-generated 18 645 chapter headings (e.g. # Chapter 1). Besides the files from the Gutenberg Library, I also loaded Joschua's Bible Study in Obsidian Kit.

Some may argue that they read more than ~2500 books and create detailed notes for more than ~550 of those. Personally, I'd be thrilled to have detailed personal notes of 50 books! I feel this volume of written information is a realistic target for a Personal Knowledge Management system.

I did not include images and multimedia files in my test, because even though these take up significant space on the local filesystem, I believe that Obsidian only indexes their filenames. I don't expect images and other attachments to have a major contribution to overall Obsidian performance.

Findings

Obsidian performed well. The editing experience, such as highlighting paragraphs, worked smoothly even with the largest document. The search was slow at first, but maybe Obsidian was still indexing in the background. When I wanted to do some screen capture of slow search performance, it was already performing well.

Referencing chapters worked well

Even with this volume of chapters, search for chapters felt very smooth. Here's an example of looking up a chapter/sub-chapter by typing [[## followed by part of the chapter's heading text:

Referencing blocks had some limitations

The same discovery feature, however, stopped working for block references. In my main Vault, I can reference paragraphs by typing [[^^ followed by some text from the paragraph. In my performance test Vault, this did not work. If, however, I know the title of the document from which I want to reference a specific paragraph, then Document-Title#^ followed by some text from the paragraph I want to reference worked smoothly.

If I am unsure which document contains the paragraph I want, I can use search to find it first. In the example below, I open Obsidian search with a hotkey and type the beginning of the paragraph I am searching for. Once found, I drag the link to the document containing the paragraph I want to reference into my notes, and use the [[Document-Title#^ format to search for the specific paragraph (again) to create a block reference to it. Admittedly, there is an extra step, Roam's solution of CTRL+drag to create a block reference is much nicer. But the solution works, and the use-case is not that extremely frequent to make this a deal-breaker.

Editing a 14MB large document felt fluid

The title of the document is "gn06v10", which is The Entire PG Works of George Meredith. Here's what I do in the demo below.

I open the search to look for this document. Note, that search would have probably performed faster if instead of doing a full-text search for "gn06v10", I rather searched for file:gn06v10. Search locates the document in a few seconds.
I opened the document and added some random text highlights. Highlights take 1-2 seconds to take effect.
Finally, I scroll to the middle of the document (scrolling is quick) and add a line of text. Here, Obsidian freezes for more than a few seconds then adds the sentence I've typed from the keyboard buffer. Remember, this is a 14MB large text file. I am not aware of any other text editor that would perform significantly better managing a document of this size.

Hitting the wrong button occasionally resulted in long waits... sometimes followed by a black screen

Truth be told, these incidents happened right after I loaded the large volume of documents into Obsidian. When I wanted to reproduce the issue to create a screen capture, I wasn't able to. I consider this as an issue while Obsidian was indexing in the background. That said, it is not elegant, that starting a search resulted in Obsidian freezing, and eventually halting, and leaving a blank black screen behind. After terminating and restarting Obsidian, the same search worked with no issues.

Conclusion

Obsidian passed the first performance test well. Apart from the issue with block discovery and the occasional black screen, it handled the large volume of text well.

Loading and reading books in Obsidian, however, is not yet solved very well. Parallel to this performance test, I have started to read my first ebook fully within Obsidian. Unfortunately, there don't seem to be good tools available to convert ePub files into Markdown (separate files per chapter, proper table of contents with chapter links, navigation links between chapters, and images copied to a separate folder with working links in the document). Even once you are ready with the hurdle of the format conversion, reading and highlighting are also not very comfortable, especially on mobile phones. Obsidian not remembering where I left off reading is just one hassle making reading books in Obsidian challenging.

Now that I know Obsidian can handle the volume of information such that in theory, I could have my library integrated with my notes, I will focus on creating the scripts, maybe the plugin to allow for a more friction-free reading and note-taking experience.

Comparing to Roam

While the following articles won't offer a direct comparison of a similar scenario, I spent many weeks trying to load books into Roam. Here are my posts dealing with the topic:

Scripts used

For reference, I am sharing the two scripts I used in the performance testing process.

Generating literature notes with block references

files = app.vault.getMarkdownFiles();
stepsize = 15;
refs=0;
blocks=0;
for(f of files) {
  notes = "";
  if(f.path!="index.md") {
	text = await app.vault.read(f);
	lines = text.split("\r\n");
	i = Math.floor(Math.random()*stepsize);
	while (i<lines.length) {
	  if (lines[i].length>10) {
		refs++;
	    blockId = "^"+Math.floor(Math.random()*Date.now()).toString(36);
	    lines[i] = lines[i] + " " + blockId;
	    notes += "> ![["+f.basename+"#"+blockId+"]]\n"+i+". Morbi lobortis augue egestas arcu porttitor, in cursus felis posuere. Nulla finibus vestibulum arcu, id molestie urna fringilla at. Fusce sit amet velit a est tincidunt iaculis sit amet vehicula dui. Aliquam elementum ex eget accumsan pulvinar. In in sollicitudin ex. Nam ut est condimentum, efficitur augue in, cursus augue. Sed faucibus mi non tempor egestas. Proin et nibh dignissim sapien feugiat porta a quis enim. Donec id leo ultrices, molestie dui ut, elementum ligula. Maecenas id suscipit tellus, et luctus libero.\n\n";
		blocks += 2;
	  }
	  i += Math.floor(Math.random()*stepsize);
	}
	await app.vault.create(f.path.split(".md")[0]+" - litnote.md",notes);
	await app.vault.modify(f,lines.join("\r\n"));
	blocks += lines.filter((l)=>l!="").length;
  }
}
console.log("Number of files",files.length*2-1);
console.log("Number of blocks",blocks);
console.log("Number of block references", refs);

Basic Vault statistics

files = app.vault.getMarkdownFiles();
paras = 0; headings = 0; block_refs = 0;
for (f of files) {
  text = await app.vault.read(f);
  lines = text.split("\r\n");
  paras += lines.filter((l)=>l!="").length;
  headings += lines.filter((l)=>l.match(/^#+\s/)).length;
  block_refs += lines.filter((l)=>l.match(/ \^[^\s]+$/)).length;
}
console.log("number of non-empty paragraphs",paras);
console.log("number of block-refs",block_refs);
console.log("number of headings",headings);

Like this post?
Show your support.

TOSCA an Algorithm for Framing Problems

We fail more often because we solve the wrong problem than because we get the wrong solution to the right problem. Russel L. Ackoff In case you were wondering, those are ducks on the table. The facilitator gave us six pieces of LEGOs and asked us to create ducks. You may think this is a well-defined problem. I find it amazing though, how each of us in a group of ten came up with a completely original design. Our unique perspective and our experiences and skills hugely influence our solutions to problems. How we perceive a situation will heavily influence the issues we identify and the solutions we find. If you put one person into a situation, they get stuck. When you put another into the same situation, they solve it in an instant or solve it in a way that you would have never expected. You can frame problems differently leading to unique solutions. Outside school there are rarely problems with an ultimate right solution. To go a step further, there are ra...

aleksandrNovember 14, 2021 at 6:47 PM
sadly i cannot confirm the same when i tried loading in the bible kit into my obsidian 0.12.19 on a windows 10 with 16gb on a amd processor, the visual mapping is re animates from zero and takes 10+ minutes to finalize loading. some books dont even show up as index. but the .mf file is definitely there. further more this issues continues on a m1 16gb apple laptop. no idea what is causing such terrible slow downs. ideas?
MultiScience approachNovember 25, 2021 at 1:20 PM
my vault is 936 MB (images and multimedia files included) and obsidian preforming great besides the " [[^^]] "it crashed my app last time .

i hope it is scalable like it should be because i will need it .

Search This Blog

Obsidian Performance Test - Take 1

Obsidian's architecture at a glance

My test database

Findings

Referencing chapters worked well

Referencing blocks had some limitations

Editing a 14MB large document felt fluid

Hitting the wrong button occasionally resulted in long waits... sometimes followed by a black screen

Conclusion

Comparing to Roam

Scripts used

Generating literature notes with block references

Basic Vault statistics

Labels

Comments

Post a Comment

Popular posts from this blog

Showcasing Excalidraw

Mind mapping with Excalidraw in Obsidian

Evergreen Note on Note-taking Strategies and Their Practical Implementations

TOSCA an Algorithm for Framing Problems