Skip to main content

Obsidian Performance Test - Take 1

Graph View of Performance Test Database

Imagine a database containing the full text of all the books and publications you have ever read, plus all your reading notes with links to the source paragraphs and your text highlights in the source. It also has everything you have ever written: your journal, all the articles, books, documents, and reports. Such a database would be a powerful source of knowledge.

With persistence, it is possible to build this in Obsidian. But can Obsidian handle this volume of information? This post is about my attempt to understand Obsidian's performance limits. I will present my approach and findings.

Obsidian's architecture at a glance

Obsidian is a web application hosted in a local web browser environment called Electron. It operates on a folder of files stored locally on your computer. Obsidian calls this folder the Vault. Information is stored in documents and attachments such as image files. Documents are formatted using markdown, a simple, platform-independent markup language. Obsidian provides features that facilitate the navigation and editing of these documents: search; linking to documents, to images, to chapters and to paragraphs (a.k.a. blocks); autocomplete for links; management of backlinks; automatic updates when you move or rename files; publishing; synchronization, etc.

As I was preparing for this test, my expectation was that Obsidian should be very scalable because it stores documents and attachments in the computer's file system. All Obsidian needs to maintain is an index database to aid full-text search and the maintenance/navigation of links.

My test database

I loaded 2459 full-text books into my test Vault and generated 565 full-book literature notes containing 161 820 block references. The total size of my test Vault is 727 MB or 8 074 688 non-empty paragraphs (a.k.a blocks). I did not load any images. The largest document I loaded is 14.1MB (2 662 616 words) long.

I used plaintext books from the Gutenberg Library. For these, I auto-generated 18 645 chapter headings (e.g. # Chapter 1). Besides the files from the Gutenberg Library, I also loaded Joschua's Bible Study in Obsidian Kit.

Some may argue that they read more than ~2500 books and create detailed notes for more than ~550 of those. Personally, I'd be thrilled to have detailed personal notes of 50 books! I feel this volume of written information is a realistic target for a Personal Knowledge Management system.

I did not include images and multimedia files in my test, because even though these take up significant space on the local filesystem, I believe that Obsidian only indexes their filenames. I don't expect images and other attachments to have a major contribution to overall Obsidian performance.

Findings

Obsidian performed well. The editing experience, such as highlighting paragraphs, worked smoothly even with the largest document. The search was slow at first, but maybe Obsidian was still indexing in the background. When I wanted to do some screen capture of slow search performance, it was already performing well.

Referencing chapters worked well

Even with this volume of chapters, search for chapters felt very smooth. Here's an example of looking up a chapter/sub-chapter by typing [[## followed by part of the chapter's heading text:

Referencing blocks had some limitations

The same discovery feature, however, stopped working for block references. In my main Vault, I can reference paragraphs by typing [[^^ followed by some text from the paragraph. In my performance test Vault, this did not work. If, however, I know the title of the document from which I want to reference a specific paragraph, then Document-Title#^ followed by some text from the paragraph I want to reference worked smoothly.

If I am unsure which document contains the paragraph I want, I can use search to find it first. In the example below, I open Obsidian search with a hotkey and type the beginning of the paragraph I am searching for. Once found, I drag the link to the document containing the paragraph I want to reference into my notes, and use the [[Document-Title#^ format to search for the specific paragraph (again) to create a block reference to it. Admittedly, there is an extra step, Roam's solution of CTRL+drag to create a block reference is much nicer. But the solution works, and the use-case is not that extremely frequent to make this a deal-breaker.

Editing a 14MB large document felt fluid

The title of the document is "gn06v10", which is The Entire PG Works of George Meredith. Here's what I do in the demo below. 

  1. I open the search to look for this document. Note, that search would have probably performed faster if instead of doing a full-text search for "gn06v10", I rather searched for file:gn06v10. Search locates the document in a few seconds. 
  2. I opened the document and added some random text highlights. Highlights take 1-2 seconds to take effect. 
  3. Finally, I scroll to the middle of the document (scrolling is quick) and add a line of text. Here, Obsidian freezes for more than a few seconds then adds the sentence I've typed from the keyboard buffer. Remember, this is a 14MB large text file. I am not aware of any other text editor that would perform significantly better managing a document of this size.

Hitting the wrong button occasionally resulted in long waits... sometimes followed by a black screen

Truth be told, these incidents happened right after I loaded the large volume of documents into Obsidian. When I wanted to reproduce the issue to create a screen capture, I wasn't able to. I consider this as an issue while Obsidian was indexing in the background. That said, it is not elegant, that starting a search resulted in Obsidian freezing, and eventually halting, and leaving a blank black screen behind. After terminating and restarting Obsidian, the same search worked with no issues.

Conclusion

Obsidian passed the first performance test well. Apart from the issue with block discovery and the occasional black screen, it handled the large volume of text well.

Loading and reading books in Obsidian, however, is not yet solved very well. Parallel to this performance test, I have started to read my first ebook fully within Obsidian. Unfortunately, there don't seem to be good tools available to convert ePub files into Markdown (separate files per chapter, proper table of contents with chapter links, navigation links between chapters, and images copied to a separate folder with working links in the document). Even once you are ready with the hurdle of the format conversion, reading and highlighting are also not very comfortable, especially on mobile phones. Obsidian not remembering where I left off reading is just one hassle making reading books in Obsidian challenging.

Now that I know Obsidian can handle the volume of information such that in theory, I could have my library integrated with my notes, I will focus on creating the scripts, maybe the plugin to allow for a more friction-free reading and note-taking experience.

Comparing to Roam

While the following articles won't offer a direct comparison of a similar scenario, I spent many weeks trying to load books into Roam. Here are my posts dealing with the topic:

Scripts used

For reference, I am sharing the two scripts I used in the performance testing process.

Generating literature notes with block references

files = app.vault.getMarkdownFiles();
stepsize = 15;
refs=0;
blocks=0;
for(f of files) {
  notes = "";
  if(f.path!="index.md") {
	text = await app.vault.read(f);
	lines = text.split("\r\n");
	i = Math.floor(Math.random()*stepsize);
	while (i<lines.length) {
	  if (lines[i].length>10) {
		refs++;
	    blockId = "^"+Math.floor(Math.random()*Date.now()).toString(36);
	    lines[i] = lines[i] + " " + blockId;
	    notes += "> ![["+f.basename+"#"+blockId+"]]\n"+i+". Morbi lobortis augue egestas arcu porttitor, in cursus felis posuere. Nulla finibus vestibulum arcu, id molestie urna fringilla at. Fusce sit amet velit a est tincidunt iaculis sit amet vehicula dui. Aliquam elementum ex eget accumsan pulvinar. In in sollicitudin ex. Nam ut est condimentum, efficitur augue in, cursus augue. Sed faucibus mi non tempor egestas. Proin et nibh dignissim sapien feugiat porta a quis enim. Donec id leo ultrices, molestie dui ut, elementum ligula. Maecenas id suscipit tellus, et luctus libero.\n\n";
		blocks += 2;
	  }
	  i += Math.floor(Math.random()*stepsize);
	}
	await app.vault.create(f.path.split(".md")[0]+" - litnote.md",notes);
	await app.vault.modify(f,lines.join("\r\n"));
	blocks += lines.filter((l)=>l!="").length;
  }
}
console.log("Number of files",files.length*2-1);
console.log("Number of blocks",blocks);
console.log("Number of block references", refs);

Basic Vault statistics

files = app.vault.getMarkdownFiles();
paras = 0; headings = 0; block_refs = 0;
for (f of files) {
  text = await app.vault.read(f);
  lines = text.split("\r\n");
  paras += lines.filter((l)=>l!="").length;
  headings += lines.filter((l)=>l.match(/^#+\s/)).length;
  block_refs += lines.filter((l)=>l.match(/ \^[^\s]+$/)).length;
}
console.log("number of non-empty paragraphs",paras);
console.log("number of block-refs",block_refs);
console.log("number of headings",headings);
Like this post?
Show your support.

Comments

  1. sadly i cannot confirm the same when i tried loading in the bible kit into my obsidian 0.12.19 on a windows 10 with 16gb on a amd processor, the visual mapping is re animates from zero and takes 10+ minutes to finalize loading. some books dont even show up as index. but the .mf file is definitely there. further more this issues continues on a m1 16gb apple laptop. no idea what is causing such terrible slow downs. ideas?

    ReplyDelete
    Replies
    1. This sound more like a plugin running out of control. Have you tried turning off all your plugins and loading Obsidian? Is it still slow. If that solves the issue, then you can turn on plugins one by one to find out which one is causing the problem.

      I currently have 5500 documents and 1200 folders. Obsidian runs smoothly.

      Delete
  2. my vault is 936 MB (images and multimedia files included) and obsidian preforming great besides the " [[^^]] "it crashed my app last time .

    i hope it is scalable like it should be because i will need it .

    ReplyDelete

Post a Comment

Popular posts from this blog

Deep Dive Into Roam's Data Structure - Why Roam is Much More Than a Note Taking App

Which are the longest paragraphs in your graph? Which pages did you edit or create last week? How many paragraphs of text do you have in your database in total? Which pages do you have under a given namesapece (e.g. meetings/)?

Showcasing Excalidraw

Conor ( @Conaw ) pointed me to Excalidraw last week, and I was blown away by the tool and especially about the opportunities it opens up for  Roam Research ! It is a full-featured, embeddable sketching component ready for web integration. This post will showcase key Excalidraw features and discusses some of the issues I still need to solve to complete its integration into Roam. I spent most of my free time during the week integrating Excalidraw into Roam. This article will introduce Excalidraw by showcasing its features.

My GTD - How I Organize Meetings and TODOs in Roam

How efficient is your workflow for keeping on top of all your meeting notes, action items, contacts, projects and more?  If you were to bump into someone unexpectedly would you be able to remind yourself of all the relevant topics you wanted to discuss with the person?  Can you remember all the things you wanted to get done when running your errands?  Can you keep track of your discussions with all the people you talk to regularly? In this post I will walk you through my meetings-actions-people workflow in Roam. If you are new to Roam and Roam42... Just in case you are not familiar with Roam , it is an ultra flexible note taking tool. It's like the Excel for text. If you want to find out more, there is tremendous amount of quality content available on YouTube, just search from "Roam Research". Equally, you can head over to RoamBrain.com for all the best links and more. My workf

A closer look at {{roam/render}}

Roam is like a good swiss army knife, it even has a ClojureScript development environment. I spent the past 14 days getting acquainted with {{roam/render}} . This post serves as a note-to-self, to summarize and organize what I have learned.  This post is targeted at developers and Roam hackers. If you are not in one of these camps, you will likely struggle with the content.

contact: info@zsolt.blog