Imagine a database containing the full text of all the books and publications you have ever read, plus all your reading notes with links to the source paragraphs and your text highlights in the source. It also has everything you have ever written: your journal, all the articles, books, documents, and reports. Such a database would be a powerful source of knowledge.
With persistence, it is possible to build this in Obsidian. But can Obsidian handle this volume of information? This post is about my attempt to understand Obsidian's performance limits. I will present my approach and findings.
Obsidian's architecture at a glance
Obsidian is a web application hosted in a local web browser environment called Electron. It operates on a folder of files stored locally on your computer. Obsidian calls this folder the Vault. Information is stored in documents and attachments such as image files. Documents are formatted using markdown, a simple, platform-independent markup language. Obsidian provides features that facilitate the navigation and editing of these documents: search; linking to documents, to images, to chapters and to paragraphs (a.k.a. blocks); autocomplete for links; management of backlinks; automatic updates when you move or rename files; publishing; synchronization, etc.
As I was preparing for this test, my expectation was that Obsidian should be very scalable because it stores documents and attachments in the computer's file system. All Obsidian needs to maintain is an index database to aid full-text search and the maintenance/navigation of links.
My test database
I loaded 2459 full-text books into my test Vault and generated 565 full-book literature notes containing 161 820 block references. The total size of my test Vault is 727 MB or 8 074 688 non-empty paragraphs (a.k.a blocks). I did not load any images. The largest document I loaded is 14.1MB (2 662 616 words) long.
I used plaintext books from the Gutenberg Library. For these, I auto-generated 18 645 chapter headings (e.g. # Chapter 1
). Besides the files from the Gutenberg Library, I also loaded Joschua's Bible Study in Obsidian Kit.
Some may argue that they read more than ~2500 books and create detailed notes for more than ~550 of those. Personally, I'd be thrilled to have detailed personal notes of 50 books! I feel this volume of written information is a realistic target for a Personal Knowledge Management system.
I did not include images and multimedia files in my test, because even though these take up significant space on the local filesystem, I believe that Obsidian only indexes their filenames. I don't expect images and other attachments to have a major contribution to overall Obsidian performance.
Findings
Obsidian performed well. The editing experience, such as highlighting paragraphs, worked smoothly even with the largest document. The search was slow at first, but maybe Obsidian was still indexing in the background. When I wanted to do some screen capture of slow search performance, it was already performing well.
Referencing chapters worked well
Even with this volume of chapters, search for chapters felt very smooth. Here's an example of looking up a chapter/sub-chapter by typing [[##
followed by part of the chapter's heading text:
Referencing blocks had some limitations
The same discovery feature, however, stopped working for block references. In my main Vault, I can reference paragraphs by typing [[^^
followed by some text from the paragraph. In my performance test Vault, this did not work. If, however, I know the title of the document from which I want to reference a specific paragraph, then Document-Title#^
followed by some text from the paragraph I want to reference worked smoothly.
If I am unsure which document contains the paragraph I want, I can use search to find it first. In the example below, I open Obsidian search with a hotkey and type the beginning of the paragraph I am searching for. Once found, I drag the link to the document containing the paragraph I want to reference into my notes, and use the [[Document-Title#^
format to search for the specific paragraph (again) to create a block reference to it. Admittedly, there is an extra step, Roam's solution of CTRL+drag to create a block reference is much nicer. But the solution works, and the use-case is not that extremely frequent to make this a deal-breaker.
Editing a 14MB large document felt fluid
The title of the document is "gn06v10", which is The Entire PG Works of George Meredith. Here's what I do in the demo below.
- I open the search to look for this document. Note, that search would have probably performed faster if instead of doing a full-text search for "gn06v10", I rather searched for
file:gn06v10
. Search locates the document in a few seconds. - I opened the document and added some random text highlights. Highlights take 1-2 seconds to take effect.
- Finally, I scroll to the middle of the document (scrolling is quick) and add a line of text. Here, Obsidian freezes for more than a few seconds then adds the sentence I've typed from the keyboard buffer. Remember, this is a 14MB large text file. I am not aware of any other text editor that would perform significantly better managing a document of this size.
Hitting the wrong button occasionally resulted in long waits... sometimes followed by a black screen
Truth be told, these incidents happened right after I loaded the large volume of documents into Obsidian. When I wanted to reproduce the issue to create a screen capture, I wasn't able to. I consider this as an issue while Obsidian was indexing in the background. That said, it is not elegant, that starting a search resulted in Obsidian freezing, and eventually halting, and leaving a blank black screen behind. After terminating and restarting Obsidian, the same search worked with no issues.
Conclusion
Comparing to Roam
While the following articles won't offer a direct comparison of a similar scenario, I spent many weeks trying to load books into Roam. Here are my posts dealing with the topic:
- Importing the Bible to Roam - Final Solution
- Study Bible or ePub Books in Roam? My Roller Coaster Ride with Roam JSON
- My Adventures with Roam.JSON
- Read Books in Roam - A Detailed How To Guide for Importing and Using ePub in Roam
Scripts used
Generating literature notes with block references
files = app.vault.getMarkdownFiles(); stepsize = 15; refs=0; blocks=0; for(f of files) { notes = ""; if(f.path!="index.md") { text = await app.vault.read(f); lines = text.split("\r\n"); i = Math.floor(Math.random()*stepsize); while (i<lines.length) { if (lines[i].length>10) { refs++; blockId = "^"+Math.floor(Math.random()*Date.now()).toString(36); lines[i] = lines[i] + " " + blockId; notes += "> ![["+f.basename+"#"+blockId+"]]\n"+i+". Morbi lobortis augue egestas arcu porttitor, in cursus felis posuere. Nulla finibus vestibulum arcu, id molestie urna fringilla at. Fusce sit amet velit a est tincidunt iaculis sit amet vehicula dui. Aliquam elementum ex eget accumsan pulvinar. In in sollicitudin ex. Nam ut est condimentum, efficitur augue in, cursus augue. Sed faucibus mi non tempor egestas. Proin et nibh dignissim sapien feugiat porta a quis enim. Donec id leo ultrices, molestie dui ut, elementum ligula. Maecenas id suscipit tellus, et luctus libero.\n\n"; blocks += 2; } i += Math.floor(Math.random()*stepsize); } await app.vault.create(f.path.split(".md")[0]+" - litnote.md",notes); await app.vault.modify(f,lines.join("\r\n")); blocks += lines.filter((l)=>l!="").length; } } console.log("Number of files",files.length*2-1); console.log("Number of blocks",blocks); console.log("Number of block references", refs);
Basic Vault statistics
files = app.vault.getMarkdownFiles(); paras = 0; headings = 0; block_refs = 0; for (f of files) { text = await app.vault.read(f); lines = text.split("\r\n"); paras += lines.filter((l)=>l!="").length; headings += lines.filter((l)=>l.match(/^#+\s/)).length; block_refs += lines.filter((l)=>l.match(/ \^[^\s]+$/)).length; } console.log("number of non-empty paragraphs",paras); console.log("number of block-refs",block_refs); console.log("number of headings",headings);
sadly i cannot confirm the same when i tried loading in the bible kit into my obsidian 0.12.19 on a windows 10 with 16gb on a amd processor, the visual mapping is re animates from zero and takes 10+ minutes to finalize loading. some books dont even show up as index. but the .mf file is definitely there. further more this issues continues on a m1 16gb apple laptop. no idea what is causing such terrible slow downs. ideas?
ReplyDeleteThis sound more like a plugin running out of control. Have you tried turning off all your plugins and loading Obsidian? Is it still slow. If that solves the issue, then you can turn on plugins one by one to find out which one is causing the problem.
DeleteI currently have 5500 documents and 1200 folders. Obsidian runs smoothly.
my vault is 936 MB (images and multimedia files included) and obsidian preforming great besides the " [[^^]] "it crashed my app last time .
ReplyDeletei hope it is scalable like it should be because i will need it .