Why don’t we write a database file system? Isn’t a file system practically a database already? Isn’t layering an OS between the data and the database application slowing things down?

985 viewsEngineeringOther

Why don’t we write a database file system? Isn’t a file system practically a database already? Isn’t layering an OS between the data and the database application slowing things down?

In: Engineering

19 Answers

Anonymous 0 Comments

Some operating systems have tried to do this, most notably BeOS with its Be file system. [Haiku](https://www.haiku-os.org/), an OS inspired by BeOS, inherited this, so you can still play around with it if you like. The query language isn’t SQL, but there *is* a query language that you can play around with.

That said, Ibam not aware of ant darabase that has rried to work directly on top of BFS. I’m not sure that even BFS provides all the infrastructure necessary to implement an ACID-compliant database from the filesystem alone.

Anonymous 0 Comments

The short answer is they have different use cases. A JFS doesn’t care about most of the ACID properties of a database. It doesn’t need to optimize for queries across the data or enforce locking to the degree that a database does, JFS’s also do not guarantee data integrity because they do not need to, they only care about file metadata.

Anonymous 0 Comments

There are file type storage systems that are based on files or “objects”. These are very popular in cloud computing because they scale really well across a network. AWS S3 is an example of this as well as Azure Blob Storage. There are other technologies called “key value” stores that have a similar approach (AWS DynamoDB), where a key is an identifier and the values are the things you want to store. The key can be a unique identifier or can even be something that resembles a file path. Anyways, things get messy when files are not on a single computer and someone wants to update a file. Different data stores have different strategies for accomplishing this with different degrees of success. Other considerations is finding the data you are looking for, how to efficiently store the data, and cost/performance – if the data is stored on a running computer or sitting there waiting for a computer to read it (share vs share nothing).

Anonymous 0 Comments

The master boot record is kinda what you are talking about. As for why an OS between, because its easier to navigate and takes less technical understanding. People still post pictures into word documents to send them.

Anonymous 0 Comments

It has been tried and found to be way more trouble than it is worth.

If you think about the use cases for file systems and databases, they are very different. Of all of the file “opens” performed by a computer 99.99999% of them, the location of the file is known…only when the users need to find a file, is a search done.

Given this simple fact, using a bolt-on DB for user files, like Microsoft has done, seems to be a good balance.

One former directory structure that is now a simple DB is LDAP. A lot of people do not remember where they were on the directory, so end up searching anyway. Sinlge key DBs like NoSQL is awesome for this case.

Anonymous 0 Comments

This was/is a thing with Oracle actually. You can use raw disk, without a filesystem for storing tablespaces, but most DBAs opt not to do it because it makes backups at a filesystem level a huge PITA.

And the filesystem doesn’t add that much overhead in terms of time once the file is opened. In the UNIX world you are either calling the open() syscall against the file and using read/write/seek to navigate and update it’s contents. Or if you use raw disk, you are opening a /dev/diskn file using the same open() syscall and using the same read/write/seek syscalls for queries and updates.

Anonymous 0 Comments

Computers that are specifically intended to be databases can use raw disk for the database storage and bypass OS filesystems and overhead. For multipurpose computers, this makes much less sense since you want to do things other than database “stuff”.

Most computers you will use are pretty multipurpose, so you’ll bear with the small amount of filesystem overhead. File systems are general purpose, so they suit the general purpose nature of most computing needs.

Anonymous 0 Comments

We do! You’re describing a NoSQL-type database.

But why aren’t *all* databases like that? Because they have different use-cases.

If I need to store a database that keeps track of some financial transactions, and each transaction is connected to an account, and an asset I purchased, and an employee who purchased it, and a department who requested it, and on and on, I have “related” data that all needs to be connected to each other. A file system doesn’t handle this well.

But if I have the text of 10,000 books and I want to search them all for the word “database,” I don’t need all of those connectors. Instead I want an index where I can just go to the entry I need and get a list.

Anonymous 0 Comments

A database does a lot of things beyond just storing data/files! And it’s interesting to think about!

Firstly, it has an *index*, or maybe even many *indices* for each table. That means you take some property that you care about, sort it, and store that sorted version of the data separately, with a reference back to the rest of the stored data. This is super important because when data is sorted, you can really quickly find any specific value.

Here’s an example. Find the 3rd smallest number:

“`
853385090
379591565
306438809
663764153
482127012
137329860
203550656
114623436
972592136
394841008
709407303
546066368
214591581
717485067
560176143
“`

Now let me sort those numbers:
“`
114623436
137329860
203550656
214591581
306438809
379591565
394841008
482127012
546066368
560176143
663764153
709407303
717485067
853385090
972592136
“`

Much easier now, right? Computers agree with you. Generally, if you have twice as many numbers, it only needs one more step to find a sorted value, but twice as long to find a non-sorted value.

Database engines make use of tricks like this to make searching, filtering, finding data easier. They have many tables with relationships between them, which allows complex queries (“find me all employees who are sales people who have made more than $8m of sales in the last 12 months, excluding February”). They have logic to do transactions – either everything in my transaction updates, or nothing does. And that’s just scratching the surface- there’s so much engineering effort that goes into building production quality database systems.

Anonymous 0 Comments

**Why don’t we write a database file system?**

This is kind of a hard one to pin down, because as you’ll see in the answer to your second question, we kind of do. But I suspect that when you say “database” you mean in the traditional RDBMS sense, which we do not. At least not today.

Microsoft actually developed a filesystem based on a relational database system. It was called WinFS, and it got a fair amount of hype (amongst developers, anyway) because it was planned to be part of Windows Vista. It died rather unceremoniously.

The reasons why are pretty nuanced. As someone who has been in the software development field for more than 20 years, I recognize a few patterns in WinFS’ failure to launch.

In a broad sense, software is a bit like an evolutionary ecosystem. Solutions that are “good enough” often thrive because they are lower cost or have the inertia of widespread adoption, and therefore widespread understanding. The low cost part is easy enough to understand — individuals and companies prefer to spend less — but the fact that WinFS was a brand new paradigm in filesystems had upsides and downsides. The most significant downside is that developers were completely unfamiliar with it, and end-users barely knew it existed.

So you had this situation where end-users just wanted features like good performance, search, and access controls. Developers wanted an API that allowed them to deliver these end-user requirements, but this was all table stakes. You couldn’t sell a new piece of software based solely on file-system features, because most of what WinFS promised was already being delivered in some capacity; just using different technologies.

Undoubtedly, WinFS was a cool technology, but it was so different from what had come before that it failed to pick up enough inertia before it ever truly entered the world.

**Isn’t a file system practically a database already?**

Yes, definitely. All modern file systems are databases containing file metadata and a pointer to the binary file data location in storage. They’re just not like other relational databases that we’re used to using.

**Isn’t layering an OS between the data and the database application slowing things down?**

This is really tough to answer. File systems are complicated enough that you can’t make blanket statements like “a relational database filesystem would be faster than current filesystem technologies”. Relational databases are very well understood, and they are *very* performant… But so are modern file systems.

There is one thing that we can say with 100% confidence though: if you add additional work, you add additional time. Every piece of software that lives in a performance-critical role adopts a strategy of work-avoidance. You cannot avoid the fact that adding the “relational” aspect of RDBMS to a file system would introduce additional, at a minimum for tasks that rely on relationships. That means there is a lot of opportunity for things to actually be *slower*.

**Questions you didn’t ask.**

Looking at the file system and operating landscape today, we see that modern operating systems deliver on much of the promises that RDBMS based file systems made. Mac OS X Tiger had fast file system search starting in 2005. Not coincidentally, this is around the time that WinFS was cancelled; I don’t recall the exact year. Modern versions of macOS continue to deliver fast file system search without a relational database file system.

What we see applied here is the same thing we see in software development in general. Modern development eschews the idea of gigantic monolithic units of software, and instead prefers to break things up into smaller, more manageable components that we can interact with through APIs. That is how Apple’s Spotlight search works. It is its own service that relies on file system features, but it keeps its own index. Much of that process uses techniques that are common in database systems.

Basically, rather than trying to develop an elaborate file system that stores relationships and complex information about file system objects, developers let the file system concern itself with storing files and file metadata, then build services that query the file system and build indexes of their own, containing only the data relevant to delivering the feature the user wants.

This approach helps developers manage complexity in ways that allows them to iterate their designs in smaller steps, avoiding long development cycles for products that are so complicated, you need months of QA just to ensure you don’t break the user experience.