How does a filesystem on a FLASH storage device work?

306 views

I was thinking about how to design a filesystem from scratch.

Let’s say every cell in our FLASH chip has 100’000 write cycles before it dies and let’s imagine a simple flat filesystem.

I imagine the filesystem splits the data of the file across multiple chunks and creates a linked list of these chunks where every chunk holds a pointer to the next chunk. It would be a copy-on-write filesystem. This would theoretically slow the wear down on the cells? Now I’m not really sure how to achieve efficient wear.

But I really don’t know how the filesystem table should work because changing, creating or deleting files would be very easy to do and wouldn’t do too much damage to the FLASH cells. But I imagine you’d need to update the filesystem table with each of this operation.

My question is how do you get around this? Do you just rely on the fact that when the cells in the filesystem table fail the whole filesystem fails? I guess you could store somehow multiple copies of the filesystem table and compare them against each other at every write operation? And if one of the filesystem table copies failed you just create a new one?

And so you would also need to store the addresses of bad sectors somewhere to not attempt to write to them. But you would also need multiple copies of that?

In: 4

4 Answers

Anonymous 0 Comments

Modern operating systems are pretty smart, you don’t need to necessarily get rid of the whole drive once sectors start to go bad because your OS will isolate them from read/write usage. But it is prudent because they are also good at spreading reads and writes pretty evenly, so it’s a good indication your drive overall is failing for whatever reason.

The last thing you mentioned is actually a thing, it’s called [RAID storage (redundant array of inexpensive disks)](https://en.m.wikipedia.org/wiki/RAID), specifically raid 1 where you are storing multiple copies of files, but that’s for the contents not the iNodes (the records that point to the actual contents and have meta data)

You are correct though about the linked list like implementation. A disk partition has a segment size and when a file exceeds that, it replaces the tail end of its data with a pointer to the next chunk. The iNode points to the first one, then each chunk points to the next one, or null.

A corrupt filesystem table is a big problem though, that’s usually the first few sectors of your disk allocation. There are ways to recover, but usually that involves parsing the entire disk and piecing it back together based on the other information that remains.

As for implementing it, you should look at how Linux does it or just Google the standards like NTFS or ExFAT or ext4, they are open standard and Linux has the implementation fully open source as well.

ELI5 isn’t a great place for asking how to implement it in specifics cause well… you can’t really do that in a short, concise manner. Filesystems are stupidly complex as you seem to know from the premise lol

You are viewing 1 out of 4 answers, click here to view all answers.