How does a filesystem on a FLASH storage device work?

388 views

I was thinking about how to design a filesystem from scratch.

Let’s say every cell in our FLASH chip has 100’000 write cycles before it dies and let’s imagine a simple flat filesystem.

I imagine the filesystem splits the data of the file across multiple chunks and creates a linked list of these chunks where every chunk holds a pointer to the next chunk. It would be a copy-on-write filesystem. This would theoretically slow the wear down on the cells? Now I’m not really sure how to achieve efficient wear.

But I really don’t know how the filesystem table should work because changing, creating or deleting files would be very easy to do and wouldn’t do too much damage to the FLASH cells. But I imagine you’d need to update the filesystem table with each of this operation.

My question is how do you get around this? Do you just rely on the fact that when the cells in the filesystem table fail the whole filesystem fails? I guess you could store somehow multiple copies of the filesystem table and compare them against each other at every write operation? And if one of the filesystem table copies failed you just create a new one?

And so you would also need to store the addresses of bad sectors somewhere to not attempt to write to them. But you would also need multiple copies of that?

In: 4

4 Answers

Anonymous 0 Comments

Modern filesystems (not just Flash-specific, but also HDD-oriented: NTFS, ext4, etc) do not use a linked list to connect chunks of a file. They instead use a tree structure: there is a “superchunk”, that contains a list of all chunks. If a file is very big, it can have multiple superchunks, with a “super-super-chunk” storing a list of those.

The files are recorded in a file table, which stores an information about a file, as well as its topmost super…chunks. The file table is itself organized as a file: it has its own chunks and superchunks. The superchunks of a file table are recorded in the “root chunk”, which records the data about the entire filesystem.

On Flash FS, the chunks do not get overwritten when modified. Instead a new chunk is created further down, then a new superchunk, then a new file table chunk, and finally a new root chunk. Only then it reaches the end of the memory, it goes back and starts overwriting the blocks in the beginning.

That means that the position of the root block is not fixed. While the computer is running, it can remember the root position in RAM, but on startup it has to search for the root. That might sound bad, but with binary search a 1 terabyte disk can be searched with just 40 or so reads. Of course, the blocks have to be marked with some version counter, and some “milestone” blocks have to be made along the way.

Many types of Flash also allow “partial writing” – you can write to the same place for “free”, as long as you only change bits in a specific direction (usually – from 1 to a 0). This allows to create “log chunks”, that record small changes without committing to overwrite chunks and superchunks. When the “log block” is full – that’s when the changes are actually performed.

*Edit:* as I was corrected, modern consumer SSDs have a wear leveling feature. It allows to use a normal HDD filesystem, without caring about special Flash needs. It works like this: internally, the SSD has a different filesystem. That filesystem contains just a single file. A mini-computer inside SSD manages that FS, presenting that single file as a content of the SSD.

You are viewing 1 out of 4 answers, click here to view all answers.