Basically, there is not necessarily a maximum frequency. We also do not see in “frames” – that is, we don’t see a series of still images in rapid succession that looks like a continuous “video”.
Each of our retinal ganglion cells (our “seeing” cells) is connected to our brain. When it is exposed to a change in light (very oversimplified for ELI5), these seeing cells send a signal to our brain. This happens all the time in all of our seeing cells, even when our eyes are closed, and there are millions and millions of these cells, each firing (mostly) on their own.
Our brain receives all of this continuously and extracts details, like edges and shapes, and then assembles these into a “mental image” that we experience as vision. It’s actually one of the best understood systems in the brain, and has been studied for over a century. This happens continuously, and we experience it as a continual “stream of vision”, where-as a video quickly take pictures over and over again and then plays them back to create the illusion of continuity.
So you can see why we can’t quite say, “humans see in XX Hz”. The way our eyes work to create vision is fundamentally different from how cameras work to create videos.
EDIT: I found a [scientific article](https://www.nature.com/articles/srep07861) that claims human subjects could detect a single altered frame in 500 FPS video. This doesn’t mean we “see” in 500 FPS; instead it means our sight cells and brain are sensitive enough to catch a frame that was visible for only 0.02 seconds. It’s like how you can run your finger along a piece of wood and find a tiny spot that’s rough through touch alone; our brain looks for patterns in the visual information coming from our eyes, and if that pattern is broken our brain is very good at detecting it and making us aware of it.
Latest Answers