The lens in your phone is different from the lens in your eye. The eye is also far more sophisticated than most cameras generally and, specifically the cameras in phones. Things like contrast and color are never going to look as good being captured through a camera phone as they look when you’re using your eyes.
Your organic perception is more or less analog. You can see and perceive a continuous and smooth spectrum of light and color.
Computer vision is digital, and color is made up of 3 8 bit numbers. This is stored as either RGB (red, green, blue) or HSL (hue, saturation, luminance). This means that there’s limits to the possible colors a computer can display. HSL works more intuitively for us humans because it maps the actual RGB values of displays to a wheel of colors that we can then manipulate in brightness (how we get black in HSL) and saturation from white to full color. In contrast, changing one value in RGB just a little makes significant and often counterintuitive changes to the color.
The other thing at play here is the way these values are stored in memory. In memory, the data in an image is stored as square roots of the values read by the camera. This causes some data loss on display, and significant data loss or corruption when manipulating the image if you aren’t using a program that accounts for it.
There are 2 things to note here:
Field of view is the maximum angle that the camera (or your eyes) can see. If you use a wider lens, then objects get stretched more and more as you get closer to the edge of the picture. Phones have software compensation to minimize this effect, but it is just a natural consequence of the lens design. Phones have wide fields of view by default, with some having ultra wide lenses as well, capturing more of the scene, but more distorted. You naturally see some of this distortion with your eyes and your brain considers it normal, that’s why a zoomed in photo feels squished, because the natural distortion is missing.
The second, color reproduction, comes from multiple things: The lens could have a subtle discoloration, the sensor might be more sensitive to certain color tones, the compression that creates the JPEG file also distorts color slightly. The main reason smartphones have unnatural colors though, is because people in general prefer more vivid and brighter scenes, even if they are not as realistic. Because of this preference, the easy to use automatic camera mode usually brightens and oversaturates the input from the camera. MKBHD’s blind camera test shows this perfectly, it’s worth watching. A way to have more realistic results is by either taking the time to set the camera up in manual mode, or by making it save the uncompressed raw image along with the JPEG. These raw files look weird as well, but they can be edited to bring out detail that the JPEG would lose during compression, and also to have realistic colors.
The tl;dr is that people usually prefer dynamic and bright photos, so phones are adjusted to fit that. The minority that prefers realism can achieve it, but it takes longer
TL; DR: most cameras aren’t designed to capture all colors. RGB colors are an elaborate illusion and it doesn’t always work well.
On colors specifically: digital cameras and screens define every color as a combination of some amount of red, green and blue. The exact shade of “red” or “blue” it uses dates back to CRT TVs in most screens – the red we could do with the phosphors available in the day is the reddest it could get. This choice of “what is 100% red, and what is 100% blue” is called a colorspace, and screens and cameras can only represent the colors inside their color space.
Some real-life “reds” are redder than CRT phosphor reds, and we can’t photograph them and then show them on the screen. Other pure colors are slightly outside of the red-green-blue triangle, and we can’t represent them with any combination of RGB light.
HDR cameras and screens use more modern colorspaces (in addition to being able to distinguish more levels) – so a sunset looks better in HDR.
Another subtle thing is people have individual differences in color perception, so a mix of the same amount of red and green may look like one shade of yellow to me, and a slightly different yellow to you. If I make a camera that produces realistic colors to me, you might not agree at all.
Most answers here seem to neglect the main reason why we can’t capture sunsets quite like what our eyes see: Dynamic Range.
Dynamic Range is basically the ability to see both dark and bright spots at the same time, and it’s measured in stops. Each ‘stop’ is a doubling of the light value of a spot. So if you have a dynamic range of 10, you can tell the difference between a bright spot that is a little more than 1000 times brighter than the dark spot.
The best cameras in the market have a Dynamic Range of about 16 (which is a ratio of 1 : 65,536), while our eyes have a dynamic range of 21. That’s a contrast ratio of 1 : 2,000,000.
So when we capture something, even with the best DSLR out there, there’s a lot of detail that gets lost that our eyes can perceive. The best example would be the moon. You can quite clearly see the moon and the details on its surface in the night with your eyes while also being aware of your surroundings. But as soon as you try to take a picture of the moon and the landscape with your phone, the moon just morphs into a bright blob on your screen. That’s because the moon is really bright, much brighter than anything else that’s usually lit in the dark, even in a city full of street lights.
And the reason for that has nothing to do with our eye lens or the camera lens, but the image sensors themselves. Camera sensors just aren’t as sensitive as the cells that capture light in our eyes.
Then there’s the problem of screens themselves. Most LCD screens only go up to 10 stops, which is about 1 : 1000. TV’s go up to 1 : 4000. So you have to further compress the details in order to show them on a screen, losing details that live between those stops further.
A lot of good replies adding bits of the answer. But also:
When you look at a sunset you are experiencing reality including the heat from the sun on your skin, the tightening of your iris from the intensity of the sunlight, the smell of the air, the breeze in your hair. The sounds of the birds, traffic, your friends… You are not looking at a picture, it’s the actual thing!
Also it moves. A picture doesn’t.
Something not mentioned so far: your eyes constantly move even if you try hard to stay still. Therefore, even if looking at something for 1 sec, with your eyes you experience a multitude of slightly different viewing angles; these all come with a variety of focal points, contrasts etc. Your brain blends that in a single beautiful image which we call “reality”.
Years ago I had two cameras. A zoom and and pocket camera with a slightly wide angle (35mm lens on 35mm film). The zoom could never get a landscape looking anything like I saw it but the smaller camera was pretty close to what I saw. I always used it for landscapes. So lens angle makes a big difference. Zoom your camera so it is displaying what you see.
Latest Answers