Do single event upsets ever effect normal computing?

628 views

I just read about [single event upsets](https://en.wikipedia.org/wiki/Single-event_upset) and it’s pretty fascinating. One thing that got me was that a speedrunner of Super Mario 64 experienced a single event upset.

So that leads me to believe that commercial electronics and regular CPUs and GPUs must have a chance to experience these single event upsets. When I research it, there’s only discussion on how it affects space electronics and FPGAs. But there’s gotta be a chance it affects my normal laptop, right? Why would FPGAs be more susceptible to SEUs than CPUs?

If I’m writing a Python script and I set a boolean to False, what’s the probability it gets set to True instead? If I’m logging into a website, what’s the possibility that the server side misinterprets my input? If it can affect an N64 in someone’s living room, there’s gotta be a non-zero chance, right?

In: Engineering

7 Answers

Anonymous 0 Comments

SEUs (or “soft errors”) are a non-zero hazard in modern computing. There are a lot of variables, but in general the issue becomes more of a risk on smaller-geometry (higher density) circuits. If your computer has ECC memory, it can deal with single upsets. Your microprocessor and GPU generally can’t. (Well, they could, but it would make everything more expensive so they don’t.) They typically don’t even use ECC on their cache memory arrays.

The biggest hazards come from cosmic rays and alpha radiation from materials used in packaging of the integrated circuits. The atmosphere helps quite a bit with cosmic radiation, but not much else practical can. Unless you like using a lot of lead or rock around your computer.

Fortunately, your error rate at sea-level with low-alpha packaging materials is fairly low. Using your computer on an airplane flight makes your chances go way up, ~10-30x the last time I did the calculations. Even going to high terrestrial altitudes makes a significant difference.

FPGAs are not intrinsically more susceptible. In fact, space applications often use them in older technologies to help mitigate the risk. You can also instantiate voting systems for computations, but that increases costs.

Run a standard PC long enough, and it will eventually have an SEU. It can cause a hang, a crash, or data corruption. (Or nothing at all, but those don’t really count.) I can’t cite a number, since there are a lot of variables. But for a typical PC, I’d say that it won’t go a full year (24/7) without a good chance of an event.

Source: *I’m a former semiconductor reliability engineer who did a fair bit of SER work.*

You are viewing 1 out of 7 answers, click here to view all answers.