Eli5 When applications crash, why do they provide an obscure error code instead of a small description of what the issue actually is?

588 views

Eli5 When applications crash, why do they provide an obscure error code instead of a small description of what the issue actually is?

In: Technology

8 Answers

Anonymous 0 Comments

The problem is that the thing doing the reporting doesn’t know exactly what caused the chain of events.

Let’s say you have a video game that allocates some memory to store each enemy’s hit points. When an enemy’s shot, it loads the hit points from the memory location belonging to a particular enemy, subtracts 1 HP of damage, and stores the new hit point number in the memory location. Except if the damage was lethal, it instead de-allocates the memory for that particular enemy’s hit points.

It’s a simple and common design for this type of thing.

Now whenever a program accesses memory, the OS checks for access to memory that’s been de-allocated (or was never properly allocated in the first place). All the OS “knows” is that an instruction attempted to access memory at address 93441720. So the error message “Process coolgame.exe invalid memory access at address 93441720” is literally all the OS knows about the error.

So what actually happened? Well in this particular level, there’s a helper character that fires shots at enemies in addition to the player. The helper character’s shot hit the enemy, killing it. Then the game processed the player’s shot, which also killed the enemy on the same frame, and attempted to decrease the enemy’s hit points by 1 — but the memory storing the enemy’s hit points was already de-allocated.

You should know that the OS’s “de-allocation detector” is very coarse-grained. Because of the way the CPU design works, it can only flag 4000-byte regions as “allocated” or “unallocated.” And there’s also some time delay involved; the OS has to run some bookkeeping to determine when regions become allocated / unallocated, so regions that become unallocated are only flagged when the OS bookkeeping code runs.

So an incredibly specific sequence of events has to occur to cause this crash:

– You have to be playing in a particular part of a particular level for the helper character to show up
– You and the helper character have to shoot at the same time
– Both shots have to hit the same enemy on the same frame
– The enemy has to have exactly one hit point remaining so the first shot is lethal
– No other enemy data is located in the same 4000-byte region as the dying enemy, making it possible for the OS to enable the “unallocated” flag for the whole 4000-byte region
– The OS’s unallocated-region bookkeeping code happens to run in between processing the first and second shot, causing it to actually flag the region as unallocated so the game crashes

Again, the OS, which is the thing that’s giving you the error message, has no idea about any of this. It doesn’t know that the memory is used to store enemy hit point data, or anything about the sequence of events that caused the program de-allocate that memory. Heck, the OS doesn’t even know your program is a game. All the OS knows is that the program attempted to access memory address 93441720, which was not properly allocated at the time of the access attempt.

Instead of having the OS terminate the program, the developer could write their own code to check for invalid memory access. But the developer can’t make much better error messages.

You see, in order to write the error message and the code to trigger it, the developer would have had to realize it’s possible for damage to be applied to an enemy whose hit point memory has been de-allocated.

But of course, if the developer had this knowledge, they wouldn’t write the detailed error message! Instead they’d simply fix the bug, and you’d never see the crash, because the developer fixed it before you got the game.

You are viewing 1 out of 8 answers, click here to view all answers.