eli5: In a video game, why is it so hard to create hit boxes? And why aren’t they exact copies of the character model?


Hopefully this hasn’t been asked before on this sub, but I was wondering why so many games struggle with good hit boxes? Why are they even “boxes” in the first place? I would assume they would just be carbon copies of the character model instead of having individual boxes that stretch out over the characters body.

In: Technology

I’ll try to explain but I’m sure someone else will have a better explanation. So you shoot someone. That projectile crosses the map and hits the player. Ok but how does it know it hit? You but a box around the player character so that when the bullet passes through it will send the command to injure the player. So as you are moving around this box has to be constantly calculated. As a big box vs a closely attached box with curves matching and moving with your body would be much easier to compute. Basically it’s easier on the machine.

The point of a hit box is to calculate if something is hitting/colliding an object. Complex shapes require much more complicated calculation to work out if there is an object inside it. So for collisions at least it is much less computationally difficult to use a cuboid

Checking for collision against a box around the player is quick and easy. Precise? Naw, not really – but gives you a result, either hit or miss, really quickly.

Checking for collision against *every individual part of a model* is slow. Slower “decisions” on things like hits or misses are extremely undesirable.

It’s very easy to check the collision of a box with a box. You can check each vertex on each box and see if it’s on the inside of the other box, and then vice versa.

The problem is when you have a lot of boxes.

If you look back at something like Super Mario 64, Mario had ~750 triangles that defined the shape of his body. If you had two Mario models and wanted to check collision, you would look at the first point, check 750 others to see if there was collision, then another 750 comparisons for the second, and so on. You would get around 750^2 comparisons if you did it this way. Now imagine you took the Super Mario Odyssey model, which I believe has 15,000+ polygons making up his texture. When you do that raw comparison, you get even more things to compare against.

So we need to significantly reduce the amount of stuff we’re comparing. Instead of looking at several hundred triangles that represent Mario’s arm, we can just put one cylinder there. Instead of several thousand triangles that make up his body, we can just use a few spheroids.

Basically, perfect hitboxes are computationally expensive to keep checking. They can be done in something like a rendered movie, where you prerender the footage and can take more than 1 second to produce 1 second of footage, but in real time gameplay, they just aren’t worth the computing power.

Because it makes for shit game feel. Well-designed hitboxes slightly “cheat” in the player’s favor to compensate for subconscious biases. If it’s really close, the player is going to assume they just barely dodged out of the way or just barely landed the hit, whether they did or not. So the hitbox of the player that determines whether you take damage is slightly smaller than the visible model, and the hurtboxes of your attacks are slightly larger. This is an intentional game design decision, not simply a technological limitation.

To see what it’s like in 2D, check out any flash game where the creator just used the sprites for collision detection instead of bothering to create hitboxes. This is what “perfect” collision detection feels like (spoiler: absolute garbage).

Frankly speaking, the math is hard and it can’t be done quickly. And as the geometry of the model increases in complexity from a cube into something with more vertexes the math complexity scales disproportionately.

Its been a long time since university, but I know there are simple algorithms for calculating whether a straight line (say a bullet path) intersects a cube, or how long/much of it intersects a cube. Easy. And compensating for the movement/rotation of the cube or the line for that matter is simple matrix transform math – modern cpus eat that up. But once you start trying to compute the boundaries or two cubes it gets harder. Change the cubes to cylinders and now we’re getting kinda difficult. Now the cylinders are complex constructs of cjoined cylinders, cuboids and triangular prisms – harder still. Now we’re a fully deformable mesh construct thing. Super hard to calculate collisions.

And by hard, I don’t mean the cpu couldn’t do it, I mean we could have a super accurate model-faithful hitbox for our players, but the calculations involved whenever collision detection was a thing would suddenly cause the CPU to grind to a halt.

So reducing the geometry of a hitbox to a reasonable compromise between collision/model faithfullness and compuation ease is something that game engine designers do all the time. Which is why its easier to develop in some regards for a console than PC because at least with the console you can right-size that compromise to deliver the best performance under most circumstances for most players – everyone has the same cpu.

Former game developer here,

What you’re asking to do is actually very complicated. Let’s presume you’re shooting at a cube. A cube is 6 planes, and they each have a normal vector indicating which side is top and bottom, so all you have to do is determine if your bullet is on the bottom of all the planes. The only way that can happen is if the bullet is inside the cube.


Now do that for a 400,000 polygon enemy model. Go ahead. I’ll wait, because it’ll take a while. Also, this test doesn’t work if the model is concave, because you’ll be on the outside of some polygons while still being inside the model.

But also, the enemy isn’t always 400k polygons. The further away it is, the fewer polygons it is. The lower resolution you have your settings, the fewer polygons. Hell, at it’s furthest distance, it may be a 2D billboard.

So if you’re playing with friends and shoot at the same thing from different distances and configurations, are you shooting at the same thing? Is it fair?

But also you want the game to run. A hit box simplifies the problem for performance. It’s easier to check a box than 400k polygons. And these are actually stored in hierarchies of better and better fits. I can surround a whole tank in a box to determine if I have to bother to see whether you clipped it in the cannon barrel or not.

Collision is expensive to calculate and 1 box containing the whole character is good enough. adding more stuff increases complexity exponentially and you cant run physics in parallel like with graphics.

Number 1: They aren’t always “boxes”. Often they are spheroids.

Number 2: Hit boxes simplify the calculation of collision detection so the computer can calculate it in real time without killing your frame rate. Detecting collision with multiple moving objects in different states of animation is already a pain in the arse with this simplification. You *could* have hit boxes that perfectly match the character model (essentially, eliminating the need for separate hit boxes at all), but those models are often thousands of polygons. It’s not feasible to do this even on the most powerful machines.

Number 3: Ignoring the technical limitations, there are often gameplay design reasons for hit boxes that don’t match one-to-one with the model. Sometimes you want to make it easier for the player to shoot an enemy even if they’re a few pixels off (first person shooters), or avoid taking damage even if a bullet technically touched them (top-down shooters/bullet hell). Or in a 2D fighter for balance reasons you want certain punches to land even if the opponent is crouching and it visibly misses them. Platform games often use larger hit boxes on platform edges to implement “coyote time” – an additional window where the player can still jump or land despite having technically fallen off the visible ledge.