I don’t work for nvidia (I wish I was paid as well) but I do work on chip design and I can tell you I cost more to my company in computer and licenses for software than what they pay me.
Making something on silicon is a ton of work, and it also takes a lot of time from when you send your design to the fab to getting the first batches out. And then if they don’t work like you expect, you’re out of luck, you can’t open that shit and plug an oscilloscope to get some traces of what is happening.
So what a lot of their engineers do is a lot of simulations, using software from companies nobody knows outside of this field like Cadence and Synopsys, that allow you to send the software a design and some program you want it to run, it simulates and you check that you’re getting what you want. It is a pretty long process but at least you can get feedback quickly enough (for small subsystems, could be mere minutes, and typically just days for whole system scale tests) so you can fix your design before sending it to TSMC or other foundries.
For a GPU, you’d typically start with simulations of a single CUDA core and have fake connections to the outside world, so you can check the core is doing what you want. Then you start putting a few together, adding the interconnects with memory, making sure you’re not getting some cores that get stalled because it can’t get data fast enough, stuff like that. Then you move into the fun stuff, simulating how it heats up and the whole dynamic frequency stuff to tweak the performance.
Simulations are done at different levels, you start with something further away from reality but pretty fast then you move to simulations that are more precise but really slow until you believe you got it right (or the higher ups tell you to get it done and there’s the deadline) and pray the silicon does what you expect.
Latest Answers