what makes programming languages like go and rust memory-safe and c++ not?


what makes programming languages like go and rust memory-safe and c++ not?

In: Technology

Short answer, “Pointers”. Long answer and in case you don’t know, A pointer is basically an object that points to a location in memory. C++ is pretty lax in how you create and destroy pointers and the objects they may or may not point to. It allows you to declare a pointer without initializing or allocating the location of the memory it’s pointing to (aka a Null pointer). It also doesn’t take care of Dangling pointers i.e. a pointer where you point to an object that has been deleted or de-allocated.

C++ is basically a language that assumes the programmer knows what they are doing and is more than willing to give you anything you need. Even if that leads to memory leaks, buffer overflows, and null reference exceptions. As my old C++ professor loved to say “C++ is a lovely language eager to give you all the space and rope you need… to hang yourself.”

In old languages like C and C++ it’s up to the programmer to not make mistakes. For example if you type:

int a[10];
a[16] = 0;

The first line gives you an array (like a group or collection) of 10 integers/numbers. The second line, you set the value of the 16th integer to 0. This doesn’t make sense, you asked for 10 numbers, you can’t set the value of the 16th number. In languages like C and C++ the is called undefined behaviour. What will happen, nobody knows. Your program might crash, it might act strangely, or it might work perfectly fine for 10 years and then start crashing.

Newer languages came along, and the designers said, people make mistakes, the language shouldn’t just allow it, but help them see their errors.

So in a newer language, if you type a[16] = 0; before it sets the value, it checks if position 16 is even valid. If not, it throws an error. Otherwise it sets the value. This check takes time, so people who like high performance have stuck with C and C++.

The other issue is sometimes the programmer doesn’t know how much memory is needed. For example, if you are creating an image editor, you don’t know the size of the image the user will open. So you need to wait until they open it, look and the width and the height an the request enough memory to hold the image. Something like:

auto image = new Pixel[width * height];

and then when the programmer is done with the memory:

delete[] image;

Again, it’s up to the programmer to not ask for a pixel at a position outside of the image. Or to ask for a pixel after they’ve already deleted the image.

Go takes a different approach. The programmer still has to ask for the data for the image, but they don’t have to worry about deleting it later. Instead, the language, every few milliseconds, looks at all the memory being used, and can see if the image data won’t be used again and deletes it for the programmer. The downside here is this checking takes time, and the program will be using memory it no longer needs, but go hasn’t noticed it can free it yet.

Most programmers use safer languages now, they are a little slower, but it’s not an issue for most things, maybe other than operating systems and video games.

So Rust came along and took a different approach. It’s goal is to be fast and safe. Also, if you are going to write an operating system, you need to be able to poke around in memory in weird and unsafe ways. So it has unsafe blocks, you can write all you weird and crazy code in there and do anything. The rest of the program will be safe. The advantage here is if you do have a crash, you only need to look in the few unsafe spots and check them carefully, compared to C++ where the crash could be anywhere.

The “compiler” is a program that converts a language that’s relatively easy for people to read like Go/Rust/C++ into a language that the computer reads, which is called “assembly” or “machine code”. (There can be some other steps in the process but we’ll stick with the simplest answer.)

The C++ compiler doesn’t enforce a lot of rules. You can write code that is very obviously flawed and it won’t complain. This is a blessing and a curse. It means very clever C++ developers can do ridiculous things and yield performance benefits for it. It also means developers who aren’t as clever as they think can create serious problems.

Rust and Go have more “discipline”. They have rules about how you are supposed to use memory. If your code breaks those rules, the compilers refuse to convert your code to machine language. Sometimes the bad stuff happens in C++ because the code makes the status of some memory ambiguous. A stricter compiler’s response to ambiguous memory use is to treat it like an error until you remove the ambiguity and prove you’re following the rules.

Imagine you’re playing the Shell Game with someone. This is the game where a ball or some small object is placed below one cup, and several identical cups are placed next to it. Then, the person running the game slides the cups around to try to disguise where the ball is. If you pick the cup with the ball, you win.

C++ is like a version of that game where anything goes. The person might spin the table around and, while you’re not looking, take the ball away so there’s no way to win.

Rust and Go are like a version where the person running the game can’t leave, can only make 5 swaps, and you also win if he’s sneaky and takes the ball away. Under these rules, you’re much more likely to win because the game’s operator can’t do cheaty things.

So why isn’t C++ more like Rust and Go? Well, history.

First, if people are very careful they can write code that’s just as “safe” in C++. It just takes an awful lot of care.

Second, when C++ was being developed, we didn’t know as much about what isn’t “safe”. Computer Science is a pretty young field, roughly 60-80 years old by certain definitions. We had to make a lot of mistakes to learn what mistakes look like.

Third, compilers are very complex and can take up a lot of computing power. The computers we have today are hundreds of thousands of times more powerful than they were when C++ compilers were developed. It may not have been possible to implement the features Rust and Go enforce *and* maintain feasible compilation times.

We could update C++ compilers today, but it’s usually hard and dangerous to dramatically change how such an old compiler works. There is a LOT of C++ code written that might stop working if the compiler suddenly required everything to be safe. That could be really bad, since that code probably does a lot of important things and the cost of reworking it to be “safe” could be immense. There are tools and compilers that do perform more rigorous checks, and I’m sure people use them. But if a C++ compiler suddenly *required* it, that would be a very unpopular change and people would refuse to use that compiler.

Languages like C++ and C allow you to handle memory allocation yourself.

This means you have the power of managing when an object gets created in memory and when it gets destroyed. It’s a wonderful thing, since it gives you complete freedom. It’s also a curse, because you must make sure that you know what you’re doing.

If you allocate memory for an object and you never free it, you create a **memory leak**. If you do this repeatedly (even worse: in a loop) you can easily hog memory until the program crashes down with the rest of the system.

If you free memory for an object, but you keep pointing to its old location in memory, you created a **dangling pointer**. Trying to access that means crashing the program, in the best of cases. Trying to deallocate the object a second time results in a **double free** – again, with dire consequences.

If you allocate a certain block of memory for an object, but you keep reading well past it, you end up accessing memory that you shouldn’t. This can result in more than crashing your program: you can pave the way for the next vulnerability to be exploited by a malicious user (**buffer overflow** attack). In C++, and even more in C, this is dangerously easy to do simply by operating on strings and forgetting to put a terminator character at the end.

Higher level languages don’t allow you to this do kind of damages. They employ boundary checks to avoid buffer overflows.

They might use a **garbage collector** – you can see it like a little program inside your program, constantly overseeing memory and automatically freeing memory when it’s no longer used.

In short, they take away the power to manage memory yourself – depending on the cases, this can come at a cost in terms of performance. But in exchange they give you the safety of not shooting yourself in the foot.

Thank your for all explanations. While I’m not a complete noob in programming I have never worked seriously with C/C++ and it was new to me that not having GC and built-in tools for memory safety checks might be considered a feature for performance sake. It’s just that there’s so much noise around rust being so great memory-safe language and bashing C++ for its unsafety and complexity, now I understand that it doesn’t come for free. I guess if you’re building some low level infrastructure tool where performance is essential like Envoy you might choose C++ knowing it’s still worth it over all these troubles you face during development/maintenance.