Shallow Copy and Deep Copy in C++

342 views

What are the differences between the two and when should you worry about creating a deep copy?

In: 1

4 Answers

Anonymous 0 Comments

Typed languages like C++ typically have two varieties of type:

* Value types, which actually store information directly. An int is a value type in most languages.

* Reference types, which store a *reference* to information. Instances of a class are almost always reference types.

If you imagine memory as a library, a value type is a thing actually written down on a piece of paper, while a reference type tells you *where the information is*. For example, if Book is a reference type and you create a Book variable called *myBook*, then *myBook* does not directly store the contents of a Book: it stores *where the book is located*. The contents of *myBook* are not “Call me Ishmael…”, they’re “shelf 14, 15th book from the left”.

——

Now, suppose I have an array of Books. This is just a bunch of Book references, meaning that in memory the array looks like [shelf 14 15th book from the left, shelf 19 3rd book from the left, shelf 98 721st book from the left…] and so on.

What does it mean to *copy* this array? Well:

In a *shallow copy*, we create a new array, and fill it with *the same references*. Those references refer to the same Books as in the original array, and so any change to those Books – even if they’re accessed via the first array – affects both. For example, in pseudocode:

Book myBook = new Book(contents = “This is the text of first book.”)
Book myOtherBook = new Book(contents = “This is the text of the second book.”)
Book[] myArray = [myBook, myOtherBook]
Book[] newArray = myArray.ShallowCopy()

The variables myArray and newArray contain references, the same references that are stored in the variables myBook and myOtherBook. If we then continue with:

myArray[0].contents = “These are new book contents”

We’ve accessed an underlying object that is referred to by all of myBook, myArray[0], and newArray[0]. This change affects the underlying object, not the references, and so myBook, myArray[0], and newArray[0] all point to the changed book. If I run, for example:

print(newArray[0].contents)
/// outputs “These are new book contents”

…we see that newArray[0] (or more properly, the object that newArray[0] points to) has been changed even though we didn’t ever directly refer to this object using newArray. We changed the book on the shelf, and newArray[0] still points to the same spot on the shelf. This is why we call it a *shallow* copy: we copy the reference, but not what’s “beneath” it.

In a *deep copy*, on the other hand, we clone each of the underlying books, create new references to these new books, and create a new array of *those* references. For example:

Book myBook = new Book(contents = “This is the text of the first book”)
Book myOtherBook = new Book(contents = “Text of the second book”)
Book[] myArray = [myBook, myOtherBook]
Book[] newArray = myArray.DeepCopy()

In this case, the references stored in myBook and myArray[0] are still the same, but newArray[0] has a reference to a different location. Behind the scenes, there are now **four** book objects: the first two are referred to by myBook and myArray[0] and by myOtherBook and myArray[1] respectively, and the latter two are referred to by newArray[0] and newArray[1]. Since myArray[0] and newArray[0] don’t point to the same place anymore, changes made to myArray[0] don’t affect newArray[0] (because they’re pointing to two different Books and we only changed one of them).

Anonymous 0 Comments

The difference manifests when the object you are copying contains references to other objects. A shallow copy just copies the references, meaning that the new objects still contains the same references as the old one. A deep copy also create a copy of the referenced objects. So for example if object A contains object B, then a shallow copy creates object A1 which references object B, while a deep copy creates object A1 which references object B1.

A deep copy is naturally risky since you can accidentally create an infinite recursion (if you accidentally copy circular references) or you just end up copying more objects than you intended.

All in all whether you need a deep or shallow copy depends on the context. The difference is whether object A “owns” object B or simply referenced. Logically, if A owns B then is should deep copy it, but if it just references then it should just shallow copy the reference.

Anonymous 0 Comments

Let’s say you have a bookshelf, and a list of all the titles/locations of the books on the shelf.

When you create a shallow copy, you’re copying the list of titles/locations of the books.

When you create a deep copy, you’re copying the list but ALSO the bookshelf and the books on it.

The former is quick and doesn’t require much space at all to make the copy. The latter is much more labor-intensive and needs lots more space for everything.

Really depends on how important certain factors are like speed/time/memory available…

Anonymous 0 Comments

Think of memory like URLs and websites.

If you want to access some memory, the computer checks the URL, the URL brings you to a website, and the website contains the data the computer wants.

Shallow copy would be like copying the URLs. If you access the URL, you’ll get whatever data the URL brings you to. The website may or may not have changed.

Deep copy would be like going to the URL and copying the HTML of the website, and then saving that under a different URL. This way, even if the website at the original URL changes, the one you deep copied will remain the same as when you originally copied it.