Shallow Copy and Deep Copy in C++

348 views

What are the differences between the two and when should you worry about creating a deep copy?

In: 1

4 Answers

Anonymous 0 Comments

Typed languages like C++ typically have two varieties of type:

* Value types, which actually store information directly. An int is a value type in most languages.

* Reference types, which store a *reference* to information. Instances of a class are almost always reference types.

If you imagine memory as a library, a value type is a thing actually written down on a piece of paper, while a reference type tells you *where the information is*. For example, if Book is a reference type and you create a Book variable called *myBook*, then *myBook* does not directly store the contents of a Book: it stores *where the book is located*. The contents of *myBook* are not “Call me Ishmael…”, they’re “shelf 14, 15th book from the left”.

——

Now, suppose I have an array of Books. This is just a bunch of Book references, meaning that in memory the array looks like [shelf 14 15th book from the left, shelf 19 3rd book from the left, shelf 98 721st book from the left…] and so on.

What does it mean to *copy* this array? Well:

In a *shallow copy*, we create a new array, and fill it with *the same references*. Those references refer to the same Books as in the original array, and so any change to those Books – even if they’re accessed via the first array – affects both. For example, in pseudocode:

Book myBook = new Book(contents = “This is the text of first book.”)
Book myOtherBook = new Book(contents = “This is the text of the second book.”)
Book[] myArray = [myBook, myOtherBook]
Book[] newArray = myArray.ShallowCopy()

The variables myArray and newArray contain references, the same references that are stored in the variables myBook and myOtherBook. If we then continue with:

myArray[0].contents = “These are new book contents”

We’ve accessed an underlying object that is referred to by all of myBook, myArray[0], and newArray[0]. This change affects the underlying object, not the references, and so myBook, myArray[0], and newArray[0] all point to the changed book. If I run, for example:

print(newArray[0].contents)
/// outputs “These are new book contents”

…we see that newArray[0] (or more properly, the object that newArray[0] points to) has been changed even though we didn’t ever directly refer to this object using newArray. We changed the book on the shelf, and newArray[0] still points to the same spot on the shelf. This is why we call it a *shallow* copy: we copy the reference, but not what’s “beneath” it.

In a *deep copy*, on the other hand, we clone each of the underlying books, create new references to these new books, and create a new array of *those* references. For example:

Book myBook = new Book(contents = “This is the text of the first book”)
Book myOtherBook = new Book(contents = “Text of the second book”)
Book[] myArray = [myBook, myOtherBook]
Book[] newArray = myArray.DeepCopy()

In this case, the references stored in myBook and myArray[0] are still the same, but newArray[0] has a reference to a different location. Behind the scenes, there are now **four** book objects: the first two are referred to by myBook and myArray[0] and by myOtherBook and myArray[1] respectively, and the latter two are referred to by newArray[0] and newArray[1]. Since myArray[0] and newArray[0] don’t point to the same place anymore, changes made to myArray[0] don’t affect newArray[0] (because they’re pointing to two different Books and we only changed one of them).

You are viewing 1 out of 4 answers, click here to view all answers.