Why is it so difficult to copy source code that is not “open source”?

1.40K views

It’s been in my mind if we are using the software/program or even hardware of a tech company, we can play around, install-unsinstall and more. Then how is it so difficult for someone to “unhide” the source code that the device uses? Technically the code is in the device somewhere hidden in it, so it’s there, but still, it’s almost impossible to obtain the source code. How do they achieve this so no one copies their code?

In: 366

42 Answers

Anonymous 0 Comments

I always liked the cake analogy.

With open source software, you’re downloading all the ingredients and the recipe to put it all together into a cake. So if you want to know exactly how the cake is made, you can dig around and look at the ingredients list and how they all mix together.

With closed source software, all you’re really downloading is the cake. There’s no way to deconstruct it into the basic elements – you might be able to guess some of it but you won’t know exactly how it all blends and bakes together into that final tasty delicious dessert.

Anonymous 0 Comments

Well, even at the source code step we can use a technique called “obfuscation”. Basically, it makes the code harder to understand by removing all spaces and changing variable names to meaningless ones. Then, if the code is compiled, decompiling hardly gives you more than machine code which is kinda hard to navigate through; some programs will try to recreate the source code from there but the result is far from human friendly. So, it’s never really impossible, it’s mostly very very unconvenient.

Ex, for the obfuscation thing:

showMessage(getText(‘askName’));
var name = getUserInput();
showMessage(getText(‘helloName’, name));

becomes :

`f(g(‘askName’));var a=h();f(g(‘helloName’, a));`

Anonymous 0 Comments

>Technically the code is in the device somewhere hidden in it, so it’s there, but still, it’s almost impossible to obtain the source code.

The source code will not usually exist in the final product. Most software is compiled, meaning that new instructions are created using the source code as a guide. The compiled program “does what the source code looks like it does,” more or less, but it will generally be structured in a way much less friendly to humans (but friendlier to computers, which is the point of doing it in the first place)

Anonymous 0 Comments

Because the owner of the IP is trying to make money from the use of their software. If you hack it or reverse engineer the code you are stealing proprietary knowledge. Depending on where you live that could be a jail sentence or if you live in China then a promotion and a new apartment.

Anonymous 0 Comments

Source code is the code at the origin of the program. This source code is “compiled” (turned into an executable file in the language that the computer uses to function) and then released to the public. It is difficult to revert the process of compiling (called decompiling) and takes a lot of time to make something useful out of.

When the source is “open”, it means that the source code is available for you to use, modify, then compile to fit your needs or re-release under an other name (Licensing sometimes prevent it but you get the idea).

This is the world of computers, decompiling then copying or examining is possible but you only see it on very old projects if they are closed source, because it takes a lot of time to work with.

Let me know if you have any more questions.

Anonymous 0 Comments

My understanding

Languages show us words and formats them so we can read and understand the code with files, comments, and organisation tools.

That code is then compressed into machine code, which is what the language is built off of. It reduces code to its most basic functions while still being somewhat readable to humans, but all the language and variable are converted to symbols and addresses without any explanations.

This is then converted into assembly, what a machine reads. This is basically nonsense to humans.

Companies ship assembly, as only your computer needs to read it.

You can actually convert it back into machine code pretty easily as that’s the point of machine code.

After that, if you had 1000s of lines with 1000s of variables, it could take you and a team AGES to read, interpret, rewrite, and recreate code through this method. That’s assuming the engineers involved put in no effort to prevent this from happening.

Encrypting the assembly code, adding layers of dummy variables and logic in the machine code, verification checks.

Anonymous 0 Comments

You’re correct that the ‘code is in the device’, since the device needs to know how to execute the software

However it’s incorrect that the original ‘source code’ is in the device. Programmers use source code to create abstractions and build complex software out of abstractions. These abstractions are flattened away when the code is converted to something the machine can execute.

So, the reason it’s difficult is because it takes considerable amount of time, skill, and effort to examine the executable and reconstruct a natural understanding of what the code is doing. It’s not impossible, but it’s very time consuming. The process is known as ‘reverse engineering’ and it has been done in the past. If your competitor is desperate to know what your code is doing, they can invest effort into finding out this way. Or they can just try their own ideas on how to solve the problem, which is usually a more productive use of time.

Anonymous 0 Comments

Source code and executable code look very different. Source code is readable to a human while executable code is readable by the computer. We “translate” source code into machine code (compiling). There are ways to translate back, but you lose some meaning.

Try running a sentence through Google translate through like 10 languages then back to English. You may get the general idea, but you’ll lose some nuance or might even get gibberish back.

Anonymous 0 Comments

The program you have is not the source code. It’s an inscrutable collection of data that’s unlabelled and compact. Source code is the actual code of a program that the developers are working with.

If we use a car analogy, the program is a car. The Source code is the car factory.

Anonymous 0 Comments

An analogy can be made with, say, a movie. The difference between final product and source code is like the difference between having:

(A) – A warehouse room full of props and costumes used, catalogued as to what was used in which scene – The raw footage of the many takes of scenes from different camera angles, only a fraction of which made the final edit – A box full of copies of the script, some with stage directions written in them, some with camera directions written in them, some with lighting directions written in them, etc.

versus

(B) A 90 minute video file.

If you just want to watch a movie, you only need (B). But if you want to *create* a movie that is a slightly altered version of (B), then having all that stuff in (A) is needed or else you’re totally starting from scratch.

It’s a one-way process going from A to B because there are multiple different hypothetical (A)’s that could have given the same resulting (B). So if all you have is (B) you can’t use it to generate the exact (A) it came from. (You can *guess*, and generate one *possible* (A) it might have come from, and there are programs that do that called “de-compilers” but the result is horrible to read and very hard to work with.)