Why is it so difficult to copy source code that is not “open source”?

866 views

It’s been in my mind if we are using the software/program or even hardware of a tech company, we can play around, install-unsinstall and more. Then how is it so difficult for someone to “unhide” the source code that the device uses? Technically the code is in the device somewhere hidden in it, so it’s there, but still, it’s almost impossible to obtain the source code. How do they achieve this so no one copies their code?

In: 366

42 Answers

Anonymous 0 Comments

it’s not that

> no one copies their code

it’s just difficult. computer operates in binary. so the code (called machine code) is just a loooong stream of 1s an 0s. first step in converting it back to readable form (source code) is disassembly. and the most difficult step is to convert the assembly dump into higher level language – with functions, procedures etc. you generally have to wrap your head around it all, and (in a way) rediscover how to put all this assembly mess back together into a coherent high-level language from. such process is called [reverse engineering](https://en.wikipedia.org/wiki/Reverse_engineering) – meaning getting the recipe/algorithm/way something works out of the results of work of such piece of code (or device, because hardware reverse engineering is also a thing). take a look at this:

cli
push cs
pop ds
mov ax, word ptr [0x4c]
mov word ptr [0x7cd3], ax
mov ax, word ptr [0x4e]
mov word ptr [0x7cd5], ax
mov al, byte ptr [0x46e]
mov byte ptr [0x7dbd], al
mov ax, word ptr [0x413]
dec ax
mov word ptr [0x413], ax
mov cl, 6
shl ax, cl
sub ax, 0x7c0
mov word ptr [0x4e], ax
mov word ptr [0x26], ax
mov word ptr [0x4c], 0x7c82
mov word ptr [0x24], 0x7d62
mov si, 0x7c00
mov di, si
mov es, ax
mov cx, 0x100
cld
rep movsw word ptr es:[di], word ptr [si]
int 0x19
cmp ah, 0xaa
jne 0x4a
iret
cmp ah, 2
jne 0x94
cmp cx, 1
jne 0x94
cmp dh, 0
jne 0x94
push ax
push bx
push si
push di
pushf
lcall cs:[0x7cd3]
jae 0x67
jmp 0x9d
cmp word ptr es:[bx + 0x1fe], 0xaa55
je 0x72
jmp 0x99
cmp byte ptr es:[bx + 0x1bc], 0xc9
je 0xf4
call 0x108
call 0xb1
mov si, bx
cmp dl, 0x79
ja 0xa5
add si, 2
mov di, 0x7c02
mov cx, 0x1e
xor dh, dh
jmp 0xc4
ljmp 0xf000:0xb648
mov ax, 1
clc
pop di
pop si
pop bx
inc sp
inc sp
retf 2
add si, 0x1be
mov di, 0x7dbe
mov cx, 0x20
jmp 0xc4
mov ax, 0x301
pushf
lcall cs:[0x7cd3]
jae 0xc3
pop bx
mov cl, 1
xor dh, dh
jmp 0x99
ret
push ds
push es
pop ds
push cs
pop es
cld
rep movsw word ptr es:[di], word ptr [si]
mov cx, 1
mov bx, 0x7c00
mov ax, 0x301
pushf
lcall cs:[0x7cd3]
jb 0xcc
push ds
pop es
inc byte ptr cs:[0x7dbd]
pop ds
jmp 0x99
add ax, 0x714
sbb al, 1
or al, 0x75
push ss
sbb ax, 0x1610
push ds

this is a result of disassembling some 400-something bytes of binary machine code (actually, a boot sector virus). you’d have to understand how it all works, and what each line does precisely, in order to be able to reconstruct it in higher level language. now, imagine doing the same with code that has several megabytes, or maybe more. of course, there are tools that help, but still software reverse enigneering is one of the most hardcore things you can do in IT. most diffcult and most mundane at the same time. the example above is only four hundred-something bytes!

You are viewing 1 out of 42 answers, click here to view all answers.