how do you “reverse engineer” something?

550 views

how do you “reverse engineer” something?

In: 64

22 Answers

Anonymous 0 Comments

You carefully move through an environment and take stupidly over-cataloged and detailed notes about every aspect of it that can possibly be aligned to a name, taxonomy, or even concept.

Then, you go sit in a corner with all your information and try to figure out every pattern you recognize and can attribute to a task and then, you have parts from which to build a puzzle.

Now, your problem is that you do not know what the puzzle is or how it should look. This is generally the most entertaining/frustrating part because you have to try every combination you can calculate to figure out which even WORK; constantly reducing the number of pieces until the only available function that remains is X (the sum defined by its parts).

There are many methods and practices and religions on this stuff as there are grains of sand on a given beach. And entire industries have spun up to support the heavy lifting of things like rapid pattern identification and/or even just points of data handoff from which logically valid functions can be derived.

Ultimately, you can only reverse engineer something to the extent that you could engineer it in the first place. Any mimicry is obvious and all imitation, subordinate. Generally.

The domain of forensic analysis has presence in technology, but not usually at this level. No one much cares beyond being able to identify “their” source for purposes of litigation, which is why you don’t see 1,000 flavors of ‘reverse engineer guru 1.0’ out there. (There are some enterprise offerings that get closer than you might think… and I can see MS finally had the light bulb go off in this area, too… so I suspect interesting times ahead.)

Anonymous 0 Comments

There are a bunch of great answers here focused on physical engineering, so I’ll take a second to give you one focused on software…

First off, reverse engineering generally relies on having some baseline, reasonably comprehensive idea of what a piece of software is *doing*. If you don’t know what it does, figuring out how it does it is going to be like throwing darts blindfolded while so drunk that you aren’t even sure you’re in the same city as the board. If you don’t already know what a piece of software does, your first step is to just poke at it a lot as a user and see if you can find all the ways it behaves.

Second, once you have this knowledge, you can usually draw on your own domain experience to make some very good guesses about how it works. Not the deep dark details, but the broad strokes are usually pretty familiar. At this stage, most of the surprises tend to be “oh, they did something surprisingly dumb and I expected them to be smarter”, but you also aren’t really getting to the juicy bits yet. If you’re really experienced in the domain, you can probably get a solid 70-80% of how a thing works just by guessing.

The remaining 20-30% is where you will get stuck. Here’s where you break out tools like a decompiler (gives you the code as the CPU sees it), packet sniffer (gives you the packets as the router sees them), dtrace (tells you what the operating system thinks the program is doing), and whatever else is appropriate to the task. If your guesswork was good, you can be pretty narrowly focused on this stuff. It’s kind of like walking into a large field with a microscope so you can find a particular bacteria, but by knowing a lot about the bacteria’s habitat, you can set your microscope down in roughly the right spot and find what you’re looking for.

Altogether, this is very tedious, very detailed, but often pretty fun work. The final outcome is rarely a sufficient understanding to reconstruct the software entirely from scratch (though this can indeed happen), but more likely a sufficient understanding to explain in detail how it works.

As a note, none of this is that different from normal debugging, which is a part of all software development. In a sense, all software developers practice reverse engineering their own code all the time, which is why particularly experienced developers don’t have much trouble reverse engineering code written by others. Years and years of practice pay off.