Moving To a New Site

I have just decided that it's time to move on and have my own domain. All the posts in this blog will be moved to:

www.pinoygeek.org

Personal posts will be transferred to: raldz.pinoygeek.org

Sunday, October 16, 2005

Reverse Engineering Software

By g0df4th3r
--------------------------------------------------

Here is a short tutorial to explain some reverse-engineering techniques. Reverse engineering is the modifaction of data to make it do what you want to do. You MUST realise that reverse engineering can be used illegally, and doing so can lead you into trouble, and hurt our economy and developers. You can though use it legally too, for example:


Some wargames present challenges that require the modifaction of applications.

It helps understand some of the very basics of assembly.

You may develop techniques useful in finding exploits, or exploiting them.

Greated debugging skills, at lower-level, and learning about patching and similar.

This is why reverse-engineering should be learned, not to learn how crack programs illegally, as this takes away profit from the software industry, and we want better programs and games. Let's now continue on with the tutorial.

The Program and the Objective

For a start, you have to know what we are going to reverse-engineer, don't you?

Here is the short program written in C++:


QUOTE

code:--------------------------------------------------------------------------------
#include

int main() {
int i;
for (i=100; i <= 200; i++) {
cout << i << "\t";
}
if (i == 0) {
cout << "You won the challenge, congragulations.";
}
else {
cout << "You failed to complete the challenge. Please try again.";
}
}
--------------------------------------------------------------------------------

This program, written in C++ for MS-DOS, it is very simple. If I write anymore tutorials on this subject, the programs will be more compilicated and most likely win32. Anyway, the programs counts up from 100 to 200. You objective is to count down to 0. The program must be compiled in its original form. So how can we change the compiled application to display the string "You won the challenge, congragulations." instead of the string "You failed to complete the challenge. Please try again.". This, with our tools, is infact easy. By understanding how to do it, it does not take long to figure out similar challenges. You will, however, have to know some basic assembly (which I will tell you) to do these operations. By this you will not have to know how to write full blown assembly applications (I can't really) but to read assembly code, in a pseudo affect. To do this, you must learn some very basic assembly. I am listing the important ones below, do not worry if you do not understand these, as I will explain what things do throughout the tutorial.

Assembly Rundown:

Here are some basic assembly functions. The will help throughout the tutorial.
These are not designed to teach assembly, and are not explained very well.

Jumps and Calls

je destination - Jump if equal
jne destination - Jump if not equal
jmp destination - Unconditional jump, always jumps ot destination.
jl destination - Jump if not less
jle destination - Jump if less or equal
ja destination - Jump if greated (above)
jae destination - Jump if greater or equal
jnl destination - Jump if not less
jng destination - Jump if not greater
call destination - Calls a subroutine

Jumps are easy to remember. If you haven't noticed, the operators are derived from abrievated english.


Examples:

je - Jump if Equal
jae - Jump if Above or Equal



Stack operations

push src - Pushes data on the stack (memory). Used for calls, passing data to subroutines.
pop src - Takes data off the stack, puts it into a register.

Stack is last in first out (LIFO) meaning the last piece of data you push in, is the first data you pop out.


Math Operators

inc src - Increment by one (src++)
dec src - Decrement by one (src--)
add src, number - Adds the number to src. ie. src=1, number is 3 so it is 4.
sub src, number - Subtracts the number from src. ie. src=4, number is 3 so it is 1.


Comparisions

cmp src, data - For example, if src=1 and data=1, then the cmp was true.
test src, data - Another comparision operation, useful for testing is a register is zero.


Others

nop - No Operation.

Studying The Program


Programs Needed

Although you can use many tools to get the task done, I will be writing this tutorial to work with these two tools under windows.


W32Dasm

Hiew


Which can be found quite easily, and I believe are both freeware, or offer freeware packages.

If you are working under *nix, then you will have to learn how to use appropiate programs on your system (like gbd).

Finding out what to change

To change this program, we need to study it. You could pull out a hex editor, or a cheap disassembler, and go reading through the assembly, but that would mean more work for you. So, we will use W32Dasm, as it is great for studing the program. To change the program, we are using Hiew (Hacker's View) which is quite easy to use.

So, what do we need to change. In this tutorial we will be reverse-engineering this program in two different way, whose output will be similar, but a little different.


Studing the program is the most important part you do. First run the program and examine it output. It counts up from 100 to 200, and then prints a string: "You failed to complete the challenge. Please try again.". This is enough to start tracking down the operations and the desired result.

Make 2 copies of the application which you have compiled. Open up W32Dasm, and select Disassembler-> Open File to Disassemble. Locate the original application, and open it. This should start the disassembly process, which is quite quick for this application. When disassembly is done, make sure the font is of readable type. If not, change it to something more suitable by going to Disassembler -> Font -> Select Font. Now, what do you think we have to do to find out what to change. I will tell you, we have to find our string "You failed to complete the challenge. Please try again." (without quotation marks).

There is two ways we can find this out:

One way is to go to the string references by clicking on the the toolbar option Str Ref (string references) or by going to the menu option Refs and String Data References. This is all the strings used by out application. The majority of them being junk (to us) added by the compiler. You can then search down through this list, which is in alphabetical order, for our string. When you find the string, double click on it, and you should be taken to a new location in the disassmbly listing.

The other way is by going to Search -> Find Text, and searching for our string. Type in something like "You failed" without the quotation marks.

I am now assuming you are near a push instruction. Above this push instruction is our string, and some text. This looks like this:


QUOTE

code:--------------------------------------------------------------------------------
* Referenced by a (U)nconditional or ©onditional Jump at Address:
|:00401180©
|

* Possible StringData Ref from Data Obj ->"You failed to complete the challenge. "
->"Please try again."
|
:00401196 68A2A14100 push 0041A1A2
:0040119B 685C044200 push 0042045C
:004011A0 E88F810000 call 00409334
:004011A5 83C408 add esp, 00000008
---------------------------------------------------------------------------------

Now, we are most interested in where the program decided to choose this option, instead of an other option. So, look at the second line. It has this written within it: 00401180©. The © means that it was from a conditional jump, meaning it was not a jmp statement, it had some critera to judge what to do. So now, we want to see the instruction that called this, so go to Goto -> Goto Code Location. Type in the text box, the location, which is 00401180 (may differ on your computer). We can now see this:


QUOTE

code:--------------------------------------------------------------------------------
:0040117E 85DB test ebx, ebx
:00401180 7514 jne 00401196

* Possible StringData Ref from Data Obj ->"You won the challenge, congragulations."
|
:00401182 687AA14100 push 0041A17A
:00401187 685C044200 push 0042045C
:0040118C E8A3810000 call 00409334
:00401191 83C408 add esp, 00000008
:00401194 EB12 jmp 004011A8
--------------------------------------------------------------------------------

The failed string area was called by a jne (jump if not equal) statement. Look at the statement above it, is reads test ebx, ebx. This statement test that ebx is equal to zero, ebx holds the number used for the loop. Now, we know that ebx cannot hold zero without changing the code, but in this example, we don't care. We just want the jump to never occur, because if it doesn't, the winning string is displayed, and although we didn't actually match the objective, we got the string. So how do we stop this jne statement occuring? By replacing it with a nop instruction (No Operation). This way it will continue on, and print our string. So, how do we change this. Okay, take down the offset when you are over the jne statement (when it is green/blue highlighted, green in jumps, you are over the statement), which can be seen in the status bar. For example, on my computer, the status bar reads this:


QUOTE

code:--------------------------------------------------------------------------------
Line:309 Pg 4 of 607 Code Data @: 00401180 @Offset 00000780h in File:crackme.exe
--------------------------------------------------------------------------------

So I know the offset I want to change on my computer is: 00000780. The h tells me it is a hex number. The offset may differ on your computer, so remember the one which corresponds to you, not me.

Reverse Engineering the Program

We will now use Hiew. We want to open a copy of our application (I told you to make 2 before), because W32Dasm is using the original, and we want to keep the original application. Now, you can open Hiew, and work your way through directorys to find the application, or you can do as I prefer and drag the copied exe icon into the hiew.exe icon, and it will open our application (same as issuing the command hiew file.exe in MS-DOS). You will now be presented with a hole bunch of characters on your screen. Press F4 for Mode, and select Decode (shortcut being pressing enter twice). Now, press F5 for Goto. Type in the offset, for me it is 00000780, but I can type in 780. Zeros to the left can be taken out, for example 00102101 would be 102101, but you can leave the zeros in to, if you prefer. You should be at the jne statement. Before I tell you how to change the program, I have to tell you this important note.

Every instruction in assembly is represent by a numerical instruction, called opcode. For example, JNE is 75, JE is 74, nop is 90. Statements like jne have parameters (like destination) and therefore take more bytes (our jne statement opcode is 7514). When changing a program, you must remember that when you change an instruction, you must not just leave the paremeters there, as these will turn into invalid instructions, usually causing an error. So, every 2 bytes relating to the instruction must change. For example if we wanted to change our jne statement (7514) into a nop, we must use 90 twice, so the opcode would be 9090. You do not have to remember opcodes, you can type in assembly codes, but knowing how to replace codes is important. Do not stress if you do not fully understand what I just said, as it will be demonstrated again soon.

Okay, now, lets change our jne statement. Press F3 for edit, and then press F2 for Asm (short for assembly). A box will open, with the asm instruction used. Delete this instruction, and type in "nop" without the quotation marks. Now what happens is directly related to the important note above. Our jne opcode was 7514, we typed in a nop which opcode is 90, 14 now makes an invalid instruction. So, we must replace this byte with another nop instruction, so type in nop, press enter, then escape to close the asm dialog. Press F9 for update, and then F10 for Quit. Run the application through the command line, and you will notice the end statement change from "You failed to complete the challenge. Please try again." to "You won the challenge, congragulations.". Although it still counts up. Now, we are going to reverse-engineer this application once more, to make it count down, completely fulfilling the objective, and furthering your understanding of the situation.

Reverse Engineering the Program - Part 2

What we do now is more compilicated, and requires a more deeper understanding of assembly (not too much more though). What we are going to do is make the application count down, and then display the winning string. We have already done the basic study, so we can skip that, now we want to look at the original application again (not the one we just engineered) and work out what to do. Realise that I have added comments in this readout, which are not displayed in W32Dasm (for obvious reasons), ; is the comment symbol, read these comments, it tells you what the code is doing. Here is the section we are looking at, which is the one we were looking at before, and a bit more up:


QUOTE

code:--------------------------------------------------------------------------------
* Referenced by a (U)nconditional or ©onditional Jump at Address:
|:0040117C© ; The below jle statement jumps back to this, forming a loop.
|
:00401159 6878A14100 push 0041A178
:0040115E 53 push ebx
:0040115F 685C044200 push 0042045C
:00401164 E8EF7F0000 call 00409158
:00401169 83C408 add esp, 00000008
:0040116C 50 push eax
:0040116D E8C2810000 call 00409334
:00401172 83C408 add esp, 00000008
:00401175 43 inc ebx ; Increase ebx by one
:00401176 81FBC8000000 cmp ebx, 000000C8 ; Compare ebx with 200, used by jle
:0040117C 7EDB jle 00401159 ; Jump is less or equal than 200.
:0040117E 85DB test ebx, ebx ; test if ebx is 0, same as before.
:00401180 7514 jne 00401196 ; Jumps to failed message, like before.

* Possible StringData Ref from Data Obj ->"You won the challenge, congragulations."
|
:00401182 687AA14100 push 0041A17A
:00401187 685C044200 push 0042045C
:0040118C E8A3810000 call 00409334
:00401191 83C408 add esp, 00000008
:00401194 EB12 jmp 004011A8
--------------------------------------------------------------------------------

From my comments, you should see the low level structure of a for loop. The jle jumps back through the whole process if it is under 200 (hex C8, assembly uses hex). We can see the inc increases ebx by 1 each time called.

So psedu of this is:

BEGIN PSEUDO CODE
x=100
loop:
x+1
if x is less or equal to 200, goto loop
end loop
if x does not equal 0 goto the failed message, leave here, returns after back to pseudo.
if x does equal 0 print failed message
below code initialises the cleanup and exit processes
END PSEUDO CODE

Which is very easily translated to any HLL, using the provided for loops or other type of structure.

Now we have to change this code, to make it count down to zero, and display our message. So, first we have to change the inc, because we want it to decrease the number. Then we must change the cmp instruction, as we want it to compare with zero, not 200. Then, we have to change the jle (jump if less or equal) because we want it to jump if greater/above than 0 which is the instruction ja, not jae (jump if above or equal) because than it will go down to -1, and will jump to the failed message. So, lets get the offset of the first instruction we want to change (which is the following line)


QUOTE

code:--------------------------------------------------------------------------------
:00401175 43 inc ebx
--------------------------------------------------------------------------------

The offset on my compiled program is 00000775h, as said before yours may differ.

Now, lets get to hiew. Open your 2nd copy, the copy which is unmodified. Get to the decode place (F4 -> Decode) Goto our offset (F5, type in offset) and now lets change these instructions. Please note when I say the line looks like the example, I mean under the Asm (F2) dialog. You should be at a line which reads:


QUOTE

code:--------------------------------------------------------------------------------
inc ebx
--------------------------------------------------------------------------------


Using F2 (Asm) you should change this too:

code:--------------------------------------------------------------------------------
dec ebx
--------------------------------------------------------------------------------

Now, there should be no need to add any nops here or anything, as both inc and dec use 2 digits for opcode.

Now we have to change this line (do not change until I fully explain)


QUOTE

code:--------------------------------------------------------------------------------
cmp ebx, 000000C8
--------------------------------------------------------------------------------

to

code:--------------------------------------------------------------------------------
test ebx, ebx
--------------------------------------------------------------------------------


Now, to do this we must refer to my special note I had before. cmp ebx, 000000C8 opcode is 81FBC8000000

We spilt this into twos:
cmp ebx 200
81 FB C8 00 00 00

Now, test ebx, ebx opcode is (in groups of two)
test ebx and 0
85 DB

So we must replace the 00's and C8 with nops.

So go to asm dialog where the cmp instruction is, change it to test ebx, ebx. We must than change the remain 8 bits, which is grouped into 2. Which means 4 nops, so write nop, press enter, and repeat 3 more times. (You can change opcode directly, just don't press F2 while under edit, however it is harder to remember opcode than the asm instructions.

Did you notice how after changing these instructions, the jle statement, which had disappeared, is back. This is because once you start an invalid instruction, it will affect the whole program, nearly every statement following changes, if not all. That is why it is very important to count you bytes.

Now we must change the statement which reads:


QUOTE

code:--------------------------------------------------------------------------------
jle 000000759
--------------------------------------------------------------------------------

to

code:--------------------------------------------------------------------------------
ja 000000759
--------------------------------------------------------------------------------


(please note that yo do not have to use tabs, spaces are suffecient)

After doing this, you should now press F9 to update, and F10 to exit.

Now you can run the code in your DOS, and see if you completed it correcly or not, if you did, congradulations, if not, bad luck, do try again, or if having severe problems reply to this thread of PM me.


Conclusion

That concludes this tutorial. I hope you enjoy. Good Luck. God Speed. uhh I forget if there are any more expressions for "Good Luck" Well Just Enjoy.

Good Luck

2 comments:

Delayed said...

Download all full debugging tools directly from

http://www.loranbase.com/idx/14/0/Debugging-Tools.html

Carmina said...

I have some experience in doing some reverse-engineering, I got the source code of a software called Sildenafil Citrate and I improved the software and added some features. Thanks for the tutorial!