Reverse Engineering for Beginners: Basic Programming Concepts

April 5, 2019 | Views: 7946

Begin Learning Cyber Security for FREE Now!

FREE REGISTRATIONAlready a Member Login Here

In this article we will look under the hood of the software. Newbies in reverse engineering will get a general idea of ​​the software research process itself, the general principles of building code, and how to read assembly code.

Note The program code for this article is compiled using Microsoft Visual Studio 2015, so some features in newer versions may be used differently. IDA Pro is used as a disassembler.

Andinitializing variables

Variables – one of the main components of programming. They are divided into several types, here are some of them:

  • line;
  • integer;
  • logical variable;
  • symbol;
  • real number with double precision;
  • real number;
  • array of characters.

Standard variables:

string stringvar = "Hello World";
int intvar = 100;
bool boolvar = false;
char charvar = 'B';
double doublevar = 3.1415;
float floatvar = 3.14159265;
char carray [] = {'a', 'b', 'c', 'd', 'e'};

in C ++, a string is not a primitive variable, but it is important to understand how it will look in native code.

Let’s look at the assembler code:

Variable initialization

Here you can see how IDA shows the allocation of space for variables. First, space is allocated for each variable, and then it is initialized.

Variable initialization

Once the space is allocated, the value we want to assign to the variable is placed in it. The initialization of most variables is shown in the picture above, but how the string is initialized is shown below.

Initializing a string variable in C ++

To initialize the string, you need to call the built-in function.

CStandard output function

Note Here we will talk about how variables are pushed onto the stack and then used as parameters for the output function. The concept of the function with parameters will be discussed later.

For data output, it was decided to use printf(), not cout.

Standard output:

printf ("Hello String Literal");
printf ("% s", stringvar);
printf ("% i", intvar);
printf ("% c", charvar);
printf ("% f", doublevar);
printf ("% f", floatvar);
printf ("% c", carray [3]);

Now look at the machine code. First string literal:

String literal output

As you can see, the string literal is first pushed onto the stack for calling the function as a parameter printf().

Now let’s look at the output of one of the variables:

Variable output

As you can see, the variable intvaris first placed in the EAX register, which in turn is written to the stack along with the string literal %iused to denote integer output. These variables are then taken from the stack and used as parameters when calling the function printf().

MMathematical operations

Now we will talk about the following mathematical operations:

  1. Addition.
  2. Subtraction.
  3. Multiplication.
  4. Division.
  5. Bitwise conjunction (I).
  6. Bitwise Disjunction (OR).
  7. Bitwise exclusive OR.
  8. Bitwise negation.
  9. Bit shift to the right.
  10. Bit shift to the left.
void mathfunctions () {// math operations

    int A = 10;
    int B = 15;
    int add = A + B;
    int sub = A - B;
    int mult = A * B;
    int div = A / B;
    int and = a & b;
    int or = A | B;
    int xor = A ^ B;
    int not = ~ A;
    int rshift = A >> B;
    int lshift = A << B;
}

We translate each operation into an assembler code:

we first assign the Avalue 0Ato a variable in hexadecimal or 10 in decimal. The variable Bis 0Fequal to 15 in decimal.

Variable initialization

For addition we use the instruction add:

Addition

When subtracting, the instruction is used sub:

Subtraction

When multiplying – imul:

Multiplication

For division the instruction is used idivWe also use an operator cdqto double the size of EAX and the result of the division fits in the register.

Division

When bitwise conjunction used instruction and:

Bitwise conjunction

When bitwise disjunction – or:

Bitwise Disjunction

With bitwise exclusive or – xor:

Bitwise exclusive OR

With bitwise negation – not:

Bitwise negation

With a bit shift to the right – sar:

Bit shift to the right

At bit shift to the left – shl:

Bit shift left

InCalling Functions

We will look at three kinds of functions:

  1. A function that does not return a value (void).
  2. A function that returns an integer.
  3. Function with parameters.

Function call:

newfunc ();
newfuncret ();
funcparams (intvar, stringvar, charvar);

First, let’s see how the functions are called newfunc()and newfuncret()which are called without parameters.

Calling functions without parameters

The function newfunc()simply displays the message “Hello! I’m a new function! ”:

void newfunc () {// new function without parameters
    printf ("Hello! I'm a new function"!);
}
Newfunc () function

This function uses the instruction retn, but only to return to the previous location (so that the program can continue its work after the end of the function). Let’s look at a function newfuncret()that generates a random integer using the C ++ function rand()and then returns it.

int newfuncret () {// new function that returns something
    int A = rand ();
    
    return A;
}
Newfuncret () function

First, space is allocated for the variable AThen the function is called rand(), the result of which is placed in the EAX register. The value of EAX is then placed in the place allocated to the variable A, effectively assigning the Aresult of the function to the variable rand()Finally, variable A is placed in the EAX register so that the function can use it as a return parameter. Now that we have figured out how to call functions without parameters and what happens when returning a value from a function, let’s talk about calling the function with parameters.

Calling such a function is as follows:

funcparams (intvar, stringvar, charvar);
Function call with parameters

Strings in C ++ require calling a function basic_string, but the concept of calling a function with parameters does not depend on the data type. First, the variable is placed in the register, then from there to the stack, and then the function is called.

Let’s look at the function code:

void funcparams (int iparam, string sparam, char cparam) {// function with parameters

    printf ("% i  n", iparam);
    printf ("% s  n", sparam);
    printf ("% c  n", cparam);
}
Funcparams () function

This function takes a string, an integer and a character, and prints them with a function printf()As you can see, first the variables are placed at the beginning of the function, then they are pushed onto the stack for calling the function as parameters printf()Very simple.

CCycles

Now that we have studied the function call, output, variables, and mathematics, we turn to controlling the order of code execution (flow control). First we examine the for loop:

void forloop (int max) {// normal for loop
    for (int i = 0; i <max; ++ i) {
        printf ("% i  n", i);
    }
}
Graphic overview of the for cycle

Before breaking the assembler code into smaller parts, let’s look at the general version. As you can see, when the for loop starts, it has 2 options:

  • he can go to the block on the right (green arrow) and return to the main program;
  • it can go to the block on the left (red arrow) and go back to the beginning of the for loop.
Cycle for detail

Variables are compared first iand maxto check if the variable has reached its maximum value. If the variable is inot greater than or not equal to the variable max, then the subroutine will go along the red arrow (down to the left) and output the variable i, then iincrease by 1 and return to the beginning of the cycle. If the variable is igreater than or equal max, the subroutine will go along the green arrow, that is, it will exit the cycle forand return to the main program.

Now let’s take a look at the loop while:

void whileloop () {// while loop

    int A = 0;
    while (a <10) {
        A = 0 + (rand ()% (int) (20-0 + 1))
    }
    printf ("I'm out!");
}
While loop

In this cycle, a random number is generated from 0 to 20. If the number is greater than 10, then the loop will exit with the words “I’m out!”, Otherwise the work in the loop will continue.

In machine code, the variable is Аfirst initialized and equated to zero, and then the cycle is initialized and Acompared to a hexadecimal number 0A, which is 10 in the decimal number system. If Аnot greater than and not equal to 10, then a new random number is generated, which is written in А, and the comparison again occurs. If Аgreater than or equal to 10, then exit from the cycle and return to the main program.

At theconditional operator

Now let’s talk about conditional statements. First, let’s look at the code:

void ifstatement () {// conditional statements
	int A = 0 + (rand ()% (int) (20-0 + 1));

	if (A <15) {
		if (A <10) {
			if (a <5) {
				printf ("less than 5");
			}
			else {
				printf ("less than 10, greater than 5");
			}
		}
		else {
			printf ("less than 15, greater than 10");
		}
	}
	else {
		printf ("greater than 15");
	}
}

This function generates a random number from 0 to 20 and stores the resulting value in a variable АIf A is greater than 15, then the program will display “greater than 15”. If A is less than 15, but more than 10 – “less than 15, greater than 10”. If less than 5 – “less than 5”.

Let’s look at the assembler graph:

Assembly graph for conditional operator

The graph is structured similarly to the actual code, because the conditional operator looks simple: “If X, then Y, otherwise Z”. If you look at the first pair of arrows above, then the operator is preceded by a comparison Аwith 0F, which is 15 in the decimal number system. If it is Аgreater than or equal to 15, then the subroutine will output “greater than 15” and will return to the main program. In another case, a comparison Аwith 0A(1010) will occur This will continue until the program displays something on the screen and returns.

OOperator selection

The select statement is very similar to the condition statement, only in a select statement one variable or expression is compared with several “cases” (possible equivalences). Let’s see the code:

void switchcase () {// select statement
	int A = 0 + (rand ()% (int) (10-0 + 1));

	switch (A) {
		case 0:
			printf ("0");
			break;
		case 1:
			printf ("1");
			break;
		case 2:
			printf ("2");
			break;
		case 3:
			printf ("3");
			break;
		case 4:
			printf ("4");
			break;
		case 5:
			printf ("5");
			break;
		case 6:
			printf ("6");
			break;
		case 7:
			printf ("7");
			break;
		case 8:
			printf ("8");
			break;
		case 9:
			printf ("9");
			break;
		case 10:
			printf ("10");
			break;
	}
}

In this function, the variable Аgets a random value from 0 to 10. It is then Аcompared with several cases using switchIf the value Аis one of the cases, the corresponding number will appear on the screen, and then the operator will exit the selection operator and return to the main program.

The choice operator does not follow the rule “If X, then Y, otherwise Z”, unlike the conditional operator. Instead, the program compares the input value with existing cases and performs only the case that matches the input value. Consider the first two blocks in more detail.

The first two blocks of the operator of choice

First, a random number is generated and written to АNow the program initializes the select statement, equating the temporary variable var_D0with А, then checks that it is equal to at least one of the cases. If var_D0a default value is required, the program will follow the green arrow to the final return section from the subroutine. Otherwise, the program will make the transition to the desired one case.

If it var_D0 (A)is equal to 5, then the code will go to the section shown above, output “5” and then go to the return section.

PUser input

In this section, we will look at user input using a stream сinfrom C ++. First, look at the code:

void userinput () {// keyboard input

    string sentence;
    cin >> sentence;

    printf ("% s", sentence);

}

In this function, we simply write the string to the variable of the sentence using the C ++ cin function and then output the sentence using the operator printf().

Let’s sort it out in machine code. First, the function cin:

cin (C ++)

First, the string variable is initialized sentence, then the call cinand the entry of the entered data into sentence.

C ++ cin function more detailed

First, the program sets the contents of the sentence variable to EAX, then pushes EAX onto the stack, from where the value of the variable will be used as a parameter for the stream cin, then the stream operator >> is called. Its output is placed on ECX, which is then pushed onto the stack for the operator printf():

We considered only the basic principles of the software at a low level. Without these fundamentals, it is impossible to understand the work of software and, accordingly, to engage in its research.

Share with Friends
FacebookTwitterLinkedInEmail
Use Cybytes and
Tip the Author!
Join
Share with Friends
FacebookTwitterLinkedInEmail
Ready to share your knowledge and expertise?
Comment on This

You must be logged in to post a comment.

Our Revolution

We believe Cyber Security training should be free, for everyone, FOREVER. Everyone, everywhere, deserves the OPPORTUNITY to learn, begin and grow a career in this fascinating field. Therefore, Cybrary is a free community where people, companies and training come together to give everyone the ability to collaborate in an open source way that is revolutionizing the cyber security educational experience.

Cybrary On The Go

Get the Cybrary app for Android for online and offline viewing of our lessons.

Get it on Google Play
 

Support Cybrary

Donate Here to Get This Month's Donor Badge

 
Skip to toolbar

We recommend always using caution when following any link

Are you sure you want to continue?

Continue
Cancel