Hi. Welcome. This I bury my name is Sean Pierce and on the subject matter expert for instruction to malware analysis. Today we're gonna be covering calling conventions and exiting six assembly.
So what exactly are calling conventions? Well, in simple terms, when a function call is made with parameters, the assembly code will do a series of pushes to push those primaries onto the stack in reverse order.
And then we'll execute the call instruction and the call instruction as its operandi. It's
second part of the instruction eyes, the location of the function call.
the caller is responsible for cleaning up the stack.
the instruction right after that call instruction
where the stack is being cleaned up
the caller had just pushed a bunch of values onto the stack. If it pushed three values on the stack and pop three values, that would work just as well.
But, uh, I'm sorry. It's not sub e S p. It's ah like Addie sp.
The other major calling convention is standard coal. It's actually something made up by Microsoft and its how it it's a P I works, and it works pretty much the same way where it pushes the parameters in reverse order and the Kali. The only difference is the Kali is responsible for cleaning up the stack
you'll see in a standard Cole. There's push, push, push, and then
And then right under the coal is
And there are no other like stack instructions like Addie SP
or add something TSP or,
you know, pop, pop, pop or whatever else. And inside that function Cole,
that function cleans up the stack. It does pop pop pop at the end, or it'll do
Rhett for or whatever. Where is it? It is explicitly saying
I am going to fix the stack
and you might think, OK,
that doesn't matter very much. And you're right, it doesn't. The compiler just needs to choose one be consistent or or not. For instance, it can. It can do some pretty cool optimization Sze like with recursive function calls and
we're not gonna look ATT.
Some of the more advanced cases of how compilers cannon optimize code really well,
but you should know it is rather expensive toe push
time wise to push something onto the stack
or to pop it off. So we're gonna look at an instance where a compiler
might choose to do it a different way.
we're gonna look at the difference between a standard coal and a C declaration call.
And we can actually do both in the same program,
no instructions right next to each other.
And then I'm going to show you the different ways that data can be put onto the stack as just kind of a preference thing by different compilers. We kind of saw some differences between D. C. C and visual studio last time, and I will show those really quickly getting this time. So here we have our hello world function again. But
we're going to have two variables that are going to be
to the print F function,
along with a another parameter the format string.
So if you're gonna print hello world
and right under that is a P I function, call hold message box A, which just pops up. Ah, message box.
we're going to just debug this,
I'm going to go and generate the disassembly.
We can look at it side by side if we want to.
up here we see push EVP
these are the beginnings of most functions most or what we would call procedures or,
subroutines is what the old terminology is for it. Um,
takes the current base pointer
and then pushes it onto the stack. So it's kind of saving that value,
and then it takes the current
E S P at the bottom of the stack or excuse me, that the top of the stack and puts it to E v. P. So what you're essentially saying is
where my current E V p is,
wherever my e S P s,
base of the frame. So that's where I'm going to start
counting my values now.
So as soon as this move happens, E d P and E S P E are the same value there. Pointing to the same location of memory move doesn't mean
what you might think of it, uh,
in terms of like, copy paste, but move is more like a copy.
It doesn't zero out anything.
have these two values that equal the same thing, you have a stack frame
and then we're subtracting
I'm sorry. Zero D eight
bites from E. S. P. So we're extending our stack. So now we have
d eight hex bytes available
on the stack that we can use for storage of variables, mostly local variables. Um,
that's pretty much the only local variables should be stored on the stack.
So what does it do that?
Well, there's a push,
So it's saving Maur registers onto the stack. And this is another type of convention it's not. Call your collar convention. It's
the compiler deciding that the Kali
should save the registers
it can't rely on the fact that the code cannot rely on the fact that if it makes a call instruction,
bill, it's registers will still be the same values
these air general purpose registers like we were talking about last time. And that means they could be used by the code for anything.
And you really shouldn't think
if you call something or jump somewhere or do something, that the
registers will still be the same value afterwards. So a lot of compilers
where it seems the registers it's gonna mess with
on the stack, and then right before going to exit,
it'll pop them back off. So if you go down to the bottom, we can see that it's popping those
original three values
registers. It doesn't know what what's in those registers. That may not matter. It may not care. It's just kind of safety thing. It's just, you know, just in case. The code calling this one main was relying on the fact that it could still use those registers,
it's saving that. So it's just being good.
Code up Compilers may optimize this out, but, ah, this convention
in the X 86 official
they say, Do not expect that these registers will be saved,
but compilers try to save them anyway.
So what's it doing next?
Which is loaded, effective address,
this. This is really a shortcut, used a lot of times to generate to do some quick math
because you can combine multiple additions, attraction multiplication operations together and it's pretty fast. But in this context,
what is E. V. P. Minus my stack size?
And then it grabs that location
or it grabs the value at that location and puts it into E. T. I.
What's that using for? I don't really know just yet.
So the next instruction is It puts the value of 36 hacks into a E C X, and it's moving
the this value and to eat a X. And, uh,
I know what this is now because since I built this program
using debug right up here,
it's adding some extra half safety net type code.
by accident, like if we have a buffer overrun,
Ah, or we something goes wrong and the e I P
Uh, it'll break here. CC
three, which is software interrupt, which means a little signal, a debunker
that something has gone wrong.
This is a repeat string instruction,
that has to do with E D. I
That has to do with, um,
the destination register and the source register.
the value at this address.
And then whatever E s I
Yeah, whatever E. S. I is pointing to eyes generally
what the string operation will use or generally the value and e s I and Edie I with this, the string
instructions do, and these repeat instructions can act as a loop.
Not a whole lot of compilers will produce these instructions.
Um, because they like to have more control.
Oh, are they would like to manipulate the logic in there,
loops of it more. But ah, the string operations were made by Intel because they noticed, ah, lot of the processing power their chips were doing were string operations.
they weren't that much faster than the other ways Compilers figured out how to do them.
So not that common. But you still see him every once in a while
hello being stored in too far one.
And that's what this instruction down here,
where it's basically
There's a reference here. This is a memory reference to ever This string is stored in memory
and storing that address
into location of variable one.
And we're doing the same for
And here is really what I want to show you.
is taking three parameters.
It's gonna be a bar, too.
So it's going to do this maneuver
where it's gonna move
the location into E. X and then push e X. So
It's pretty much character array. At least one if we're keeping it simple.
It's a character array
Ask e one bite, one bite asking values.
and we're pushing it in the opposite order,
we're gonna go from Bar one
I'm sorry far to borrow one,
and we could have used a X again,
But the compiler decided to
store bar one E. C. X,
and then it pushes E. C X,
and then it pushes this last value. This memory value must contain a reference to this format string.
And then this call is performed
where a print off is then executed, then right after it,
the damage done to the stack by this push this push And in this push,
12 bytes. Their push for here. Eight here, 12 here and 12 and hex is C
zero x zero x zero c is
the standard way of writing it. But if you also if you put an H at the end of it, we know it's hex.
So this is pretty common. This is, you know, the standard See libraries that it's using. And that's the calling convention
standard Cole made by Microsoft and pretty much on Lee used by Microsoft. So not really standard at all.
So for this line of code,
we're gonna do something similar we're going to
I don't know why that instruction was produced.
Uh, so it's gonna do a push.
So that was this. No bite or zero. That's gonna do another push for the reference to this string.
for the reference to this string.
The reference to this string,
and I'm gonna do another push
as a reference to this zero
then is going to do a call.
Now this call looks a little weird
and that's because it's stored
the location of this function
This D S is data segment
and D word means 32 bits
and PTR means pointer.
it's a 32 bit pointer in the data segment.
this is part of our I A t e r,
import address table
you don't know what that is. That's okay. It's part of the p E file structure.
It is important. So you should get to know it down the line.
But for now, just know that location
of this function is stored here,
and we can see it a bit more elegantly and here in a minute.
there is no, uh, ad E S P.
this function, this message box a function
cleaned up the stack
bites or it added 16 bites to E. S. P
So you might be curious to see what
while other code is here. This is
you know, compilers will insert code for you.
And ah, lot of that has to do with performance or security checks to make sure stack isn't corrupted or something
or to set exception handlers,
uh, especially in debugging code because I built it in debug mode.
extra kind of security checks and rapper functions and stuff to just double check things. And then this last instruction this instruction is very, very common. You should really get to know the X or, uh, excellent e a taxi exit. It will make e x zero and
kept any axe. So return zero means
turned yea x to zero and then used the rep function,
you know, the pop pop pop that we saw earlier and
to see. Make sure that ah, the E S p is correct.
And so, while we're in debugging mode,
we're trying to figure out why something isn't working like a particular
function in 1/3 party libraries and working, you can work out.
Maybe Why? And as I was saying was I mentioned this the CC
the instructions a CC disassembled. So if you looked at this in a hex editor, be just B c, c, c, c, c, c, c, c, c, c, c, c, c and et cetera. And this is also in case you are code kept executing the on this return value. Or maybe jump was off and jumped down here or something like that. It happens
So if we want to look at this
If we didn't have the source code
we often don't for malware,
we would find this very useful somebody. Hello?
The project we made last time I'm going to go to
do you both because that's what we were just looking at.
a saying that it noticed that there is some debugging information in there.
Uh, does it want to try? Does it want to retrieve the file to say no? Because we usually don't have that file.
I say, No, I don't really like the proximity of you,
and this might seem pretty innocent, and you might click through it and try to find
you know a few lines of code were,
but, uh, remember when I was saying there's a lot of added
a lot of added code, a lot of, uh,
protections built in for the stack
for performance to help de buggers.
Well, this is it. So
last time we went and
and found the strings that we knew were reference. But I'm going to show you another technique that's very popular with reverse engineers. And probably the most common technique is that we go view the imports. These are all the functions that our program
has requested. The call, and you might look at the sneer, just like, Hey, I didn't call that function quarry performance performance counter or get current threat I d or any of that. And I know that's more than debugging code and just
other boilerplate code that Microsoft will insert in there when it's inserting its see libraries or
we call the function
I'm going to see this at that location.
I can see where else this memory address is reference and memory.
And over on the right,
you'll be able to see that it's reference in two places. But if I want to get a list of all of them, I can hit X
So these two values are the same address.
And if you look over at type one is a pointer and the other one is it's being read is what the R stands for in the P stands for and below it, there's a jump. So something is jumping
So we're just gonna go with where we know this thing is being called.
if you remember, this function was a thing built into
this program by the compiler, and it was like, you know, check. Yes, I are,
whatever was checking it was checking something about the stack.
but we pretty much get the same instructions that we saw, you know, with the same CC values. The same,
you know, push, push, push, push, push, push. And
Ida doesn't actually know
what the stack was used for.
These weren't assembly instructions that's figured out that based on
how the assembly was referencing
the various values in the stack. It said, OK, there's probably three local variables in here. They got that right sometimes gets it wrong,
especially with, ah arrays.
But it also figured out that this function was Message Box, eh?
And so it's said Okay, Message box A, according to Microsoft, has thes parameters and it knows, is calling them or it's going to push them onto the stack from right to left in reverse order.
So it's said Okay, what were the last four pushes?
And it's like, Okay, so that was the primer to that. It was You type the parameter to this is caption the parameter to this is text and the president of this is window handle.
it's very kindly provided, um,
uh, what the primers to this meshes box is. But be warned that sometimes the analysis gets a little messed up, and it might
be a little off, so these air helpful, but you should not just completely rely on them.
It's also very helpful that it put a little comment here and said, Hey, this is referencing
And that pointer is pointing to an ask e string. So it's gonna guess that This is a string,
and it names the A for asking and then whatever the string is,
so that's very helpful.
We could also rename things in Ida.
So if I click this and hit end,
I can rename this variable, too,
Rename this variable too far, too.
And then I just named this
I know this is print F.
That's what it said in
my D bugger. And that's what's said in my source code.
But this is showing up his sub short for subroutine and then
a number, which is the memory address that is going to
so I can click on it and then scroll down and get a peek of what's there. And it looks like there's just a jump there
so I can follow his jump by double clicking
I confined all this other stuff. I like to highlight the calls on jumps, and I can see that it's calling into several other things. And I'm just thinking,
you know, is this print off? Is this something else? While, as it turns out, Microsoft
doesn't actually include
libraries that are standard. It made its Microsoft implemented its own See runtime libraries.
usually they're bundled with See visual See runtime libraries or V C
Whatever there's there's a lot of programming at libraries out there for programmers to use,
but they're pretty standard.
This is not it. Uh, this is Maur,
uh, debugging code. And it's also more code to check certain things.
Like the proud owners that you gave
print f were safe because there have been vulnerabilities. But they can't change the functionality
standard library calls because,
uh, maybe that might break old software if you tried to re compile it.
So we know somewhere down this line there's gonna be a print F or equivalent thereof.
what exactly this programming dysfunction call is.
So I'm gonna go over and look in the imports tab.
It's gonna be underscore. Underscore something
that's generally with a
name it. I can search by hitting control F, and it's typing in print.
I can see it right here towards the bottom, but I'm gonna type in print anyway.
So there's two function calls
underscoring the score. STD io
underscore common the F print us
and then above it is s print off. I'm score s
So the compiler is doing its best to protect us from being dumb programmers,
and that's very nice.
and we can see if we hit escape, we can go back and our analysis, and we can get to our original stuff
and you'll notice that this function wasn't named Maine.
with the X, you can say what
is what has a reference to this location.
You can see there's a jump.
What? Has a reference to this location?
Jump further back has a reference to this location,
you know? And then you can't keep going like that.
Uh, click that hit X on that and see what calls that function.
Um, we can see there's more checking code for RV Park Si,
and for preparation code for us,
we can jump. See what calls this?
We can see that. Ah,
another function this
if I was curious to see what function calls this
I can say, uh, I could go under,
and then graphs, or it can right click
I could see that this function
and then this function
and then this function,
so that could be useful. If you're trying to get an idea of what a function does because
you can use this graphing recursive lee, and you can say, OK, I know
you know, this function has a bunch of functions that calls,
uh, what are they? How many of them are libraries? Because, ah,
labeled function calls or function names are those libraries or functions that Ida has referenced or Ida has recognized.
So if we want to see this,
uh, actually executing
and we didn't have the original source code, we can always crack open all the V bugger
start. Hello, don t x c.
And here Ali is looking in libraries and analyzing them.
So we might not always No,
where are code is just similarly to, um
us not knowing where this code was in, Ida,
where the code we want is in
so we can do the same sort of thing
where we can right click
sorry, it search for
all in her module, Coles
and we can look for And that basically looks for call instructions.
We can find the call instructions. Like to message box, eh?
So we could go there,
uh, and see the same sort of
push, push, push Cole
on. Dhe said a break point. We can do that with, like, two,
and this will set a soft breakpoint. So behind the scenes it's actually putting. And
three construction there
code gets executed will stop there,
And then the D bugger will take over and then replace that bite with the original bite that was there and then begin to execute
as if nothing had ever happened.
I'm gonna put a great point here
where the entrance to this
main function is which all he was kind enough to recognise removed this break point
It ran. It has now stopped at my break point.
And now I can do the same sort of step over
that I could with visual studio's de bugger.
Now this is important.
Here, let me start over.
My break point is still there. someone hit Playing is gonna go break point
the first instruction was pushed EVP.
The next one was yes. Movie ESPN tpp. And as I was saying earlier, This is important because
here the stack is going to be manipulated.
thing that's at the top of the stack a k a. The lower memory address is the return function from the previous
So the previous function that called this one
E i p onto the stack.
So e i p was going to execute this instruction next, but said the call
force it to come to this address. So now I'm going to make my own stack here.
So how do I do that safely? Well, the first step is
push EVP so I save the current base pointer.
I'm gonna move e s p e e v p. We can see e b p was just overwritten here.
It's now the same values E s P.
Now I'm gonna make room on the stack.
on the stack is now allocated all of this for us
for to use and, uh, local variables and see how they're all filled with C. C's.
That was the debugging code that was added in
because we didn't say this was released ready so that, you know, this thing probably has bugs, and we're gonna wanna fix them. So, uh, just in case any of this could gets executed, the D bugger will kick in and say, Oh, something wrong happened.
So now we have a local stack frame.
So now we're going to We're about to execute another function we're gonna or in our saving the registers
It doesn't know within them, doesn't care. It's just trying to
those values air now saved.
So now we're going to.
So this is important
is being stored in a local variable on the stack
so it can do step over.
And now we can see a reference to the helo String is right here.
Step over again. And now Hello. Now, world is, uh,
again being used. Now if
we used if we're debating
the release version,
these CC bites would not be here.
Step over is gonna push yaks.
world, and then it's gonna push the format String
was nice enough to say, Oh, print Fok. You at least have to have one parameter. And that's a format string because it doesn't know how many primaries are gonna get pushed in tow.
call or pushed his parameters. So
now I'm gonna make this call. I'm just gonna step over. I could step in and watch it actually do exactly what it says it's going to do.
I can even flip over to the tab and Seo to print out. So
the values are still on the stack here.
And now, as I step over
the next instruction, did this ad E S p zero c?
frame has decreased.
The data is still there on the stack,
but the stack pointer
the extent is Doc Pointer E S P is now pointed here.
It's not pointing lower. So as far as the pilot is concerned, as sorry as all these instructions are concerned, this is now the top of the stack. These other values don't matter. They don't care about him. They're gonna override him.
They do anything else
there, they no longer exist.
is gonna push zero. Um,
push. Hello, World push alert. Push zero.
Message box, eh? So watch carefully, because
while the S p is up here, as soon as we're gonna call it, the Kali function is gonna clean up the stack.
New modules are being loaded, and all these analyzing them is disassembling all their code.
Those are some pretty big deal Els.
So now there's a pop up that's happened. And that's what dysfunction does
hit. Okay, now, that function finished,
you can notice the stack has cleaned up,
it didn't need to do that. Ah,
that's pretty much the major difference between those calling conventions. And you should be aware of thumb.
And I don't really care what happens to the rest of this function. So I'm just gonna terminate the process
and take a quick look
the release version.
So it's stopped in what looks to be the same place. I'm slightly different. Boilerplate code.
So we're gonna find the
search for inner module calls.
Go to message box, eh?
looks like it's calling print up from there.
So here's another function. Call up or here's some more coat up here and separating. The code is in three,
just in case something
overran where it's supposed to go.
it's over and we can see it doing this stack manipulation.
It didn't subtract nearly as much
there. It's doing some checking
there. Here's where it's pushing
parameters to print F,
print F function that Microsoft made. That's a little safer.
more parameter pushing for another little function call. It's going to make,
which is the V F print off,
and that finished, Which is means that printed to the screen.
And now it's fixing up the stack
It says add E s P. Zero See here
is cleaning up the stack and now about to start pushing
for the next function call, which is Message box, eh?
I'm gonna switch over to wherever I need to call. There it is.
And then it's ready to in this function, which is returned zero, which is
easy quickly to do. It's excellent. Yea x e x,
much done with us. Exit out
so quickly and cig win.
We noticed that the compiler changed in visual studio from
what functions it was calling based on the release. And what
what wrapper functions that had
And cig one is another has uses the G C C compiler, the gene you see compilers what it stands for
it will produce a different code as well.
Every compiler has its own little
Programmers, that made it so it's kind of important. So here we demonstrated that, uh, I've compiled this program just like I did in visual studio.
go find that program.
And I'm gonna crack it open and ida and see what kind of code it produced.
with G C c. Insert a lot of extra code.
This looks to be some parameter checking,
but I'm gonna go and find the code that I'm most interested in.
And I see message box A right there.
I'm going to say OK, it's only reference in two places. One is a call instruction and the others and move instruction, the call instructions, what I'm interested in
and this is something that I really want to show you in that, uh, Like I said, push instructions are very expensive, time wise. And so some compilers have moved away from them
the G c C compiler in this case
instead of pushing into the stack, it simply does a move. So it moves
onto the stack. Because you can do that. You could move,
or the pointer to the string. Um,
the pointer restored and Farsi.
You could just move that
this way he doesn't have to push,
but it does have to make sure that stack is the right size by time. It calls print F.
and how it's trying to do something
when it calls message box, eh?
Uh, it does another little trick.
Which malware will do frequently,
we'll have a function pointer
and store that function pointer in a register and then call that register?
This makes it very hard for disassembly hours to figure out
what functions are being called from where,
unless it's more intelligently guided pro. But even then, I'd approve will frequently not catch
afterwards you'll see sub d S p.
It's not cleaning up
this function. Call because message box A will clean up after itself.
It's cleaning up for
all the instructions before it. So these to function calls.
uh, G. C. C. Is trying to be a bit faster than the maximum visual studio's compiler.
That's it for this demo. I thank you for watching we covered Standard Cola versus See Tackle. We looked at different ways. Code has generated, and he steps through a lot of assembly there, and we talked about
how compilers will choose to do certain things
how in one instance, ah, lot of push instructions were used to get data onto the stack and then in another instance, with GCC, it'll just move
the data onto the stack and instead of pushing and then
we'll clean up the stack
when it's done with it
and the next video, we're going to do some stack analysis on some actual malware,
hope to see you there.