I Presentation.

The goal here is not to go through all code armoring techniques. There are a lot of resources already describing that. What we are going to do here is presenting 4 basic anti-debugging and anti-disassembly tricks and then combining them to make a code more difficult to analyze.

I used Visual Studio 2010 and wrote the ASM code using inline assembly (__asm directive). It is nice because it is easier to add these tricks anywhere in the C or C++ code, the drawback is it only works for 32bit compiled applications since Visual Studio does not support 64bit inline assembly. You can find more information about Microsot inline assembly on the MSDN.

Part of the tricks described here rely on the direct use of opcodes. A complete X86 Intel assembly guide including opcodes can be found at http://www.mathemainzel.info/files/x86asmref.html

II Some basic tricks

II.1 PEB structure check
There is a native Windows function to check if a process is being debugged. That is the IsDebuggerPresent function which returns TRUE if a debugger is detected.
Since a call to this function is easy to detect; an alternative is to replace it by the method used inside this function. On current Windows implementations what the IsDebuggerPresent function does is to check the beingDebugged byte of the PEB structure.
The PEB structure contains various process information. Here is the description of PEB structure as defined by MSDN:

Code:
typedef struct _PEB {
  BYTE                          Reserved1[2];
  BYTE                          BeingDebugged;
  BYTE                          Reserved2[1];
  PVOID                         Reserved3[2];
  PPEB_LDR_DATA                 Ldr;
  PRTL_USER_PROCESS_PARAMETERS  ProcessParameters;
  BYTE                          Reserved4[104];
  PVOID                         Reserved5[52];
  PPS_POST_PROCESS_INIT_ROUTINE PostProcessInitRoutine;
  BYTE                          Reserved6[128];
  PVOID                         Reserved7[1];
  ULONG                         SessionId;
} PEB, *PPEB;
More information including undocumented parts of the structure can be found here. Here is an example of code checking PEB structure:

Code:
__asm
{
	mov eax,dword ptr fs:[0x18]
 	//  Get PEB structure address
        mov eax,dword ptr [eax+0x30]
	// Check if isDebug byte is set
	cmp byte ptr [eax+2],0 
	je blocEnd
        // Debugger detected
blocEnd:
	// etc ...
}
For someone wanting to analyze the code, it is easy to spot these instructions. Then several measures can be taken like nullifying the code or setting the PEB BeingDebugged to 0 at runtime so the trick never works.

II.2 Inserting garbage opcode
Garbage opcode insertion is a classic anti-disassembly trick. It is however powerful and can really badly confuse any available tools, as well as the person analyzing the code.

This method consists into inserting some opcode which will never be called in the middle of the code. The inserted opcode can be the first byte of what should be a several byte instruction which means the following code will be scrambled. Inserting junk code can be done for conditions which are always true so we know it is never executed in the normal flow of the process.
In the next example we start to use a xor eax, eax instruction and we are sure our code will always jump to valid label at runtime. However when the code is disassembled the 0xea garbage opcode will be confused with a valid jump instruction and the code behind will not be readable.

Code:
__asm
{
 	// Will always set zero flag
	xor eax,eax  
        jz valid
	 // Insert long jump opcode
	__asm __emit(0xea)
valid:
	... // This will be obfuscated when disassembled
}
So how to bypass this trick? Well code may be scrambled and unreadable but it can be executed fine. This means that a debugger can bypass the problem. A simple step by step process will reveal the real code once you are stepping in.

II.3 Detect software breakpoints
This is an Anti-Debugging trick. Debuggers can be detected by the use of a software breakpoint in the debugged memory code. When a software breakpoint is set on a part of the code, the code is replaced by INT 03 instruction (opcode 0xCC). At runtime, the code can scan itself to search for such instruction. If it exists this means the code is currently being debugged.
In the next example the code will search for 0xCC opcode between blocStart and blocEnd labels.

Code:
__asm
{
        /* We are going to look for breakpoints */
        xor ebx, ebx
        mov bl, 0xCC
blocStart:
        /* Get address and size of block we want to check against breakpoint*/
        mov eax, blocStart   
        mov ecx, blocEnd    
        sub ecx, blocStart
	/* Loop trough the code looking for breakpoint */
antiBpLoop:
	// Check if the opcode is 0xCC 
        cmp byte ptr [eax],bl         
        jne continueLoop           
         ... 				 // Debugger detected!!
continueLoop:   
        inc eax
        dec ecx
        jnz antiBpLoop    
...
blocEnd:
}
This trick can be detected when an 0xCC opcode is scanned for. To disable this antidebug function, you can nop out some code or just replace the jne instruction by a jmp. Another possibility is to use hardware breakpoints.

II.4 Self modifying code
This is an anti-disassembly trick. The code you are looking at when disassembled will not be the executed one because the code will self modify at runtime. Usually, self modification is done using encryption (for more information on code encryption and self decryption routine you have a look at the Code Segment Encryption post). There are also other possibilities, in the next example is described how to change a single Byte opcode value, replacing its value by 0x90 (NOP instruction).

Code:
__asm
{   
        /* Get address of changeMe label in eax*/
        mov eax, changeMe    
	/* Replace first byte in changeMe by a NOP*/      
        mov [eax], 0x90   
changeMe:   
        ...     
}
Note that this code will work if you have the permission to rewrite the code. Code segments are normally read-only.You can set the rights of a given code segment using a pragma call.

Code:
// Declare .text as Executable, Read, Write section, this is necessary so application rewrite its executable code
#pragma comment(linker,"/SECTION:.text,ERW")
As for garbage code insertion, single stepping process or setting up breakpoints at the right place will reveal the real code.

III Combining everything

What we can learn from the 4 examples above is that anti-debugger tricks can be detected by looking at disassembled code while anti-disassembly tricks can be unveiled by debugging the code. The idea now is, how about we combine those methods together so they protect each other?
The example below makes a combine use of the 4 tricks to make the code much more difficult to defeat. Here is how it works:

Some parts of the code are not readable because of garbage code insertion
There is an obvious indication that the code is looking for software breakpoints and replacing them by the jump opcode ( 0xEB )
An INT 3 instruction is hidden inside the code an needs to be replaced or else the code will not work
A check of BeingDebugged PEB byte is also hidden in the code
We play on the fact that we have to execute what seems to be useless code.
The 0xCC, 0x02 byte will transform into "jmp 2". This means the software breakpoint trick scan cannot be removed. This block saves registers meaning in can be called anywhere inside C or C++ source code.

Code:
__asm
 {
         /* Save used registers so we can call this code anywhere */
        push ebx
        push eax
        push ecx
        /* We are going to look for breakpoints, this will be obviously be detected */
        xor ebx, ebx
        mov bl, 0xCC
    blocStart:
        /* Get addr and size of block we want to check against breakpoints */
        mov eax, blocStart    
        mov ecx, blocEnd    
        sub ecx, blocStart
    antiBpLoop:
	// check if the opcode is 0xCC
        cmp byte ptr [eax],bl         
        jne continueLoop          
        // If detected breakpoint opcode is replaced by a short jump opcode
        mov [eax], 0xEB   
    continueLoop:   
        inc eax
        dec ecx
	/* Loop */
        jnz antiBpLoop
        /* Here we finished to remove breakpoints */
        mov ecx, 2
	/* Going to insert garbage code */
        xor eax,eax  
        jz valid
 	// Garbage code confusing disassembler using add opcode (0x02)
	// This is done to hide the 0xCC opcode coming next
        __asm __emit(0x02)
    valid:
	// Software breakpoint to be replaced by jump. If not replaced, code fails
        __asm __emit(0xcc)
	// offset to jump (jump to anti debug code). This is executed if previous line breakpoint is not replaced.
        __asm __emit(0x02) 
        ret
	/* Garbage code confusing disassembler using long Add rmw, the code will fail here if CC replaced by something else  */
        __asm __emit(0x81)
        /* Anti debug code ) */
        sub ebx, 0xB4
	// Now ebx = 0x18
        mov eax,dword ptr fs:[ebx]
        add ebx, ebx
 	// 0x30 = 0x18 + 0x18
        mov eax,dword ptr [eax+ebx]
	// This is why we moved 2 in ecx register
        cmp byte ptr [eax+ecx],ch
        pop ecx 
        pop eax
        pop ebx
        je blocEnd
	/* Garbage code confusing disassembler using long jump. Also this jump is called if process is debugged, leading to crash */
        __asm __emit(0xea)
    blocEnd:
	... // Behind that  code will be unreadable (probably display as constant). 
}
The described method is far from being the strongest possibility. You could imagine sophisticating implementation using other tricks. You can also imagine having operational code requiring some of the values set during the execution of the protection measures. Imagine for example a decrytor routine heavily linked to self changing opcode in a more complex version of above listing.
There is one thing to remember; eventually all these tricks will fail because a good analyst with the right tools and enough time will always be able to dessicate a software he has his hands on. The question is, how much time is required for that?