CIS-77 Home http://www.c-jump.com/CIS77/CIS77syllabus.htm
Introduction to x86 Assembly Language
1. Advantages of High-Level Languages
In this guide, we describe the basics of 32-bit x86 assembly language programming, covering a small but useful subset of the available instructions and assembler directives. How-ever, real x86 programming is a large and extremely complex universe, much of which is beyond the useful scope of this class. Get Assembly Language for Intel-Based Computers, one of the best books on the subject. Alternatively, you may try Randall Hyde's free, online Art of Assembly Language book as well. Download Masm32 assembler, which you will use to compile your assembly code into executables. And, if you like IDEs, get Winasm as well. It'll simplify code editing.
HOWL (HLA Object Windows Library) has arrived! HOWL makes Win32 assembly language programming easier than ever before. By providing an 'Application Framework' (much like Microsoft's MFC or Borland's VCL), HOWL takes care of all the grunt work required by low-level Win32 API programming and lets you concentrate on writing your applications. If you are new to Win32 API programming then I suggest you download a copy of the Win32 Reference Manual and this help file viewer before going on to the first tutorial on how to make a Basic Window. For further help you could also go to #winprog on Efnet IRC where you can ask questions and talk about anything to do with the Win32 API.
High-level language programs are portable.
(Although some programs could still have a few machine-dependent details, they can be used with little or no modifications on other types of machines.)
High-level instructions:
Program development is faster
Fewer lines of code
Program maintenance is easier
Compiler translates to the target machine language.
2. Why program in Assembly ?
There are some disadvantages...
Assembly language programs are not portable!
Learning the assembly is more difficult than learning Java!
Programming in the assembly language is a tedious and error-prone process.
High-level languages should be natural preference for common applications.
3. Here is why...
I just don't consider a utility program that's 4 megabytes big, and contains all sorts of files that the author didn't create, to be really great software.
Do you?Steve Gibson, Gibson Research Corporation.
Assembly language programs contain only the code that is necessary to perform the given task.
Assembly gives direct and complete control over system hardware:
Writing device drivers.
Operating system design.
Embedded systems programming, e.g. aviation industry.
Writing in-line assembly (mixed-mode) in high-level languages such as C/C++, or hybrid programming in assembly and C/C++.
4. Speed, Efficiency, Debugging, Optimization...
There are areas where speed is everything, for example, internet data encryption, aircraft navigational systems, medical hardware control...
There are also areas where space-efficiency is everything: spacecraft control software...
Understanding disassembly view of an executable program is also useful:
for investigating the cause of a serious bugs or crashes that require understanding of memory dumps and disassembled code.
for optimizing your code.
for practical and educational purposes.
5. Why MASM ?
The 'granddaddy' of all assemblers for the Intel platform, product of Microsoft.
Available since the beginning of the IBM-compatible PCs.
Works in MS-DOS and Windows environments.
It's free: Microsoft no longer sells MASM as a standalone product.
Bundled with the Microsoft Visual Studio product.
Numerous tutorials, books, and samples floating around, many are free or low-cost.
Steve Hutchessen's
MASM32 development environment incorporates MASM assembler and Win32 API tools.
6. Introduction to 80x86 Assembly Language
Logic gates are used at the hardware level.
What is machine language?
How high-level language concepts, such as if-else statements, are realized at the machine level?
What about interactions with the operating system functions?
How is assembly language translated into machine language?
These fundamental questions apply to most computer architectures.
By using assembly, we gain understanding of how the particular model of computer works.
7. Materials on the Web
Such secrets have been revealed to me that all I have written now appears of little value.
St. Thomas Aquinas, December 6, 1273.
Useful links: Assembly-Language Development System v6.1, also at
MASM Reference Guide can be downloaded there, too.
More here: in PDF and MS Word format
Intel and Microsoft MASM 6.1
A web page with a variety of
Intel 80x86 Conditional and Unconditional Branching
Intel 80x86 Boolean and Arithmetic Instruction
You can get Microsoft's Macro Assembler free: download (DDK), which contains both assembler and linker. Also, download Microsoft's for Windows 32-bit Version.
Take a look at Sivarama P. Dandamudi textbook info, , From 8086 to Pentium. Homepage includes free downloadable Microsoft assembler, , and student
Last, but not least, MSDN resource.
8. Useful books, in no particular order
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
9. Fundamental Concepts
CPU registers
Memory addressing
Representation of data:
numeric formats
character strings
Instructions to operate on 2's complement integers
Instructions to operate on individual bits
Instructions to handle strings of characters
Instructions for branching and looping
Coding of procedures:
transfer of control
parameter passing
local variables
Win32 Assembly Coding For Crackers Free
10. Software Environment
The tools we will use include:
Visual Studio development environment...
...edit, assemble, link, manage projects, debug and disassemble programs.
Command-line MASM, Microsoft Macro Assembler...
...produces code for 32-bit flat memory model appropriate to modern Windows.
Test-drive fullscreen 32-bit debuggers: OllyDbg, Visual Studio, WinDbg.
DUMPBIN: command-line utility that examines binary files and disassembles programs.
11. Runtime Environment
Program runs on the processor.
Program uses operating system functions and services.
Program uses one of the memory models:
Real mode flat model, 65,536 bytes of addressable memory (ancient MS-DOS .COM files)
Real mode segmented model, 1 megabyte (prime-time MS-DOS)
Protected mode flat model, modern Windows and Linux:
Addressable Memory: 80486 and Pentium - 4 Gigabytes
As far as 32-bit Vista is concerned, the world ends at 4,096 megabytes.
A 32-bit program can address up to 4 gigabytes of memory.
12. Assembly and C Code Compared
Some simple high-level language instructions can be expressed by a single assembly instruction:
13. More Assembly and C Code
Most high-level language instructions need more than one assembly instruction:
14. Assembly vs. Machine Language
Assembly Language uses mnemonics, digital numbers, comments, etc.
Machine Language instructions are just a sequences of 1s and 0s.
Readability of assembly language instructions is much better than the machine language instructions:
15. Controlling Program Flow
Just as in high-level language, you want to control program flow.
The JMP instruction transfers control unconditionally to another instruction.
JMP corresponds to goto statements in high-level languages:
16. Conditional Jumps
Conditional jump is taken only if the condition is met.
Condition testing is separated from branching.
Flag register is used to convey the condition test result.
For example:
17. General-Purpose Registers
|
|
Similarly, AL, DL, CL, and BL represent the low-order 8 bits of the registers.
18. Typical Uses of General-Purpose Registers
| Register | Size | Typical Uses |
|---|---|---|
| EAX | 32-bit | Accumulator for operands and results |
| EBX | 32-bit | Base pointer to data in the data segment |
| ECX | 32-bit | Counter for loop operations |
| EDX | 32-bit | Data pointer and I/O pointer |
| EBP | 32-bit | Frame Pointer - useful for stack frames |
| ESP | 32-bit | Stack Pointer - hardcoded into PUSH and POP operations |
| ESI | 32-bit | Source Index - required for some array operations |
| EDI | 32-bit | Destination Index - required for some array operations |
| EIP | 32-bit | Instruction Pointer |
| EFLAGS | 32-bit | Result Flags - hardcoded into conditional operations |
19. x86 Registers
Four 32-bit registers can be used as
Four 32-bit registers EAX, EBX, ECX, EDX.
Four 16-bit registers AX, BX, CX, DX.
Eight 8-bit register AH, AL, BH, BL, CH, CL, DH, DL.
Some registers have special use...
...ECX for count in LOOP and REPeatable instructions
20. x86 Registers, Cont
|
21. x86 Control Registers
EIP Program counter (Instruction Pointer)
EFLAGS is set of bit flags:
Status flags record status information about the result of the last arithmetic/logical instruction.
Direction flag stores forward/backward direction for data copying.
System flags store
IF interrupt-enable mode
TF Trap flag used in single-step debugging.
22. MOV, Data Transfer Instructions
The MOV instruction copies the source operand to the destination operand without affecting the source.
Five types of operand combinations are allowed with MOV:
Note: the above operand combinations are valid for all instructions that require two operands.
23. Ambiguous MOVes: PTR and OFFSET
For the following data definitions
The above MOV instructions are ambiguous.
Not clear whether the assembler should use byte or word equivalent of 100.
Better:
24. INC and DEC Arithmetic Instructions
Format:
Semantics:
The destination can be 8-bit, 16-bit, or 32-bit operand, in memory or in register.
No immediate operand is allowed.
Examples:
25. ADD Arithmetic Instruction
Format:
Semantics:
Examples:
26. ADD vs. INC

Note that
is better than
INC takes less space.
Both INC and ADD execute at about the same speed.
27. SUB Arithmetic Instruction
Format:
Semantics:
Examples:
28. SUB vs. DEC
Note that
is better than
DEC takes less space.
Both execute at about the same speed.
29. CMP instruction
Format:
Semantics:
The destination and source are not altered.
Useful to test relationship such as < > or = between the two operands.
Used in conjunction with conditional jump instructions for decision making purposes.
Examples:
30. Unconditional Jumps
Format:
Semantics:
Execution is transferred to the instruction identified by the label.
Infinite loop example:
31. Conditional Jumps
Format:
Semantics:
Execution is transferred to the instruction identified by label only if condition is met.
Testing for carriage return example:
32. Conditional Jumps, Cont
Some conditional jump instructions treat operands of the CMP instruction as signed numbers:
33. Conditional Jumps, Cont
Some conditional jump instructions can also test values of the individual CPU flags:
34. LOOP Instruction

Format:
Semantics:
Decrements ECX and jumps to target, if ECX > 0
ECX should be loaded with a loop count value before loop begins.
|
|
|
35. Logical Instructions
Format:
Semantics:
Perform the standard bitwise logical operations.
Result goes to the destination.
TEST is a non-destructive AND instruction:
TEST performs logical AND but the result is not stored in destination (similar to CMP instruction.)
36. Logical Instructions, Cont.
Win32 Assembly Coding For Crackers Using
Example of testing the value in AL for odd/even number:
37. Shift Instructions
Shift left format:
Shift right format:
where count is an immediate value.
Semantics:
Performs left/right bit-shift of destination by the value in count or CL register.
CL register contents is not altered.
38. SHL and SHR Shift Instructions
Bit shifted out goes into the carry flag CF.
Zero bit is shifted in at the other end:
39. Shift Instructions Examples
Count is an immediate value:
Specification of count greater than 31 is not allowed.
If greater, only the least significant 5 bits are actually used.
CL version of shift is useful if shift count is known at run time,
e.g. when the shift count is a parameter in a procedure call.
Only CL register can be used.
Shift count value should be loaded into CL:
40. Rotate Instructions
Two types of rotate instructions:
Rotate without carry:
ROL (ROtate Left)
ROR (ROtate Right)
Rotate with carry:
RCL (Rotate through Carry Left)
RCR (Rotate through Carry Right)
Rotate instruction operand is similar to shift instructions and supports two versions:
Immediate count value
Count value is in CL register
41. ROL and ROR, Rotate Without Carry
42. RCL and RCR, Rotate With Carry
43. EQU directive
EQU directive eliminates hardcoding:
No reassignment is allowed.
Only numeric constants are allowed.
Defining constants has two main advantages:
Improves program readability
Helps in software maintenance.
Multiple occurrences can be changed from a single place
The convention is to use all UPPER-CASE LETTERS for names of constants.
44. EQU Directive Syntax
Assigns the result of expression to name.
The expression is evaluated at assembly time.
More examples: