The Data Segment

horizontal rule

It doesn't take much to work out that the data segment is intended for data! The implication is that memory areas intended for storing variables, buffer work spaces and so on should be grouped into one logical segment. If necessary, a program can contain more than one data segment. The i80x86 microprocessor has two segment registers which are commonly used for accessing data segments: DS (Data Segment register) and ES (Extra Data Segment Registers). Most instructions for processing data use the DS register for storing the segment address of data. So, if your program contains the instruction

mov ax, Field1

it is treated by the microprocessor as

mov ax, DS:Field1

unless you specifically tell the instruction to use another register, for example

mov ax, ES:Field2

Some instructions use both DS and ES segment registers. Take the MOVSB instruction which moves strings for example. Before performing this instruction you have to load four registers - two segment registers, DS and ES and two index registers, SI (Source Index) and DI (Destination Index). The following example shows a complete program which processes data located in different segments:

.model large
ExtDat	segment 'DATA'		; this defines extra data segment
;...............................
StringD	db 'This must have enough space to fit a source string'
;.......
ExtDat	ends
;...............................

.data
; this is ordinary DATA segment
;...............................
StringS	db 'This is the source string'
LSource	equ $ - StringS
;...............................

.code
.startup
	mov ax, seg StringS	; load segment address of StringS
	mov ds, ax		; DS points to segment of StringS
	mov ax, seg StringD	; load segment address of StringD
	mov es, ax		; ES points to segment of StringD
	mov si, offset StringS	; DS:SI point to StringS
	mov di, offset StringD	; ES:DI point to StringD
	mov cx, LSource		; load length of StringS into CX
rep	movsb			; copy strings (size in CX)
;...............................
.exit
	end

The MOVSB instruction copies the string, whose initial address is stored in the DS:SI register, into a string specified by the contents of the ES:DI register. Note that this instruction is preceded by the REP prefix. Unlike labels, prefixes have no character ":" after them and are used for repeating an instruction depending on certain conditions. The simplest form of the repetition prefix used in this example causes the MOVSB instruction to be repeated n times, where n is the value in the CX register. Each repetition of the MOVSB instruction involves transferring one byte from the memory location specified by an address in DS:SI to the target byte at the address ES:DI. When you use the MOVSB instruction, the length of the string being processed is contained in the CX register.

The DS and ES registers are not the only segment registers you can use in data processing instructions. For example, if you want to calculate the size of a particular fragment of a program, you could use the following sequence of instructions:

;.......
	mov ax, offset cs:FirstInstr
	mov cx, offset cs:LastInstr
	sub cx, ax
;.......
FirstInstr:
;.......
LastInstr:

This piece of code calculates the distance between the instructions labeled FirstInstr and LastInstr and places this value into the CX register. This kind of technique is widely used for calculating the size of the memory area required for a certain part of a program.

You can also use the data segment registers in instructions which control program flow, such as jumps or calls. For example, suppose that the NewInstr label is located in a data segment. In this case, the instruction:

jmp ds:NewInstr

is valid, but it behaves unexpectedly - the segment address of the next instruction to be performed is still in the CS register but the offset address is equal to the offset of the NewInstr label in the data segment! This is because the Assembler is "too clever" and always tries to generate the most compact machine code possible. Therefore, by default, it treats the instruction JMP as JMP SHORT which always uses the CS register as a segment register. If you want to write a program which creates a new executable code in a data segment and then passes control to that code, use the FAR JMP instruction:

jmp far ptr ds:NewInstr

Notice that this tells Assembler explicitly how to calculate the effective address of the instruction. By default, program flow control instructions such as JMP or CALL use the CS register as an address segment. The specifier FAR PTR (Far Pointer) tells Assembler to use the segment register given in the instruction rather than the default one.

This trick will work if the data located where control is passed can be interpreted as valid machine instructions. Make sure that these instructions do not perform tasks such as low-level formatting of a hard disk or transferring all your data through communication ports.

horizontal rule

Program Segments Chapter 7....The Code Segment

Revised - Sunday, November 14, 1999 05:46 PM Central Standard Time


Página cedida por Luis Piñuel