We’re so close! We’re currently in long mode, but not ‘real’ long mode. We need to go from this ‘compatibility mode’ to honest-to-goodness long mode. To do this, we need to set up a ‘global descriptor table’.
This table, also known as a GDT, is kind of vestigial. The GDT is used for a style of memory handling called ‘segmentation’, which is in contrast to the paging model that we just set up. Even though we’re not using segmentation, however, we’re still required to have a valid GDT. Such is life.
So let’s set up a minimal GDT. Our GDT will have three entries:
- a ‘zero entry’
- a ‘code segment’
- a ‘data segment’
If we were going to be using the GDT for real stuff, it could have a number of code and data segment entries. But we need at least one of each to have a minimum viable table, so let’s get to it!
The first entry in the GDT is special: it needs to be a zero value. Add this
to the bottom of
section .rodata gdt64: dq 0
We have a new section:
rodata. This stands for ‘read only data’, and since
we’re not going to modify our GDT, having it be read-only is a good idea.
Next, we have a label:
gdt64. We’ll use this label later, to tell the hardware
where our GDT is located.
dq 0. This is ‘define quad-word’, in other words, a 64-bit value.
Given that it’s a zero entry, it shouldn’t be too surprising that the value of
this entry is zero!
That’s all there is to it.
Next, we need a code segment. Add this below the
.code: equ $ - gdt64 dq (1<<44) | (1<<47) | (1<<41) | (1<<43) | (1<<53)
Let's talk about the
dq line first. If you recall from the last section,
1<<44 means ‘left shift one 44 places’, which sets the 44th bit. But what
|? This means
or. So, if we
or a bunch of these values together,
we’ll end up with a value that has the 44th, 47th, 41st, 43rd, and 53rd bit
| and not
or, like before? Well, here, we’re not running assembly
instructions: we’re defining some data. So there’s no instruction to execute, so
the language used is a bit different.
Finally, why these bits? Well, as we’ve seen with other table entries, each bit has a meaning. Here’s a summary:
- 44: ‘descriptor type’: This has to be
1for code and data segments
- 47: ‘present’: This is set to
1if the entry is valid
- 41: ‘read/write’: If this is a code segment,
1means that it’s readable
- 43: ‘executable’: Set to
1for code segments
- 53: ‘64-bit’: if this is a 64-bit GDT, this should be set
That’s all we need for a valid code segment!
Oh, but let's not forget about the other line:
.code: equ $ - gdt64
What's up with this? So, in a bit, we'll need to reference this entry somehow.
But we don't reference the entry by its address, we reference it by an offset.
If we needed just an address, we could use
code:. But we can't, so we need
more. Also, note that period at the start, it's
.code:. This tells the
assembler to scope this label under the last label that appeared, so we'll
gdt64.code rather than just
code. Some nice encapsulation.
So that's what's up with the label, but we still have this
equ $ - gdt64 bit.
$ is the current position. So we're subtracting the address of
the current position. Conveniently, that's the offset number we need for later:
how far is this segment past the start of the GDT. The
equ sets the address
for the label; in other words, this line is saying "set the
value to the current address minus the address of
gdt64". Got it?
Below the code segment, add this for a data segment:
.data: equ $ - gdt64 dq (1<<44) | (1<<47) | (1<<41)
We need less bits set for a data segment. But they’re ones we covered before.
The only difference is bit 41; for data segments, a
1 means that it’s
We also use the same trick again with the labels, calculating the offset with
Here’s our whole GDT:
section .rodata gdt64: dq 0 .code: equ $ - gdt64 dq (1<<44) | (1<<47) | (1<<41) | (1<<43) | (1<<53) .data: equ $ - gdt64 dq (1<<44) | (1<<47) | (1<<41)
We’re so close! Now, to tell the hardware about our GDT. There’s a special
assembly instruction for this:
lgdt. But it doesn’t take the GDT itself; it
takes a special structure: two bytes for the length, and eight bytes for the
address. So we have to set that up.
dqs, add this:
.pointer: dw .pointer - gdt64 - 1 dq gdt64
To calculate the length, we take the value of this new label,
subtract the value of
gdt64, and then subtract one more. We could calculate
this length manually, but if we do it this way, if we add another GDT entry for
some reason, it will automatically correct itself, which is nice.
dq here has the address of our table. Straightforward.
So! We’re finally ready to tell the hardware about our GDT. Add this line after all of the paging stuff we did in the last chapter:
lgdt the value of our
lgdt stands for ‘load global
descriptor table’. That’s it!
We have all of the prerequisites done! In the next section, we will complete our transition by jumping to long mode.