-
-
Notifications
You must be signed in to change notification settings - Fork 604
OSv lzloader and early loader
OSv loader is composed of two parts. The LZ loader, which simply uncompresses the "real" loader. And the actual loader who starts up OSv.
The first part of the loader of OSv is pretty simple, a C LZ77 implementation
that uncompress the loader. It is compiled to a standard ELF file, and is called by boot16.S
. The
lzloader.ld
linker script specify uncompress_loader
to be lzloader.elf
's entry point, and
boot16.S
calls reads the address of lzloader.elf start + entry_offset
and calls the funtion at
that address.
The function itself is fairly simple:
fastlz_decompress(&_binary_loader_stripped_elf_lz_start,
(size_t) &_binary_loader_stripped_elf_lz_size,
BUFFER_OUT/*(char *)0x200000*/, MAX_BUFFER);
We decompress the stream at _binary_loader_stripped_elf_lz_start
to the hardcoded address
0x200000
. Obviously, _binary_loader_stripped_elf_lz_start
contains the compressed loader,
but where does it come from? The objcopy
utility comes to the rescue,
enabling us to "copy" a real binary file to a compiled object file. From the manual:
You can access this binary data inside a program by referencing the special symbols that are created by the conversion process. These symbols are called _binary_objfile_start, _binary_objfile_end and _binary_objfile_size. e.g. you can transform a picture file into an object file and then access it in your code using these symbols.
For example
$ echo hello world > hello-1.txt
$ cat ->main.c
#include <stdio.h>
extern char _binary_hello_1_txt_start;
extern char _binary_hello_1_txt_end;
extern char _binary_hello_1_txt_size;
int main(int argc, char** argv) {
size_t sz = (size_t)&_binary_hello_1_txt_size;
(&_binary_hello_1_txt_start)[sz-1] = '\0';
printf("packaged %s\n", &_binary_hello_1_txt_start);
return 0;
}
$ # out ELF arch input format output format
$ objcopy -B i386:x86-64 -I binary -O elf64-x86-64 hello-1.txt hello-1.o
$ gcc main.c hello-1.o
$ ./a.out
packaged hello world
Indeed the makefile includes:
loader-stripped.elf.lz.o: loader-stripped.elf fastlz/lz
# compress the loader's code with LZ
$(call quiet, fastlz/lz loader-stripped.elf, LZ $@)
# convert to object file, exporting _binary_loader_stripped_elf_lz_{start,end,size}
$(call quiet, objcopy -B i386 -I binary -O elf32-i386 loader-stripped.elf.lz $@, OBJCOPY $@)
After lzloader decompress the loader, boot16.S
calls the ELF file that now appears in 0x200000
by calling the address in its entry point in the ELF header. This time, we're running a full fledged C++ program.
But wait. One cannot just issue call main
and expect their C++ program to start.
Usually the OS sets a proper environment for your C++
program, loads different sections to different
location in the virtual memory. Zeros the BSS
section, sets the argv
and argc
command line variables,
etc. Who does all those things? In OSv you are the OS.
The entry point of loader.elf
is start32
in boot.S
. Let's remember that boot16.S
left the CPU
in 32 bit mode before calling lzloader.elf
and then loader.elf
, so before we start we have some
bookkeeping to make:
Set GDT to flat segments for 64 and 32 bits, initialize segment registers to this flat segment and start using the new GDT:
gdt = . - 8
// base flag limit type base limit
.quad 0x00 a f 9b 000000 ffff # 64-bit code segment
// first descriptor special to enable long mode, see
// http://wiki.osdev.org/X86-64#How_do_I_enable_Long_Mode_.3F
.quad 0x00 c f 93 000000 ffff # 64-bit data segment
.quad 0x00 c f 9b 000000 ffff # 32-bit code segment
gdt_end = .
lgdt gdt_desc
// set all segment registers to 0x10, the second segment
...
// first, we enable 32-bit mode on the new GDT table
ljmp $0x18, $1f
Now, we need to enable 64 bit mode
- Disable paging - they were never enabled.
- Set the PAE enable bit on CR4
```
mov $BOOT_CR4, %eax
mov %eax, %cr4
```
- Load CR3 with the physical address of Page Map Level 4
```
lea ident_pt_l4, %eax
mov %eax, %cr3
```
- Enable long mode by setting the EFER.LME flag in MSR
0xC0000080
```
mov $0xc0000080, %ecx
mov $0x00000900, %eax
xor %edx, %edx
wrmsr
```
- Enable paging, by setting the relevant bit in
CR0
```
mov $BOOT_CR0, %eax
mov %eax, %cr0
```
- Now the CPU will be in compatibility mode, jump to the special 64-bit GDT code descriptor, and we're in 64 bit mode.
```
ljmpl $8, $start64
```
Some more bookkeeping to do when in 64 bit mode:
-
Zero the BSS section:
lea .bss, %rdi lea .edata, %rcx sub %rdi, %rcx xor %eax, %eax rep stosb
-
Init global variables with information from
boot16.S
. Note thatboot16.S
keeps the ELF header and multiboot addresses in%ecx
and%ebx
respectively.boot.S
saves%ebx
to%ebp
.mov %rbp, elf_header mov %rbx, osv_multiboot_info
-
Set the stack pointer. At first, for
main
, it is simply set to an empty16K
region in the image, remember that in x64 stack grows down:.align 16 . = . + 4096*4 init_stack_top = . lea init_stack_top, %rsp
-
Call the
premain
function.call premain
Now we're in premain
, and it looks like we're running a regular C++ program.
There are however a couple of differences:
- Global variables are not initialized, since the
.init
section inloader.elf
didn't run yet. - The stack is limited to
16K
. - We haven't initialized the other CPUs, so we're running on the BSP CPU that started the system.
What does premain
do?
- Init terminal
```C++
arch_init_early_console();
```
- We need to use the newer APIC interrupt controller Hence we disable PIC
```C++
disable_pic();
auto inittab = elf::get_init(elf_header);
```
- Setup thread local storage. This is an interesting topic worthy of a discussion in a separate wiki page.
```C++
setup_tls(inittab);
```
- Run .init functions from loader.elf, hence, intializing global variables. global variables.
```C++
for (auto init = inittab.start; init < inittab.start + inittab.count; ++init) {
(*init)();
}
```
Which functions are running from the .init
section? And at what order? OSv uses the
GCC
attribute
extension
to C++
, that allows you to definte the initialization order across translation units.
All order constants are defined in include/osv/prio.hh
.
Very broadly speaking, we have:
console = 101,
That initialize the platform specific console (16550A UART on x86-64) and allows us to print debug messages.
sort
Sorts the .fixup
section, for the safe_load
mechanism.
fpranges
Initialize the free page ranges. See Memory Pages Handling
pt_root
Initialize the page table.
mempool,
vma_list,
reclaimer,
malloc_pools,
Initialize OSv's memory allocator.
pagecache
OSv implementation of page cache that is used for file-backed memory maps.
cpus,
threadlist,
pthread,
notifiers,
sched,
Initialize OSv's threads and scheduler.
acpi,
hpet, // High Precision Timer
idt,
clock,
Initialize hardware devices like ACPI, and HPEC, manage interrupts.
tracepoint_base,
OSv's implementation of tracepoints, a mechanism to trace events in the system.
The last phase, is running the loader's main
function, which takes care of
processing OSv
's command line argument, and loading the single executable
you wanted to run.