-
-
Notifications
You must be signed in to change notification settings - Fork 604
Syscalls
By design, most applications running on OSv do not execute system calls when calling the libc
functions. For example, an invocation of a mmap()
is a direct local function call resolved by OSv dynamic linker that involves very few instructions and is therefore very fast. On Linux, the same call is way more expensive as it goes through a wrapper function in glibc
which then invokes the system call SYS_mmap
that involves a CPU ring and virtual address space switch among other things. This OSv optimization may not be as relevant as one would hope, especially when applications make few mmap()
calls as is often the case, but this is a topic for another story.
Some applications like Golang or statically linked applications (see this for more details) bypass the libc
layer and invoke systems calls directly using the SYSCALL
(x86_64
) or SVC
(aarch64
) instructions. To support those, OSv implemented the system handler machinery in assembly for both x86_64
and aarch64
.
Unlike Linux, where libc
functions like mmap()
delegate to the corresponding system calls (SYS_mmap
in the example above), in OSv the opposite happens. Just like in Linux, OSv implements the SYSCALL
and SVC
instructions for x86_64
and aarch64
respectively (see syscall_entry
in arch/x64/entry.S
and handle_system_call
in arch/aarch64/entry.S
). This tricky low-level assembly code switches to a dedicated system call stack, saves all necessary registers, and delegates to syscall_wrapper()
and eventually syscall()
functions implemented in linux.cc
. The syscall()
function has a case
statement that invokes the relevant libc
function.
accept4
bind
clock_getres
clock_gettime
close
connect
dup3
epoll_create1
epoll_ctl
epoll_pwait
epoll_wait
eventfd2
exit
exit_group
fcntl
fdatasync
flock
fstat
fstatat
fsync
ftruncate
futex
getcwd
getdents64
getgid
get_mempolicy
getpeername
getpid
getrandom
getsockname
getsockopt
gettid
getuid
ioctl
listen
lseek
madvise
mincore
mkdir
mkdirat
mmap
munmap
nanosleep
open
openat
pipe2
pread64
pselect6
pwrite64
read
readlinkat
recvfrom
recvmsg
renameat
rt_sigaction
rt_sigprocmask
sched_getaffinity
sched_setaffinity
sched_yield
select
sendmsg
sendto
set_mempolicy
setsockopt
sigaltstack
socket
stat
statfs
symlinkat
tgkill
uname
unlinkat
write
accept
access
alarm
chdir
creat
dup
dup2
epoll_create
eventfd
fallocate
faccessat
fchdir
fstatfs
futimesat
getitimer
getpriority
getrlimit
getrusage
gettimeofday
kill
lstat
mprotect
msync
pause
pipe
poll
ppoll
prctl
readlink
readv
rename
rmdir
sched_get_priority_max
sched_get_priority_min
sendfile
sethostname
setitimer
setpriority
setrlimit
shmget
shmat
shmctl
shmdt
shutdown
socketpair
symlink
sync
sysinfo
time
timerfd_create
timerfd_gettime
timerfd_settime
times
truncate
umask
unlink
utime
utimensat
utimes
writev