7

Are there any lists compiled that provide a list of linux system calls used per function in a standard glibc build?

For example, free() requires mmap, munmap, mprotect, prlimit64, and brk.

If necessary, I can figure this out by grepping the source code or some strace wizardry, but I don't want to reinvent the wheel. I've been searching the web for about a week with no avail, mostly just turning up info on system call wrappers.

I am aware that officially there is no such certainty, but I know from practical experience that this info changes for most functions very rarely.

4
  • 2
    While you might be able to build a list based on glibc source, I doubt you could build an exclusive list for all functions. Certain functions may invoke plugins, a third party plugin is possible, and who knows what it might do. Additionally, in some cases, glibc will do different syscalls depending on your kernel and your architecture. Commented Jul 5 at 23:42
  • 1
    You could grovel through the include files, and find all the syscalls, grep them from the source. Then all that's left is to predict the path though the code, for all possible Inputs. Commented Jul 6 at 0:32
  • Why do you need to know that? Just out of curiosity? Or is there a reason for you to know that? Commented Jul 7 at 10:56
  • @aviro seccomp filtering Commented Jul 7 at 15:51

1 Answer 1

10

Are there any lists compiled that provide a list of linux system calls used per function in a standard glibc build?

No, because that both depends on the target the glibc gets built for, the capabilities of the kernel glibc detects at runtime, and, rarely, due to what I'd consider bugs, also whether the platform glibc was built on is the same as it is targetting¹.

For example, free() requires mmap, munmap, mprotect, prlimit64, and brk.

On your version of linux, at least.

If necessary I can figure this out by grepping the source code

nope, some of this is runtime-decided,

or some strace wizardry,

"wizardry" such as straightforward running of strace attached to your process of interest. Do note that just because libc call a(param *p) makes syscalls sa, sb and sc when *p=42, that's not true for other values, and especially not when a passed pointer is NULL. (For example, free(0) makes no syscalls; if you're mallocing memory that is already available in the allocation arena, then potentially no syscalls are made, either.)

You will also notice that even on the same machine, same program, same execution path so far, libc calls might or might not need to make syscalls – for example, depending on whether sysfs is visible for a process, or depending on the presence of a vdso allowing to query timers simply by reading userland-mapped hardware memory regions.

So,

  • neither is it possible from code reading to know what actually happens in the execution context, nor
  • is it static within an execution context what syscalls are involved in execution of a libc function.

but I know from practical experience that this info changes for most functions very rarely.

My experience is different here: Especially for libc function involved with handling files and paths, a lot depends on the kernel capabilities, the privileges of the executing process. Also, glibc (and any other libc) is intended as an abstraction over the underlying operating system API; so changing the usage of that API without making a fuzz about it is actually pretty sensible.

You'll notice, for example, that glibc updates on debian often lead to an update of a lot of apparmor profiles. Exactly that reason!

So, no, this is not documentable behaviour. You should not depend on implementation detail of your libc with respect to the syscalls made.


¹ for example: that can happen when you're cross-compiling glibc for Linux. The configure scripts fail to understand that even when cross-compiling, the getcwd Linux syscall is available; thus, on the right combination of build host and target, you can end up in a situation where the libc call getcwd, instead of making the eponymous linux syscall, falls back to recursivelya building of the path, by directory listing the parent directory, and looking for the entry with current directory's inode number, repeating until the root of the file system is reached. This leads to the most interesting effects when any of the directories in that chain happens to be deleted or moved around during that very-many-syscalls operation.

2
  • 1
    It gets further complicated by the symbol versioning mechanism, that is depending on whether the binary that is dynamically linked to requests function@GLIBC_2.4 or @GLIBC_2.12 (or without version) another implementation is used which may in the end call other syscalls Commented Jul 7 at 10:17
  • 2
    @PlasmaHH oh if we start taking the shared object mechanism into account, we can give up right there; anything to do with name resolution or user handling might call into NSS or PAM, respectively, including doing things like loading libraries on demand, and these libraries then doing arbitrarily complex things, locally, via network, via shared memory, or pixie dust, that glibc has no knowledge of. Commented Jul 7 at 10:25

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.