Taming the ‘stat’ storm with a loader cache
It was one of these days where some of us on IRC were rehashing that old
problem—that application startup in Guix causes a
“stat
storm”—and lamenting the
lack of a solution when suddenly, Ricardo
proposes what,
in hindsight, looks like an obvious solution: “maybe we could use a
per-application ld cache?”. A moment where collective thinking exceeds
the sum of our individual thoughts. The result is one of the many
features that made it in the core-updates
branch, slated to be merged
in the coming weeks, one that reduces application startup time.
ELF files and their dependencies
Before going into detail, let’s look at what those “stat
storms” look
like and where they come from. Loading an
ELF
executable involves loading the shared libraries (the .so
files, for
“shared objects”) it depends on, recursively. This is the job of the
loader (or dynamic linker), ld.so
, which is part of the GNU C
Library (glibc) package. What shared libraries an executable like that
of Emacs depends on? The ldd
command answers that question:
$ ldd $(type -P .emacs-27.2-real)
linux-vdso.so.1 (0x00007fff565bb000)
libtiff.so.5 => /gnu/store/l1wwr5c34593gqxvp34qbwdkaf7xhdbd-libtiff-4.2.0/lib/libtiff.so.5 (0x00007fd5aa2b1000)
libjpeg.so.62 => /gnu/store/5khkwz9g6vza1n4z8xlmdrwhazz7m8wp-libjpeg-turbo-2.0.5/lib/libjpeg.so.62 (0x00007fd5aa219000)
libpng16.so.16 => /gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/libpng16.so.16 (0x00007fd5aa1e4000)
libz.so.1 => /gnu/store/rykm237xkmq7rl1p0nwass01p090p88x-zlib-1.2.11/lib/libz.so.1 (0x00007fd5aa1c2000)
libgif.so.7 => /gnu/store/bpw826hypzlnl4gr6d0v8m63dd0k8waw-giflib-5.2.1/lib/libgif.so.7 (0x00007fd5aa1b8000)
libXpm.so.4 => /gnu/store/jgdsl6whyimkz4hxsp2vrl77338kpl0i-libxpm-3.5.13/lib/libXpm.so.4 (0x00007fd5aa1a4000)
[…]
$ ldd $(type -P .emacs-27.2-real) | wc -l
89
(If you’re wondering why we’re looking at .emacs-27.2-real
rather than
emacs-27.2
, it’s because in Guix the latter is a tiny shell wrapper
around the former.)
To load a graphical program like Emacs, the loader needs to load more
than 80 shared libraries! Each is in its own /gnu/store
sub-directory
in Guix, one directory per package.
But how does ld.so
know where to find these libraries in the first
place? In Guix, during the link phase that produces an ELF file
(executable or shared library), we tell the
linker to
populate the RUNPATH
entry of the ELF file with the list of
directories where its dependencies may be found. This is done by
passing
-rpath
options to the linker, which Guix’s “linker
wrapper”
takes care of. The RUNPATH
is the run-time library search path:
it’s a colon-separated list of directories where ld.so
will look for
shared libraries when it loads an ELF file. We can look at the
RUNPATH
of our Emacs executable like this:
$ objdump -x $(type -P .emacs-27.2-real) | grep RUNPATH
RUNPATH /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib:/gnu/store/01b4w3m6mp55y531kyi1g8shh722kwqm-gcc-7.5.0-lib/lib:/gnu/store/l1wwr5c34593gqxvp34qbwdkaf7xhdbd-libtiff-4.2.0/lib:/gnu/store/5khkwz9g6vza1n4z8xlmdrwhazz7m8wp-libjpeg-turbo-2.0.5/lib:[…]
This RUNPATH
has 39 entries, which roughly corresponds to the number
of direct dependencies of the executable—dependencies are listed as
NEEDED
entries in the ELF file:
$ objdump -x $(type -P .emacs-27.2-real) | grep NEED | head
NEEDED libtiff.so.5
NEEDED libjpeg.so.62
NEEDED libpng16.so.16
NEEDED libz.so.1
NEEDED libgif.so.7
NEEDED libXpm.so.4
NEEDED libgtk-3.so.0
NEEDED libgdk-3.so.0
NEEDED libpangocairo-1.0.so.0
NEEDED libpango-1.0.so.0
$ objdump -x $(type -P .emacs-27.2-real) | grep NEED | wc -l
52
(Some of these .so
files live in the same directory, which is why
there are more NEEDED
entries than directories in the RUNPATH
.)
A system such as Debian that follows the file system hierarchy
standard
(FHS), where all libraries are in /lib
or /usr/lib
, does not have to
bother with RUNPATH
: all .so
files are known to be found in one of
these two “standard” locations. Anyway, let’s get back to our initial
topic: the “stat
storm”.
Walking search paths
As you can guess, when we run Emacs, the loader first needs to locate
and load the 80+ shared libraries it depends on. That’s where things
get pretty inefficient: the loader will search each .so
file Emacs
depends on in one of the 39 directories listed in its RUNPATH
.
Likewise, when it finally finds libgtk-3.so
, it’ll look for its
dependencies in each of the directories in its RUNPATH
. We can see
that at play by tracing system calls with the
strace
command:
$ strace -c emacs --version
GNU Emacs 27.2
Copyright (C) 2021 Free Software Foundation, Inc.
GNU Emacs comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GNU Emacs
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
55.46 0.006629 3 1851 1742 openat
16.06 0.001919 4 422 mmap
11.46 0.001370 2 501 477 stat
4.79 0.000573 4 122 mprotect
3.84 0.000459 4 111 read
2.45 0.000293 2 109 fstat
2.34 0.000280 2 111 close
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.011952 3 3325 2227 total
For this simple emacs --version
command, the loader and emacs
probed
for more than 2,200 files, with the
openat
and
stat
system calls, and most of
these probes were unsuccessful (counted as “errors” here, meaning that
the call returned an error). The fraction of “erroneous” system calls
is no less than 67% (2,227 over 3,325). We can see the desperate search
of .so
files by looking at individual calls:
$ strace -e openat,stat emacs --version
[…]
openat(AT_FDCWD, "/gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/01b4w3m6mp55y531kyi1g8shh722kwqm-gcc-7.5.0-lib/lib/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/l1wwr5c34593gqxvp34qbwdkaf7xhdbd-libtiff-4.2.0/lib/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/5khkwz9g6vza1n4z8xlmdrwhazz7m8wp-libjpeg-turbo-2.0.5/lib/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/tls/haswell/x86_64/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/tls/haswell/x86_64", 0x7ffe428a1c70) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/tls/haswell/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/tls/haswell", 0x7ffe428a1c70) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/tls/x86_64/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/tls/x86_64", 0x7ffe428a1c70) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/tls/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/tls", 0x7ffe428a1c70) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/haswell/x86_64/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/haswell/x86_64", 0x7ffe428a1c70) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/haswell/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/haswell", 0x7ffe428a1c70) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/x86_64/libpng16.so.16", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/x86_64", 0x7ffe428a1c70) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/gnu/store/3x2kak8abb6z2klch72kfff2qxzv00pj-libpng-1.6.37/lib/libpng16.so.16", O_RDONLY|O_CLOEXEC) = 3
[…]
Above is the sequence where we see ld.so
look for libpng16.so.16
,
searching in locations where we know it’s not going to find it. A bit
ridiculous. How does this affect performance? The impact is small in
the most favorable case—on a hot cache, with fast solid state device
(SSD) storage. But it likely has a visible effect in other cases—on a
cold cache, with a slower spinning hard disk drive (HDD), on a network
file system (NFS).
Enter the per-package loader cache
The idea that Ricardo submitted, using a loader cache, makes a lot of
sense: we know from the start that libpng.so
may only be found in
/gnu/store/…-libpng-1.6.37
, no need to look elsewhere. In fact, it’s
not new: glibc has had such a cache “forever”; it’s the
/etc/ld.so.cache
file you can see on FHS distros and which is
typically created by running
ldconfig
when a package has
been installed. Roughly, the cache maps library SONAME
s, such as
libpng16.so.16
, to their file name on disk, say
/usr/lib/libpng16.so.16
.
The problem is that this cache is inherently system-wide: it assumes
that there is only one libpng16.so
on the system; any binary that
depends on libpng16.so
will load it from its one and only location.
This models perfectly matches the FHS, but it’s at odds with the
flexibility offered by Guix, where several variants or versions of the
library can coexist on the system, used by different applications.
That’s the reason why Guix and other non-FHS distros such as NixOS or
GoboLinux typically turn
off
that feature altogether… and pay the cost of those stat
storms.
The insight we gained on that Tuesday evening IRC conversation is that
we could adapt glibc’s loader cache to our setting: instead of a
system-wide cache, we’d have a per-application loader cache. As one
of the last package build
phases,
we’d run ldconfig
to create etc/ld.so.cache
within that package’s
/gnu/store
sub-directory. We then need to modify the loader so it
would look for ${ORIGIN}/../etc/ld.so.cache
instead of
/etc/ld.so.cache
, where ${ORIGIN}
is the location of the ELF file
being loaded. A discussion of these changes is in the issue
tracker; you can see the glibc
patch
and the new make-dynamic-linker-cache
build
phase.
In short, the make-dynamic-linker-cache
phase computes the set of
direct and indirect dependencies of an ELF file using the
file-needed/recursive
procedure and derives from that the library search path, creates a
temporary ld.so.conf
file containing this search path for use by
ldconfig
, and finally runs ldconfig
to actually build the cache.
How does this play out in practice? Let’s try an emacs
build that
uses this new loader cache:
$ strace -c /gnu/store/ijgcbf790z4x2mkjx2ha893hhmqrj29j-emacs-27.2/bin/emacs --version
GNU Emacs 27.2
Copyright (C) 2021 Free Software Foundation, Inc.
GNU Emacs comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GNU Emacs
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
28.68 0.002909 26 110 13 openat
25.13 0.002549 26 96 read
20.41 0.002070 4 418 mmap
9.34 0.000947 10 90 pread64
6.60 0.000669 5 123 mprotect
4.12 0.000418 3 107 1 newfstatat
2.19 0.000222 2 99 close
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.010144 8 1128 24 total
Compared to what we have above, the total number of system calls has been divided by 3, and the fraction of erroneous system calls goes from 67% to 0.2%. Quite a difference! We count on you, dear users, to let us know how this impacts load time for you.
Flexibility without stat
storms
With GNU Stow in the 1990s, and then Nix, Guix, and other distros, the benefits of flexible file layouts rather than the rigid Unix-inherited FHS have been demonstrated—nowadays I see it as an antidote to opaque and bloated application bundles à la Docker. Luckily, few of our system tools have FHS assumptions baked in, probably in large part thanks to GNU’s insistence on a rigorous installation directory categorization in the early days rather than hard-coded directory names. The loader cache is one of the few exceptions. Adapting it to a non-FHS context is fruitful for Guix and for the other distros and packaging tools in a similar situation; perhaps it could become an option in glibc proper?
This is not the end of stat
storms, though. Interpreters and language
run-time systems rely on search paths—GUILE_LOAD_PATH
for Guile,
PYTHONPATH
for Python, OCAMLPATH
for OCaml, etc.—and are equally
prone to stormy application startups. Unlike ELF, they do not have a
mechanism akin to RUNPATH
, let alone a run-time search path cache. We
have yet to find ways to address these.
About GNU Guix
GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the Hurd or the Linux kernel, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, AArch64 and POWER9 machines.
In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.
Sauf indication contraire, les billets de blog de ce site sont la propriété de leurs auteurs respectifs et publiés sous les termes de la licence CC-BY-SA 4.0 et ceux de la GNU Free Documentation License (version 1.3 ou supérieur, sans section invariante, sans texte de préface ni de postface).