Grafts, continued
Guix includes a mechanism called grafts that allows us to provide users with security updates in a timely fashion, even for core packages deep down in the dependency graph. Most users value the benefits of grafts, but newcomers were also unavoidably struck by what turned out to be the undesirable side effect of our graft implementation on user experience. This had been a well-known problem for a while, but 1.1.0 finally addressed these issues.
This article recaps how grafts are implemented, what problems that caused, and how we solved it. It’s a deep dive into core Guix, and I hope it’ll be insightful to all and intriguing to the functional programming geeks among us!
What’s this “graft” thing anyway?
Grafts were introduced in the early days of Guix to address probably the main practical shortcomings of functional software deployment. In a nutshell, functional deployment as implemented by Nix and Guix means that, when a package changes, everything that depends on it must be rebuilt (or re-downloaded). To deploy a security fix in the C library or in Bash, you would thus need to rebuild everything. Even with a huge build farm, that can significantly delay the deployment of fixes; users would find themselves either rebuilding things locally or, at best, re-downloading binaries of everything.
To address this, Guix developers can instead specify a replacement in
a package
definition.
If we have a bug-fix for, say, libc, developers would (1) define a
package for the fixed libc, and (2) add a replacement
field in the
original libc package pointing to that fixed package. The effect is
that only the bug-fix libc needs to be built. When building a
package, the bug-fix libc is automatically grafted onto that package,
such that the resulting package refers to the bug-fix libc. See the
manual
for more.
When “lowering” a high-level package
definition
to a low-level
derivation,
Guix traverses the package dependency graph and identifies a set of
potentially applicable grafts. Why “potentially applicable”? Consider
this scenario: assume perl
has a replacement
; coreutils
has a
dependency on perl
, but it’s a build-time dependency: coreutils
does
not depend on perl
at run time. Thus, coreutils
can be used as is,
there is no need to graft it.
But how do we know whether a dependency is a built-time-only dependency?
The native-inputs
field
of a package usually lists build-time dependencies, but it’s more of a
hint. Ultimately, the set of run-time dependencies, which we call the
references, is the subset of the build-time dependencies that the
garbage collector (GC) in the build daemon finds in the build
result—Section 5.5.1 of Eelco Dolstra’s PhD
thesis describes how the
GC
scans for references. In our example, we first have to actually build
coreutils
before we can tell whether it depends on perl
at
run time.
Guix arranges to graft only when necessary. In this example, guix build coreutils
would return the same as guix build coreutils --no-grafts
. Conversely, since inkscape
has a run-time dependency on
perl
, guix build inkscape
returns a derivation that grafts the
perl
replacement onto the original inkscape
build result, the one
returned by guix build inkscape --no-grafts
. The (simplified)
dependency graph of the derivation for the grafted inkscape
looks like
this:
Grafts are a form of what Build Systems à la Carte by Mokhov et al. (a good read!) refers to as dynamic dependencies: grafting depends on intermediate build results.
Still here? With the background in place, let’s look at the problems that arose.
Grafts, the user interface, and performance
Conceptually, to decide whether to graft a package, we examine the
references of the build result of the ungrafted package. However, we
usually want guix install
& co. to first display an overall build
plan, especially when invoked with --dry-run
:
$ guix install inkscape
The following package will be installed:
inkscape 1.0
71.3 MB will be downloaded:
/gnu/store/xq64iaxx2gmlcgnipj31wjxlf1yd2g2p-gsl-2.6
/gnu/store/zrmhnj3pwchn2msphgnwzwd3q89m46rn-aspell-0.60.8
/gnu/store/5g31zf21lk8nrfd2b8qrm19nwdm9gir9-potrace-1.16
/gnu/store/qpr7387bsjs7rpl6flrwdvn2zbzh5bch-ghostscript-with-cups-9.52
/gnu/store/7y3lvk3xf4im8n44337mc6y0ccysvfia-font-dejavu-2.37
/gnu/store/95n3zvzxzq2bxvdvz292cg04757ij30x-cups-minimal-2.3.1
…
To accommodate that, the pre-1.1.0 implementation of grafts did the
following: when
substitutes
were enabled, it would get the list of references of ungrafted packages
from substitutes; only when substitutes for an ungrafted package are
missing would it first try to build that package. Thus, when
substitutes are available, guix install
and similar commands would be
able to display the build plan upfront. However, when a packages had no
substitutes, you’d see Guix start building it without saying a word
about the build plan, which was arguably
confusing.
But it’s worse than that. Grafting is per-package, so every time you would lower a package to a derivation, you would need to answer the question “does this specific package have substitutes, and if so, should it be grafted?” The end result was poor resource usage and terrible user interface feedback. For every package that is a graft candidate, the user would see that infamous line:
updating substitutes from 'https://ci.guix.gnu.org'...
The problem was particularly acute when building whole systems with
guix system
because there would typically be a large number of such
packages. Furthermore, each of these lines would correspond to
(roughly) a single HTTP GET request on a fresh TLS connection. That can
be slow… and annoying. Perhaps to some users this updating substitutes
stuttering was the proof of the developers’ incompetence
and perhaps, truth be told, to some of us developers it was a small
price to pay for the sophistication of grafts.
For users who disable substitutes and build everything locally, the
situation wasn’t much better: all the packages candidate for grafting
would be built one by one, thereby missing parallelization opportunities
as specified by
--max-jobs
.
Gathering dynamic dependencies
To address this, all these individual dynamic dependencies need to be gathered somehow instead of being treated one by one. Conceptually, we would like to, roughly, do a first pass lowering packages to derivations as if grafting was disabled, build all these derivations, and then do a second pass to determine which packages in the graph need to be grafted and to compute the relevant grafting derivation. That would address the performance issue: we’d now have as much parallelism as possible, so we wouldn’t query substitutes or build packages one by one. If we reify that second pass to the user interface code, it also addresses the user interface issue by allowing it to display, possibly, two build plans: the “ungrafted” one followed by the grafted one.
The problem is that our API is inherently serial: the
package-derivation
function takes one package, lowers it, and
returns its derivation:
(use-modules (guix)
(gnu packages base)
(gnu packages inkscape))
(define s (open-connection))
(package-derivation s coreutils)
⇒ #<derivation /gnu/store/rpfdbax1py483m9ydhvp73s7dgmn6xh4-coreutils-8.31.drv => /gnu/store/jkj7wxybgcpdamkl6fz7wwbb1ak5wxvk-coreutils-8.31-debug /gnu/store/zibwkb5xavnv6z3gzknfqjsxb9b0izh0-coreutils-8.31 7f6c92e3a000>
(package-derivation s coreutils #:graft? #f)
⇒ #<derivation /gnu/store/rpfdbax1py483m9ydhvp73s7dgmn6xh4-coreutils-8.31.drv => /gnu/store/jkj7wxybgcpdamkl6fz7wwbb1ak5wxvk-coreutils-8.31-debug /gnu/store/zibwkb5xavnv6z3gzknfqjsxb9b0izh0-coreutils-8.31 7f6c92e3a000>
(package-derivation s inkscape)
⇒ #<derivation /gnu/store/jzm2zsq8m0rj8wdsmikc0p2ik0cprrcf-inkscape-0.92.4.drv => /gnu/store/clj8rjnsip8a35hyd9nf4l65w7ahn0gs-inkscape-0.92.4 7f6c9c15b730>
(package-derivation s inkscape #:graft? #f)
⇒ #<derivation /gnu/store/psd31x1fq0v2g594z217mh56xzg21dym-inkscape-0.92.4.drv => /gnu/store/zz28ckjwfxwkx3gsm8sc452kmvfiky6y-inkscape-0.92.4 7f6c90ad4f50>
Lowering includes dealing with grafts, and
that’s why we ended up with one-by-one inefficiencies. An option would
be to make all the API “plural”: have package-derivation
and its
friends accept a list of packages instead of a single one. That would
be a huge amount of work and the end result would be unpleasant to use:
it’s easier to reason one-by-one.
The solution implemented in 1.1.0 instead starts from this observation:
the call graph of package-derivation
mirrors the package graph. Thus,
we could gather dynamic dependencies using monad
trickery
or using “control effects”. We went for the latter, which didn’t have
the “contamination” problem of monads and led to simpler code.
The starting point is that, by definition, code with dynamic
dependencies necessarily calls
build-derivations
.
Taking advantage of delimited continuations in
Guile,
build-derivations
is instrumented to abort to a “build handler”
prompt
when it’s called. The build handler receives the list of derivations to
build along with a continuation to invoke to resume the aborted
computation and start building things. User interface code can install
a build handler that displays what’s going to be built:
(with-build-handler (lambda (continue store things mode)
(show-what-to-build store things)
(continue #t))
…)
To implement dry runs, simply omit the call to continue
and nothing
will be built. (This is a slightly simplified artist view, see
build-notifier
for the real thing.)
Now, we need to take advantage of this mechanism to gather the
individual build-derivations
calls so we can later emit a single
build-derivations
call for all the gathered derivations. The goal is
to effectively gather all the calls for ungrafted packages, build them
all at once, and then resume graft computation.
To achieve that, we write a build handler that, when invoked, returns an
<unresolved>
object that captures what to build and the continuation.
In addition, we provide a primitive to introduce parallelism such
that, if a dynamic dependency is encountered, we keep going and attempt
to compute as much as possible without resolving that dynamic
dependency. These are build-accumulator
and
map/accumulate-builds
.
map/accumulate-builds
is like map
, except that it accumulates and
gathers build-derivations
request.
By using map/accumulate-builds
instead of map
in a few
key
places,
we obtain a good approximation of what we wanted, as illustrated in this
run:
$ guix install zile-on-guile vim-full
The following packages will be installed:
zile-on-guile 2.4.14-0.fd09781
vim-full 8.2.0411
9.4 MB will be downloaded:
/gnu/store/vf7w4yiax38ra7x8aqqvbnc38c0ldgpm-zile-on-guile-2.4.14-0.fd09781
/gnu/store/dnj9wljcck9vdwgp7dwxk00qnnk1g3c5-vim-full-8.2.0411
downloading from https://ci.guix.gnu.org/nar/lzip/dnj9wljcck9vdwgp7dwxk00qnnk1g3c5-vim-full-8.2.0411...
vim-full-8.2.0411 8.9MiB 7.6MiB/s 00:01 [##################] 100.0%
downloading from https://ci.guix.gnu.org/nar/lzip/vf7w4yiax38ra7x8aqqvbnc38c0ldgpm-zile-on-guile-2.4.14-0.fd09781...
zile-on-guile-2.4.14-0.fd09781 140KiB 1.8MiB/s 00:00 [##################] 100.0%
The following derivation will be built:
/gnu/store/d9xms78917w67xq71pqsx5x9s6dmq6d7-profile.drv
The following graft will be made:
/gnu/store/4n6dmg6iwjg0adpcvqygr9wgsnclswss-vim-full-8.2.0411.drv
applying 8 grafts for /gnu/store/4n6dmg6iwjg0adpcvqygr9wgsnclswss-vim-full-8.2.0411.drv...
building /gnu/store/d9xms78917w67xq71pqsx5x9s6dmq6d7-profile.drv...
What we see above is first a build plan that downloads binaries for the two ungrafted packages, followed by a build plan for one grafting derivations: we have successfully preserved parallelism.
The solution resembles the suspending
scheduler discussed in the à
la Carte paper, though decomposition is not as principled as what the
paper describes. It remains an approximation and not the
optimal way to deal with dynamic dependencies. There are still
situations where that shows,
but overall, it’s a significant improvement. Unlike other solutions
prototyped before, this one
has the advantage of being orthogonal and simple: less than 100 new
lines of
code,
and even about 30 lines
removed
from the graft implementation. That alone contributes a lot to the
author’s satisfaction. :-)
Interlude: a scatter/gather pattern?
In the end, we’re just gathering all the build-derivations
calls,
turning them into a single call, and finally calling all the original
site continuations with the result. The same kind of issue shows up
when dealing with sequences of remote procedure calls (RPCs) and HTTP
requests, and it seems there’s a more general pattern lurking here.
Consider code like this:
(map (lambda (thing)
(http-get (thing->url thing)))
lst)
Wouldn’t it be nice if we could somehow capture all the http-get
calls, turn them into a series of pipelined GET
requests, and resume the
continuations with their result?
I haven’t found a standard functional pattern to address this and would welcome ideas!
Dynamic dependencies of all shapes
We have seen how Guix deals with dynamic dependencies. Nix supports a
similar but limited form of dynamic dependencies through
the import
primitive of the
Nix language, which can take the result of a derivation
build;
it does not attempt to gather the resulting buildPaths
calls.
If we take a step back, we can see that Nix and Guix actually support
other forms of dynamic dependencies. For example, it’s possible to
write derivations whose result is a function of the reference graph of
another derivation’s build result. This is achieved in Guix by passing
the #:references-graphs
argument of
gexp->derivation
,
which leads the build daemon to include the reference graph in the
build
environment.
Another form of dynamic dependency is derivation-building derivations or recursive derivations, which were recently implemented in Nix. It supports another form of dynamic dependency where the build process of a derivation can itself create and build derivations (these are moldable tasks in scheduling parlance). It’s a great feature because in a nutshell, it allows Nix to be used not only to compose packages, but also at a finer grain as part of a package build process.
Guix supports yet another form of dynamic dependencies. The newfangled
guix deploy
tool
works by evaluating g-expressions (gexps)
remotely.
For example, before actually deploying an operating system, it first
runs code on the remote node to perform sanity checks: checking whether
the declared file system UUIDs or labels are valid, checking whether
additional kernel modules should be added to the initial RAM disk, and
so forth. To do that,
remote-eval
first builds a derivation that produces a Scheme program, deploys it
along with all its dependencies on that target machine, runs it, and
retrieves the result. This form of dynamic dependency also benefits
from the gathering machinery discussed above.
Conclusion
This is a long article on what may look like a fine point of Guix design and implementation, but there’s much to say about it! Grafts are key to the use of functional deployment in production because they enable quick security updates, and it’s a lot better if they don’t harm the user experience.
The pre-1.1.0 implementation of grafts had a negative impact on the user interface and on performance, which was due to the sequential handling of grafts, one package at a time. In 1.1.0 we addressed it by using delimited continuations to gather dynamic dependencies such as grafts, perform builds in bulk, and resume each derivation computation.
As it turned out, the implementation of dynamic dependencies raises lots of interesting design and implementation issues, and it’s probably not the end of the road!
About GNU Guix
GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the kernel Linux, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, and AArch64 machines.
In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.
Sauf indication contraire, les billets de blog de ce site sont la propriété de leurs auteurs respectifs et publiés sous les termes de la licence CC-BY-SA 4.0 et ceux de la GNU Free Documentation License (version 1.3 ou supérieur, sans section invariante, sans texte de préface ni de postface).