doc: cookbook: Add "Installing Guix on a Cluster" chapter.

This is derived from the article at
<https://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster/>, with
clarifications and updates.

* doc/guix-cookbook.texi (Installing Guix on a Cluster): New chapter.
This commit is contained in:
Ludovic Courtès 2023-01-05 12:39:06 +01:00 committed by Ludovic Courtès
parent 8b314efd50
commit 47c1de22df
No known key found for this signature in database
GPG Key ID: 090B11993D9AEBB5
1 changed files with 414 additions and 19 deletions

View File

@ -21,7 +21,8 @@ Copyright @copyright{} 2020 Brice Waegeneire@*
Copyright @copyright{} 2020 André Batista@*
Copyright @copyright{} 2020 Christine Lemmer-Webber@*
Copyright @copyright{} 2021 Joshua Branson@*
Copyright @copyright{} 2022 Maxim Cournoyer*
Copyright @copyright{} 2022 Maxim Cournoyer@*
Copyright @copyright{} 2023 Ludovic Courtès
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
@ -73,8 +74,9 @@ Weblate} (@pxref{Translating Guix,,, guix, GNU Guix reference manual}).
* Packaging:: Packaging tutorials
* System Configuration:: Customizing the GNU System
* Containers:: Isolated environments and nested systems
* Advanced package management:: Power to the users!
* Advanced package management:: Power to the users!
* Environment management:: Control environment
* Installing Guix on a Cluster:: High-performance computing.
* Acknowledgments:: Thanks!
* GNU Free Documentation License:: The license of this document.
@ -83,28 +85,45 @@ Weblate} (@pxref{Translating Guix,,, guix, GNU Guix reference manual}).
@detailmenu
--- The Detailed Node Listing ---
Scheme tutorials
* A Scheme Crash Course:: Learn the basics of Scheme
Packaging
* Packaging Tutorial:: Let's add a package to Guix!
* Packaging Tutorial:: A tutorial on how to add packages to Guix.
System Configuration
* Auto-Login to a Specific TTY:: Automatically Login a User to a Specific TTY
* Customizing the Kernel:: Creating and using a custom Linux kernel on Guix System.
* Guix System Image API:: Customizing images to target specific platforms.
* Using security keys:: How to use security keys with Guix System.
* Connecting to Wireguard VPN:: Connecting to a Wireguard VPN.
* Customizing a Window Manager:: Handle customization of a Window manager on Guix System.
* Running Guix on a Linode Server:: Running Guix on a Linode Server. Running Guix on a Linode Server
* Setting up a bind mount:: Setting up a bind mount in the file-systems definition.
* Getting substitutes from Tor:: Configuring Guix daemon to get substitutes through Tor.
* Setting up NGINX with Lua:: Configuring NGINX web-server to load Lua modules.
* Auto-Login to a Specific TTY:: Automatically Login a User to a Specific TTY
* Customizing the Kernel:: Creating and using a custom Linux kernel on Guix System.
* Guix System Image API:: Customizing images to target specific platforms.
* Using security keys:: How to use security keys with Guix System.
* Connecting to Wireguard VPN:: Connecting to a Wireguard VPN.
* Customizing a Window Manager:: Handle customization of a Window manager on Guix System.
* Running Guix on a Linode Server:: Running Guix on a Linode Server
* Setting up a bind mount:: Setting up a bind mount in the file-systems definition.
* Getting substitutes from Tor:: Configuring Guix daemon to get substitutes through Tor.
* Setting up NGINX with Lua:: Configuring NGINX web-server to load Lua modules.
* Music Server with Bluetooth Audio:: Headless music player with Bluetooth output.
Containers
* Guix Containers:: Perfectly isolated environments
* Guix System Containers:: A system inside your system
Advanced package management
* Guix Profiles in Practice:: Strategies for multiple profiles and manifests.
Environment management
* Guix environment via direnv:: Setup Guix environment with direnv
Installing Guix on a Cluster
* Setting Up a Head Node:: The node that runs the daemon.
* Setting Up Compute Nodes:: Client nodes.
* Cluster Network Access:: Dealing with network access restrictions.
* Cluster Disk Usage:: Disk usage considerations.
* Cluster Security Considerations:: Keeping the cluster secure.
@end detailmenu
@end menu
@ -3635,6 +3654,380 @@ will have predefined environment variables and procedures.
Run @command{direnv allow} to setup the environment for the first time.
@c *********************************************************************
@node Installing Guix on a Cluster
@chapter Installing Guix on a Cluster
@cindex cluster installation
@cindex high-performance computing, HPC
@cindex HPC, high-performance computing
Guix is appealing to scientists and @acronym{HPC, high-performance
computing} practitioners: it makes it easy to deploy potentially complex
software stacks, and it lets you do so in a reproducible fashion---you
can redeploy the exact same software on different machines and at
different points in time.
In this chapter we look at how a cluster sysadmin can install Guix for
system-wide use, such that it can be used on all the cluster nodes, and
discuss the various tradeoffs@footnote{This chapter is adapted from a
@uref{https://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster/,
blog post published on the Guix-HPC web site in 2017}.}.
@quotation Note
Here we assume that the cluster is running a GNU/Linux distro other than
Guix System and that we are going to install Guix on top of it.
@end quotation
@menu
* Setting Up a Head Node:: The node that runs the daemon.
* Setting Up Compute Nodes:: Client nodes.
* Cluster Network Access:: Dealing with network access restrictions.
* Cluster Disk Usage:: Disk usage considerations.
* Cluster Security Considerations:: Keeping the cluster secure.
@end menu
@node Setting Up a Head Node
@section Setting Up a Head Node
The recommended approach is to set up one @emph{head node} running
@command{guix-daemon} and exporting @file{/gnu/store} over NFS to
compute nodes.
Remember that @command{guix-daemon} is responsible for spawning build
processes and downloads on behalf of clients (@pxref{Invoking
guix-daemon,,, guix, GNU Guix Reference Manual}), and more generally
accessing @file{/gnu/store}, which contains all the package binaries
built by all the users (@pxref{The Store,,, guix, GNU Guix Reference
Manual}). ``Client'' here refers to all the Guix commands that users
see, such as @code{guix install}. On a cluster, these commands may be
running on the compute nodes and we'll want them to talk to the head
node's @code{guix-daemon} instance.
To begin with, the head node can be installed following the usual binary
installation instructions (@pxref{Binary Installation,,, guix, GNU Guix
Reference Manual}). Thanks to the installation script, this should be
quick. Once installation is complete, we need to make some adjustments.
Since we want @code{guix-daemon} to be reachable not just from the head
node but also from the compute nodes, we need to arrange so that it
listens for connections over TCP/IP. To do that, we'll edit the systemd
startup file for @command{guix-daemon},
@file{/etc/systemd/system/guix-daemon.service}, and add a
@code{--listen} argument to the @code{ExecStart} line so that it looks
something like this:
@example
ExecStart=/var/guix/profiles/per-user/root/current-guix/bin/guix-daemon --build-users-group=guixbuild --listen=/var/guix/daemon-socket/socket --listen=0.0.0.0
@end example
For these changes to take effect, the service needs to be restarted:
@example
systemctl daemon-reload
systemctl restart guix-daemon
@end example
@quotation Note
The @code{--listen=0.0.0.0} bit means that @code{guix-daemon} will
process @emph{all} incoming TCP connections on port 44146
(@pxref{Invoking guix-daemon,,, guix, GNU Guix Reference Manual}). This
is usually fine in a cluster setup where the head node is reachable
exclusively from the cluster's local area network---you don't want that
to be exposed to the Internet!
@end quotation
The next step is to define our NFS exports in
@uref{https://linux.die.net/man/5/exports,@file{/etc/exports}} by adding
something along these lines:
@example
/gnu/store *(ro)
/var/guix *(rw, async)
/var/log/guix *(ro)
@end example
The @file{/gnu/store} directory can be exported read-only since only
@command{guix-daemon} on the master node will ever modify it.
@file{/var/guix} contains @emph{user profiles} as managed by @code{guix
package}; thus, to allow users to install packages with @code{guix
package}, this must be read-write.
Users can create as many profiles as they like in addition to the
default profile, @file{~/.guix-profile}. For instance, @code{guix
package -p ~/dev/python-dev -i python} installs Python in a profile
reachable from the @code{~/dev/python-dev} symlink. To make sure that
this profile is protected from garbage collection---i.e., that Python
will not be removed from @file{/gnu/store} while this profile exists---,
@emph{home directories should be mounted on the head node} as well so
that @code{guix-daemon} knows about these non-standard profiles and
avoids collecting software they refer to.
It may be a good idea to periodically remove unused bits from
@file{/gnu/store} by running @command{guix gc} (@pxref{Invoking guix
gc,,, guix, GNU Guix Reference Manual}). This can be done by adding a
crontab entry on the head node:
@example
root@@master# crontab -e
@end example
@noindent
... with something like this:
@example
# Every day at 5AM, run the garbage collector to make sure
# at least 10 GB are free on /gnu/store.
0 5 * * 1 /usr/local/bin/guix gc -F10G
@end example
We're done with the head node! Let's look at compute nodes now.
@node Setting Up Compute Nodes
@section Setting Up Compute Nodes
First of all, we need compute nodes to mount those NFS directories that
the head node exports. This can be done by adding the following lines
to @uref{https://linux.die.net/man/5/fstab,@file{/etc/fstab}}:
@example
@var{head-node}:/gnu/store /gnu/store nfs defaults,_netdev,vers=3 0 0
@var{head-node}:/var/guix /var/guix nfs defaults,_netdev,vers=3 0 0
@var{head-node}:/var/log/guix /var/log/guix nfs defaults,_netdev,vers=3 0 0
@end example
@noindent
... where @var{head-node} is the name or IP address of your head node.
From there on, assuming the mount points exist, you should be able to
mount each of these on the compute nodes.
Next, we need to provide a default @command{guix} command that users can
run when they first connect to the cluster (eventually they will invoke
@command{guix pull}, which will provide them with their ``own''
@command{guix} command). Similar to what the binary installation script
did on the head node, we'll store that in @file{/usr/local/bin}:
@example
mkdir -p /usr/local/bin
ln -s /var/guix/profiles/per-user/root/current-guix/bin/guix \
/usr/local/bin/guix
@end example
We then need to tell @code{guix} to talk to the daemon running on our
master node, by adding these lines to @code{/etc/profile}:
@example
GUIX_DAEMON_SOCKET="guix://@var{head-node}"
export GUIX_DAEMON_SOCKET
@end example
To avoid warnings and make sure @code{guix} uses the right locale, we
need to tell it to use locale data provided by Guix (@pxref{Application
Setup,,, guix, GNU Guix Reference Manual}):
@example
GUIX_LOCPATH=/var/guix/profiles/per-user/root/guix-profile/lib/locale
export GUIX_LOCPATH
# Here we must use a valid locale name. Try "ls $GUIX_LOCPATH/*"
# to see what names can be used.
LC_ALL=fr_FR.utf8
export LC_ALL
@end example
For convenience, @code{guix package} automatically generates
@file{~/.guix-profile/etc/profile}, which defines all the environment
variables necessary to use the packages---@code{PATH},
@code{C_INCLUDE_PATH}, @code{PYTHONPATH}, etc. Thus it's a good idea to
source it from @code{/etc/profile}:
@example
GUIX_PROFILE="$HOME/.guix-profile"
if [ -f "$GUIX_PROFILE/etc/profile" ]; then
. "$GUIX_PROFILE/etc/profile"
fi
@end example
Last but not least, Guix provides command-line completion notably for
Bash and zsh. In @code{/etc/bashrc}, consider adding this line:
@verbatim
. /var/guix/profiles/per-user/root/current-guix/etc/bash_completion.d/guix
@end verbatim
Voilà!
You can check that everything's in place by logging in on a compute node
and running:
@example
guix install hello
@end example
The daemon on the head node should download pre-built binaries on your
behalf and unpack them in @file{/gnu/store}, and @command{guix install}
should create @file{~/.guix-profile} containing the
@file{~/.guix-profile/bin/hello} command.
@node Cluster Network Access
@section Network Access
Guix requires network access to download source code and pre-built
binaries. The good news is that only the head node needs that since
compute nodes simply delegate to it.
It is customary for cluster nodes to have access at best to a
@emph{white list} of hosts. Our head node needs at least
@code{ci.guix.gnu.org} in this white list since this is where it gets
pre-built binaries from by default, for all the packages that are in
Guix proper.
Incidentally, @code{ci.guix.gnu.org} also serves as a
@emph{content-addressed mirror} of the source code of those packages.
Consequently, it is sufficient to have @emph{only}
@code{ci.guix.gnu.org} in that white list.
Software packages maintained in a separate repository such as one of the
various @uref{https://hpc.guix.info/channels, HPC channels} are of
course unavailable from @code{ci.guix.gnu.org}. For these packages, you
may want to extend the white list such that source and pre-built
binaries (assuming this-party servers provide binaries for these
packages) can be downloaded. As a last resort, users can always
download source on their workstation and add it to the cluster's
@file{/gnu/store}, like this:
@verbatim
GUIX_DAEMON_SOCKET=ssh://compute-node.example.org \
guix download http://starpu.gforge.inria.fr/files/starpu-1.2.3/starpu-1.2.3.tar.gz
@end verbatim
The above command downloads @code{starpu-1.2.3.tar.gz} @emph{and} sends
it to the cluster's @code{guix-daemon} instance over SSH.
Air-gapped clusters require more work. At the moment, our suggestion
would be to download all the necessary source code on a workstation
running Guix. For instance, using the @option{--sources} option of
@command{guix build} (@pxref{Invoking guix build,,, guix, GNU Guix
Reference Manual}), the example below downloads all the source code the
@code{openmpi} package depends on:
@example
$ guix build --sources=transitive openmpi
@dots{}
/gnu/store/xc17sm60fb8nxadc4qy0c7rqph499z8s-openmpi-1.10.7.tar.bz2
/gnu/store/s67jx92lpipy2nfj5cz818xv430n4b7w-gcc-5.4.0.tar.xz
/gnu/store/npw9qh8a46lrxiwh9xwk0wpi3jlzmjnh-gmp-6.0.0a.tar.xz
/gnu/store/hcz0f4wkdbsvsdky3c0vdvcawhdkyldb-mpfr-3.1.5.tar.xz
/gnu/store/y9akh452n3p4w2v631nj0injx7y0d68x-mpc-1.0.3.tar.gz
/gnu/store/6g5c35q8avfnzs3v14dzl54cmrvddjm2-glibc-2.25.tar.xz
/gnu/store/p9k48dk3dvvk7gads7fk30xc2pxsd66z-hwloc-1.11.8.tar.bz2
/gnu/store/cry9lqidwfrfmgl0x389cs3syr15p13q-gcc-5.4.0.tar.xz
/gnu/store/7ak0v3rzpqm2c5q1mp3v7cj0rxz0qakf-libfabric-1.4.1.tar.bz2
/gnu/store/vh8syjrsilnbfcf582qhmvpg1v3rampf-rdma-core-14.tar.gz
@end example
(In case you're wondering, that's more than 320@ MiB of
@emph{compressed} source code.)
We can then make a big archive containing all of this (@pxref{Invoking
guix archive,,, guix, GNU Guix Reference Manual}):
@verbatim
$ guix archive --export \
`guix build --sources=transitive openmpi` \
> openmpi-source-code.nar
@end verbatim
@dots{} and we can eventually transfer that archive to the cluster on
removable storage and unpack it there:
@verbatim
$ guix archive --import < openmpi-source-code.nar
@end verbatim
This process has to be repeated every time new source code needs to be
brought to the cluster.
As we write this, the research institutes involved in Guix-HPC do not
have air-gapped clusters though. If you have experience with such
setups, we would like to hear feedback and suggestions.
@node Cluster Disk Usage
@section Disk Usage
@cindex disk usage, on a cluster
A common concern of sysadmins' is whether this is all going to eat a lot
of disk space. If anything, if something is going to exhaust disk
space, it's going to be scientific data sets rather than compiled
software---that's our experience with almost ten years of Guix usage on
HPC clusters. Nevertheless, it's worth taking a look at how Guix
contributes to disk usage.
First, having several versions or variants of a given package in
@file{/gnu/store} does not necessarily cost much, because
@command{guix-daemon} implements deduplication of identical files, and
package variants are likely to have a number of common files.
As mentioned above, we recommend having a cron job to run @code{guix gc}
periodically, which removes @emph{unused} software from
@file{/gnu/store}. However, there's always a possibility that users will
keep lots of software in their profiles, or lots of old generations of
their profiles, which is ``live'' and cannot be deleted from the
viewpoint of @command{guix gc}.
The solution to this is for users to regularly remove old generations of
their profile. For instance, the following command removes generations
that are more than two-month old:
@example
guix package --delete-generations=2m
@end example
Likewise, it's a good idea to invite users to regularly upgrade their
profile, which can reduce the number of variants of a given piece of
software stored in @file{/gnu/store}:
@example
guix pull
guix upgrade
@end example
As a last resort, it is always possible for sysadmins to do some of this
on behalf of their users. Nevertheless, one of the strengths of Guix is
the freedom and control users get on their software environment, so we
strongly recommend leaving users in control.
@node Cluster Security Considerations
@section Security Considerations
@cindex security, on a cluster
On an HPC cluster, Guix is typically used to manage scientific software.
Security-critical software such as the operating system kernel and
system services such as @code{sshd} and the batch scheduler remain under
control of sysadmins.
The Guix project has a good track record delivering security updates in
a timely fashion (@pxref{Security Updates,,, guix, GNU Guix Reference
Manual}). To get security updates, users have to run @code{guix pull &&
guix upgrade}.
Because Guix uniquely identifies software variants, it is easy to see if
a vulnerable piece of software is in use. For instance, to check whether
the glibc@ 2.25 variant without the mitigation patch against
``@uref{https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt,Stack
Clash}'', one can check whether user profiles refer to it at all:
@example
guix gc --referrers /gnu/store/…-glibc-2.25
@end example
This will report whether profiles exist that refer to this specific
glibc variant.
@c *********************************************************************
@node Acknowledgments
@chapter Acknowledgments
@ -3656,8 +4049,10 @@ information on these fine people. The @file{THANKS} file lists people
who have helped by reporting bugs, taking care of the infrastructure,
providing artwork and themes, making suggestions, and more---thank you!
This document includes adapted sections from articles that have previously
been published on the Guix blog at @uref{https://guix.gnu.org/blog}.
This document includes adapted sections from articles that have
previously been published on the Guix blog at
@uref{https://guix.gnu.org/blog} and on the Guix-HPC blog at
@uref{https://hpc.guix.info/blog}.
@c *********************************************************************