Commit Graph

24 Commits

Author SHA1 Message Date
Ludovic Courtès 472a0e82a5
daemon: Do not deduplicate files smaller than 8 KiB.
Files smaller than 8 KiB typically represent ~70% of the entries in
/gnu/store/.links but only contribute to ~4% of the space savings
afforded by deduplication.

Not considering these files for deduplication speeds up file insertion
in the store and, more importantly, leaves 'removeUnusedLinks' with
fewer entries to traverse, thereby speeding it up proportionally.

Partly fixes <https://issues.guix.gnu.org/24937>.

* config-daemon.ac: Remove symlink hard link check and CAN_LINK_SYMLINK
definition.
* guix/store/deduplication.scm (%deduplication-minimum-size): New
variable.
(deduplicate)[loop]: Do not recurse when FILE's size is below
%DEDUPLICATION-MINIMUM-SIZE.
(dump-port): New procedure.
(dump-file/deduplicate)[hash]: Turn into...
[dump-and-compute-hash]: ... this thunk.
Call 'deduplicate' only when SIZE is greater than
%DEDUPLICATION-MINIMUM-SIZE; otherwise call 'dump-port'.
* nix/libstore/gc.cc (LocalStore::removeUnusedLinks): Drop files where
st.st_size < deduplicationMinSize.
* nix/libstore/local-store.hh (deduplicationMinSize): New declaration.
* nix/libstore/optimise-store.cc (deduplicationMinSize): New variable.
(LocalStore::optimisePath_): Return when PATH is a symlink or smaller
than 'deduplicationMinSize'.
* tests/derivations.scm ("identical files are deduplicated"): Produce
files bigger than %DEDUPLICATION-MINIMUM-SIZE.
* tests/nar.scm ("restore-file-set with directories (signed, valid)"):
Likewise.
* tests/store-deduplication.scm ("deduplicate, below %deduplication-minimum-size"):
New test.
("deduplicate", "deduplicate, ENOSPC"): Produce files bigger than
%DEDUPLICATION-MINIMUM-SIZE.
* tests/store.scm ("substitute, deduplication"): Likewise.
2021-11-16 14:34:28 +01:00
Ludovic Courtès 4f621a2b00
maint: Require Guile >= 2.2.6.
* configure.ac: For Guile 2.2, require 2.2.6 or later.
* guix/gexp.scm (define-syntax-parameter-once): Remove.
Use 'define-syntax-parameter' instead.
* guix/mnoads.scm: Likewise.
* guix/inferior.scm (proxy)[select*]: Remove.
* guix/scripts/publish.scm <top level>: Remove replacement for (@@ (web
http) read-header-line).
* guix/store/deduplication.scm (counting-wrapper-port): Remove.
(nar-sha256): Call 'port-position' on PORT to compute SIZE.
2020-12-19 23:25:01 +01:00
Ludovic Courtès 7530e491b5
deduplicate: Create the '.links' directory lazily.
This avoids repeated (mkdir-p "/gnu/store/.links") calls when
deduplicating lots of files.

* guix/store/deduplication.scm (deduplicate): Remove initial call to
'mkdir-p'.  Add ENOENT case in 'link' exception handler.  Reindent.
* tests/store-deduplication.scm ("deduplicate, ENOSPC"): Check
for (<= links 4) to account for the initial 'link' call.
2020-12-15 17:32:12 +01:00
Ludovic Courtès 6a060ff27f
store-copy: 'populate-store' can optionally deduplicate files.
Until now deduplication was performed as an additional pass after
copying files, which involve re-traversing all the files that had just
been copied.

* guix/store/deduplication.scm (copy-file/deduplicate): New procedure.
* tests/store-deduplication.scm ("copy-file/deduplicate"): New test.
* guix/build/store-copy.scm (populate-store): Add #:deduplicate?
parameter and honor it.
* tests/gexp.scm ("gexp->derivation, store copy"): Pass #:deduplicate? #f
to 'populate-store'.
* gnu/build/image.scm (initialize-root-partition): Pass #:deduplicate?
to 'populate-store'.  Pass #:deduplicate? #f to 'register-closure'.
* gnu/build/vm.scm (root-partition-initializer): Likewise.
* gnu/build/install.scm (populate-single-profile-directory): Pass
 #:deduplicate? #f to 'populate-store'.
* gnu/build/linux-initrd.scm (build-initrd): Likewise.
* guix/scripts/pack.scm (self-contained-tarball)[import-module?]: New
procedure.
[build]: Pass it as an argument to 'source-module-closure'.
* guix/scripts/pack.scm (squashfs-image)[build]: Wrap in
'with-extensions'.
* gnu/system/linux-initrd.scm (expression->initrd)[import-module?]: New
procedure.
[builder]: Pass it to 'source-module-closure'.
* gnu/system/install.scm (cow-store-service-type)[import-module?]: New
procedure.  Pass it to 'source-module-closure'.
2020-12-15 17:32:10 +01:00
Ludovic Courtès 2718c29c3f
nar: Deduplicate files right as they are restored.
This avoids having to traverse and re-read the files that we have just
restored, thereby reducing I/O.

* guix/serialization.scm (dump-file): New procedure.
(restore-file): Add #:dump-file parameter and honor it.
* guix/store/deduplication.scm (tee, dump-file/deduplicate): New
procedures.
* guix/nar.scm (restore-one-item): Pass #:dump-file to 'restore-file'.
(finalize-store-file): Pass #:deduplicate? #f to 'register-items'.
* tests/nar.scm <top level>: Call 'setenv' to set "NIX_STORE".
2020-12-15 17:32:09 +01:00
Caleb Ristvedt 14c422c12c
deduplication: pass store directory to replace-with-link.
This causes with-writable-file to take into consideration the actual store
being used, as passed to 'deduplicate', rather than
whatever (%store-directory) may return.

* guix/store/deduplication.scm (replace-with-link): new keyword argument
  'store'.  Pass to with-writable-file.
  (with-writable-file, call-with-writable-file): new store argument.
  (deduplicate): pass store to replace-with-link.

Signed-off-by: Ludovic Courtès <ludo@gnu.org>
2020-09-14 10:51:26 +02:00
Mathieu Othacehe 8b221b64a5
store: deduplication: Handle fs without d_type support.
scandir* uses readdir, which means that the file type property can be 'unknown
if the underlying file-system does not support d_type. Make sure to fallback
to lstat in that case.

Fixes: https://issues.guix.gnu.org/issue/42579.

* guix/store/deduplication.scm (deduplicate): Handle the case where properties
is 'unknown because the underlying file-system does not support d_type.
2020-07-28 14:10:28 +02:00
Ludovic Courtès 3b7145d821
deduplication: Leave the store permissions unchanged.
Suggested by Caleb Ristvedt <caleb.ristvedt@cune.org>.

* guix/store/deduplication.scm (call-with-writable-file): Call THUNK
directly when FILE is (%store-directory).
2020-06-25 12:29:23 +02:00
Ludovic Courtès 6b654a3332
deduplication: Fix default value of #:store in 'deduplicate'.
* guix/store/deduplication.scm (deduplicate): Change #:store default
value to (%store-directory).
2020-06-25 12:29:23 +02:00
Ludovic Courtès d52e16d3b6
deduplication: Use 'dynamic-wind' when changing permissions of the parent.
Suggested by Caleb Ristvedt <caleb.ristvedt@cune.org>.

* guix/store/deduplication.scm (call-with-writable-file): New procedure.
(with-writable-file): New macro.
(replace-with-link): Use it.
2020-06-25 12:29:22 +02:00
Ludovic Courtès fe5de925aa
deduplicate: Avoid traversing directories twice.
Until now, we'd call (nar-sha256 file) unconditionally.  Thus, if FILE
was a directory, we would traverse it for no reason, and then call
'deduplicate' on FILE, which would again traverse it.

This change also removes redundant (mkdir-p store) calls from the loop,
and avoids 'lstat' calls by using 'scandir*'.

* guix/store/deduplication.scm (deduplicate): Add named loop.  Move
'mkdir-p' outside the loop.  Use 'scandir*' instead of 'scandir'.  Do
not call 'nar-sha256' when FILE has type 'directory.
2020-06-22 15:42:55 +02:00
Ludovic Courtès 4cb63a564d
deduplication: Use nix-base32 encoding for link names.
Fixes <https://bugs.gnu.org/39725>.

* guix/store/deduplication.scm (deduplicate): Use
'bytevector->nix-base32-string' instead of 'bytevector->base16-string'.
2020-02-22 00:46:06 +01:00
Tobias Geerinckx-Rice 52beae7b8a
gnu, guix: Yearly ritual purging of the filesystems.
* gnu/packages/android.scm (android-ext4-utils)[synopsis]: Fix ‘file
system’ spelling.
* gnu/packages/disk.scm (rmlint)[synopsis, description]: Likewise.
* gnu/packages/golang.scm (go-github-com-kr-fs)[synopsis, description]:
Likewise & edit for grammar.
* gnu/packages/ipfs.scm (gx, go-ipfs)[description]: Likewise.
* /gnu/packages/java.scm (java-commons-vfs)[synopsis]: Likewise.
* gnu/packages/linux.scm (fuseiso)[description]: Likewise.
(genext2fs)[synopsis, description]: Likewise.
* gnu/packages/package-management.scm (libostree)[description]: Likewise.
* gnu/packages/python-xyz.scm (python-requests-file)[description]:
Likewise & mark up.
* gnu/packages/rails.scm (ruby-with-advisory-lock)[description]:
Likewise.
* gnu/packages/ruby.scm (ruby-rerun)[description]: Likewise.
* guix/build/go-build-system.scm (setup-go-environment)<docstring>:
Likewise.
* guix/store/deduplication.scm (get-temp-link)<docstring>: Likewise.
2019-04-25 04:42:16 +02:00
Ludovic Courtès ba5e89be8c
deduplication: Ignore EMLINK.
Until now 'guix offload' would fail (transient failure) upon EMLINK.

* guix/store/deduplication.scm (replace-with-link)
(deduplicate): Ignore EMLINK.
2019-01-23 23:35:12 +01:00
Ludovic Courtès adb158b739
deduplication: Gracefully handle ENOSPC raised by 'link' calls.
Reported by Andreas Enge <andreas@enge.fr>
in <https://bugs.gnu.org/33676>.

* guix/store/deduplication.scm (replace-with-link): Catch ENOSPC around
'get-temp-link'.  Do nothing when 'get-temp-link' throws ENOSPC.  Move
code to restore PARENT's permissions outside of 'catch'.
* tests/store-deduplication.scm ("deduplicate, ENOSPC"): New test.
2018-12-14 12:07:24 +01:00
Ludovic Courtès f5a2724ae4
deduplication: Restore directory mtime and permissions after deduplication.
Fixes <https://bugs.gnu.org/33361>.

* guix/store/deduplication.scm (replace-with-link): Call 'set-file-time'
and 'chmod' after 'rename-file'.
* tests/nar.scm ("restore-file-set with directories (signed, valid)"):
New test.
2018-11-13 14:59:46 +01:00
Ludovic Courtès ca71942445
Switch to Guile-Gcrypt.
This removes (guix hash) and (guix pk-crypto), which now live as part of
Guile-Gcrypt (version 0.1.0.)

* guix/gcrypt.scm, guix/hash.scm, guix/pk-crypto.scm,
tests/hash.scm, tests/pk-crypto.scm: Remove.
* configure.ac: Test for Guile-Gcrypt.  Remove LIBGCRYPT and
LIBGCRYPT_LIBDIR assignments.
* m4/guix.m4 (GUIX_ASSERT_LIBGCRYPT_USABLE): Remove.
* README: Add Guile-Gcrypt to the dependencies; move libgcrypt as
"required unless --disable-daemon".
* doc/guix.texi (Requirements): Likewise.
* gnu/packages/bash.scm, guix/derivations.scm, guix/docker.scm,
guix/git.scm, guix/http-client.scm, guix/import/cpan.scm,
guix/import/cran.scm, guix/import/crate.scm, guix/import/elpa.scm,
guix/import/gnu.scm, guix/import/hackage.scm,
guix/import/texlive.scm, guix/import/utils.scm, guix/nar.scm,
guix/pki.scm, guix/scripts/archive.scm,
guix/scripts/authenticate.scm, guix/scripts/download.scm,
guix/scripts/hash.scm, guix/scripts/pack.scm,
guix/scripts/publish.scm, guix/scripts/refresh.scm,
guix/scripts/substitute.scm, guix/store.scm,
guix/store/deduplication.scm, guix/tests.scm, tests/base32.scm,
tests/builders.scm, tests/challenge.scm, tests/cpan.scm,
tests/crate.scm, tests/derivations.scm, tests/gem.scm,
tests/nar.scm, tests/opam.scm, tests/pki.scm,
tests/publish.scm, tests/pypi.scm, tests/store-deduplication.scm,
tests/store.scm, tests/substitute.scm: Adjust imports.
* gnu/system/vm.scm: Likewise.
(guile-sqlite3&co): Rename to...
(gcrypt-sqlite3&co): ... this.  Add GUILE-GCRYPT.
(expression->derivation-in-linux-vm)[config]: Remove.
(iso9660-image)[config]: Remove.
(qemu-image)[config]: Remove.
(system-docker-image)[config]: Remove.
* guix/scripts/pack.scm: Adjust imports.
(guile-sqlite3&co): Rename to...
(gcrypt-sqlite3&co): ... this.  Add GUILE-GCRYPT.
(self-contained-tarball)[build]: Call 'make-config.scm' without
 #:libgcrypt argument.
(squashfs-image)[libgcrypt]: Remove.
[build]: Call 'make-config.scm' without #:libgcrypt.
(docker-image)[config, json]: Remove.
[build]: Add GUILE-GCRYPT to the extensions  Remove (guix config) from
the imported modules.
* guix/self.scm (specification->package): Remove "libgcrypt", add
"guile-gcrypt".
(compiled-guix): Remove #:libgcrypt.
[guile-gcrypt]: New variable.
[dependencies]: Add it.
[*core-modules*]: Remove #:libgcrypt from 'make-config.scm' call.
Add #:extensions.
[*config*]: Remove #:libgcrypt from 'make-config.scm' call.
(%dependency-variables): Remove %libgcrypt.
(make-config.scm): Remove #:libgcrypt.
* build-aux/build-self.scm (guile-gcrypt): New variable.
(make-config.scm): Remove #:libgcrypt.
(build-program)[fake-gcrypt-hash]: New variable.
Add (gcrypt hash) to the imported modules.  Adjust load path
assignments.
* gnu/packages/package-management.scm (guix)[propagated-inputs]: Add
GUILE-GCRYPT.
[arguments]: In 'wrap-program' phase, add GUILE-GCRYPT to the search
path.
2018-09-04 17:25:11 +02:00
Ludovic Courtès 4f89a8eec6
deduplication: Work around Guile bug in 'seek'.
Fixes <https://bugs.gnu.org/32161>.
Reported by Ricardo Wurmus <rekado@elephly.net>.

This mostly reverts 83099892e0.

* guix/store/deduplication.scm (counting-wrapper-port): New procedure.
(nar-sha256): Use it.
2018-07-20 15:01:33 +02:00
Ludovic Courtès 83099892e0
deduplication: Remove 'counting-wrapper-port'.
* guix/store/deduplication.scm (counting-wrapper-port): Remove.
(nar-sha256): Call 'port-position' directly on PORT.
2018-07-19 17:12:48 +02:00
Ludovic Courtès a5b34d9d24
deduplication: Remove 'false-if-system-error', now unused.
* guix/store/deduplication.scm (false-if-system-error): Remove.
2018-07-03 17:50:04 +02:00
Ludovic Courtès 3dbf331942
deduplication: Place link files under /gnu/store/.links.
Previously they'd always be placed next to TO-REPLACE, which would lead
to EPERM in some cases.

* guix/store/deduplication.scm (replace-with-link): Add #:swap-directory
parameter and honor it.  Add call to 'make-file-writable'.  Catch
'system-error' around 'rename-file'.
(deduplicate): Pass #:swap-directory and remove uses of
'false-if-system-error'.
* tests/store-deduplication.scm ("deduplicate"): Add 'chmod' call.
2018-07-03 00:39:11 +02:00
Ludovic Courtès af2f8ae5f1
deduplication: Fix incorrect use of 'throw'.
* guix/store/deduplication.scm (get-temp-link): In handler, fix call to
'throw'.
2018-07-03 00:39:11 +02:00
Ludovic Courtès 0d0438ed8c
deduplicate: Fix a couple of thinkos.
* guix/store/deduplication.scm (get-temp-link): Turn 'args' in the 'catch'
handler into a rest argument.
(deduplicate): Use 'lstat' instead of 'file-is-directory?' to properly
handle symlinks.  When iterating over the result of 'scandir', exclude
the ".links" sub-directory.
* tests/store-deduplication.scm ("deduplicate"): Create sub-directories
and call 'deduplicate' directly on STORE.
2018-06-14 11:16:59 +02:00
Caleb Ristvedt bf5bf5778c
Add (guix store deduplication).
* guix/store/database.scm (register-path): Add #:deduplicate? and call
'deduplicate' when it's true.
(counting-wrapper-port, nar-sha256): Move to...
* guix/store/deduplication.scm: ... here.  New file.
* tests/store-deduplication.scm: New file.
* Makefile.am (STORE_MODULES): Add deduplication.scm.
(SCM_TESTS) [HAVE_GUILE_SQLITE3]: Add store-deduplication.scm.

Co-authored-by: Ludovic Courtès <ludo@gnu.org>
2018-06-01 15:35:54 +02:00