guix/guix
Ludovic Courtès 472a0e82a5
daemon: Do not deduplicate files smaller than 8 KiB.
Files smaller than 8 KiB typically represent ~70% of the entries in
/gnu/store/.links but only contribute to ~4% of the space savings
afforded by deduplication.

Not considering these files for deduplication speeds up file insertion
in the store and, more importantly, leaves 'removeUnusedLinks' with
fewer entries to traverse, thereby speeding it up proportionally.

Partly fixes <https://issues.guix.gnu.org/24937>.

* config-daemon.ac: Remove symlink hard link check and CAN_LINK_SYMLINK
definition.
* guix/store/deduplication.scm (%deduplication-minimum-size): New
variable.
(deduplicate)[loop]: Do not recurse when FILE's size is below
%DEDUPLICATION-MINIMUM-SIZE.
(dump-port): New procedure.
(dump-file/deduplicate)[hash]: Turn into...
[dump-and-compute-hash]: ... this thunk.
Call 'deduplicate' only when SIZE is greater than
%DEDUPLICATION-MINIMUM-SIZE; otherwise call 'dump-port'.
* nix/libstore/gc.cc (LocalStore::removeUnusedLinks): Drop files where
st.st_size < deduplicationMinSize.
* nix/libstore/local-store.hh (deduplicationMinSize): New declaration.
* nix/libstore/optimise-store.cc (deduplicationMinSize): New variable.
(LocalStore::optimisePath_): Return when PATH is a symlink or smaller
than 'deduplicationMinSize'.
* tests/derivations.scm ("identical files are deduplicated"): Produce
files bigger than %DEDUPLICATION-MINIMUM-SIZE.
* tests/nar.scm ("restore-file-set with directories (signed, valid)"):
Likewise.
* tests/store-deduplication.scm ("deduplicate, below %deduplication-minimum-size"):
New test.
("deduplicate", "deduplicate, ENOSPC"): Produce files bigger than
%DEDUPLICATION-MINIMUM-SIZE.
* tests/store.scm ("substitute, deduplication"): Likewise.
2021-11-16 14:34:28 +01:00
..
build build-system/julia: Enable Julia Pkg to find installed packages. 2021-11-16 14:39:51 +02:00
build-system build-system/julia: Enable Julia Pkg to find installed packages. 2021-11-16 14:39:51 +02:00
import import: utils: Add more licenses and extend their detection. 2021-11-12 23:34:18 +01:00
scripts environment: Fix ‘--check’ with exported PS1 variable. 2021-11-14 23:18:08 +01:00
store daemon: Do not deduplicate files smaller than 8 KiB. 2021-11-16 14:34:28 +01:00
tests tests: git: Make 'tag' directive non-interactive. 2021-09-18 19:37:45 +02:00
android-repo-download.scm android-repo-download: Add guile-json extension. 2021-05-02 18:45:27 +02:00
avahi.scm
base16.scm base16: Reduce GC pressure in bytevector->base16-string. 2021-09-10 17:30:54 +02:00
base32.scm base32: Work around (ash x N) miscompilation at '-O1' and below. 2021-09-21 15:15:52 +02:00
base64.scm
build-system.scm
bzr-download.scm
cache.scm cache: Gracefully handle non-existent cache. 2021-10-25 19:02:33 +02:00
channels.scm channels: 'channel-news-entry-commit' correctly resolves annotated tags. 2021-09-18 19:37:45 +02:00
ci.scm ci: Add jobs history support. 2021-08-22 21:36:29 +02:00
colors.scm
combinators.scm
config.scm.in
cpio.scm syscalls: Deduplicate device number conversion. 2021-09-23 18:17:16 +02:00
cve.scm cve: Gracefully handle bogus CVE entries. 2021-04-25 14:35:42 +02:00
cvs-download.scm cvs-download: Fix module exports 2021-05-05 16:56:43 +02:00
d3.v3.js
deprecation.scm
derivations.scm derivations: Make 'coalesce-duplicate-inputs' linear in the number of inputs. 2021-07-27 18:26:08 +02:00
describe.scm describe: 'current-channel-entries' ignores non-channel profile entries. 2021-06-13 23:57:44 +02:00
diagnostics.scm diagnostics, ui: Adjust to 'read-error' and 'syntax-error' in Guile 3.0.6. 2021-05-09 23:45:36 +02:00
discovery.scm discovery: Hide Guile warnings when loading modules. 2021-09-30 23:44:49 +02:00
docker.scm guix: docker: Ensure repository name length limits are met. 2021-07-05 16:34:07 -04:00
download.scm download: "GUIX_DOWNLOAD_FALLBACK_TEST=none" disables fallback mechanisms. 2021-10-15 23:16:28 +02:00
elf.scm
extracting-download.scm Add (guix extracting-download). 2021-10-07 22:24:23 +02:00
ftp-client.scm
gexp.scm guix: gexp: Define gexp->approximate-sexp. 2021-06-30 13:53:00 +02:00
git-authenticate.scm
git-download.scm git-download: Support submodules in 'git-predicate'. 2021-05-28 11:36:02 +02:00
git.scm git: 'reference-available?' recognizes 'tag-or-commit'. 2021-09-18 23:08:32 +02:00
glob.scm
gnu-maintenance.scm gnu-maintenance: 'generic-html' computes the right source URL. 2021-06-03 13:04:20 +02:00
gnupg.scm
grafts.scm grafts: Cache the derivation/graft mapping for the whole session. 2021-06-08 09:25:50 +02:00
graph.js
graph.scm graph: Add '--max-depth'. 2021-09-21 15:15:52 +02:00
hg-download.scm hg-download: Make (guix swh) output visible. 2021-06-14 18:35:18 +02:00
http-client.scm http-client: Remove exception mishandling in 'http-multiple-get'. 2021-04-25 14:36:45 +02:00
i18n.scm
inferior.scm inferior: 'cached-channel-instance' no longer calls 'show-what-to-build'. 2021-08-09 18:14:37 +02:00
ipfs.scm Add (guix ipfs). 2021-04-12 18:42:22 +02:00
licenses.scm import: utils: Add more licenses and extend their detection. 2021-11-12 23:34:18 +01:00
lint.scm lint: Add description check for common typos. 2021-10-24 14:26:12 -07:00
man-db.scm
memoization.scm
modules.scm
monad-repl.scm
monads.scm
nar.scm
narinfo.scm substitute: Choose compression method based on past CPU usage. 2021-03-21 23:41:01 +01:00
openpgp.scm openpgp: Remove now unnecessary procedure. 2021-03-02 23:12:37 +01:00
packages.scm guix: packages: Clarify that list is a list of <license> records. 2021-11-13 09:52:19 +01:00
pki.scm
profiles.scm profiles: Build the man database only if 'man-db' is in the profile. 2021-11-06 23:01:21 +01:00
profiling.scm
progress.scm progress: Add a download-size argument to progress-report-port. 2021-06-01 09:10:32 +02:00
quirks.scm
records.scm records: Support field sanitizers. 2021-08-12 12:34:13 +02:00
remote.scm store: 'map/accumulate-builds' handler checks the store received. 2021-10-28 21:30:27 +02:00
repl.scm
scripts.scm guix: scripts: Fix corner cases of hint for option typo. 2021-02-24 23:50:13 +01:00
search-paths.scm
self.scm maint: Factorize po xref translation. 2021-10-17 18:26:44 +02:00
serialization.scm serialization: Micro-optimize string literal output in 'write-file-tree'. 2021-03-01 17:45:51 +01:00
sets.scm
ssh.scm ssh: Fix type that broke offloading. 2021-05-11 12:49:53 +02:00
status.scm status: Add missing newline after substitution completion message. 2021-07-04 23:00:36 +02:00
store.scm store: 'map/accumulate-builds' handler checks the store received. 2021-10-28 21:30:27 +02:00
substitutes.scm substitutes: Properly construct URLs. 2021-07-16 19:36:11 +02:00
svn-download.scm
swh.scm swh: Allows token from Software Heritage authentication service. 2021-10-15 23:16:28 +02:00
tests.scm tests: Factorize 'file=?'. 2021-11-16 14:34:28 +01:00
transformations.scm transformations: Git tags and 'git describe' style IDs are used as version. 2021-09-08 18:03:50 +02:00
ui.scm ui: 'load*' correctly reports 'read-error' in all cases. 2021-11-07 23:10:41 +01:00
upstream.scm Revert the #51061 patch series for now. 2021-10-08 23:31:34 +02:00
utils.scm utils: Define a target-x86-32? and target-x86-64? predicate. 2021-11-07 01:38:23 -04:00
workers.scm