#+TITLE: Continuous Integration * Table of Contents :TOC_5_gh:noexport: - [[#description][Description]] - [[#overview][Overview]] - [[#tldr][TLDR]] - [[#current-stack][Current stack]] - [[#circleci][CircleCI]] - [[#github-actions][GitHub Actions]] - [[#docker][Docker]] - [[#clojure][Clojure]] - [[#ci-files-and-directories][CI files and directories]] - [[#workflows-groups-of-ci-jobs][Workflows (groups of CI jobs)]] - [[#pull-request-jobs][Pull request jobs]] - [[#emacs-lisp-tests][Emacs Lisp Tests]] - [[#documentation-validation][Documentation validation]] - [[#pr-validation][PR validation]] - [[#branch-updates-runs-on-merge][Branch updates (runs on merge)]] - [[#emacs-lisp-tests-1][Emacs Lisp Tests]] - [[#project-files-updates][Project files updates]] - [[#how-updates-end-up-in-spacemacs-repositories][How updates end up in Spacemacs repositories]] - [[#built-in-updates][Built-in updates]] - [[#documentation-updates][Documentation updates]] - [[#web-site-updates][Web site updates]] - [[#scheduled-jobs][Scheduled jobs]] - [[#potential-improvements-pr-ideas][Potential improvements (PR ideas)]] - [[#side-notes][Side notes]] - [[#we-used-to-have-travisci-3-ci-providers-at-the-same-time][We used to have TravisCI (3 CI providers at the same time)]] - [[#circleci-setup-config-and-cron-jobs][CircleCI setup config and cron jobs]] * Description This file explains how our continuous integration operates and what problems it solves. * Overview ** TLDR Spacemacs is big - the active maintainers team is small. The more we can automate - the better. We useĀ  [[https://circleci.com/][CircleCI]], [[https://github.com/features/actions][GitHub Actions]] and [[https://www.docker.com/][Docker]] to test PRs, update/fixi documentation and exporting it to [[https://develop.spacemacs.org/]]. Most of the code is just a bunch of bash/ELisp scripts and yml files, but some of the documentation tools are written in Clojure. Check out [[https://github.com/syl20bnr/spacemacs/tree/develop/.circleci][CircleCI]] and [[https://github.com/syl20bnr/spacemacs/tree/develop/.github/workflows][GitHub Actions]] directories, the code is pretty much self-explanatory but will be examined in depth below. ** Current stack Wait, what? Why Clojure, why 2 CI providers? I knew you would ask this question, dear reader, so here's my rationale: *** CircleCI It has a cool set of features and a generous quota for open source projects. But most importantly, unlike GitHub Actions, there is a straight forward way to cache build dependencies between runs and using it in tandem with GH Actions provides us with even more concurrency. It means that PR authors have to wait less time for feed back. This is crucial since we have a lot of test and platforms to cover. Also, CircleCI can run jobs on user provided Docker images that it caches, so we do not hit the DockerHub pull quota. On the downside, the CircleCI configuration file can be pretty involved, has unexpected limitations that can leave you puzzled for quite a while. *** GitHub Actions Oh man, that's good. It is clear that GH team had the benefit of hindsight when developed their CI platform. And it runs rely fast (at least for now). Maybe one day we'll fully switch to Actions. The biggest concern here is the vendor lock-in since all the good stuff is highly specific. While CircleCI allows you to run a job locally for free. And run whole CI on your own hardware with "strings attached". *** Docker Having a stable pre-build environment for jobs reduces headaches and improves set up time. Duh! Also DockerHub used to be a cool place to store and build huge images for free, but now it has all sorts of quotas + RAM is pretty limited for memory hungry JVM builds (((foreshadowing))). *** Clojure Besides the obvious fact that Rich Hickey's talks are the best. Before we started with automation, Spacemacs already had a huge set of documentation files that couldn't be fixed by a bunch of regular expressions wrapped into bash/ELisp code. The options were to either fix all README.org files by hand and keep fixing them forever, since contributors often forget to format stuff properly and nagging them constantly both wastes PR reviewer time and makes the contributor less likely to stick. Or go all-in and create a system that can extract data out of documentation files and rebuild them from scratch. Clojure designed to push data around and it has specs that can be used to validate files, generate test data and constructors for org-mode elements. The code is compiled to [[https://www.graalvm.org/reference-manual/native-image/][native-image]] so pretty much all of the JVM drawbacks are mitigated, for the particular use case anyway. * CI files and directories - [[https://github.com/syl20bnr/spacemacs/tree/develop/.ci][.ci]] is a shared CI directory that holds two config files: 1. [[https://github.com/syl20bnr/spacemacs/blob/develop/.ci/built_in_manifest][built_in_manifest]] list of upstream URL and target locations for built-in files. 2. [[https://github.com/syl20bnr/spacemacs/blob/develop/.ci/spacedoc-cfg.edn][spacedoc-cfg.edn]] configuration file for Spacemacs documentation tools. More details in [[#documentation-updates][Documentation updates]]. - [[https://github.com/syl20bnr/spacemacs/tree/develop/.github/workflows][.github/workflows]] place where GitHub workflow files live: - =workflows/scripts/test= runner script for EmacsLisp tests. - =workflows/scripts/dot_lock.el= package lock file that adds local ELPA mirror. - =elisp_test.yml= runs EmacsLisp tests on PR and branch updates. - =rebase.yml= we don't really use it :) It rebases PR onto current HEAD, it doesn't always work and requires personal token to run automatically. - =stale.yml= manages stale issues and PR. - [[https://github.com/syl20bnr/spacemacs/tree/develop/.circleci][.circleci]] everything specific for CircleCI. Documentation related files stored in the =org= sub folder, =web= is where HTML export stuff hides and =built_in= is all about updating built-in files. The rest is a bunch of shared script files. The specific cases are =shared= file that loads before each script run for every job, =config.yml= - CircleCI bootstrap script that generates the config that CircleCI will run by populating =config_tmpl.yml= file - actual config. We need the bootstrap step to inject =IS_BRANCH_UDATE= environment variable value into generated config file that CircleCI will run because it used to choose which jobs should be executed but environment variables aren't loaded soon enough. * Workflows (groups of CI jobs) ** Pull request jobs *** Emacs Lisp Tests Code tests are handled by GitHub Actions exclusively. The stages are: 1. Emacs installation with [[https://github.com/purcell/setup-emacs][purcell/setup-emacs]] - for UNIX and [[https://github.com/jcs090218/setup-emacs-windows][jcs090218/setup-emacs-windows]] for Windows. The step is configured by the job matrix. With two keys =os= and =emacs_version=. CI runs test for every possible combination. The stage ends up seriously bloated with repetition since the actions sometimes fail (especially for MacOS) so I added 3 sets of retires for the both actions. Currently GitHub [[https://github.community/t/how-to-retry-a-failed-step-in-github-actions-workflow/125880][doesn't provide a better way to implement this]]. 2. Checkout - clones the repo. 3. Installation of a local ELPA mirror with packages used be the tests. The archive is build daily in [[https://github.com/JAremko/testelpa-develop][JAremko/testelpa-develop]] repository and configured by .spacemacs files used in test. The mirror is set as a top priority package repository via [[https://github.com/syl20bnr/spacemacs/blob/develop/.github/workflows/scripts/dot_lock.el][Spacemacs lock file]] this way we actually install the packages (it is important to test that the system works) and if some packages are missing (for example, the mirror can be outdated) then they will be installed from a remote repository. 4. Run the tests! CI run core, base and layer tests sequentially because heaving 20+ CI results in PRs makes people ignore them. And this way they start faster since we cut on setup time. But the tests have to =always= clean after themselves to avoid affecting the next ones. For more details see the [[https://github.com/syl20bnr/spacemacs/blob/develop/.github/workflows/elisp_test.yml][workflow]] file *** Documentation validation This job uses [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/select_pr_changed][.circleci/select_pr_changed]] to find out what files are changed in the tested PR and if any of them are .org files it will check that they can be processed by exporting and validating them. The process will be explored further in the [[#documentation-updates][Documentation updates]] section. *** PR validation There are only two jobs here.[[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/PR_base][.circleci/PR_base]] makes sure that the PR is against develop branch and [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/PR_rebased][.circleci/PR_rebased]] checks if the PR needs a rebase (only when it's updated, so Spacemacs HEAD can actually get, well... Ahead). ** Branch updates (runs on merge) *** Emacs Lisp Tests See [[#emacs-lisp-tests][Emacs Lisp Tests]] it is the same. *** Project files updates All updates are handled by CircleCI. There are two config files: [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/config.yml][.circleci/config.yml]] that injects =IS_BRANCH_UDATE= environment variable into the second file [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/config_tmpl.yml][.circleci/config_tmpl.yml]] - actual config that CI will use. It has to be done this way because environment variables aren't accessible outside workflows, but CI needs =IS_BRANCH_UDATE= to choose what workflows to run. [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/config_tmpl.yml][.circleci/config_tmpl.yml]] begins with declarations of =parameters= (they are used to configure jobs) and =spacetools= executor. Every job runs inside of a freshly spawned =jare/spacemacs-circleci:latest= container that has Emacs and documentation tools, hub CLI and some other stuff. Here's its [[https://github.com/JAremko/spacemacs-circleci/blob/master/Dockerfile][docker file]] and its base image's [[https://github.com/JAremko/spacetools/blob/master/Dockerfile.noemacs][docker file]]. The middle section of the config defines jobs and their names. At the end of the file we have workflow definitions that aggregate jobs by names. Here you can see how =is_branch_update= parameter is used to select which workflows should be ran. Its value is set by inlined =IS_BRANCH_UDATE= environment variable that comes from environment variables page under CircleCI project settings. **** How updates end up in Spacemacs repositories Merging updates is semi-automatic. Bot (specified by =UPD_BOT_LOGIN= job environment variable) uses GitHub token (stored in CircleCI project settings) to push updated version of Spacemacs develop branch into its fork (=UPD_BOT_REPO=) then it opens pull request to =PRJ_REPO= owned by =PRJ_OWNER= (the fork is based on it). =PUBLISH= variable also used as a name for fork repo branch while =PR_BRANCH= is the branch against which PR will be opened. See [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/push][.circleci/push]] and [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/maybe_pr][.circleci/maybe_pr]] files for inner works. Most of bash variables are configured in the [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/shared][.circleci/shared]] file. The PRs are merged manually. **** Built-in updates The setup is really simple. We have [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/built_in/upd_built_in][.circleci/built_in/upd_built_in]] bash script that reads [[https://github.com/syl20bnr/spacemacs/blob/develop/.ci/built_in_manifest][.ci/built_in_manifest]] file line by line and downloads every file into the specified location. **** Documentation updates First files are exported into [[https://github.com/edn-format/edn][edn]]. File extension is .sdn "Spacemacs Documentation Notation" if you will, it's done to avoid collisions with config .edn files that can be in exported directories. The exporting is done by Emacs Lisp program based of [[https://github.com/emacsmirror/org/blob/master/lisp/ox.el][ox.el]]. [[https://github.com/JAremko/sdnize.el][Here's repository]]. The program extracts data and perform basic validations. The resulting .sdn files then process by [[https://github.com/JAremko/spacetools][spacetools]] (I'll work on documentation). The steps are: 1. parse and validate .sdn files 2. Generae LAYERS.sdn file. 3. Generate new set of .org files and replace old ones. The tool is configured by [[https://github.com/syl20bnr/spacemacs/blob/develop/.ci/spacedoc-cfg.edn][.ci/spacedoc-cfg.edn]] file. For details on how LAYERS.org generation works see [[https://github.com/syl20bnr/spacemacs/blob/develop/CONTRIBUTING.org#readmeorg-tags]["README.org tags" section of CONTRIBUTING.org]] The rest of configs(and their default values) are listed [[https://github.com/JAremko/spacetools/blob/master/components/spacedoc/src/spacetools/spacedoc/config.clj][here]]. **** Web site updates HTML generation handled by [[https://github.com/syl20bnr/spacemacs/blob/develop/core/core-documentation.el][core/core-documentation.el]] the entry function is =spacemacs/publish-doc= all the interesting parts are in preprocessors. Search for =Add preprocessors here= comment. Overall - pretty basic. When I finish with documenting/refactoring =spacetools= I'll probably use it to generate HTML similarly to how it generates .org files. What makes this job special is that CircleCI caches EmacsLisp dependencies of the HTML exporter script. See =save_cache= and =restore_cache= sections in [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/config_tmpl.yml][config file]]. Even with this export is pretty slow since it processes files sequentially one by one. ** Scheduled jobs We have 2 cron(scheduled) jobs: [[https://github.com/syl20bnr/spacemacs/blob/develop/.github/workflows/stale.yml][Managing stale issues]] with [[https://github.com/actions/stale][actions/stale]] and running built-in update job. The last one is managed by CircleCI and currently doesn't run since CircleCI [[https://discuss.circleci.com/t/setup-workflow-and-scheduled-workflow-in-the-same-configuration/39932/6][doesn't support cron jobs with setup configs]]. Instead built-in files are updated every time Spacemacs develop branch is pushed. * Potential improvements (PR ideas) - CircleCI config generation stage can test if a PR changes any .org file and schedule documentation testing job only if it does. - PR validation job can be moved to CircleCI config generation stage. If it isn't valid all CircleCI jobs can be skipped. - Web site repo becomes too heavy and PR diffs are meaningless. Removing update dates that are embedded into each exported HTML files would reduce the patch size drastically. - Figure out how to retry installation of Emacs for EmacsLisp tests in more concise manner. - Emacs Lisp tests step that runs the test isn't DRY. - Emacs Install retries can use some delay between the attempts since it is likely that a failed upstream repo will fail again if you don't give it any time to recover/change state. But it shouldn't add delay to runs without failures since they vastly outnumber failed ones. - See if we actually properly clean all they side effects between running EmacsLisp tests. - CircleCI script files can have better names. Also =.circleci= directory gets a bit crowded. Some of them should be moved into separate directory. It can be called "shared" since most of the scripts are reused across different jobs. * Side notes ** We used to have TravisCI (3 CI providers at the same time) We ran long running jobs there but ended up dropping the CI since TravisCI doesn't allow collaborators to read/set environment variables anymore, [[https://pbs.twimg.com/media/Eoq3OnWW4AIy7ih?format=jpg&name=large][the could be in some kind of trouble]] or [[https://blog.travis-ci.com/oss-announcement][maybe not]]. Anyway, when TravisCI stopped running jobs on their old domain (as a part of the migration from [[https://travis-ci.org/]] to [[https://www.travis-ci.com/]]) I decided to use it as an opportunity to have fewer kinds of configs. Still, it's good environment for building heavy (both in build time and RAM) Docker images. ** CircleCI setup config and cron jobs - Currently configs with setup step [[https://discuss.circleci.com/t/setup-workflow-and-scheduled-workflow-in-the-same-configuration/39932/6][don't run cron jobs]]. - We have setup config because environment variables aren't accessible at the top level of config files. But we need =IS_BRANCH_UDATE= environment variable to figure out if CI runs on PR or branch update. So config generation step bakes it into the config that CircleCI will use.