251 lines
16 KiB
Org Mode
251 lines
16 KiB
Org Mode
#+TITLE: Continuous Integration
|
||
|
||
* Table of Contents :TOC_5_gh:noexport:
|
||
- [[#description][Description]]
|
||
- [[#overview][Overview]]
|
||
- [[#tldr][TLDR]]
|
||
- [[#current-stack][Current stack]]
|
||
- [[#circleci][CircleCI]]
|
||
- [[#github-actions][GitHub Actions]]
|
||
- [[#docker][Docker]]
|
||
- [[#clojure][Clojure]]
|
||
- [[#ci-files-and-directories][CI files and directories]]
|
||
- [[#workflows-groups-of-ci-jobs][Workflows (groups of CI jobs)]]
|
||
- [[#pull-request-jobs][Pull request jobs]]
|
||
- [[#emacs-lisp-tests][Emacs Lisp Tests]]
|
||
- [[#documentation-validation][Documentation validation]]
|
||
- [[#pr-validation][PR validation]]
|
||
- [[#branch-updates-runs-on-merge][Branch updates (runs on merge)]]
|
||
- [[#emacs-lisp-tests-1][Emacs Lisp Tests]]
|
||
- [[#project-files-updates][Project files updates]]
|
||
- [[#how-updates-end-up-in-spacemacs-repositories][How updates end up in Spacemacs repositories]]
|
||
- [[#built-in-updates][Built-in updates]]
|
||
- [[#documentation-updates][Documentation updates]]
|
||
- [[#web-site-updates][Web site updates]]
|
||
- [[#scheduled-jobs][Scheduled jobs]]
|
||
- [[#potential-improvements-pr-ideas][Potential improvements (PR ideas)]]
|
||
- [[#side-notes][Side notes]]
|
||
- [[#we-used-to-have-travisci-3-ci-providers-at-the-same-time][We used to have TravisCI (3 CI providers at the same time)]]
|
||
|
||
* Description
|
||
Overview of our continuous integration, how it operates and what problems
|
||
solves.
|
||
|
||
* Overview
|
||
** TLDR
|
||
Spacemacs is big - the active maintainers team is small. The more we can
|
||
automate - the better. We use [[https://circleci.com/][CircleCI]], [[https://github.com/features/actions][GitHub Actions]] and [[https://www.docker.com/][Docker]] to test PRs,
|
||
update built-in files, fix documentation and export it to [[https://develop.spacemacs.org/][develop.spacemacs.org]].
|
||
Most of the code is just a bunch of bash/EmacsLisp scripts and yml files, but
|
||
some of the documentation tools are written in Clojure.
|
||
If you prefer reading code instead of documentation, checkout out [[https://github.com/syl20bnr/spacemacs/tree/develop/.circleci][CircleCI]] and
|
||
[[https://github.com/syl20bnr/spacemacs/tree/develop/.github/workflows][GitHub Actions]] directories.
|
||
|
||
** Current stack
|
||
Wait, what? Why Clojure, why 2 CI providers?
|
||
I knew you would ask this question, dear reader, so here's my rationale:
|
||
|
||
*** CircleCI
|
||
It has a cool set of features and a generous quota for open source projects.
|
||
But most importantly, unlike GitHub Actions, there is a straight forward way
|
||
to cache build dependencies between runs and using it in tandem with
|
||
GH Actions provides us with even more concurrency. It means that PR authors
|
||
have to wait less time for feedback. This is crucial since we have a lot of
|
||
test and platforms to cover. Also, CircleCI can run jobs on user provided Docker
|
||
images that it caches, so we do not hit the DockerHub pull quota.
|
||
On the downside, the CircleCI configuration file can be pretty involved,
|
||
has unexpected limitations that can leave you puzzled for quite a while.
|
||
|
||
*** GitHub Actions
|
||
Oh man, that's good. It is clear that GH team had the benefit of hindsight
|
||
when developed their CI platform. And it runs rely fast (at least for now).
|
||
Maybe, one day we'll fully switch to Actions. The biggest concern here is
|
||
the vendor lock-in since all of the good stuff is highly specific.
|
||
While CircleCI allows you to run a job locally for free. And run whole CI
|
||
on your own hardware with "strings attached".
|
||
|
||
*** Docker
|
||
Having a stable pre-build environment reduces headaches and improves
|
||
setup time. Duh!
|
||
Also DockerHub used to be a cool place to store and build huge images for
|
||
free, but now it has all sorts of quotas + RAM is pretty limited for memory
|
||
hungry JVM builds (((foreshadowing))). And it looks like they're going to
|
||
stop all automatic builds for free accounts.
|
||
|
||
*** Clojure
|
||
Besides the obvious fact that Rich Hickey's talks are the best.
|
||
Before we started with automation, Spacemacs already had a huge set of
|
||
documentation files that couldn't be fixed by a bunch of regular expressions
|
||
wrapped into bash/ELisp code.
|
||
The options were to either fix all README.org files by hand and keep fixing
|
||
them forever, since contributors often forget to format stuff properly and
|
||
nagging them constantly both wastes PR reviewer time and makes the
|
||
contributor less likely to stick. Or go all-in and create a system that
|
||
can extract data out of documentation files and rebuild them from scratch.
|
||
Clojure designed to push data around and it has specs that can be used
|
||
to validate documents, generate test data and constructors for org-mode
|
||
elements. The code is compiled to [[https://www.graalvm.org/reference-manual/native-image/][native-image]] so pretty much all of
|
||
the JVM drawbacks are mitigated, for the particular use case anyway.
|
||
|
||
* CI files and directories
|
||
- [[https://github.com/syl20bnr/spacemacs/tree/develop/.ci][.ci]] is a shared CI directory that holds two config files:
|
||
1. [[https://github.com/syl20bnr/spacemacs/blob/develop/.ci/built_in_manifest][built_in_manifest]] list of upstream URL and target locations for
|
||
built-in files.
|
||
2. [[https://github.com/syl20bnr/spacemacs/blob/develop/.ci/spacedoc-cfg.edn][spacedoc-cfg.edn]] configuration file for Spacemacs documentation tools.
|
||
More details in [[#documentation-updates][Documentation updates]].
|
||
- [[https://github.com/syl20bnr/spacemacs/tree/develop/.github/workflows][.github/workflows]] place where GitHub workflow files live:
|
||
- =workflows/scripts/test= runner script for EmacsLisp tests.
|
||
- =workflows/scripts/dot_lock.el= package lock file that adds local ELPA
|
||
mirror.
|
||
- =elisp_test.yml= runs EmacsLisp tests on PR and branch updates.
|
||
- =rebase.yml= we don't really use it :) It rebases PR onto current HEAD,
|
||
it doesn't always work and requires personal token to run automatically.
|
||
- =stale.yml= manages stale issues and PRs.
|
||
- [[https://github.com/syl20bnr/spacemacs/tree/develop/.circleci][.circleci]] everything specific for CircleCI. Documentation related files
|
||
stored in the =org= sub folder, =web= is where HTML export stuff hides,
|
||
=built_in= is all about updating built-in files and =update= contains helpers
|
||
related to making patches and pushing changes. The rest is a bunch of
|
||
shared script files. The specific cases are =shared= file that loads before
|
||
each script run for every job, =config.yml= - CircleCI bootstrap script that
|
||
generates the config that CircleCI runs for actual jobs. It does so by
|
||
rendering =config_tmpl.yml= template file.
|
||
|
||
* Workflows (groups of CI jobs)
|
||
** Pull request jobs
|
||
*** Emacs Lisp Tests
|
||
Code tests are handled by GitHub Actions exclusively.
|
||
The stages are:
|
||
1. Emacs installation with [[https://github.com/purcell/setup-emacs][purcell/setup-emacs]] - for UNIX and
|
||
[[https://github.com/jcs090218/setup-emacs-windows][jcs090218/setup-emacs-windows]] for Windows. The step is configured
|
||
by a job matrix. With two keys =os= and =emacs_version=. CI runs test for
|
||
every possible combination. The stage ends up seriously bloated with
|
||
repetition since the actions sometimes fail (especially for MacOS)
|
||
so I added 3 sets of retires for the both actions. Currently GitHub
|
||
[[https://github.community/t/how-to-retry-a-failed-step-in-github-actions-workflow/125880][doesn't provide a better way to implement this]].
|
||
2. Checkout - clones the repo.
|
||
3. Installation of a local ELPA mirror with packages used be the tests.
|
||
The archive is build daily in [[https://github.com/JAremko/testelpa-develop][JAremko/testelpa-develop]] repository and
|
||
configured by .spacemacs files used in test. The mirror is set as a top
|
||
priority package repository via [[https://github.com/syl20bnr/spacemacs/blob/develop/.github/workflows/scripts/dot_lock.el][Spacemacs lock file]] this way we actually
|
||
install the packages (it is important to test that the system works) and
|
||
if some packages are missing (for example, the mirror can be outdated)
|
||
then they will be installed from a remote repository.
|
||
4. Run the tests! CI run core, base and layer tests sequentially because
|
||
heaving 20+ CI results for a PR makes people ignore them. And this way
|
||
they start faster since we cut on setup time. But the tests have to
|
||
=always= clean after themselves to avoid affecting the next ones.
|
||
|
||
For more details see the [[https://github.com/syl20bnr/spacemacs/blob/develop/.github/workflows/elisp_test.yml][workflow]] file.
|
||
|
||
*** Documentation validation
|
||
This job uses [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/select_pr_changed][.circleci/select_pr_changed]] to find out what files are changed in
|
||
the tested PR and if any of them are .org files it will check that they can be
|
||
processed by exporting and validating them. The process will be explored further
|
||
in the [[#documentation-updates][Documentation updates]] section.
|
||
|
||
*** PR validation
|
||
There are only two jobs here.[[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/PR_base][.circleci/PR_base]] makes sure that the PR
|
||
is against develop branch and [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/PR_rebased][.circleci/PR_rebased]] checks if the PR
|
||
needs a rebase (only when it's updated, so Spacemacs HEAD can actually get,
|
||
well... Ahead).
|
||
|
||
** Branch updates (runs on merge)
|
||
*** Emacs Lisp Tests
|
||
Same as [[#emacs-lisp-tests][Emacs Lisp Tests]] on PRs.
|
||
|
||
*** Project files updates
|
||
All updates are handled by CircleCI. There are two config files:
|
||
[[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/config.yml][.circleci/config.yml]] that injects =IS_BRANCH_UDATE= environment variable into
|
||
the second file [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/config_tmpl.yml][.circleci/config_tmpl.yml]] - actual config that CI will use.
|
||
It has to be done this way because environment variables aren't accessible
|
||
outside workflows, but CI needs =IS_BRANCH_UDATE= to choose what workflows
|
||
to run.
|
||
[[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/config_tmpl.yml][.circleci/config_tmpl.yml]] begins with declarations of =parameters= (they
|
||
are used to configure jobs) and =spacetools= executor. Every job runs inside of
|
||
a freshly spawned =jare/spacemacs-circleci:latest= container that has Emacs and
|
||
documentation tools, hub CLI and some other stuff. Here's its [[https://github.com/JAremko/spacemacs-circleci/blob/master/Dockerfile][docker file]] and
|
||
its base image's [[https://github.com/JAremko/spacetools/blob/master/Dockerfile.noemacs][docker file]].
|
||
The middle section of the config defines jobs and their names. At the end of the
|
||
file we have workflow definitions that aggregate jobs by names. Here you can see
|
||
how =is_branch_update= parameter is used to select which workflows should be
|
||
ran. Its value is set by inlined =IS_BRANCH_UDATE= environment variable that
|
||
comes from environment variables page under CircleCI project settings.
|
||
|
||
**** How updates end up in Spacemacs repositories
|
||
Merging updates is semi-automatic. Bot (specified by =UPD_BOT_LOGIN= job
|
||
environment variable) uses GitHub token (stored in CircleCI project settings) to
|
||
push updated version of Spacemacs develop branch into its fork (=UPD_BOT_REPO=)
|
||
then it opens pull request to =PRJ_REPO= owned by =PRJ_OWNER= (the fork is based
|
||
on it). =PUBLISH= variable also used as a name for the fork repo branch while
|
||
=PR_BRANCH= is the branch against which PR will be opened. See [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/update/push][.circleci/update/push]]
|
||
and [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/update/maybe_pr][.circleci/update/maybe_pr]] files for inner-works. Most of bash variables are
|
||
configured in the [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/shared][.circleci/shared]] file. The PRs are merged manually.
|
||
|
||
**** Built-in updates
|
||
Setup is really simple here. We have [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/built_in/upd_built_in][.circleci/built_in/upd_built_in]] bash
|
||
script that reads [[https://github.com/syl20bnr/spacemacs/blob/develop/.ci/built_in_manifest][.ci/built_in_manifest]] file line by line and downloads every
|
||
file into the specified location.
|
||
|
||
**** Documentation updates
|
||
Firstly, files are exported into [[https://github.com/edn-format/edn][edn]] format. The file extension is .sdn
|
||
"Spacemacs Documentation Notation" - if you will, it's done to avoid collisions
|
||
with config .edn files. The exporting is done by Emacs Lisp program based on
|
||
[[https://github.com/emacsmirror/org/blob/master/lisp/ox.el][ox.el]]. [[https://github.com/JAremko/sdnize.el][Here's repository]]. The program extracts data and perform basic
|
||
validations. The resulting .sdn files then process by [[https://github.com/JAremko/spacetools][spacetools]] (I'll work on
|
||
documentation). The steps are:
|
||
1. parse and validate .sdn files
|
||
2. Generae LAYERS.sdn file from them.
|
||
3. Generate new set of .org files and replace the old ones.
|
||
|
||
=spacetools= configured by [[https://github.com/syl20bnr/spacemacs/blob/develop/.ci/spacedoc-cfg.edn][.ci/spacedoc-cfg.edn]] file. For details on how
|
||
LAYERS.org generation works see [[https://github.com/syl20bnr/spacemacs/blob/develop/CONTRIBUTING.org#readmeorg-tags]["README.org tags" section of CONTRIBUTING.org]]
|
||
The rest of configs(and their default values) are listed [[https://github.com/JAremko/spacetools/blob/master/components/spacedoc/src/spacetools/spacedoc/config.clj][here]].
|
||
|
||
**** Web site updates
|
||
HTML generation code lives in [[https://github.com/syl20bnr/spacemacs/blob/develop/core/core-documentation.el][core/core-documentation.el]].
|
||
=spacemacs/publish-doc= is the entry function. All the interesting parts are in
|
||
preprocessors. Search for =Add preprocessors here= comment.
|
||
Overall - pretty basic. When I finish with documenting/refactoring =spacetools=
|
||
I'll probably use it to generate HTML similarly to how it generates .org files.
|
||
What makes this job special is that CircleCI caches EmacsLisp dependencies of
|
||
the HTML exporter script. See =save_cache= and =restore_cache= sections
|
||
in the [[https://github.com/syl20bnr/spacemacs/blob/develop/.circleci/config_tmpl.yml][config file]]. Even with this, export is pretty slow since Emacs processes
|
||
files sequentially.
|
||
|
||
** Scheduled jobs
|
||
We have 2 cron(scheduled) jobs: [[https://github.com/syl20bnr/spacemacs/blob/develop/.github/workflows/stale.yml][Managing stale issues]] with [[https://github.com/actions/stale][actions/stale]] and
|
||
running built-in update job. The last one is ran by CircleCI and currently seems
|
||
to bug out since CircleCI [[https://discuss.circleci.com/t/setup-workflow-and-scheduled-workflow-in-the-same-configuration/39932/6][doesn't support cron jobs with setup configs]].
|
||
As a fall-back mechanism, CI updates built-in files every time Spacemacs
|
||
develop branch is pushed.
|
||
|
||
* Potential improvements (PR ideas)
|
||
- CircleCI config generation stage can test if a PR changes any .org file
|
||
and schedule documentation testing job only if it does.
|
||
- PR validation job can be moved to CircleCI config generation stage. If
|
||
it isn't valid - all CircleCI jobs can be skipped.
|
||
- Web site repo becomes too heavy and PR diffs are meaningless. Removing update
|
||
dates that are embedded into each exported HTML files would reduce the
|
||
patch size drastically.
|
||
- Figure out how to retry installation of Emacs for EmacsLisp tests in more
|
||
concise manner.
|
||
- EmacsLisp step that executes the tests isn't DRY.
|
||
- Emacs Install retries can use some delay between the attempts since it is
|
||
likely that a failed upstream repo will fail again if you don't give it any
|
||
time to recover/change state. But it shouldn't add delay to runs without
|
||
failures since they vastly outnumber failed ones.
|
||
- See if we actually properly clean all they side effects between running
|
||
EmacsLisp tests.
|
||
- CircleCI script files can have better names.
|
||
- Better error reporting in scripts. It is hard to debug CI so knowing what
|
||
exactly went wrong would help a lot.
|
||
|
||
* Side notes
|
||
** We used to have TravisCI (3 CI providers at the same time)
|
||
We ran long running jobs there but ended up dropping the CI since TravisCI
|
||
doesn't allow collaborators to read/set environment variables anymore,
|
||
[[https://pbs.twimg.com/media/Eoq3OnWW4AIy7ih?format=jpg&name=large][they could be in some kind of trouble]] or [[https://blog.travis-ci.com/oss-announcement][maybe not]]. Anyway, when TravisCI
|
||
stopped running jobs on their old domain (as a part of the migration from
|
||
[[https://travis-ci.org/]] to [[https://www.travis-ci.com/]]) I decided to use it
|
||
as an opportunity to have fewer kinds of configs. Still, it's good environment
|
||
for building heavy (both in build time and RAM) Docker images.
|