Skip to content

Conversation

@tlaurion
Copy link
Collaborator

@tlaurion tlaurion commented Feb 24, 2022

Heads buildstystem:

  • Makefile logic will download modules packages under ./packages, check itheir integrity, then extract it and patch extraction directory ONLY if no corresponding .*_verify files are found under ./packages directory. They are extracted under build/modulename-ver/ where patches are applied prior of building them.
  • build/module* .configured is written when packages are configured under build/modulename-ver/.configured
  • build/modules* .build is written when packages are built under build/modulename-ver/.build

CircleCI caching subsystem notes:

  • A cache name tag is calculated in the prep_env stage early at each beginning of a workflow, and consists of a cache name, appended by a calculated digest signature (which is the final hash of hashed files (the hash of a digest).
    • Look for the following under .circleci/config.yml:
      • "Creating .... digest statements" : they are basically files passed under sha256sum to create a digest.
      • restore_cache keys: they are basically a string concatenating: name + checksum of digest + CACHE_VERSION. Only the first cache is extracted following declared order.
      • save_cache keys: same as above, only saving non-existing caches. That is, skipping existing ones and creating missing ones.
  • A cache is extracted at the beginning of a workflow if an archive matches an archive name, which consists of a name tag + digest hash + CACHE_VERSION
  • A cache is created only at the end of a workflow ("Saving cache...").
    • Caches are specialized. Caches are linked to checkumming of some content. And the largest available cache is extracted on next workflow, only extracting the directories/files that were contained in that cache.
  • A workspace cache ("Attaching workspace..."), as opposed to a end workflow cache, is passed along steps that depends on prior workflow, as specified under CirclecI config. The current CircleCI config creates a workspace cache for:
    • make + gawk + musl-cross-make (passed along next)
    • the most massive board config for each coreboot version (passed along next)
    • which is finally leading to the workflow cache, specialized for different content that should not change across builds.
      • That is 3 caches
        • musl-cross-make and bootstrapping tools (builds make and gawk locally) as long as musl-cross module has same checksum
        • a coreboot cache, containing all coreboot building directories, as long as coreboot module and patches are having the same hashes
        • a global cache containing alla builds artifacts (build dir, install dir, musl-cross dir etc)
  • Consequently, a workspace cache contains all the files under a path that is specified. For heads running under CircleCI, this is ~/project, which is basically "heads" checked out GitHub project, and everything being built under it.
  • When a workflow is successful, save_cache is ran, constructing caches for digest hashes that are not yet saved (which corresponds to a hash matching muslc-cross module hash, coreboot+patches digest hash and another one for all modules and patches digest hash.
  • On next workspace iteration, pre_env step will include a "Restore cache" step, which will use the largest cache available and extract it prior of passing it as workspace caches. This is why there is no such different in build time when building on a clean build (the workspace caches layers are smaller, and passed along. This means saving it, passing it. next workspace downloads extracts and builds on top of those smaller layers), as opposed to a workspace reusing and repassing the bigger workspaces containing the whole cache (bigger initial cache extract, then compressing and saving it to be passed as a workspace layer that is then downloaded, extracted, building on top, compressing and saving which then passed as a workspace cache to the next layer depending on it).
  • And finally, the caching system (save_cache, restore_cache) is based on a CircleCI environment variable named CACHE_VERSION which is appended at the end of the checkum fingerprint of a named cache. It can at any moment be changed to wipe actually used cache, if for some reason it is broken.

Ideally, we would not use workspace cache when we have a full cache available. But I have no idea on how to do that reading the docs. If we have a full cache, it would be better to just download and extract it instead of downloading that cache and building bigger workspaces caches to be passed along with the final goal of creating a new cache, which will be ignored, since the digests of modules and patches are the same and no cache is create since final hash (signature of that cache) is the same.

Consequently:

  • CircleCI cache should include packages cache (so that packages are downloaded and verified only once.)
  • Heads Makefile only downloads, checks and extracts packages and then patch extracted directory content if packages/.module-version_verify doesn't exist. This was missing, causing coreboot tarballs to be redownloaded (not present under packages) and reextracted and repatched (since _verify file was not present under packages/*_verify)

@tlaurion tlaurion marked this pull request as draft February 24, 2022 14:39
@tlaurion
Copy link
Collaborator Author

We need to build twice from CircleCI to see if this fixes problem.

The situation that triggered this problem is that we

  • use the same musl-cross-make ( module and patches haven't changed. So cache layer 1 can be reused)
  • We use the same coreboot versions and patches (modules and patches haven't changed. But haven't cached packages directory) on layer 2 cache
  • We changed modules to force O2-> Os.

Consequently, we reused cache layer 2, not 3. And only layer 3 was caching packages directory, and its _verify files.


For coreboot 4.11, which is the only version for which patches applied to coreboot actually adds files from patches, CircleCI bails the build because patch is attempting to create a new file that already exists.

This needs special attention in inspecting logs on the second build.

Checkpoints:

  • make, awk, rebuilt or used (neither layer 1 nor 2 is caching build/make build/gawk directory)

@tlaurion
Copy link
Collaborator Author

Included in #1124 which is building cache. Stopped CircleCI first build (creating cache) for this PR. Will reuse this PR to test reusing cache.

@tlaurion
Copy link
Collaborator Author

Wow. It took 3 runs to build t430-flash for race condition reasons...

@tlaurion tlaurion force-pushed the CircleCI_add_packages_to_all_cache_layers branch from 95ef953 to 5df8791 Compare February 24, 2022 18:18
@tlaurion
Copy link
Collaborator Author

The second run reusing cache is happening at https://app.circleci.com/pipelines/github/tlaurion/heads/999/workflows/51140d16-151a-4783-9745-c90d631f79f5

We are waiting for the kgpe-d16 builds to see if coreboot 4.11 still fails at patching, which should not be the case.

@tlaurion
Copy link
Collaborator Author

Successful, with CircleCI having more and more weird behavior and failing on other race conditions, now on x230-flash sometimes it seems.

Rebuilding one last time prior of asking merge, since this simple fix makes sure that we won't fail because repatching and trying to recreate files that are already existing in CircleCI caches.

@tlaurion tlaurion force-pushed the CircleCI_add_packages_to_all_cache_layers branch from 66ddce9 to 92c817a Compare February 24, 2022 20:25
@tlaurion
Copy link
Collaborator Author

tlaurion commented Apr 2, 2022

Hitting us again since not merged.

https://app.circleci.com/pipelines/github/osresearch/heads/399/workflows/63d3c2ff-16ca-4680-9666-903f236b56f9/jobs/2772/parallel-runs/0/steps/0-108

Se can see in prep step that since a module was changed, only coreboot cache is downloaded, which doesn't cache packages. Since packages is not cached, then the coreboot 4.11 packages are redownloaded, and reextracted, and build/coreboot-* are repatched. Since the patches create some files here for 4.11, and that the files were already create per previous patching in cache, then the build stalls because it waits for user interaction at:

The next patch would create the file src/security/tpm/sha1.c, which already exists! 
Assume -R? [n] EOF 
Apply anyway? [n] EOF Skipping patch. 1 out of 1 hunk ignored 
The next patch would create the file src/security/tpm/sha1.h, which already exists! Assume -R? [n] 
make: *** [Makefile:507: /root/project/build/coreboot-4.11/.canary] 
Hangup context deadline exceeded

Heads buildstystem:

    Makefile logic will download modules packages under ./packages, check itheir integrity, then extract it and patch extraction directory ONLY if no corresponding .*_verify files are found under ./packages directory. They are extracted under build/modulename-ver/ where patches are applied prior of building them.
    build/module* .configured is written when packages are configured under build/modulename-ver/.configured
    build/modules* .build is written when packages are built under build/modulename-ver/.build

CircleCI caching subsystem notes:

    A cache name tag is calculated in the prep_env stage early at each beginning of a workflow, and consists of a cache name, appended by a calculated digest signature (which is the final hash of hashed files (the hash of a digest).
        Look for the following under .circleci/config.yml:
            "Creating .... digest statements" : they are basically files passed under sha256sum to create a digest.
            restore_cache keys: they are basically a string concatenating: name + checksum of digest + CACHE_VERSION. Only the first cache is extracted following declared order.
            save_cache keys: same as above, only saving non-existing caches. That is, skipping existing ones and creating missing ones.
    A cache is extracted at the beginning of a workflow if an archive matches an archive name, which consists of a name tag + digest hash + CACHE_VERSION
    A cache is created only at the end of a workflow ("Saving cache...").
        Caches are specialized. Caches are linked to checkumming of some content. And the largest available cache is extracted on next workflow, only extracting the directories/files that were contained in that cache.
    A workspace cache ("Attaching workspace..."), as opposed to a end workflow cache, is passed along steps that depends on prior workflow, as specified under CirclecI config. The current CircleCI config creates a workspace cache for:
        make + gawk + musl-cross-make (passed along next)
        the most massive board config for each coreboot version (passed along next)
        which is finally leading to the workflow cache, specialized for different content that should not change across builds.
            That is 3 caches
                musl-cross-make and bootstrapping tools (builds make and gawk locally) as long as musl-cross module has same checksum
                a coreboot cache, containing all coreboot building directories, as long as coreboot module and patches are having the same hashes
                a global cache containing alla builds artifacts (build dir, install dir, musl-cross dir etc)
    Consequently, a workspace cache contains all the files under a path that is specified. For heads running under CircleCI, this is ~/project, which is basically "heads" checked out GitHub project, and everything being built under it.
    When a workflow is successful, save_cache is ran, constructing caches for digest hashes that are not yet saved (which corresponds to a hash matching muslc-cross module hash, coreboot+patches digest hash and another one for all modules and patches digest hash.
    On next workspace iteration, pre_env step will include a "Restore cache" step, which will use the largest cache available and extract it prior of passing it as workspace caches. This is why there is no such different in build time when building on a clean build (the workspace caches layers are smaller, and passed along. This means saving it, passing it. next workspace downloads extracts and builds on top of those smaller layers), as opposed to a workspace reusing and repassing the bigger workspaces containing the whole cache (bigger initial cache extract, then compressing and saving it to be passed as a workspace layer that is then downloaded, extracted, building on top, compressing and saving which then passed as a workspace cache to the next layer depending on it).
    And finally, the caching system (save_cache, restore_cache) is based on a CircleCI environment variable named CACHE_VERSION which is appended at the end of the checkum fingerprint of a named cache. It can at any moment be changed to wipe actually used cache, if for some reason it is broken.

Consequently:

    CircleCI cache should include packages cache (so that packages are downloaded and verified only once.)
    Heads Makefile only downloads, checks and extracts packages and then patch extracted directory content if packages/.module-version_verify doesn't exist. This was missing, causing coreboot tarballs to be redownloaded (not present under packages) and reextracted and repatched (since _verify file was not present under packages/*_verify)
@tlaurion tlaurion force-pushed the CircleCI_add_packages_to_all_cache_layers branch from 92c817a to f6d049b Compare April 2, 2022 18:58
tlaurion referenced this pull request Apr 2, 2022
Fix current builds (zlib 1.2.11 cannot be downloaded, busybox patch not applied)
@tlaurion tlaurion changed the title WiP: CircleCI cache: have all cache layers caching packages directory. CircleCI cache: have all cache layers caching packages directory. Apr 2, 2022
@tlaurion tlaurion marked this pull request as ready for review April 2, 2022 19:13
@tlaurion tlaurion merged commit 493eb3e into linuxboot:master Apr 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant