Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Cached recipes #1861

Open
tgross35 opened this issue Jan 21, 2024 · 4 comments
Open

Feature request: Cached recipes #1861

tgross35 opened this issue Jan 21, 2024 · 4 comments
Labels

Comments

@tgross35
Copy link
Contributor

For quite a few CMake projects, I have a separate recipes for configuration and build, with configure being a dependency of build. Normally the configuration does not need to be rerun unless variables change, and I would prefer it runs as little as possible because it can sometimes take longer than the build.

Currently I have something a bit messy that stores a hash of all captured variables to check if anything changed:

# Configure Cmake
configure build-type="Debug" projects="clang":
	#!/bin/sh
	# Hash all configurable parts 
	hash="{{ sha256(source_dir + build_dir + build-type + install_dir + projects + linker_arg) }}"
	if [ "$hash" = "$(cat '{{config_hash_file}}')" ]; then
		echo "configuration up to date, exiting"
		exit
	else
		echo "config outdated, rerunning"
	fi

	printf "$hash" > "{{config_hash_file}}"

	cmake "-S{{source_dir}}/llvm" "-B{{build_dir}}" \
		-G Ninja \
		-DCMAKE_C_COMPILER_LAUNCHER=sccache \
		-DCMAKE_CXX_COMPILER_LAUNCHER=sccache \
		-DCMAKE_EXPORT_COMPILE_COMMANDS=true \
		"-DCMAKE_BUILD_TYPE={{build-type}}" \
		"-DCMAKE_INSTALL_PREFIX={{install_dir}}" \
		"-DLLVM_ENABLE_PROJECTS={{projects}}" \
		"{{linker_arg}}"

My suggestion is to add a way to do this by default. The above would become:

[cached]
configure build-type="Debug" projects="clang":
	cmake "-S{{source_dir}}/llvm" "-B{{build_dir}}" \
	# ...

And Just would need to do the following:

  • Evaluate all captures in the recipe
  • Create a deterministic hash of the captures (and maybe also environment variables?)
  • If never run before, store this information in the cache directory. Roughly
    {
      "cached_recipes": [
        { "path": "/home/user/project/justfile", "recipe": "configure", "hash": "09ca7e4e...", "last_run": "2024-01-21T08:40:52Z" },
        // ...
      ]
    }
  • If an entry for that file and recipe already exists, compare the hash. Skip if it is the same

A more flexible alternative is to have the user specify what gets set as a cache key. This would be easier for Just to implement too, but is less user friendly.

[cache_keys(source_dir, build_dir, build-type, install_dir, projects, linker_arg, `$SOME_ENV`)]
configure build-type="Debug" projects="clang":
	cmake "-S{{source_dir}}/llvm" "-B{{build_dir}}" \
	# ...

This is slightly related to #867 since a lot of the use of file dependencies is cache.

@casey
Copy link
Owner

casey commented Jan 27, 2024

I think this would probably create a long-tail of tricky issues. I've had the experience with a few build systems that sometimes things get cached when they shouldn't, like if a file on disk, environment variable, or binary changes, but the build system isn't aware of it. So I'm open to adding this, but only if it has a simple, minimal implementation which is easy for users to understand. I kind of suspect that this isn't possible, but I'll leave this open in case someone can come up with something clever.

@nmay231 nmay231 mentioned this issue Feb 16, 2024
18 tasks
@rhysparry
Copy link

Something that I have done in the past is use || in the command to short-circuit the recipe. This can work well for simple things, but can get a little clumsy as the logic gets more complicated (as in @tgross35's original example).

E.g.

configure:
    [ -f configured ] || ./run-slow-process-to-configure.sh 

In this example, it just checks for the existence of the file configured, and if it exists, it doesn't do the rest.

If we could better structure this sort of short-circuiting in just this might provide a straightforward way for users to define how they want to control any skipping behaviour in recipes.

Creating another recipe that defines the logic might be a natural way to achieve this.

configured-file-exists:
    [ -f configured ]

Then it would be a case of deciding how to best express this requirement.

E.g. an attribute

[short_circuit(configured-file-exists)]
configure:
    ./run-slow-process-to-configure.sh

As part of the recipe line:

configured-file-exists || configure:
    ./run-slow-process-to-configure.sh

Or some other way.

@rhysparry
Copy link

Another approach that leverages the existing method for defining dependencies. Create independent recipes for each of the alternative branches and then a "gate" recipe as the mechanism to combine them.

[private]
check-if-already-configured:
    # do the check

[private]
do-actual-configure:
    # do the configure

configure: (check-if-already-configured || do-actual-configure)

I like that this approach leverages the existing recipe mechanism, although it does lean into having a recipe failing (albeit with a path to recover the overall run).

@jrouaix
Copy link

jrouaix commented Sep 3, 2024

Hi, I found this issue after having implemented a first draft for our own solution :

# Handle recipe cached completions

# completion cache directory
completions_cache_dir := justfile_directory() / ".recipe.completions"

clear_cached_runs: 
  @rm -rf {{completions_cache_dir}}
  @echo "Completions cleared"

# exec a recipe only if it has not been successfully completed yet
cached_run recipe:
  #!/usr/bin/env sh
  if 
    test -f {{completions_cache_dir}}/{{recipe}}.completed
  then 
    echo "'{{recipe}}' already completed"
  else
    just "{{recipe}}"
    mkdir -p {{completions_cache_dir}}
    touch {{completions_cache_dir}}/{{recipe}}.completed
  fi

######################################
# -----         EXAMPLES       ----- #

# test recipe (this is a syntaxe test, we'll have to declare 1 more line for each recipe)
init_mob: 
  @echo "INIT_MOB"

test_mob: (cached_run "init_mob")
  @echo "TEST_MOB"

build_mob: (cached_run "test_mob")
  @echo "BUILD_MOB"

publish_mob: init_mob build_mob

Using this example :

$ just test_mob
'init_mob' already completed
TEST_MOB

$ just build_mob
'init_mob' already completed
TEST_MOB
BUILD_MOB

$ just publish_mob
INIT_MOB
'test_mob' already completed
BUILD_MOB

Would be awesome to have a special syntax for that ! 🚀

@casey casey added the coveted label Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants