Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolated variables should be factored into checksum #548

Open
Porges opened this issue Aug 9, 2021 · 14 comments · May be fixed by #1401
Open

Interpolated variables should be factored into checksum #548

Porges opened this issue Aug 9, 2021 · 14 comments · May be fixed by #1401
Labels
breaking change Changes that are not backward compatible.

Comments

@Porges
Copy link
Contributor

Porges commented Aug 9, 2021

  • Task version: v3.7.0 (h1:8dDFv12/Lxi6PHkYNIIm7N6v6oRGxePpLK67LuMh/Rs=)
  • Operating System: Linux 2d7a33008788 5.10.43.3-microsoft-standard-WSL2 # 1 SMP Wed Jun 16 23:47:55 UTC 2021 x86_64 GNU/Linux

Example Taskfile showing the issue

version: '3.7'

run: when_changed

vars:
  VERSION: 1

tasks:
  generate-file:
    sources:
    - base.txt
    cmds:
    - <base.txt sed -e 's/VERSION/{{.VERSION}}/g' > versioned.txt

When changing VERSION from 1 to 2, this should cause generate-file to re-run. At the moment it doesn’t because interpolated variables are not incorporated into the checksum.

@Porges Porges added the type: bug Something not working as intended. label Aug 9, 2021
@Porges
Copy link
Contributor Author

Porges commented Aug 17, 2021

Also, ideally, the task itself would be checksummed somehow, so that if I change the commands it will re-run.

@andreynering andreynering added enhancement and removed type: bug Something not working as intended. labels Oct 12, 2021
@andreynering
Copy link
Member

Hi @Porges,

It's currently possible to instruct Task to invalidate the checksum on specific variables changes by adding a label: prop to the task:

version: '3'

tasks:
  generate-file:
    cmds:
      - ...
    label: 'generate-file-{{.VERSION}}'

It's a valid discussion whether we should do that by default or not, as you suggested. I think it may be unexpected to some users. For example: if a variable isn't deterministic (a date/time or something) the task will always run and that may not always be the expected behavior.

This makes me think that, if done, there should probably be a setting to choose the appropriate behavior. But maybe label: is enough and we would not even need that...

@ghostsquad
Copy link
Contributor

I found myself changing my task file, and later wondering how I was going to invalidate the cache automatically. It's not really just vars, it's actually the entire task definition. I imagine that you may want to checksum the output of --dry-run as part of the caching behavior, or at least some flag to enable or disable that.

I'm curious what people think about having the default behavior include the task definition itself (and variable values) as part of the checksum, with a flag to disable 1 or both of those?

@ghostsquad ghostsquad added the v4 label Apr 29, 2022
@ghostsquad ghostsquad added the breaking change Changes that are not backward compatible. label May 15, 2022
@ahus1
Copy link

ahus1 commented Jun 26, 2022

@ghostsquad: +1 for making the task definition part of the checksum (not sure about interpolations as a new user yet). I admit that I'm new to task, still the first thing I did was adding Taskfile.yaml as one of the sources (which would build too much).

@ghostsquad
Copy link
Contributor

For now, I'm including Taskfile.yml itself in sources. This is equivalent of using a sledge hammer when a rubber mallet would be sufficient.

@ahus1
Copy link

ahus1 commented Jun 27, 2022

This is equivalent of using a sledge hammer when a rubber mallet would be sufficient.

I know, and it wasn't fun. Next iteration was task -f taskname, but found that it also forces the dependencies of a task to run.

Current iteration: a small script that extracts each task from the Taskfile.yaml and use that as a source: once the task changes, run the task again. Works OK for now.

# Taskfile.yaml
tasks:
  split:
    desc: Split Taskfile to one-file-per-task for dirty checking
    run: once
    cmds:
      - bash -c ./split.sh
    sources:
      - Taskfile.yaml
      - split.sh
    silent: true

  ipchange:
    deps:
      - split
    cmds:
      # ...
    sources:
      - .task/subtask-{{.TASK}}.yaml
# split.sh
set -euo pipefail
mkdir -p .task
rm -f .task/subtask-*.yaml
for i in $(yq -M e '.tasks | keys' Taskfile.yaml); do
   yq -M e ".tasks.${i}" Taskfile.yaml > .task/subtask-${i}.yaml
done

@andreynering
Copy link
Member

Hi @ahus1,

Have you tried my suggestion done here? #548 (comment)

I believe that's what you're are looking for.

@ahus1
Copy link

ahus1 commented Jul 6, 2022

@andreynering - your suggestion will solve AFAIK the problem when a variable changes, as the labels changes with the variable. I don't know what will happen if the variable changes back to the old value: Would it then run again, or will the state with the old variable value / label name will be still there and it will not run?

This thread evolved form interpolating variables to also include the body of the task definition, so that whenever the definition changes, the task should run again. The variable/label approach doesn't cover that, so I looked for something else. So I came up with the idea to split the taskfile by tasks, and then use that as a "sources" element. I did some iterations with that, and it looks ok for me. although a bit verbose.

@ghostsquad
Copy link
Contributor

ghostsquad commented Jul 6, 2022

I think what is realistically needed is a "cache key". I don't know what else that label does. But generally, if we are able to derive our own "key" tied to the task definition itself, that means that we can invalidate the cache arbitrarily.

@pd93 pd93 removed the v4 label Oct 15, 2022
@aminya
Copy link
Contributor

aminya commented Mar 14, 2023

I tried using label: {{.TASK}}-{{.VARIABLE}} in addition to sources and generates, but still the task is incorrectly skipped. Is there a way to fix this issue?
I have to disable the whole caching because of this bug

@ypid-work
Copy link

#548 (comment)

Also, ideally, the task itself would be checksummed somehow, so that if I change the commands it will re-run.

#455 already exists where this is discussed.

aminya added a commit to aminya/task that referenced this issue Nov 15, 2023
This adds a new checker for Task definitions, status, and any other variable that affect a task.

The implementation hashes the task structure and writes it to `.task/definition`. In later runs, this is checked to make sure the definition is up-to-date. If the definition has changed (e.g. one of the variables has changed), the task will run again.

Fixes go-task#548
Fixes go-task#455
aminya added a commit to aminya/task that referenced this issue Nov 16, 2023
This adds a new checker for Task definitions, status, and any other variable that affect a task.

The implementation hashes the task structure and writes it to `.task/definition`. In later runs, this is checked to make sure the definition is up-to-date. If the definition has changed (e.g. one of the variables has changed), the task will run again.

Fixes go-task#548
Fixes go-task#455
@dcecile
Copy link

dcecile commented Jan 9, 2024

My workaround is to add a silent command that does nothing except reference the used variables:

      - cmd: 'true # /{{.WATCH}}/{{.RESTART}}/{{.MODE}}/{{.TEST}}/{{.NAME}}/{{.ARGS}}/'
        silent: true

So it seems like interpolated variables are factored in with 3.29.1? I think my issue was some of my other usages weren't factored in.

Other usages
      - task: :watchman:client
        vars:
          WATCH: '{{.WATCH}}'
          RESTART: '{{.RESTART}}'
          MODE: executable
          COMMAND: build/target/{{.MODE}}/intertwine-{{.NAME | trimSuffix "-rs"}}{{if eq .TEST "true"}}-test{{end}} {{.ARGS}} {{.CLI_ARGS}}

@aminya
Copy link
Contributor

aminya commented Jan 9, 2024

I fixed this issue in
#1401

It just needs addressing some comments otherwise it works as expected.

aminya added a commit to aminya/task that referenced this issue Feb 7, 2024
This adds a new checker for Task definitions, status, and any other variable that affect a task.

The implementation hashes the task structure and writes it to `.task/definition`. In later runs, this is checked to make sure the definition is up-to-date. If the definition has changed (e.g. one of the variables has changed), the task will run again.

Fixes go-task#548
Fixes go-task#455
aminya added a commit to aminya/task that referenced this issue Feb 7, 2024
This adds a new checker for Task definitions, status, and any other variable that affect a task.

The implementation hashes the task structure and writes it to `.task/definition`. In later runs, this is checked to make sure the definition is up-to-date. If the definition has changed (e.g. one of the variables has changed), the task will run again.

Fixes go-task#548
Fixes go-task#455
aminya added a commit to aminya/task that referenced this issue Apr 25, 2024
This adds a new checker for Task definitions, status, and any other variable that affect a task.

The implementation hashes the task structure and writes it to `.task/definition`. In later runs, this is checked to make sure the definition is up-to-date. If the definition has changed (e.g. one of the variables has changed), the task will run again.

Fixes go-task#548
Fixes go-task#455
aminya added a commit to aminya/task that referenced this issue May 3, 2024
This adds a new checker for Task definitions, status, and any other variable that affect a task.

The implementation hashes the task structure and writes it to `.task/definition`. In later runs, this is checked to make sure the definition is up-to-date. If the definition has changed (e.g. one of the variables has changed), the task will run again.

Fixes go-task#548
Fixes go-task#455
@reitzig
Copy link

reitzig commented Oct 1, 2024

FWIW: The docs say (emphasis mine):

when_changed only invokes the task once for each unique set of variables passed into the task

As a newcomer, it was quite surprising to me that changes in vars did not cause a re-run!

The workaround with label works, I guess, but a) is not intuitive from the docs, b) does not look nice in the CLI output (in my case), and c) causes lots of files in .task/checksum over time.

My workaround is to (always) write the values (resp. a checksum of theirs) to a file and include that as source:

tasks:
  init-submodules-status:
    internal: true
    silent: true
    cmds:
      - git submodule status --cached | sha256sum | cut -d' ' -f1 > .task/git-submodules-status
    run: always
  init-submodules:
    deps:
      - init-submodules-status
    sources:
      - .gitmodules
      - .task/git-submodules-status
    cmds:
      - git submodule update --init
    run: when_changed

(Well, I need to re-run basd on changes in the output of a command, but I guess the approach carries over to writing the values of variables into files?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes that are not backward compatible.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants