Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execute parallel tasks for nested tasks in a job spec #21

Open
pandemicsyn opened this issue Jun 9, 2022 · 1 comment
Open

Execute parallel tasks for nested tasks in a job spec #21

pandemicsyn opened this issue Jun 9, 2022 · 1 comment

Comments

@pandemicsyn
Copy link

Given a yaml such as:

jobs:
- name: g-to-p-job
  tasks:
  - - tap-gitlab target-postgres
     - tap-github target-postgres
  - dbt-postgres:run

Today we create one dag with two connected tasks:

#Dag -> task1 -> task2

task1 = meltano run tap-gitlab target-postgres tap-github target-postgres
task2 (depends on task1) = meltano run dbt-postgres:run

But it should be "trivial" to parallelize the subtasks rather than flattening them into a single task:

#Dag -> task1 (parrallel subtask1, parallel subtask2) -> task2

task1 = (subtask_1, subtask_2)
    subtask_1 = meltano run tap-gitlab target-postgres
    subtask_2 = meltano run tap-github target-postgres
task2 (depends on task1) = meltano run dbt-postgres:run

To do so, we'd need just trigger subtask creation at this check:

if isinstance(task, Iterable) and not isinstance(task, str):
run_args = " ".join(task)
else:

This parallel execution isn't native to meltano and wouldn't be support yet by meltano run natively, but it would at least allow airflow users to leverage parallel subtasks.

@pandemicsyn
Copy link
Author

@aaronsteers @tayloramurphy might be nice to try to squeeze into a future iteration. This would be a pretty simple change (weight 2) but should teamed up with #17 (adding some test coverage).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant