Nodes with same output dataset for Partitioned Scenarios #4360

mehrzadai · 2023-12-20T08:02:10Z

mehrzadai
Dec 20, 2023

I faced an issue that may be solved in the future or have any solution available that I don't know.
I have a scenario in which I have different categories of big data e.g. rates, sales, views, and reviews and I want to join them together.
I don't want to have different datasets for each in my catalog, instead, I want to save each as one partition, something like this :

concat:
   type : Partitioned
node(views -> concat) , node(rates -> concat) , ...

In this way, I can use connectivity and lazy save/load in the same time.
But currently, the rule is :
kedro.pipeline.pipeline.OutputNotUniqueError: Output(s) ['concat'] are returned by more than one nodes. Node outputsmust be unique.
I can save my partitions like :

rates:
   type : CSVDataset
views:
   type : CSVDatset
 ...

and load the partitioned dataset in another node, but in this way, I will lose the connectivity of my nodes.
I guess this rule is better to be changed for partitioned datasets to be able to save each partition in a different node.

astrojuanlu · 2024-01-10T11:26:44Z

astrojuanlu
Jan 10, 2024
Maintainer

Hi @mehrzadai, thanks for opening this issue and sorry for the delay.

On first inspection your use case makes sense, but it might be problematic for us to introduce a special case for partitioned datasets to allow different nodes to write to a different partition of the same dataset. We'll have a look at this soon.

0 replies

astrojuanlu · 2024-12-02T14:36:01Z

astrojuanlu
Dec 2, 2024
Maintainer

I'm moving this to a discussion for now, let's continue the conversation there.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nodes with same output dataset for Partitioned Scenarios #4360

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Nodes with same output dataset for Partitioned Scenarios #4360

mehrzadai Dec 20, 2023

Replies: 2 comments

astrojuanlu Jan 10, 2024 Maintainer

astrojuanlu Dec 2, 2024 Maintainer

mehrzadai
Dec 20, 2023

astrojuanlu
Jan 10, 2024
Maintainer

astrojuanlu
Dec 2, 2024
Maintainer