Replies: 2 comments
-
Hi @mehrzadai, thanks for opening this issue and sorry for the delay. On first inspection your use case makes sense, but it might be problematic for us to introduce a special case for partitioned datasets to allow different nodes to write to a different partition of the same dataset. We'll have a look at this soon. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I'm moving this to a discussion for now, let's continue the conversation there. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I faced an issue that may be solved in the future or have any solution available that I don't know.
I have a scenario in which I have different categories of big data e.g.
rates
,sales
,views
, andreviews
and I want to join them together.I don't want to have different datasets for each in my catalog, instead, I want to save each as one partition, something like this :
In this way, I can use connectivity and lazy save/load in the same time.
But currently, the rule is :
kedro.pipeline.pipeline.OutputNotUniqueError: Output(s) ['concat'] are returned by more than one nodes. Node outputsmust be unique.
I can save my partitions like :
and load the partitioned dataset in another node, but in this way, I will lose the connectivity of my nodes.
I guess this rule is better to be changed for partitioned datasets to be able to save each partition in a different node.
Beta Was this translation helpful? Give feedback.
All reactions