Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: AggregateRel should use grouping references #94

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

srikrishnak
Copy link
Contributor

No description provided.

Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 67.92453% with 34 lines in your changes missing coverage. Please review.

Project coverage is 62.21%. Comparing base (317c209) to head (4b6da37).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
plan/plan.go 60.46% 16 Missing and 1 partial ⚠️
plan/builders.go 66.66% 8 Missing and 4 partials ⚠️
plan/relations.go 81.48% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #94      +/-   ##
==========================================
+ Coverage   61.88%   62.21%   +0.32%     
==========================================
  Files          44       44              
  Lines        9811     9977     +166     
==========================================
+ Hits         6072     6207     +135     
- Misses       3462     3487      +25     
- Partials      277      283       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

plan/builders.go Outdated Show resolved Hide resolved
plan/builders.go Outdated
AggregateColumns(input Rel, measures []AggRelMeasure, groupByCols ...int32) (*AggregateRel, error)
// Deprecated: Use AggregateExprs(...).Remap() instead.
AggregateExprsRemap(input Rel, remap []int32, measures []AggRelMeasure, groups ...[]expr.Expression) (*AggregateRel, error)
// Deprecated: Use Aggregate(...) instead.
AggregateExprs(input Rel, measures []AggRelMeasure, groups ...[]expr.Expression) (*AggregateRel, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Is a Grouping something we could provide a builder for instead of requiring the caller to construct a slice of expression slices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could only think of an API which takes input as grouping Expressions and grouping References and gives the slice of expression slices but that would be same as Aggregate() API. Any other ways to do it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I was using the interface I'd prefer to just call:

func (b *builder) Aggregate(input Rel, measures []AggRelMeasure, groupingExpressions []expr.Expression) (*AggregateRel, error) {

Then Aggregate could either:

  1. simply assign ascending values to the grouping references for each expression 1:1 (not gaining any benefit of the identical expressions being able to be deduped and other optimization savings)
  2. assign values to the grouping references for each expression. If an expression duplicates one seen previously then it isn't added to the expression list and a previous grouping reference value is used instead.

The notion of duplication for the second approach could mark equivalence incorrectly. (For instance, perhaps marking a literal 1 as identical to another one even though it is of a different unit.). However I'm not sure it matters.

Copy link
Contributor Author

@srikrishnak srikrishnak Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EpsilonPrime Adding to your suggestion a bit. How about we do introduce more API on AggregateRel like below?

func (b *builder) Aggregate(input Rel, measures []AggRelMeasure, groupingExpressions []expr.Expression) (*AggregateRel, error)

func (ar *AggregateRel) addCube(groupingReferences []uint32) 
func (ar *AggregateRel) addRollup(groupingReferences []uint32)
func (ar *AggregateRel) addGroupingSet(groupingReferences []uint32)

This decoupling, however, means we are not ensuring AggregateRel always has groupingExpressions of which every expression is always referenced by atleast one groupingReference.

Maybe alternatively use something like

func (b *builder) Aggregate(input Rel, measures []AggRelMeasure, groupingExpressions []expr.Expression) (*AggregateRelBuilder, error)

func (arb *AggregateRelBuilder) addCube(groupingReferences []uint32) 
func (arb *AggregateRelBuilder) addRollup(groupingReferences []uint32)
func (arb *AggregateRelBuilder) addGroupingSet(groupingReferences []uint32)
func (arb *AggregateRelBuilder) build() (*AggregateRel, error)

But this is introducing a new builder.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like the idea of adding a builder to ease construction. I would try to avoid having to deal with grouping references unless the builder is going to tell me which reference a given expression is. Perhaps we can return the references as part of adding the expressions?

func (arb *AggregateBuilder) addExpression(expr.Expression) uint32

}, nil
}

func (b *builder) AggregateExprs(input Rel, measures []AggRelMeasure, groups ...[]expr.Expression) (*AggregateRel, error) {
return b.AggregateExprsRemap(input, nil, measures, groups...)
}

func (b *builder) Aggregate(input Rel, measures []AggRelMeasure, groupingExpressions []expr.Expression, groupingReferences [][]uint32) (*AggregateRel, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be part of the interface?

@@ -286,11 +287,13 @@ func (b *builder) AggregateColumnsRemap(input Rel, remap []int32, measures []Agg
}
}

groupingExpressions, groupingReferences := groupingExprs(exprs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could ditch the Remap versions since they've been deprecated for a while (hehe, for at least two major versions) to avoid needing passing these changes along.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants