Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Represent all assignment operator calls in the IR #1177

Open
smeenai opened this issue Nov 27, 2024 · 3 comments
Open

Represent all assignment operator calls in the IR #1177

smeenai opened this issue Nov 27, 2024 · 3 comments
Labels
IR design Considerations around the design of ClangIR

Comments

@smeenai
Copy link
Collaborator

smeenai commented Nov 27, 2024

ClangIR tries to represent all function calls in the IR (e.g. CodeGen omits calls to trivial constructors but CIRGen keeps them). One exception I've discovered is around calling operator= for an array member when generating a default operator=. https://godbolt.org/z/nEE46ff9d is an example; for the following code:

struct S { int i; };
struct T {
    S arr[4];
    T &operator=(const T &);
};
T &T::operator=(const T &) = default;

The generated code for T::operator= does a memcpy for the arr field instead of emitting the individual S::operator= calls for each array member.

This stems from how CIRGen (and CodeGen) generate default assignment operators. Clang generates an AST representation of the assignment operator, which already contains a __builtin_memcpy call for the arr field, and then CIRGen emits code for that AST. We'd need to change the AST itself to not generate a memcpy here.

@smeenai smeenai added the IR design Considerations around the design of ClangIR label Nov 27, 2024
@bcardosolopes
Copy link
Member

bcardosolopes commented Nov 27, 2024

We'd need to change the AST itself to not generate a memcpy here.

Not necessarily, we could look at the node children to realize a __builtin_memcpy is there (a small hand written visitor could be necessary) and just bypass it because the new CIR operation emitted would imply such memcpy during lowering.

EDIT: Maybe changing the AST could be the long term solution, my point is that it shouldn't block us to make this happen since it's fairly orthogonal

@smeenai
Copy link
Collaborator Author

smeenai commented Nov 28, 2024

The visitor already exists here. The problem is that the AST doesn't distinguish between arrays of primitive types (which we'd still want to memcpy) and arrays of trivial structs: https://godbolt.org/z/xcMbd8qPT. We can examine the field type in that visitor, but if the visitor says the field isn't memcpyable we'll fall back to CGF.emitStmt, which will translate that AST node into a memcpy anyway. We can special case that again, but it seems kinda ugly vs. changing the AST.

https://godbolt.org/z/fhbM8ajja is an example with a non-trivial assignment operator for reference, and the generated AST contains a full initialization loop. I'm also including the default copy constructor for reference, and it's interesting that it has a much higher-level representation (ArrayInitLoopExpr, which #1055 discusses the appropriate CIR equivalent for). I imagine we'll also want to raise assignment operator loops in the future.

smeenai added a commit that referenced this issue Dec 3, 2024
Our previous logic here was matching CodeGen, which folds trivial
assignment operator calls into memcpys, but we want to avoid that. Note
that we still end up emitting memcpys for arrays of classes with trivial
assignment operators; #1177 tracks
fixing that.
@bcardosolopes
Copy link
Member

bcardosolopes commented Dec 6, 2024

I think it's totally reasonable if high level code for this is just a thin wrapper for starters (e.g. a new operation with all this logic happen within it's region), this way we don't have to deviate from CIRGen, and will delimit initialization code altogether, such that when we get fancy with analysis we can then play more with additional abstractions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IR design Considerations around the design of ClangIR
Projects
None yet
Development

No branches or pull requests

2 participants