Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ByteBuf returned by BookKeeper triggers CRC Checksum calculation when calling "readBytes" #4372

Open
eolivelli opened this issue May 17, 2024 · 6 comments
Labels

Comments

@eolivelli
Copy link
Contributor

BUG REPORT

Describe the bug
I have developting a Pulsar BrokerInterceptor. The BrokerInterceptor is able to process the data that Pulsar read from BookKeeper, without memory copies.
While analysing a flamegraph I have seen that the ByteBuf returned by BookKeeper shows this weird behaviour and uses lot of CPU.

This is a flame graph. The version of BookKeeper is based on latest 4.16.x

image

To Reproduce

See the flamegraph

Expected behavior

readBytes has very little overhead

@lhotari
Copy link
Member

lhotari commented May 17, 2024

that seems like a strange flamegraph. I don't see how readBytes could trigger CRC calculation.

ByteBufVisitor is used in checksum calculations:

UpdateContext updateContext = new UpdateContext(digest);
ByteBufVisitor.visitBuffers(buffer, offset, len, byteBufVisitorCallback, updateContext);
return updateContext.digest;

PR was #4196

@eolivelli
Copy link
Contributor Author

The ByteBuf was not coming from a read from the BookKeeper client, because the ByteBug is coming from the network (it is the Pulsar producer that is sending a message and the interceptor processes it)

But maybe it is a ByteBuf recycled ?

@lhotari
Copy link
Member

lhotari commented May 17, 2024

The ByteBuf was not coming from a read from the BookKeeper client, because the ByteBug is coming from the network (it is the Pulsar producer that is sending a message and the interceptor processes it)

But maybe it is a ByteBuf recycled ?

It's hard to see how it could result in the stacktrace even if there was a recycling bug.
GetBytesCallbackByteBuf instance is not stored as a reference anywhere and gets passed as a parameter here:

visitBuffer.getBytes(visitIndex, callbackByteBuf, 0, visitLength);
.

Perhaps it's a profiler issue.

Please add -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints to JVM options to prevent any issues in this area:

  • When agent is not loaded at JVM startup (by using -agentpath option) it is
    highly recommended to use -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints JVM flags.
    Without those flags the profiler will still work correctly but results might be
    less accurate. For example, without -XX:+DebugNonSafepoints there is a high chance
    that simple inlined methods will not appear in the profile. When the agent is attached at runtime,
    CompiledMethodLoad JVMTI event enables debug info, but only for methods compiled after attaching.

It might also be useful to compare Async Profiler 2.9 and 3.0 results. Just to be sure that the new stacktrace solution in 3.0 isn't causing the problem.

@hangc0276
Copy link
Contributor

I also found the checksum cost a lot of CPU
image

@lhotari
Copy link
Member

lhotari commented May 29, 2024

I also found the checksum cost a lot of CPU image

@hangc0276 I guess it is expected to consume a lot of CPU? In Enrico's case, the flamegraph doesn't seem to be valid and my assumption was that adding -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints to JVM options would fix it since it's recommended to use these JVM options to get proper results while profiling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants