Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document possible StackOverflowError on the Spark benchmarks #375

Open
farquet opened this issue Mar 4, 2023 · 0 comments
Open

Document possible StackOverflowError on the Spark benchmarks #375

farquet opened this issue Mar 4, 2023 · 0 comments

Comments

@farquet
Copy link
Collaborator

farquet commented Mar 4, 2023

The context is the following OpenJDK bug that was been closed as a non-issue: https://bugs.openjdk.org/browse/JDK-8303076

Under certain circumstances (special JVM configurations like the one described in the bug), it is possible that Spark benchmarks fail with a StackOverflowError. The reason is that the benchmarks themselves allocate a lot on the stack and are already close to the overflow limit. This seems to be a common issue with Spark applications.

The simple workaround is to increase the stack size with -Xss. That would be great if this workaround would be documented somewhere in the repo (main README?).

Hopefully opening this issue already helps potential users hitting this in the future.

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
NOTE: 'als' benchmark uses Spark local executor with 12 (out of 12) threads.
====== als (apache-spark) [default], iteration 0 started ======
GC before operation: completed in 33.258 ms, heap usage 179.102 MB -> 38.409 MB.
23/02/22 14:54:31 WARN GarbageCollectionMetrics: To enable non-built-in garbage collector(s) List(G1 Concurrent GC), users should configure it(them) to spark.eventLog.gcMetrics.youngGenerationGarbageCollectors or spark.eventLog.gcMetrics.oldGenerationGarbageCollectors
23/02/22 14:54:50 ERROR Executor: Exception in task 8.0 in stage 30.0 (TID 247)
java.lang.StackOverflowError
at org.apache.spark.util.ByteBufferInputStream.read(ByteBufferInputStream.scala:49)
at java.base/java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2911)
at java.base/java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2927)
at java.base/java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:3448)
at java.base/java.io.ObjectInputStream.readHandle(ObjectInputStream.java:1866)
at java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1927)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2248)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1760)
at java.base/java.io.ObjectInputStream$FieldValues.<init>(ObjectInputStream.java:2614)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2465)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2280)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1760)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:538)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:496)
at scala.collection.generic.DefaultSerializationProxy.readObject(DefaultSerializationProxy.scala:58)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
at java.base/java.lang.reflect.Method.invoke(Method.java:578)
at java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1100)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2440)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2280)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1760)
at java.base/java.io.ObjectInputStream$FieldValues.<init>(ObjectInputStream.java:2614)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2465)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2280)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1760)
at java.base/java.io.ObjectInputStream$FieldValues.<init>(ObjectInputStream.java:2614)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2465)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2280)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1760)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:538)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:496)
at scala.collection.generic.DefaultSerializationProxy.readObject(DefaultSerializationProxy.scala:58)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant