Rectangle 27 0

spark.closure.serializer

True but registering your java class with Kryo causes the java serializer to be used. I did this with Guava's BiMap. Without registering BiMap the task failed to serialize.

So another issue I'm running into that causes a task from a closure to be unable to serialize. I'm using an external scala library in a closure. How do I determine which class is causing the problem. For all I know it could be in a transitive dependency.

Using external scala library in a closure is fine if you make sure you don't use them out of the closure. For debug, you can use Java option -Dsun.io.serialization.extendedDebugInfo=true.

I'm reading in a text file using rdd.textFile(source).map{closure} inside the closure I use json4s (which uses java jackson) and get a task not serialized failure. The closure returns an rdd of case classes so I don't need to use it outside. Why the task not serialized, and why can't I use the json4s outside the closure? Do I need to register something with Kryo?

E.g., you create some object in json4s outside the closure, and use it in the closure. Then when serializing the closure, it also need to serialize the object from json4s. If the object is not serializable, you will get the serializable failure.

java - kryo serializing of class (task object) in apache spark returns...

java serialization deserialization apache-spark kryo