You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We hit an out of memory issue due to unstable streaming which resulted in Spark cancelling the job. The driver kept running with a stopped context. We need to recreate the context in the event that it's stopped unexpectedly.
Adding context.
From Kevin:
I believe JC Jimenez may have handled this back when he added the Stream Change listener stuff. JC, does that sound familiar? The behavior I was seeing was that StreamingContext (and also SparkContext?) had auto-stopped, but the driver didn't terminate.
From JC:
Yeah, I think you are correct, the context would close but it wasn’t possible to restart it. I think I ended up opting to exit with non-zero. However, the spark-submit tool may not have been restarting. (Despite having the —supervise arg). I would test it to make sure it works. The supervise option didn’t seem to work in single-node land.
Looks like the --supervise argument not working as @jcjimenez observed may be linked to the fact that the argument requires the cluster to be run in spark-standalone mode via --deploy-mode cluster which is incompatible with --master local[*] which is used in single-node land. Source: StackOverflow + Spark Docs
Note that --supervise and --deploy-mode cluster are already being set for Fortis in production by install-spark.sh so we should be good here.
We hit an out of memory issue due to unstable streaming which resulted in Spark cancelling the job. The driver kept running with a stopped context. We need to recreate the context in the event that it's stopped unexpectedly.
Adding context.
From Kevin:
From JC:
Looks like the
--supervise
argument not working as @jcjimenez observed may be linked to the fact that the argument requires the cluster to be run in spark-standalone mode via--deploy-mode cluster
which is incompatible with--master local[*]
which is used in single-node land. Source: StackOverflow + Spark DocsNote that
--supervise
and--deploy-mode cluster
are already being set for Fortis in production by install-spark.sh so we should be good here.Copied from CatalystCode/project-fortis-spark#98
The text was updated successfully, but these errors were encountered: