Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recreate SparkContext if job is cancelled unexpectedly #27

Open
c-w opened this issue Feb 22, 2018 · 0 comments
Open

Recreate SparkContext if job is cancelled unexpectedly #27

c-w opened this issue Feb 22, 2018 · 0 comments

Comments

@c-w
Copy link
Contributor

c-w commented Feb 22, 2018

We hit an out of memory issue due to unstable streaming which resulted in Spark cancelling the job. The driver kept running with a stopped context. We need to recreate the context in the event that it's stopped unexpectedly.


Adding context.

From Kevin:

I believe JC Jimenez may have handled this back when he added the Stream Change listener stuff. JC, does that sound familiar? The behavior I was seeing was that StreamingContext (and also SparkContext?) had auto-stopped, but the driver didn't terminate.

From JC:

Yeah, I think you are correct, the context would close but it wasn’t possible to restart it. I think I ended up opting to exit with non-zero. However, the spark-submit tool may not have been restarting. (Despite having the —supervise arg). I would test it to make sure it works. The supervise option didn’t seem to work in single-node land.


Looks like the --supervise argument not working as @jcjimenez observed may be linked to the fact that the argument requires the cluster to be run in spark-standalone mode via --deploy-mode cluster which is incompatible with --master local[*] which is used in single-node land. Source: StackOverflow + Spark Docs

image

Note that --supervise and --deploy-mode cluster are already being set for Fortis in production by install-spark.sh so we should be good here.


Copied from CatalystCode/project-fortis-spark#98

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant