Skip to content

Unable to run some PythonStep configurations in background using the konduit-serving background flag -b #478

@ShamsUlAzeem

Description

@ShamsUlAzeem

The logs are as follows:

(base) root@ecs-8bba:~/konduit-serving-demo/demos/6-bmi-onnx-pytorch# ../../bin/konduit logs bmi-onnx-pytorch2 -l 1000
18:08:27.682 [main] INFO  a.k.s.c.l.command.KonduitRunCommand - Processing configuration: /root/konduit-serving-demo/demos/6-bmi-onnx-pytorch/bmi-onnx-pytorch2.yaml
18:08:27.692 [main] INFO  u.o.l.s.context.SysOutOverSLF4J - Replaced standard System.out and System.err PrintStreams with SLF4JPrintStreams
18:08:27.694 [main] INFO  u.o.l.s.context.SysOutOverSLF4J - Redirected System.out and System.err to SLF4J for this context
18:08:27.695 [main] INFO  a.k.s.c.l.command.KonduitRunCommand - Starting konduit server with an id of 'bmi-onnx-pytorch2'
18:08:28.236 [vert.x-worker-thread-0] INFO  a.k.s.p.registry.PipelineRegistry - Loaded 27 PipelineStepRunnerFactory instances
18:08:28.601 [vert.x-worker-thread-0] INFO  a.k.serving.python.PythonRunner - Over riding python path :/root/miniconda3/lib/python37.zip:/root/miniconda3/lib/python3.7:/root/miniconda3/lib/python3.7/lib-dynload:/root/miniconda3/lib/python3.7/site-packages
18:08:29.391 [vert.x-worker-thread-0] INFO  a.k.serving.python.PythonRunner - Resolving execution code from run_script.py
18:08:29.391 [vert.x-worker-thread-0] INFO  a.k.serving.python.PythonRunner - Resolving import code from init_script.py
18:08:29.392 [vert.x-worker-thread-0] INFO  org.nd4j.python4j.PythonGIL - Pre Gil State ensure for thread 17
18:08:29.392 [vert.x-worker-thread-0] INFO  org.nd4j.python4j.PythonGIL - Thread 17 acquired GIL
18:08:29.782 [vert.x-worker-thread-0] INFO  org.nd4j.python4j.PythonGIL - Releasing GIL on thread 17
18:08:29.786 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable - org.nd4j.python4j.PythonException: Execution failed, unable to retrieve python exception.
18:08:29.787 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at org.nd4j.python4j.PythonExecutioner.simpleExec(PythonExecutioner.java:167)
18:08:29.787 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at org.nd4j.python4j.PythonExecutioner.exec(PythonExecutioner.java:200)
18:08:29.787 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at ai.konduit.serving.python.PythonRunner.<init>(PythonRunner.java:97)
18:08:29.787 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at ai.konduit.serving.python.PythonRunnerFactory.create(PythonRunnerFactory.java:33)
18:08:29.787 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at ai.konduit.serving.pipeline.impl.pipeline.BasePipelineExecutor.getRunner(BasePipelineExecutor.java:73)
18:08:29.787 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at ai.konduit.serving.pipeline.impl.pipeline.SequencePipelineExecutor.<init>(SequencePipelineExecutor.java:55)
18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at ai.konduit.serving.pipeline.impl.pipeline.SequencePipeline.executor(SequencePipeline.java:60)18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at ai.konduit.serving.vertx.verticle.InferenceVerticle.initialize(InferenceVerticle.java:47)
18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at ai.konduit.serving.vertx.protocols.http.verticle.InferenceVerticleHttp.lambda$start$0(InferenceVerticleHttp.java:68)
18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313)
18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
18:08:29.788 [vert.x-eventloop-thread-1] ERROR java.lang.Throwable -    at java.lang.Thread.run(Thread.java:745)

To reproduce this issue, you'll need a konduit.jar in the root repo folder and then run the following command in the following repo folder: https://github.com/ShamsUlAzeem/konduit-serving-demo/tree/master/demos/6-bmi-onnx-pytorch

java -cp classes:../../konduit.jar ai.konduit.serving.cli.launcher.KonduitServingLauncher serve --config bmi-onnx-pytorch.yaml -rwm -id bmi-onnx-pytorch -b

And then check logs with:

../../bin/konduit logs bmi-onnx-pytorch

Notice that the same command run through nohup would run fine and run in the background as well with no issues:

nohup java -cp classes:../../konduit.jar ai.konduit.serving.cli.launcher.KonduitServingLauncher serve --config bmi-onnx-pytorch.yaml -rwm -id bmi-onnx-pytorch &

And then check logs with:

../../bin/konduit logs bmi-onnx-pytorch

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions