Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad folder reference when installating in Docker? #6

Open
josejuanmartinez opened this issue Oct 18, 2022 · 4 comments
Open

Bad folder reference when installating in Docker? #6

josejuanmartinez opened this issue Oct 18, 2022 · 4 comments
Assignees

Comments

@josejuanmartinez
Copy link
Contributor

josejuanmartinez commented Oct 18, 2022

A prospect asked for a Docker installation.

I prepared everything, but jsl.install() fails trying to resolve the folder where to download / install everything. Is getting a weird .johnsnowlabs. Maybe due to docker volumes?

image

image

@josejuanmartinez josejuanmartinez changed the title Bad host resolution in Docker? Bad folder reference when installating in Docker? Oct 18, 2022
@C-K-Loan
Copy link
Member

Thank you @josejuanmartinez can you share some information on the Docker Image that was used?
Especially the base would be interesting.
I am taking a look and trying to reproduce this on an Ubuntu image today

@C-K-Loan
Copy link
Member

Thanks for the report, there was a bug with handling some paths like the one in your docker image.

It's fixed with pip install johnsnowlabs==4.2.3rc1

Also see updated Dockerfile for install reference
Dockerfile.txt

@josejuanmartinez
Copy link
Contributor Author

josejuanmartinez commented Oct 19, 2022

Hey, this is what I'm getting with jsl.start() (BUT EVERYTHING WORKS AFTER THAT!)

How to reproduce:

1) docker-compose up -d .
2) docker exec -it johnsnowlabs /bin/bash
3) source jslenv/bin/activate
4) python (to open the python console)
5) >> from johnsnowlabs import *
6) >> jsl.start()

docker_example.zip

>>> from johnsnowlabs import *
>>> jsl.start()
👌 Detected license file /home/jsl/license.json
22/10/19 11:25:52 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/10/19 11:26:05 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult:
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:103)
        at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:87)
        at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:78)
        at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:589)
        at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1000)
        at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:212)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
        at org.apache.spark.Heartbeater$$anon$1.run(Heartbeater.scala:46)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
        at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$register(BlockManagerMasterEndpoint.scala:524)
        at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:116)
        at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
        at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
        at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
        at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
        at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more
22/10/19 11:26:05 ERROR Inbox: Ignoring error
java.lang.NullPointerException
        at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$register(BlockManagerMasterEndpoint.scala:524)
        at org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:116)
        at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
        at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
        at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
        at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
        at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
👌 Launched cpu-Optimized JVM SparkSession with Jars for: 🚀Spark-NLP==4.2.1, 💊Spark-Healthcare==4.2.0, 🕶Spark-OCR==4.
1.0, running on ⚡ PySpark==3.1.2
<pyspark.sql.session.SparkSession object at 0x7f42aea8ca90>```

@JohnSnowLabs JohnSnowLabs deleted a comment from josejuanmartinez Oct 19, 2022
@C-K-Loan
Copy link
Member

@josejuanmartinez thanks for the report and glad to hear it works.
This error message pop-ups randomly when starting up a spark session and on some systems.
But it's not a critical one.
Let's keep this ticket open to track this message and maybe we can improve the UX here in the future

@C-K-Loan C-K-Loan self-assigned this Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants