Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.util.start.registerListenerAndStartRefresh. : java.net.SocketTimeoutException: connect timed out #10

uzairahmadxy · 2022-10-24T17:45:19Z

Hi guys. I'm trying to run spark NLP for healthcare locally and I seem to have the compatible versions of spark/java but it still throws an error (screenshots attached).
Anyone face this?


import json
import os

# Loading license key
with open('key.json') as f:
    license_keys = json.load(f)

# Defining license key-value pairs as local variables
locals().update(license_keys)
os.environ.update(license_keys)

# Installing pyspark and spark-nlp
! pip install --upgrade -q pyspark==3.1.2 spark-nlp==$PUBLIC_VERSION

# Installing Spark NLP Healthcare
! pip install --upgrade -q spark-nlp-jsl==$JSL_VERSION  --extra-index-url https://pypi.johnsnowlabs.com/$SECRET

!pyspark --version

!pip show spark-nlp-jsl

!pip show spark-nlp

import json
import os

from pyspark.ml import Pipeline, PipelineModel
from pyspark.sql import SparkSession

import sparknlp
import sparknlp_jsl

from sparknlp.annotator import *
from sparknlp_jsl.annotator import *
from sparknlp.base import *
from sparknlp.util import *
from sparknlp.pretrained import ResourceDownloader
from pyspark.sql import functions as F

import pandas as pd

pd.set_option('display.max_columns', None)  
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', None)

import string
import numpy as np

params = {"spark.driver.memory":"16G",
          "spark.kryoserializer.buffer.max":"2000M",
          "spark.driver.maxResultSize":"2000M"}

spark = sparknlp_jsl.start(secret = SECRET, params=params)

print ("Spark NLP Version :", sparknlp.version())
print ("Spark NLP_JSL Version :", sparknlp_jsl.version())

spark

The text was updated successfully, but these errors were encountered:

uzairahmadxy · 2022-10-24T18:00:09Z

I forgot to mention I have a trial Healthcare license.

C-K-Loan · 2022-10-27T23:31:32Z

@uzairahmadxy can you share the full error trace from the notebook and also check your jupyter shell for any errors and share those?

uzairahmadxy · 2022-10-28T14:53:25Z

Hi @C-K-Loan. Here's the additional information

C-K-Loan · 2022-10-28T20:54:21Z

Thank you for sharing @uzairahmadxy
Looks like something is not correctly setup with your hadoop utils.
Make sure to precisely follow every step listed here https://nlp.johnsnowlabs.com/docs/en/install#windows-support
This should fix all your issues

uzairahmadxy · 2022-10-31T18:09:54Z

Hi @C-K-Loan

I re-installed everything using the instructions. It still throws the error (note: I don't see the Hadoop utils error now in the jupyter kernel though).

C-K-Loan · 2022-10-31T18:31:16Z

Nice that's one less error!
@uzairahmadxy can you test running this open source notebook and see if it works or not ?

https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Public/1.SparkNLP_Basics.ipynb
You can skip the cells with pip install

Also could you copy paste the entire error trace you get here or https://pastebin.com/

uzairahmadxy · 2022-11-01T15:19:00Z

Hi @C-K-Loan This is for the healthcare notebook kernel (https://pastebin.com/cV6ymZvR)

Also, the training notebook doesn't run. Here are the traces for the open source notebook:
Python Interpreter Error: https://pastebin.com/XiXLxnnT
Jupyter Kernel: https://pastebin.com/v7jn0EBr

Side note: Pyspark works ok (as shown in the screenshot. I thought there was an issue with spark before)

C-K-Loan · 2022-11-03T01:48:47Z

Thank you for sharing @uzairahmadxy

Looks like the jar loaded into you spark session is missing some classes.
But you should have downloaded the fat jar, i.e. the one with all the dependencies when running sparknlp.start()

@uzairahmadxy
Can you try manually downloading the Spark-NLP jar and then start a Spark-Session by passing the path to it?
I.e. Download : https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-4.2.2.jar

Then instead of sparknlp.start() run the following and try continue running the rest of the Notebook 1

spark =  SparkSession.builder \
    .appName("Spark NLP")\
    .master("local[*]")\
    .config("spark.driver.memory","16G")\
    .config("spark.driver.maxResultSize", "0") \
    .config("spark.kryoserializer.buffer.max", "2000M")\
    .config("spark.jars", "path/to/the/spark-nlp.jar")\
    .getOrCreate()

Maybe this is a Windows Specific bug, I think @josejuanmartinez is on Windows have you maybe seen this?

josejuanmartinez · 2022-11-03T09:45:48Z

Hey I am not on Windows anymore sorry

uzairahmadxy · 2022-11-03T14:18:35Z

Thanks @C-K-Loan. Manually loading the jar worked for the basic spark nlp.

I guess the same will have to be done for using the healthcare library as well. Can you please share where I can get these from?

C-K-Loan · 2022-11-07T23:25:16Z

Hi @uzairahmadxy, great good to know that this works and sorry for the bug

to get the healthcare jar :
replace secret with your healthcare Secret and lib_version and you will have the URL.
https://pypi.johnsnowlabs.com/{secret}/spark-nlp-jsl-{lib_version}.jar
i.e. if the secret is 4.2.1.agdfgdgdl the url would be
https://pypi.johnsnowlabs.com/4.2.1.agdfgdgdl/spark-nlp-jsl-4.2.1.jar

@Meryem1425 can you see if you run into the same issue on Windows?

uzairahmadxy · 2022-11-08T18:08:19Z

Thank you for sharing @C-K-Loan

While the jars are loaded, the problem still persists as I want to load pretrained healthcare models/pipelines.

Error Trace: https://pastebin.com/xtkJKVLk
Jupyter Kernel: https://pastebin.com/fznqEBvq

Side note: In order to manually download the healthcare model from the models hub, I'm assuming I have to specify the secret. How do we do download that?

Cabir40 · 2022-11-10T15:21:37Z

Can you test if your license is valid by running it on this notebook?

Can you share the last versions you used?
(java? pyspark?, spark-nlp?, spark-nlp-jsl?)

if you want to download manually? you can use this script, and in this notebook there is same example

from sparknlp.pretrained import ResourceDownloader
ResourceDownloader.downloadModelDirectly("clinical/models/embeddings_clinical_en_2.4.0_2.4_1580237286004.zip", "clinical/models")

uzairahmadxy · 2022-11-10T21:05:57Z

The license works on notebook (tried on Collab).

Here are the versions used:

Java 8 (OpenJDK 64-Bit Server VM, 1.8.0_345)
Pyspark (Version 3.3.1)
Spark-NLP (4.2.0)
Spark-NLP-JSL (4.2.0)

Meryem1425 · 2022-11-21T16:23:48Z

I followed https://nlp.johnsnowlabs.com/docs/en/install#windows-support that website @uzairahmadxy. I set up correctly. I didn't any bug. Please make sure all stage apply correctly.

You have to create java folder, spark folder, hadoop folder and tmp folder under the C folder. And then you have to make sure about set environment variable. Look at stage number 4 and 5.

Could you delete all things and then follow installation step? Thank you

C-K-Loan · 2023-01-30T02:29:16Z

@uzairahmadxy I notice you are using openJDK, but Adopt OpenJDK is recommended,

uzairahmadxy assigned maziyarpanahi Oct 24, 2022

maziyarpanahi transferred this issue from JohnSnowLabs/spark-nlp Oct 24, 2022

maziyarpanahi assigned C-K-Loan and unassigned maziyarpanahi Oct 24, 2022

C-K-Loan assigned C-K-Loan and unassigned C-K-Loan Oct 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.util.start.registerListenerAndStartRefresh. : java.net.SocketTimeoutException: connect timed out #10

Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.util.start.registerListenerAndStartRefresh. : java.net.SocketTimeoutException: connect timed out #10

uzairahmadxy commented Oct 24, 2022

uzairahmadxy commented Oct 24, 2022 •

edited

Loading

C-K-Loan commented Oct 27, 2022

uzairahmadxy commented Oct 28, 2022

C-K-Loan commented Oct 28, 2022

uzairahmadxy commented Oct 31, 2022

C-K-Loan commented Oct 31, 2022 •

edited

Loading

uzairahmadxy commented Nov 1, 2022

C-K-Loan commented Nov 3, 2022

josejuanmartinez commented Nov 3, 2022

uzairahmadxy commented Nov 3, 2022

C-K-Loan commented Nov 7, 2022 •

edited

Loading

uzairahmadxy commented Nov 8, 2022

Cabir40 commented Nov 10, 2022

uzairahmadxy commented Nov 10, 2022 •

edited

Loading

Meryem1425 commented Nov 21, 2022

C-K-Loan commented Jan 30, 2023

Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.util.start.registerListenerAndStartRefresh. : java.net.SocketTimeoutException: connect timed out #10

Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.util.start.registerListenerAndStartRefresh. : java.net.SocketTimeoutException: connect timed out #10

Comments

uzairahmadxy commented Oct 24, 2022

uzairahmadxy commented Oct 24, 2022 • edited Loading

C-K-Loan commented Oct 27, 2022

uzairahmadxy commented Oct 28, 2022

C-K-Loan commented Oct 28, 2022

uzairahmadxy commented Oct 31, 2022

C-K-Loan commented Oct 31, 2022 • edited Loading

uzairahmadxy commented Nov 1, 2022

Also, the training notebook doesn't run. Here are the traces for the open source notebook: Python Interpreter Error: https://pastebin.com/XiXLxnnT Jupyter Kernel: https://pastebin.com/v7jn0EBr

C-K-Loan commented Nov 3, 2022

josejuanmartinez commented Nov 3, 2022

uzairahmadxy commented Nov 3, 2022

C-K-Loan commented Nov 7, 2022 • edited Loading

uzairahmadxy commented Nov 8, 2022

Cabir40 commented Nov 10, 2022

uzairahmadxy commented Nov 10, 2022 • edited Loading

Meryem1425 commented Nov 21, 2022

C-K-Loan commented Jan 30, 2023

uzairahmadxy commented Oct 24, 2022 •

edited

Loading

C-K-Loan commented Oct 31, 2022 •

edited

Loading

Also, the training notebook doesn't run. Here are the traces for the open source notebook:
Python Interpreter Error: https://pastebin.com/XiXLxnnT
Jupyter Kernel: https://pastebin.com/v7jn0EBr

C-K-Loan commented Nov 7, 2022 •

edited

Loading

uzairahmadxy commented Nov 10, 2022 •

edited

Loading