Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use in Databricks JedisConnectionException: Could not get a resource from the pool #357

Open
juancresc opened this issue Oct 11, 2022 · 4 comments

Comments

@juancresc
Copy link

juancresc commented Oct 11, 2022

I'm currently testing this in pyspark

df.write\
  .format("org.apache.spark.sql.redis")\
  .option("table", "mytable")\
  .option("infer.schema", True)\
  .option("spark.redis.host","somehost")\
  .option("host","somehost")\
  .option("spark.redis.port", "6666")\
  .option("port", "6666")\
  .option("spark.redis.ssl", False)\
  .option("auth", "")\
  .option("timeout", 5000)\
  .option("key.column", "key")\
  .save()
# JedisConnectionException: Could not get a resource from the pool

I've installed this
spark_redis_2_4_0_jar_with_dependencies.jar
From here: https://repo1.maven.org/maven2/com/redislabs/spark-redis/2.4.0/
The notebook currently runs: 10.4 LTS ML (includes Apache Spark 3.2.1, Scala 2.12)

I'm able to connect to redis from the notebook using the redis lib from python

@tonofll
Copy link

tonofll commented Oct 26, 2022

Ok so I was facing exactly the same issue and I managed to solve it. I tested it with version spark-redis 3.1.0, scala 2.12 and Spark 3.2.1 (Databricks runtime 10.4 LTS).

You must set the variables in Spark configuration before launching the cluster. Otherwise if you put them directly in your spark session through spark.conf.set("", "") or directly when reading/wrinting your dataframe as .option(...), it would raise JedisConnectionException

image

spark.redis.host <your_host>
spark.redis.port <your_port> // usually 6379
spark.redis.auth <your_auth_token> // if needed
spark.redis.ssl true // in case you connect using TLS (port 6380)

Example code (in Scala)

case class Person(name: String, age: Int)

val personSeq = Seq(Person("John", 30), Person("Peter", 45))
val df = spark.createDataFrame(personSeq)

df.write
  .format("org.apache.spark.sql.redis")
  .option("table", "person-db")
  .save()

// Read the same table afterwards
val df = spark.read
  .format("org.apache.spark.sql.redis")
  .option("table", "person-db")
  .load()
df.show()

@adamwrobel-ext-gd
Copy link

@tonofll hey sorry for asking in an old topic, I am having issues even adding the JAR to the cluster. How did you do it?

@tonofll
Copy link

tonofll commented Apr 20, 2023

@tonofll hey sorry for asking in an old topic, I am having issues even adding the JAR to the cluster. How did you do it?

To install de JAR in the cluster, just go to the cluster configuration and open Libraries tab:

image

Afterwards click Install new and search spark-redis library in Maven central repository:

image

image

image

image

Once installed, simply restart the cluster and it should work properly. To avoid JedisConnectionException follow the steps in my previous comment.

@adamwrobel-ext-gd
Copy link

Oh yeah I just noticed you switched to Maven Central from Spark Packages. In there, the latest is 2.3.0. I managed today to workaround this by just pasting the coordinates, repository and clicking Install with no browsing. It worked too. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants