5

In Scala, we would write an RDD to Redis like this:

datardd.foreachPartition(iter => {
      val r = new RedisClient("hosturl", 6379)
      iter.foreach(i => {
        val (str, it) = i
        val map = it.toMap
        r.hmset(str, map)
      })
    })

I tried doing this in PySpark like this: datardd.foreachPartition(storeToRedis), where function storeToRedis is defined as:

def storeToRedis(x):
    r = redis.StrictRedis(host = 'hosturl', port = 6379)
    for i in x:
        r.set(i[0], dict(i[1]))

It gives me this:

ImportError: ('No module named redis', function subimport at 0x47879b0, ('redis',))

Of course, I have imported redis.

9
  • 2
    Is redis installed on every worker? Commented Aug 28, 2015 at 16:43
  • @zero323 Is that the way to do it? Install redis on every worker. Commented Aug 29, 2015 at 7:33
  • 1
    python modules to be used in the workers must be on all the workers.... so he means the python redis module, not a redis db installation. Commented Aug 29, 2015 at 10:50
  • @Paul: I understood what he meant, and that's what I am asking. Do I have to install the python redis module on all the workers manually? There should be an easier and shortcut way, like Scala API's addJars method. Commented Aug 29, 2015 at 11:26
  • @kamalbanga I'm unaware of a good way. Of course you could try to use spark to make the workers run pip or easy_install but unless you can limit workers to one per machine, it might not behave very well. Commented Aug 29, 2015 at 11:37

1 Answer 1

7

PySpark's SparkContext has a addPyFile method specifically for this thing. Make the redis module a zip file (like this) and just call this method:

sc = SparkContext(appName = "analyze")
sc.addPyFile("/path/to/redis.zip")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.