1

I am using spark 1.3.0.

I have a problem running the python program in spark python shell.

This is how I submit the job :

/bin/spark-submit progname.py

the error I found is,

NameError: name 'sc' is not defined

on that line.

Any idea? Thanks in advance

3 Answers 3

2
 ## Imports

from pyspark import SparkConf, SparkContext

## CONSTANTS

APP_NAME = "My Spark Application"

##OTHER FUNCTIONS/CLASSES

## Main functionality

def main(sc):

    rdd = sc.parallelize(range(1000), 10)

    print rdd.mean()

if __name__ == "__main__":
     # Configure OPTIONS
     conf = SparkConf().setAppName(APP_NAME)
     conf = conf.setMaster("local[*]")
     #in cluster this will be like
     #"spark://ec2-0-17-03-078.compute-#1.amazonaws.com:7077"
     sc   = SparkContext(conf=conf)
     # Execute Main functionality
main(sc)
Sign up to request clarification or add additional context in comments.

6 Comments

I tried to copy paste the above prog and run it. IndentationError: expected an indented block I am getting this error. soory I am troubling u more. but thank yo so much for the help
have you /t (did 4 spaces) after the if statement?
Yes sir. now I am getting this error zipimport.ZipImportError: can't decompress data; zlib not available
can u give your email id if u dont mind. I shall send u the screen shots. sorry for troubling u again.
[link] askubuntu.com/questions/661039/… you will find the answer in here
|
0
conf = pyspark.SparkConf()

This is how you should create SparkConf object.

Further you can use chaining to do thins like set application name etc

conf = pyspark.SparkConf().setAppName("My_App_Name")

Then pass this config var to create spark context.

Comments

-1

The first thing a Spark program must do is to create a SparkContext object, which tells Spark how to access a cluster. To create a SparkContext you first need to build a SparkConf object that contains information about your application.

conf = SparkConf().setAppName(appName).setMaster(master)
sc = SparkContext(conf=conf)

5 Comments

sorry to ask again. Can u tell me how to build a sparkConf. In terminal? or where? Thanks again .
create a SparkConf object with new SparkConf(), which will load values from any spark.* Python system properties
The appName parameter is a name for your application to show on the cluster UI. master is a Spark, Mesos or YARN cluster URL, or a special “local” string to run in local mode
I am sorry. Thank you for your help. but i again that is throwing error. and I dont know how to resolve. may be I dont know the correct syntax. :( conf = SparkConf().setAppname("README.md").setMaster("/home/nikitha/Downloads/spark-1.5.0-bin-hadoop2.4") sc = SparkContext(conf=conf) textFile=sc.textFile("README.md") I gave like this and tried to run by using /bin/spark-submit progname.py but the error is NameError: name 'SparkConf' is not defined
set the .master to local

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.