I am using Apache Spark on Windows 10 64 bit machine. I have installed Java, Python 3.6 ,spark-2.3.1-bin-hadoop2.7. I am using VSCode editor for PySpark codeing.
When I'm executing the Python spark code in VSCode using spark-submit, it is showing
Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
and is terminating the execution.
Relevant code:
from pyspark import SparkContext, SparkConf
if name == "main":
conf = SparkConf().setAppName("word count").setMaster("local[2]")
sc = SparkContext(conf=conf)
lines = sc.textFile("in/word_count.text")
words = lines.flatMap(lambda line: line.split(" "))
wordcounts = words.countByValue()
for word, count in wordcounts.items():
print("{} : {}".format(word,count))
Spark Execution Error:
