0

I am new to/still learning Apache Spark/Scala. I am trying to analyze a dataset and have loaded the dataset into Scala. However, when I try to perform a basic analysis such as max, min or average, I get an error -

error: value select is not a member of org.apache.spark.rdd.RDD[Array[String]]

Could anyone please shed some light on this please? I am running Spark on the cloudlab of an organization.

Code:

// Reading in the csv file

val df = sc.textFile("/user/Spark/PortbankRTD.csv").map(x => x.split(","))  

// Select Max of Age

df.select(max($"age")).show()                                                                                                        

Error:

<console>:40: error: value select is not a member of org.apache.spark.rdd.RDD[Array[String]]                                                
          df.select(max($"age")).show()  

Please let me know if you need any more information. Thanks

1
  • To add to @thebluephantom's incredibly helpful comment, you should read about converting an RDD to a DataFrame. select is a method on DataFrame; you're handing it an RDD (as the error message indicates). Commented Apr 15, 2020 at 14:17

1 Answer 1

3

Following up on my comment, the textFile method returns an RDD[String]. select is a method on DataFrame. You will need to convert your RDD[String] into a DataFrame. You can do this in a number of ways. One example is

import spark.implicits._

val rdd = sc.textFile("/user/Spark/PortbankRTD.csv")
val df = rdd.toDF()

There are also built-in readers for many types of input files:

spark.read.csv("/user/Spark/PortbankRTD.csv")

returns a DataFrame immediately.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank so much user4601931! Really appreciate the guidance. Will try this and revert.
No problem! If you think the answer addresses your question, you should click the checkmark to accept it. This way the question doesn't pop up again in the feed. This isn't definitive, and if a better answer comes along, you can accept that one instead.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.