0

This is slightly hard to explain for me so I'll do my best. Here is the given data set:

Name Car Brand Car Model Car Color Year Bought
Tom Toyota Corolla Black 2009
Tom Hyundai Kona Blue 2010
Tom Kia Soul Red 2011
Bob Mazda CX-30 Red 2008
Bob BMW X1 Blue 2014

With the given data set, I want to condense it based on name and just put all the cars into a list and output it out as JSON objects on separated lines in file. For the above data set, the output should look like this:

{
    "name": "Tom",
    "Cars": [{
        "CarSpecifications": {
            "Brand": "Toyota",
            "Model": "Corolla",
            "Color": "Black"
        },
        "YearBought":2009
     }, 
     {
        "CarSpecifications": {
            "Brand": "Hyundai",
            "Model": "Kona",
            "Color": "Blue"
        },
        "YearBought":2010
     },
     {
        "CarSpecifications": {
            "Brand": "Hyundai",
            "Model": "Kona",
            "Color": "Blue"
        },
        "YearBought":2011
    }]
}

{
    "name": "Bob",
    "Cars": [{
        "CarSpecifications": {
            "Brand": "Mazda",
            "Model": "CX-30",
            "Color": "Red"
        },
        "YearBought":2008
     }, 
     {
        "CarSpecifications": {
            "Brand": "BMW",
            "Model": "X1",
            "Color": "Blue"
        },
        "YearBought":2014
     }]
}

How could I accomplish these transformations using Scala and Scala Dataframes?

1 Answer 1

1

You can aggregate the dataset using groupBy & collect_list and generate JSON strings with toJSON:

df.groupBy("Name").agg(collect_list(
    struct(
      struct(
        $"Car Brand".as("Brand"),
        $"Car Model".as("Model"),
        $"Car Color".as("Color")
      ).as("CarSpecifications"),
      $"Year Bought".as("YearBought")
    ).as("CarSpecifications")
  ).as("Cars"))
  .toJSON
  .show(false)

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|value                                                                                                                                                                                                                                                                                                    |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|{"Name":"Tom","Cars":[{"CarSpecifications":{"Brand":"Toyota","Model":"Corolla","Color":"Black"},"YearBought":"2009"},{"CarSpecifications":{"Brand":"Hyundai","Model":"Kona","Color":"Blue"},"YearBought":"2010"},{"CarSpecifications":{"Brand":"Kia","Model":"Soul","Color":"Red"},"YearBought":"2011"}]}|
|{"Name":"Bob","Cars":[{"CarSpecifications":{"Brand":"Mazda","Model":"CX-30","Color":"Red"},"YearBought":"2008"},{"CarSpecifications":{"Brand":"BMW","Model":"X1","Color":"Blue"},"YearBought":"2014"}]}                                                                                                  |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.