I'm using pyspark 2.0 I have a df like this:
+----------+----------+--------
|pid | date| p_category
+----------+----------+--------
| 1ba |2016-09-30|flat
| 3ed |2016-09-30|ultra_thin
+----------+----------+----------
I did a
df.groupBy("p_category","date") \
.agg(countDistinct("pid").alias('cnt'))
and I got this:
+-------------+----------+------+
|p_category | date| cnt|
+-------------+----------+------+
| flat |2016-09-30|116251|
|ultra_thin |2016-09-30|113017|
+-------------+----------+------+
But I want I pivot table like this:
+----------+----------+------+
|date | flat| ultra-thin
+----------+----------+------+
2016-09-30 | 116251|113017
------------------------------
df.groupBy("p_category","date") \
.agg(countDistinct("pid").alias('cnt')).pivot("p_category")
I got this error:
'DataFrame' object has no attribute 'pivot'
How could I do a pivot in such case or is there other solution? Thanks