I am trying to convert the conversation into dataframe in spark through Scala. The person and its message are separated by tab length of space. Each conversation is in a new line.
The text file is like following:
alpha hello,beta! how are you?
beta I am fine alpha.How about you?
alpha I am also doing fine...
alpha Actually, beta, I am bit busy nowadays and sorry I hadn't call U
and I need the dataframe as following:
------------------------------------
|Person | Message
------------------------------------
|1 | hello,beta! how are you?
|2 | I am fine alpha.How about you?
|1 | I am also doing fine...
|1 | Actually, beta, I am bit busy nowadays and sorry I hadn't call
-------------------------------------