0

I am just getting started on Apache Flink (Scala API), my issue is following: I am trying to stream data from Kafka into Apache Flink based on one example from the Flink site:

val stream =
  env.addSource(new FlinkKafkaConsumer09("testing", new SimpleStringSchema() , properties))

Everything works correctly, the stream.print() statement displays the following on the screen:

2018-05-16 10:22:44 AM|1|11|-71.16|40.27

I would like to use a case class in order to load the data, I've tried using

flatMap(p=>p.split("|")) 

but it's only splitting the data one character at a time.

Basically the expected results is to be able to populate 5 fields of the case class as follows

  field(0)=2018-05-16 10:22:44 AM
  field(1)=1
  field(2)=11
  field(3)=-71.16 
  field(4)=40.27

but it's now doing:

   field(0) = 2
   field(1) = 0
   field(3) = 1
   field(4) = 8 

etc...

Any advice would be greatly appreciated.

Thank you in advance

Frank

2
  • What do you mean with "splitting the data one character at a time". Maybe you could give an example of the expected output and the actual output. Commented May 17, 2016 at 11:42
  • Thanks for your message, I've edited my post. Commented May 17, 2016 at 12:09

1 Answer 1

3

The problem is the usage of String.split. If you call it with a String, then the method expects it to be a regular expression. Thus, p.split("\\|") would be the correct regular expression for your input data. Alternatively, you can also call the split variant where you specify the separating character p.split('|'). Both solutions should give you the desired result.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much, I truly appreciate your helping a newbie.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.