0

I use ExecuteGroovyScript processor just to extract only wanted columns for my further calculations.

Groovy Code :

def flowFile = session.get()

if(!flowFile) return

flowFile = session.write(flowFile, {inputStream, outputStream ->
    outputStream.withWriter("UTF-8"){ w ->
        inputStream.eachLine("UTF-8"){ line ->

                 def row = line.split(';',-1)

                 w << row[0,1,6,8,9,11].join(',') << '\n'

        }
    }
} as StreamCallback)

session.transfer(flowFile, REL_SUCCESS)

But for some csv, I get java.lang.ArrayIndexOutOfBoundsException.

My csv :

id,name,email,address
1,sachith,[email protected],{"Lane":"ABC Lane","No":"24"}
2,nalaka,[email protected],{"Lane":
"DEF Lane","No":"34"}

How can I get just 1 row and ignore other two rows? I have tried ValidateCSV processor for validating. But it can not capture this.

1
  • 1
    give the sample of csv file Commented Feb 13, 2020 at 12:51

1 Answer 1

1

I was able to use ValidateCSV processor to validate the row. this is tricky because the , in the middle of {"Lane":"ABC Lane","No":"24"} will consider this field as 2 differents fields.

every invalids rows will be routed to invalid :

enter image description here

ValidateCsv processor configuration :

enter image description here

Schema :

ParseInt(),StrNotNullOrEmpty(),StrNotNullOrEmpty(),StrRegex("\{.*"),StrRegex(".*\}")

You should look at this processor documentation for more detail if you want a better schema :

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ValidateCsv/additionalDetails.html

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.