-1

I would like to generate two outputs in Pentaho. One output with lines where the CPF is unique and another output as lines where the CPF is repeated. Initially I used the "Data grid" and "Sort rows" steps, but I don't know how to go about doing what I want. See the data:

Data input:


| CPF           | Nome         | Ano |
-------------------------------------|             
|636.624.160-00 |Alexandre Dias| 2023|                  
|438.815.860-75 |José da Silva | 2023|
|438.815.860-75 |José da Silva | 2022|
|311.520.000-55 |Maria Pereira | 2022|
|835.894.510-84 |Otávio Campos | 2023|
|835.894.510-84 |Otávio Campos | 2022|

Outputs I want:

Output with lines with single CPF:


| CPF           | Nome         | Ano |
-------------------------------------|             
|636.624.160-00 |Alexandre Dias| 2023|                  
|311.520.000-55 |Maria Pereira | 2022|

Output with lines with repeated CPF:

| CPF           | Nome         | Ano |
-------------------------------------|                              
|438.815.860-75 |José da Silva | 2023|
|438.815.860-75 |José da Silva | 2022|
|835.894.510-84 |Otávio Campos | 2023|
|835.894.510-84 |Otávio Campos | 2022|

Obs: CPF randomly generated.

1 Answer 1

0

Load data via text input as lines to record,

then in a side flow "Memory group" the data on CPF and add count field. then add a 1 or 0 based on the count (id count = 1 then 1 else 0).

Connect your mainflow with a lookup to the sideflow, where you retrieve the 0/1 variable. "switch step" based on 0/1 variable on each of the two paths, write to two diffrent files.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.