0

Input File :

"Run Date/Time: 2022-02-09 12:47",,,GOOD_MORNING_WORLD
"File Processed: AB-FILE2.20220209.110516",,,GOOD_MORNING_WORLD
AB1234,5,"        PQR2",GOOD_MORNING_WORLD
AB-345,10,"        PQR2",GOOD_MORNING_WORLD
XY890,20,"        PQR2",GOOD_MORNING_WORLD

Expected Output File:

Codes Produced, Count, PQR, Run Date, Run Time, File Processed
AB1234,5,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
AB-345,10,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
XY890,20,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516

Please help me achieve above output format.

I tried below command, but it gives both column and value. I need to remove column header from the each row.

awk -F, 'NR==1 {FN = $1; next} NR==2 {DT = $1; next} {print $1,$2,$3,FN,DT,$4}' OFS=, InputFile.csv > InputFile_Int.csv

AB1234,5,"        PQR2","Run Date/Time: 2022-02-09,12:47",File Processed: AB-FILE2.20220209.110516,GOOD_MORNING_WORLD
AB-345,10,"        PQR2","Run Date/Time: 2022-02-09,12:47",File Processed: AB-FILE2.20220209.110516,GOOD_MORNING_WORLD
XY890,20,"        PQR2","Run Date/Time: 2022-02-09,12:47",File Processed: AB-FILE2.20220209.110516,GOOD_MORNING_WORLD

Thanks in advance.

1

2 Answers 2

2

Don't put FS= or OFS= after your script as it makes your code harder to read as then people read your script assuming you have the default FS and/or OFS values and only at the very end see that you actually changed it. Instead set both up front, i.e. do awk -Fx -v OFS=y 'script' file or awk 'BEGIN{FS="x";OFS="y"} script' file, not awk -Fx 'script' OFS=y file. The only exception to that rule is when you need to set them to different values for different input files and then you'd set either or both of them between input file names.

Also don't use all-capitals variable names for user-defined variables as that obfuscates your code by making it look like you're using builtin variable names when you aren't and it can cause clashes between variable names you thought you were defining but are actually clobbering or being clobbered by builtin variable names.

$ cat tst.awk
BEGIN {
    FS = "[\"[:space:]]*,[\"[:space:]]*"
    OFS = ","
}
NR < 3 {
    split($1,parts," ")
    if ( NR == 1 ) {
        date = parts[3]
        time = parts[4]
    }
    else {
        file = parts[3]
        print "Codes Produced"," Count"," PQR"," Run Date"," Run Time"," File Processed"
    }
    next
}
{ print $1, $2, "\"" $3 "\"", date, time, file }

$ awk -f tst.awk InputFile.csv
Codes Produced, Count, PQR, Run Date, Run Time, File Processed
AB1234,5,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
AB-345,10,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
XY890,20,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
0

Try

awk -F"[ ,]*" '
        {gsub(/"/,"")
        }
NR==1   {print "Codes Produced, Count, PQR, Run Date, Run Time, File Processed"
         DT = $3
         TM = $4
         next
        }
NR==2   {FN = $3
         next
        }
        {print $1,$2,"\"" $3 "\"",DT,TM,FN
        }
' OFS=, file
Codes Produced, Count, PQR, Run Date, Run Time, File Processed
AB1234,5,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
AB-345,10,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
XY890,20,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.