-1

I have a csv that looks like below with up to 15000 lines.
The numeric values of Start and End are between 0 and 300.
I am looking for a way to parse through the file, search for rows starting with white, then check the Start value and the End value of this row with the following conditions:

  1. if the value ≤ 150 then add 150
  2. if the value is > 150 then subtract 150

Finally, overwrite the source file with the edits.
I am looking for a way to realize that with bash or python. Any help is much appreciated!

Raw Data:

Color, Start, End
white, 0, 1, 
black, 23, 150, 
black, 150, 24, 
white, 24, 152, 
black, 152, 25, 
black, 25, 154, 
black, 154, 81, 
white, 99, 220,
...

Final Data:

Color, Start, End
white, 150, 151, 
black, 23, 150, 
black, 150, 24, 
white, 174, 2, 
black, 152, 25, 
black, 25, 154, 
black, 154, 81, 
white, 249, 70,
...
1
  • 3
    Hello, welcome on SO. What did you try? What was the result? Commented Aug 5, 2021 at 11:15

3 Answers 3

1

Sounds like a perfect job for awk:

awk -F, -v OFS=', ' '$1=="white" {for(i=2;i<=3;i++) if($i<=150) $i+=150; else $i-=150} 1' file.csv
Sign up to request clarification or add additional context in comments.

1 Comment

that worked perfectly! thanks a lot! I added a > output.csv to the end of the command to write the results in a file
1
$ awk 'BEGIN{FS=" *, *"; OFS=", "; n=150} $1=="white"{ for (i=2; i<NF; i++) $i+=($i>n ? -n : n) } 1' file
Color, Start, End
white, 150, 151,
black, 23, 150,
black, 150, 24,
white, 174, 2,
black, 152, 25,
black, 25, 154,
black, 154, 81,
white, 249, 70,

Comments

0

Python code that does it:

f = open("colors.csv", "r")
lines = f.read().split("\n")
f.close()
output = ""
for l in lines:
    parts = l.split(",")
    if len(parts) >= 3 and parts[0] == "white":
        if int(parts[1]) > 150:
            parts[1] = str(int(parts[1]) - 150)
        else:
            parts[1] = str(int(parts[1]) + 150)
        if int(parts[2]) > 150:
            parts[2] = str(int(parts[2]) - 150)
        else:
            parts[2] = str(int(parts[2]) + 150)
        output += (parts[0] + ", " + parts[1] + ", " + parts[2] + ",\n")
    else:
        output += (l + "\n")
f = open("colors.csv", "w")
f.write(output)
f.close()

note: If you run it twice all values will be reset, except 0, which will be 300.

1 Comment

If it worked you should mark it as accepted answer (or mark the awk one, which was more elegant).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.