1

I am reading a file into an array. This file contains comma delimited data formatted like this:

16.01,1.8
20,1.84
25.01,1.9
31.52,1.93
800.1,1.99
1000,1.98
1250,1.98
16000,2
20010,2

I need to find the closest number to "1000" in the first column, and I have a working function for that. This number is then used for further processing of the input file, however I can't get this variable until the file has been processed up to that point, which means that the second part of my script only processes the array data after the "1000" point has been found.

The only way I could see to do this, was to open the file a second time with another while loop (unless there is a way to store and re-use the array content?).

My script:

    #!/bin/bash
    #
    find1k () {
        if ((980<=$freq && $freq<=1050)); then
            scalevolts=$volts
        fi
    }

    while IFS=$',' read -r -a lines; do
        [[ "$lines" =~ ^#.*$ ]] && continue

        freq="${lines[0]}"
        volts="${lines[1]}"
        freq=$(printf "%.0f\n" $freq)

        if [ -z "$scalevolts" ]; then
            find1k
        else
# I need to loop through the entire array again from here.
#
            normalised=$(echo "scale=3; ($volts/$scalevolts)"|bc -l)
            echo $freq , $volts , $normalised
        fi
    done < $1

Is there a way to do this without having to open the file twice? (i.e. re-use the array content).

bash version is 4.2

Thanks.

4
  • Is the file sorted by the first column? Commented Jul 14, 2016 at 10:08
  • Yes, the content is always generated in the same order - first column sorted low to high. Commented Jul 14, 2016 at 10:09
  • which part of your script requires to read from first again? Is it the part where you calculate normalised? Commented Jul 14, 2016 at 10:10
  • @Fazlin - yes - it needs to loop through the entire array again from that point. Commented Jul 14, 2016 at 10:14

2 Answers 2

2

I'd do the whole thing in awk. Then you don't have to worry about only integer arithmetic.

awk '
    BEGIN {FS = ","; n=0}
    function abs(x) {
        if (x < 0) return -x
        return x
    }
    /^[[:blank:]]*(#|$)/ {next}                # skip comments and blank lines
    n == 0 {min = abs(1000 - $1) }
    {diff = abs(1000 - $1)}
    diff <= min {min = diff; base_voltage = $2}
    {n++; freq[n] = $1; volts[n] = $2}
    END {
        for (i=1; i<n; i++) {
            printf "%s,%s,%.3f\n", freq[i], volts[i], volts[i]/base_voltage
        }
    }
' "$1"

outputs

16.01,1.8,0.909
20,1.84,0.929
25.01,1.9,0.960
31.52,1.93,0.975
800.1,1.99,1.005
1000,1.98,1.000
1250,1.98,1.000
16000,2,1.010
20010,2,1.010
Sign up to request clarification or add additional context in comments.

13 Comments

I just tried this, and rather oddly it prints "inf" as a third parameter. like this: " 16.01,1.8,inf " Another potential issue is that the "1000" can vary ever so slightly either way, which is why my original script searched the array either side of 1000, which I don't think your awk script is doing? I have a reasonably fair understanding of awk, but not quite at your level!
Ah.. is it this bit "diff = abs(1000 - $1)} diff < prev {one_k = $1" that's doing the checking for 1000?
that's right. I'm finding the freq that's closest to 1000. Your code finds the last freq that's between the given range.
Got it, updated my answer. Accounts for comments now.
I tweaked how I found the closest. See where I use variable named "min".
|
0

At the moment, you're using read -a so every line is being read into an array but the contents of this array are overwritten. The elements of ${lines[@]} only contain the values for the current line, so I guess that ${line[@]} would be a better name for the variable.

If you want, you can save each part of the line into a separate array like this:

freq=()
volts=()

while IFS=, read -r f v rest; do
    freq+=( "$f" )
    volts+= ( "$v" )
    # etc.
done < "$1"

But really I think that you're using the wrong tool for the job; the shell is slow at reading files line by line. I would recommend that you use awk instead. If you show us what your desired output is for the input you have shown, then we can help you better.

3 Comments

Bash arithmetic doesn't handle floats.
@choroba I know! I wasn't suggesting that it did. I was just showing the OP how to save the values from each line into an array.
You'll note I was using 'bc' for the maths, not bash, so I don't think that bash not handling floats matters?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.