0

I am looking for something like this...

Given this file (lets call it "foo.log"):

START_OF_ENTRY
line2
END_OF_ENTRY
START_OF_ENTRY
no match
END_OF_ENTRY
START_OF_ENTRY
line2
END_OF_ENTRY

Executing the following command:

pcregrep -M -o '(?m)^START_OF_ENTRY\nline2\nEND_OF_ENTRY$' foo.log | for match in STDIN; do echo "match: $match"; done

would produce

match: START_OF_ENTRY
line2
END_OF_ENTRY
match: START_OF_ENTRY
line2
END_OF_ENTRY

Is this possible in bash?

2
  • What do you mean exactly by "in bash"? are you looking for a solution using only shell features? Commented May 11, 2020 at 1:14
  • yes - I'm curious as to if it's possible only with the shell - preferably with bash only Commented May 11, 2020 at 9:49

1 Answer 1

0

Using bash regular expression matching with the proviso that there does not appear to be a multiline equivalent of the ^ and $ anchors in bash:

x=$(<foo.log)

printf -v re 'START_OF_ENTRY\nline2\nEND_OF_ENTRY'

while [[ $x =~ $re ]]; do 
  printf 'match: %s\n' "${BASH_REMATCH[0]}"; x=${x#*${BASH_REMATCH[0]}};
done
match: START_OF_ENTRY
line2
END_OF_ENTRY
match: START_OF_ENTRY
line2
END_OF_ENTRY

With Perl, in which you can use the m modifier to make ^ and $ meaningful in a multiline context:

perl -0777 -nE 'while ($_ =~ /^(START_OF_ENTRY\nline2\nEND_OF_ENTRY)$/mg) {say "match: $1"}' foo.log
1
  • Capturing the whole file in memory will use more memory. That could be an issue with big files. Also, processing text in the shell is usually a bad idea as the shell is quite slow on such tasks. Commented May 11, 2020 at 7:18

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.