0

I have several txt files and would like to extract email from that file if file contains text "Provider" inside. Text files are not equally formatted. "provider" could be anywhere in text.

Here is some short sample :
file 1.txt
Name: Joe1
Provider
...
Email [email protected]

file 2.txt
Name: Joe2
...
Client
...
Email [email protected]

file 3.txt
Name: Joe3
...
Provider
Email [email protected]

I am using this short code but it returns all emails

$ awk -F, '{
  for (i=1; i<=NF; i++)
    if ($i ~ /@/)
       print $i
}' *

Can you help me out?

Thanks

3 Answers 3

1
$ awk 'FNR==1 { provider = False } 
       $0 ~ /Provider/ { provider = True} 
       $0 ~ /@/ && provider == True {
         for (i=1; i<=NF; i++) {
           if ($i ~ /@/) print $i;
         }
       }' *
  • For each file set provider to False in the first line
  • If a line contains Provider set provider to True
  • If a line contains an @ and the word Provider was seen before, iterate over the fields and print those which contain the @
0

You can try:

for fname in file*.txt
do
    if grep 'Provider' ${fname} &> /dev/null ; then
       grep -oP 'Email[[:space:]]*\K(.*@.*)' ${fname}
    fi
done
0
grep -l Provider file*.txt | xargs grep -o '[^@[:space:]]+@[^@[:space:]]+'

There are more accurate regexps for email addresses out there, e.g. [0-9a-zA-Z._%+-]+@[0-9a-zA-Z.-]+\.[a-zA-Z]{2,}, if you need to get more precise.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.