DNA covariance model single/one file : Input data
Header : sequence and covariance
NC_013791.2.2 : GCTCAGCTGGCtAGAG
NC_013791.2.2 : >>>>.........<<<
NC_013791.2.3 : GCTCAGCTGGCtAGAG
NC_013791.2.3 : >>>>..<<<<......
NC_013791.2.4 : GCTCAGCTGGCtAGGA
NC_013791.2.4 : >>>>.........<<<
NC_013791.2.5 : GCTCAGCTGACtACAG
NC_013791.2.5 : >>>>..<<<<......
output data/expected data for all the above IDs from a single/one file
NC_013791.2.2 : GAG
NC_013791.2.2 : <<<
NC_013791.2.3 : CTGG
NC_013791.2.3 : <<<<
NC_013791.2.4 : GGA
NC_013791.2.4 : <<<
NC_013791.2.5 : CTGA
NC_013791.2.5 : <<<<
I am able to delete last character with :
sed 's/.$//'as suggested in stackflowextract last characters with :
rev sym.txt | cut -c 1-3 | revto extract only < with grep :
grep -Eo "<.{3}" sym.txt
but i am not able to extract as below
GAG
<<<
GAGC
<<<<
or GAGC <<<<
Could someone help with sed, awk or grep - thank you in advance