1

I have some data like this and I want to extract the Var column for CCC but only for the first three months of 2018 and 2019.

ID   Date  Var
--- ------ ---
AAA 201701 110
BBB 201705 211
CCC 201710 312
AAA 201712 413
BBB 201801 514
CCC 201801 615
AAA 201802 716
BBB 201802 817
CCC 201803 918
AAA 201803 119
BBB 201804 220
CCC 201804 321
AAA 201901 222
BBB 201902 312
CCC 201903 111

The output should be 615,918,111.

I would like to make a pattern for the dates.

I've tried these so far

awk '/CCC/ && /201801/ && /201802/ && /201901/ && /201902/&& /201903/ { print $3 } ' file.txt

awk ' $1 ~ /CCC/ || /201801/ && /201802/ && /201901/ && /201902/&& /201903/ { print $3 } ' file.txt
2
  • But 918 and 111 are from 3rd month. Commented Sep 27, 2020 at 1:20
  • Sorry I meant first three Commented Sep 27, 2020 at 1:23

3 Answers 3

2
awk '$1 == "CCC" && $2 ~ /201[89]0[123]/{print $3}' filename

output

615
918
111

Python

#!/usr/bin/python
import re
u=re.compile(r'201[89]0[123]')
k=open('filename','r')
k.read
for i in k:
    j=i.split(' ')
    if j[0] == "CCC":
        if re.search ( u,j[1]):
            print j[2].strip()

output

615
918
111
2

You can use following awk

$ awk '$1 ~ /CCC/ && $2 ~ /201(8|9)0(1|2|3)/ {print $3}' file.txt
615
918
111

UPDATE

For average:

awk '$1 ~ /CCC/ && $2 ~ /201(8|9)0(1|2|3)/ {print $3; sum+=$3; n+=1} END { print "Average: " sum/n }' file.txt
615
918
111
Average: 548
5
  • How could I get the average Commented Sep 27, 2020 at 2:43
  • @entropee I updated the answer. Commented Sep 27, 2020 at 2:54
  • Can you explain the n+=1 Commented Sep 27, 2020 at 3:50
  • @entropee every time it matches the specified conditions, the value at Var column will be added to sum and n will be incremented by 1. we do this because finally we need the number of items for calculating average. Commented Sep 27, 2020 at 7:33
  • 1
    @entropee Please always accept an answer to the question you have asked, without adding new questions that require modifications on the existing answers. Chameleon questions should be avoided. You can ask a new question if you don't have the answer on a following task you have to do. Commented Sep 27, 2020 at 9:46
0

In fact you only have one pattern in $0 and could save a few characters of typing by

awk '/CCC 201[89]0[1-3]/{print $3; s+=$3; n++}END{print s/n}' file
615
918
111
548

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.