I know Python is almost made for these kind of purposes, but I am really struggling to understand how I get access to specific values in the dataset, and I tried both with pandas and csv modules. It is probably a matter of syntax. Here's the thing: I have a csv file in the form of
Nation, Year, No. of refugees
Afghanistan,2013,6657
Albania,2013,199
Algeria,2013,91
Angola,2013,47
Armenia,2013,156
...
...
Afghanistan,2012,6960
Albania,2012,157
Algeria,2012,67
Angola,2012,43
Armenia,2012,143
...
and so on. What I would like to do is to get the total amount of refugees per year, i.e. selecting all the rows with a certain year and summing all the elements in the related "no. of refugees" column. I managed to do this:
import csv
with open('refugees.csv', 'r') as f:
d_reader = csv.DictReader(f)
headers = d_reader.fieldnames
print headers
#2013
list2013=[]
for line in d_reader:
if (line['Year']=='2013'):
list2013.append(line['Refugees'])
list2013=map(int,list2013) #I have str values in my file
ref13=sum(list2013)
but I am looking for a more elegant (and, above all, iterative) solution. Moreover, if I perform that procedure multiple times for different years, I always get 0: it works for 2013 only, not sure why.
Edit: I tried this as well, without success, but I think this could be totally wrong:
import csv
refugees_dict={}
a=range(2005,2014)
a=map(str, a)
with open('refugees.csv', 'r') as f:
d_reader = csv.DictReader(f)
for element in a:
for line in d_reader:
if (line['Year']==element):
print 'hello!'
temp_list=[]
temp_list.append(line['Refugees'])
temp_list=map(int, temp_list)
refugees_dict[a]=sum(temp_list)
print refugees_dict
The next step of my work will involve further studies on the dataset, eg I am probably gonna need to access data nation-wise instead of year-wise, and I really appreciate any hint so I understand how to manipulate data. Thanks a lot.
it works for 2013 only, not sure why, because you hardcodedif (line['Year']=='2013'), maybe?....., also were you not able to load this into a pandas data frame?dataframe = pd.read_csv("refugees.csv")ordataframe = pd.read_csv("refugees.csv", header=None)if you don't want the headers.