I have file *.data, which include data in this order:
2.5,10,U1
3,4.5,U1
3,9,U1
3.5,5.5,U1
3.5,8,U1
4,7.5,U1
4.5,3.5,U1
4.5,4.5,U1
4.5,6,U1
5,5,U1
5,7,U1
7,6.5,U1
3.5,9.5,U2
3.5,10.5,U2
4.5,8,U2
4.5,10.5,U2
5,9,U2
5.5,5.5,U2
5.5,7.5,U2
In this data(I have different types of data, this is just example where are just 2 classes...), is 2 classes: U1 and U2, and for every class there is 2 values... What I need is to read this data and separate them to classes, in this case to U1 and U2.... Then after that I need to take from every class 2/3 data to new value(learning_set), and other 1/3 to other value(test_set).
I started with this code:
data = open('set.data', 'rt')
data_list=[]
border=2./3
data_list = [line.strip().split(',') for line in data]
learning_set=data_list[:int(round(len(data_list)*border))]
test_set=data_list[int(round(len(data_list)*border)):]
But there I take from all data 2/3 and 1/3, not from every class.
Many thanks for help