2

Using the tutorial on multiclass adaboost, I'm trying to classify some images that have two classes (but I don't suppose the algorithm shouldn't work if the problem is binary). Then I'm going to extend my samples to include other classes.

My current test is quite small, only 17 images in all, 10 for training, 7 for testing.

For now I have two classes: 0: no vehicle, 1: vehicle present I used integer labels because according to the example in the link above, the training data consists of integer-based labels.

I've edited the provided example only a bit, to include my own image files, but I'm getting an error.

Traceback (most recent call last):
  File "C:\Users\app\Documents\Python Scripts\carclassify.py", line 66, in <module>
    bdt_discrete.fit(X_train, y_train)
  File "C:\Users\app\Anaconda\lib\site-packages\sklearn\ensemble\weight_boosting.py", line 389, in fit
    return super(AdaBoostClassifier, self).fit(X, y, sample_weight)
  File "C:\Users\app\Anaconda\lib\site-packages\sklearn\ensemble\weight_boosting.py", line 99, in fit
    X = np.ascontiguousarray(array2d(X), dtype=DTYPE)
  File "C:\Users\app\Anaconda\lib\site-packages\numpy\core\numeric.py", line 408, in ascontiguousarray
    return array(a, dtype, copy=False, order='C', ndmin=1)
ValueError: setting an array element with a sequence.

The following is my code, adapted from the example on the scikit-learn website:

f = open("PATH_TO_SAMPLES\\samples.txt",'r')
out = f.read().splitlines()
import numpy as np

imgs = []
tmp_hogs = []
# 13 of the images are with vehicles, 4 are without
labels = [1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0]

for file in out:
        filepath = "C:\PATH_TO_SAMPLE_IMAGES\\" + file
        curr_img = color.rgb2gray(io.imread(filepath))
        imgs.append(resize(curr_img,(60,40)))
        fd, hog_image = hog(curr_img, orientations=8, pixels_per_cell=(16, 16),
                 cells_per_block=(1, 1), visualise=True)
        tmp_hogs.append(fd) 

img_hogs = np.array(tmp_hogs)
n_split = 10
X_train, X_test = img_hogs[:n_split], X[n_split:] # all first ten images with vehicles
y_train, y_test = labels[:n_split], labels[n_split:] # 3 images with vehicles, 4 without

#now all the code below is straight off the example on scikit-learn's website

bdt_real = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=2),
    n_estimators=600,
    learning_rate=1)

bdt_discrete = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=2),
    n_estimators=600,
    learning_rate=1.5,
    algorithm="SAMME")

bdt_real.fit(X_train, y_train)
bdt_discrete.fit(X_train, y_train)

real_test_errors = []
discrete_test_errors = []

for real_test_predict, discrete_train_predict in zip(
        bdt_real.staged_predict(X_test), bdt_discrete.staged_predict(X_test)):
    real_test_errors.append(
        1. - accuracy_score(real_test_predict, y_test))
    discrete_test_errors.append(
        1. - accuracy_score(discrete_train_predict, y_test))

n_trees = xrange(1, len(bdt_discrete) + 1)

pl.figure(figsize=(15, 5))

pl.subplot(131)
pl.plot(n_trees, discrete_test_errors, c='black', label='SAMME')
pl.plot(n_trees, real_test_errors, c='black',
        linestyle='dashed', label='SAMME.R')
pl.legend()
pl.ylim(0.18, 0.62)
pl.ylabel('Test Error')
pl.xlabel('Number of Trees')

pl.subplot(132)
pl.plot(n_trees, bdt_discrete.estimator_errors_, "b", label='SAMME', alpha=.5)
pl.plot(n_trees, bdt_real.estimator_errors_, "r", label='SAMME.R', alpha=.5)
pl.legend()
pl.ylabel('Error')
pl.xlabel('Number of Trees')
pl.ylim((.2,
        max(bdt_real.estimator_errors_.max(),
            bdt_discrete.estimator_errors_.max()) * 1.2))
pl.xlim((-20, len(bdt_discrete) + 20))

pl.subplot(133)
pl.plot(n_trees, bdt_discrete.estimator_weights_, "b", label='SAMME')
pl.legend()
pl.ylabel('Weight')
pl.xlabel('Number of Trees')
pl.ylim((0, bdt_discrete.estimator_weights_.max() * 1.2))
pl.xlim((-20, len(bdt_discrete) + 20))

# prevent overlapping y-axis labels
pl.subplots_adjust(wspace=0.25)
pl.show()

Edit

I typed

print tmp_hogs

and the output was this:

[ array([ 0.27621208,  0.11038658,  0.10698133, ...,  0.08661556,        0.04612063,  0.0280782 ]), 
        array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., -1.29909838e-15,  -7.01780982e-17,  -1.24900943e-15]), 
        array([ 0.0503603 ,  0.1497235 ,  0.2372957 , ...,  0.07249325, 0.04545541,  0.00903818]), 
        array([ 0.27299191,  0.13122109,  0.0719268 , ...,  0.0848522 ,  0.04789403,  0.01387038]), 
        array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ...,  3.32140617e-17,  -6.58924128e-17,  -6.23567224e-16]), 
        array([ 0.37431874,  0.18094303,  0.01219871, ...,  0.06501856, 0.04855516,  0.02439321]), 
        array([ 0.41087302,  0.16478851,  0.03396399, ...,  0.09511273, 0.04077713,  0.03945513]), 
        array([ 0.17753915,  0.07025565,  0.09136909, ...,  0.03396507, 0.01379266,  0.01645722]), 
        array([ 0.40605587,  0.05915388,  0.03767763, ...,  0.08981079, 0.05452031,  0.01725399]), 
        array([ 0.        ,  0.        ,  0.        , ...,  0.00579303, 0.02053979,  0.0019091 ]), 
        array([ 0.31550735,  0.11988131,  0.07716529, ...,  0.09815158, 0.03058497,  0.02236517]), 
        array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., -3.51175682e-16,   1.31619418e-03,   2.86127901e-16]), 
        array([ 0.21381704,  0.22352378,  0.11568828, ...,  0.06311083, 0.02696666,  0.00402261]), 
        array([ 0.17480064,  0.1469145 ,  0.16336016, ...,  0.05614001, 0.03244093,  0.00524034]), 
        array([ 0.        ,  0.        ,  0.        , ...,  0.03089959, 0.00509584,  0.00247698]), 
        array([ 0.04711166,  0.0218663 ,  0.05316   , ...,  0.04214594, 0.04892439,  0.25840958]), 
        array([ 0.05357464,  0.00530857,  0.07162301, ...,  0.06802692, 0.08331959,  0.26619977])]

Then I ran

print img_hogs

and the output was:

[ array([ 0.27621208,  0.11038658,  0.10698133, ...,  0.08661556, 0.04612063,  0.0280782 ])
 array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., -1.29909838e-15,  -7.01780982e-17,  -1.24900943e-15])
 array([ 0.0503603 ,  0.1497235 ,  0.2372957 , ...,  0.07249325, 0.04545541,  0.00903818])
 array([ 0.27299191,  0.13122109,  0.0719268 , ...,  0.0848522 , 0.04789403,  0.01387038])
 array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., 3.32140617e-17,  -6.58924128e-17,  -6.23567224e-16])
 array([ 0.37431874,  0.18094303,  0.01219871, ...,  0.06501856, 0.04855516,  0.02439321])
 array([ 0.41087302,  0.16478851,  0.03396399, ...,  0.09511273, 0.04077713,  0.03945513])
 array([ 0.17753915,  0.07025565,  0.09136909, ...,  0.03396507, 0.01379266,  0.01645722])
 array([ 0.40605587,  0.05915388,  0.03767763, ...,  0.08981079, 0.05452031,  0.01725399])
 array([ 0.        ,  0.        ,  0.        , ...,  0.00579303, 0.02053979,  0.0019091 ])
 array([ 0.31550735,  0.11988131,  0.07716529, ...,  0.09815158, 0.03058497,  0.02236517])
 array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00, ..., -3.51175682e-16,   1.31619418e-03,   2.86127901e-16])
 array([ 0.21381704,  0.22352378,  0.11568828, ...,  0.06311083, 0.02696666,  0.00402261])
 array([ 0.17480064,  0.1469145 ,  0.16336016, ...,  0.05614001, 0.03244093,  0.00524034])
 array([ 0.        ,  0.        ,  0.        , ...,  0.03089959, 0.00509584,  0.00247698])
 array([ 0.04711166,  0.0218663 ,  0.05316   , ...,  0.04214594, 0.04892439,  0.25840958])
 array([ 0.05357464,  0.00530857,  0.07162301, ...,  0.06802692, 0.08331959,  0.26619977])]
7
  • Quite independently of the error: 17 samples is decidedly not enough to do anything meaningful. Why don't you download a standard image database and try it on that? An easy, well organized one is Caltech101. Commented Apr 11, 2014 at 11:02
  • Could you show what tmp_hogs and what img_hogs looks like? Commented Apr 11, 2014 at 12:52
  • Sure! I've edited the question to include the outputs at the end. Commented Apr 11, 2014 at 13:21
  • The second output doesn't look right (it is a copy of the first). It should say array([... ... ... ...], dtype= ...) Commented Apr 11, 2014 at 13:52
  • I just tried img_hogs = np.array(tmp_hogs, dtype =float), but it gave the same error, and on this line in fact. Commented Apr 11, 2014 at 14:25

1 Answer 1

1

try:

imgs = []
tmp_hogs = np.zeros((17, 256))
# 13 of the images are with vehicles, 4 are without
labels = [1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0]

i = 0
for file in out:
        filepath = "C:\PATH_TO_SAMPLE_IMAGES\\" + file
        curr_img = color.rgb2gray(io.imread(filepath))
        imgs.append(resize(curr_img,(60,40)))
        fd, hog_image = hog(curr_img, orientations=8, pixels_per_cell=(16, 16),
                 cells_per_block=(1, 1), visualise=True)
        tmp_hogs[i,:] = fd
        i+=1

img_hogs = tmp_hogs
Sign up to request clarification or add additional context in comments.

3 Comments

Just tried this, but on the line tmp_hogs[i,:]=fd, I get the following error: ValueError: could not broadcast input array from shape (1728) into shape (256). So I adjusted the tmp_hogs declaration and gave it 1728 columns, then I got the same error again, this time saying it couldn't "broadcat from 1728 to 2080". So I'm guessing this means that my tmp_hogs doesn't have the same number of columns in each row? But I resized all images to 60,40! So how could this be?
limit the number of hog features. its not consistent
How so? I thought resizing the image to the same size would limit this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.