Python: Statistics T-test

Question

I am using python 3.6 to run some statistics test on a data-set. What I am trying to accomplish is to run a t-test between the data-set and the trend line to determine the statistical significance. I and using scipy to do this however I am not sure what variables I should include in the test to get the outcome I need.

Here is my code so far:

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

p = np.load('data.npy')

#0=1901
start=0
end=100

plt.figure()
plt.plot(a,annualmean,  '-')
slope, intercept, r_value, p_value, std_err = stats.linregress(a,annualmean)
plt.plot(a,intercept+slope*a, 'r')

annualmean=[]
for n in range(start,end):
    annualmean.append(np.nanmean(p[n]))

#Trendline Plots
a=range(start,end)
year1 = 1901

print(stats.ttest_ind(annualmean,a))

Right now the code is working, no error messages, however I am getting an incredibly small p-value that I don't think is correct. If anyone knows knows what variables I should write into the t-test that would be very helpful. Thanks!

Tkanno · Accepted Answer · 2017-06-23 16:42:35Z

1

I don't have the reputation to comment, but according to your code, you are doing a t-test comparing the means between the annual mean data and an array from 0-100. scipy.stats.ttest takes two arrays of equal size for which you want to compare the mean.

According to the documentation:

scipy.stats.ttest_ind(a, b, axis=0, equal_var=True)[source]

Parameters: 
a, b : array_like
The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).

An additional note, it doesn't make sense to do a t-test between a trend line and your raw data but that is a question for another forum

answered Jun 23, 2017 at 16:42

Tkanno

6766 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Josef Over a year ago

Note, the two arrays don't need to have the same length for ttest_ind. See the except clause in the docstring.

CPG · Accepted Answer · 2017-07-07 18:06:20Z

0

So turns out I was confused about how to test the statistical significance. I already had figured out a p-value for the data in the line:

slope, intercept, r_value, p_value, std_err = stats.linregress(a,annualmean)

All I needed to do was: print(p_value)

answered Jul 7, 2017 at 18:06

CPG

973 silver badges16 bronze badges

Collectives™ on Stack Overflow

Python: Statistics T-test

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related