3

I'm trying to calculate the autocorrelation time for a time series generated from molecular-dynamics trajectory frames. I used several methods to estimate the autocorrelation time because I need it to compute the statistical error of an observable. My first attemp was block averaging. The block-averaging plot for the standard error of the mean (SEM) shows no clear plateau. The code I used for block-averaging:

sem=[]
            
            #return sems
            for i in range(1,1000):
                block_size.append(i)
                block_mean=[]
                no_blocks=0
                for l in range(0, len(nums),i):
                    if len(nums[l:l+i])==i:
                        block_mean.append(np.mean(nums[l:l+i]))
                        no_blocks=no_blocks+1
                sem.append(np.std(block_mean)/np.sqrt(no_blocks))
                

Example output (block-averaging SEM vs block size):enter image description here My question: If there is no clear plateau in the SEM from block averaging, is it still meaningful to attempt to estimate the statistical error for these data? If so, what methods/precautions should I use to get a reliable error estimate (or a conservative bound) from a single long trajectory of Molecular Dynamics frames?

I tried using other methods such as statsmodel.tsa.stattools.acf for autocorrelation time calculation the error I end up getting is 0.0008712 which seems a bit unrealistic in my opinion. Is there a different way to calculate a more accurate error for these data?

I used the following code to calculate error using statsmodel.tsa.stattools.acf,

def tau_cal(acf_out,nums):
    positive_acf =np.where(acf_out > 0.0000)[0]
                
    result = [acf_out[x] for x in positive_acf]
    tau = 1 + 2 * np.sum(result[1:-1])
    Neff = len(nums)/tau
    error = (statistics.stdev(nums))/np.sqrt(Neff)
    return error
time = statsmodel.tsa.stattools.acf(nums)
error = tau_cal(acf_out,nums)
2
  • I don't quite follow the calculation you're doing. It sounds like you're trying to divide your time series up into blocks with N samples apiece, take the mean of those blocks individually, then compute summary statistics on the block means. However, the code you posted won't do that. e.g. If you have 100 numbers, and a block size of 5, the code you posted will use statistics.stdev(), which will divide variance by 100, rather than dividing it by 20, which will lead an underestimate in your block variance calculation. Commented Oct 22 at 6:46
  • 1
    I'm sorry about the confusion. I have attached the block averaging code in the question now. Commented Oct 22 at 13:22

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.