4
$\begingroup$

Given a sample $X_1,\ldots X_n$, how can I test the hypothesis that these are i.i.d. samples from a fixed (unknown) distribution?

To add context, assume this is a time series and I want evidence against things like periodicity, short-term dependence, predictability of $X_{i+1}$ from $X_1,\ldots, X_i$, or any "structural" bias.

I want to draw conclusions from the empirical distribution, $p_v=|\{v : X_j=v\}|/n$, but this seems meaningless without some evidence for i.i.d.

I know this is a broad, perhaps unanswerable question, but I'm looking for ways to support an i.i.d. assumption for temporal measurements of a physical process.


Ideas I've had, found, or been pointed to are ad hoc even in their limited scope. For instance:

  • $\chi^2$ tests for independence between various subsets, e.g. independence of even and odd index values or first and last halves of the data.
  • Tests stemming from autocorrelation, e.g. Ljung$-$Box test.
  • Some test from spectral analysis, but I don't know of any.
  • Wald$-$Wolfowitz runs test
  • Kolmogorov$-$Smirnov test after partitioning the data in some fashion.
$\endgroup$
6
  • 2
    $\begingroup$ Unless you suppose particular forms of non-iid-ness, this is problematic. e.g. if you have independence, you could have a distribution that fits for any set of data (the empirical cdf). If data arrive over time (or space or along with some variable or collection of variables) you might suppose that there is some change over time (etc). e.g. if location changes from time1 to time2, you might see bimodality. But bimodality doesnt of itself suggest non-iid. Similarly you might see dependence over time, but how do you identify completely general dependence from a single n-dimensional data point? $\endgroup$ Commented yesterday
  • $\begingroup$ If you have some model/assumptions and/or a particular structure of non-iid-ness you anticipate / need to be able to pick up, then you can get somewhere. Can you give more context for what prompts the question? $\endgroup$ Commented yesterday
  • 1
    $\begingroup$ This question is similar to: What are some statistical tests for exchangeability of a data set?. If you believe it’s different, please edit the question, make it clear how it’s different and/or how the answers on that question are not helpful for your problem. $\endgroup$ Commented yesterday
  • 1
    $\begingroup$ @Ben Although these are indeed strongly related, "i.i.d." and "exchangeability" are usually used under different statistical paradigms, and therefore it seems to be worthwhile to me to have both these questions open. It will not be obvious to many at first sight that these are so similar if not even equal in a certain sense. $\endgroup$ Commented yesterday
  • 2
    $\begingroup$ Instead of "Given samples $X_1,\ldots,X_n,$" I would have said "Given a sample $X_1,\ldots,X_n.$" $\endgroup$ Commented yesterday

2 Answers 2

8
$\begingroup$

First it's important to understand that no test can confirm the i.i.d. assumption. All that tests can do is to indicate against their null hypotheses (H0) or not; the latter doesn't mean that the H0 (e.g., "data are i.i.d.") is true.

The i.i.d. assumption can only be tested against certain specific alternatives, i.e., any test requires a specification in which way i.i.d. might be violated. Also one wouldn't test "i.i.d." as a whole but rather test independence and identical distributions separately.

Here are some tests:

The runs test tests independence against a dependence alternative that implies that high values are more likely to occur in a row, also low values, than in a random sequence. Also the opposite alternative (after a high value it is more likely to observe a low value and the other way round) can be tested in this way.

Another possibility would be to fit time series models and to test whether all the parameters that produce dependence/correlation between observations are zero.

Note that both of these tests test dependence patterns that manifest themselves through the time order of observations. If you hypothesise that certain given groups of observations are positively dependent, you could fit a random effect or mixed effects model and test whether the random effect (that models dependence) is actually a constant.

Note also that general dependence patterns can be so complex that they cannot be ruled out by any amount of data, and even some quite simple dependence structures cannot be identified from the data. You always need a specific idea how dependence is supposed to work out.

The situation with "the other i", i.e., the "identical distribution" is similar. You may specifically fit a regression model where your samples are modelled as functions of some covariates (generating different distributions for different $X_i$ with different covariate values), and then test whether the regression (slope) coefficient is zero (i.e., only a constant plus i.i.d. error term remains). There are also tests for heteroscedasticity, i.e., growing or shrinking variance over time or depending on some covariate, which obviously also is a specific violation of "identity" of distributions (googling finds more heteroscedasticity tests than those mentioned in the link). As in the independence case, general structures of non-identity can be so flexible that they can't be detected from the data (e.g., you may have a model that says that whatever is observed at a certain time point has to be observed there with probability 1, and this can never be rejected, to give a rather bizarre example).

Unfortunately, if you want to test the i.i.d. assumption in order to make sure that your data truly are (at least approximately) i.i.d., you will be sabotaged by the Misspecification Paradox, which says that misspecification testing of model assumptions will actively violate the model assumptions even if they were not violated before. Particularly regarding independence, the annoying fact is that if you have a truly independent sequence, but only apply a method assuming i.i.d. to it in case that an independence test doesn't reject independence, the observations conditionally on not rejecting independence are actually dependent! This is because if you use the runs test, say, and one observation before the end of the sequence the result is borderline significant, under the condition of non-rejection you know that the last observation needs to come in so that the result is pulled to insignificance. This creates dependence between observations, as the last observation cannot freely vary (as it needs to ensure that the test doesn't reject).

So testing i.i.d. in order to make sure your observations are fine for a certain method is potentially problematic even as far as it can be done (although arguably not detecting big and critical problems with i.i.d. may be more harmful than the misspecification paradox).

To some extent assuming i.i.d. is always a judgement call.

$\endgroup$
5
$\begingroup$

This is just a remark too long to fit in the space for a comment, and not a full answer.

I certainly agree with @ChristianHennig's very good answer (+1), that one can not ever expect to truly test the i.i.d assumption. It has to be taken on "faith".

Proper random sampling may give one a presumption of "independence".

It also should be noted that i.d. is a question of perspective. For example, we all know that the distribution of of heights in a population is different for males and females. Say now that I randomly sample from this population, and aggregate the data regardless of gender. The 2 distributions are close enough that, depending on the "altitude" from which I look at the problem, it may be fine to consider the data as i.d, or not... And I am not even talking about the difference between age groups, ethinic origins, etc. So is the height of a given population a single distribution, or a mixture? Yes to both, and it is up to the researcher to see if (and how much) this matters, for their specific purpose.

Now, you seem to be mostly interested in time series behaviors. Assuming that your data generating process is stationary over time (yes, checking an assumption by introducing another one...), you may be interested in Shewhart's control charts. They are routinely used in statistical process control, with great success. They are based on various rules (e.g. here) to detect non-random values (non-independence), or "special causes" (i.e. non i.d. values). You may need to tweak the rules to suit your particular needs (the rules are heuristics, so feel free to not use some, modify others, and add a few of your own choosing).

Last, I will again defer to @ChristianHennig's answer to let you decide if using such "tests" before considering data i.i.d is a good idea or not...

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.