How To Calculate P-value For Two Lists Of Floats?
So I have lists of floats. Like [1.33,2.555,3.2134,4.123123] etc. Those lists are mean frequencies of something. How do I proof that two lists are different? I thought about calcul
Solution 1:
Let's say you have a list of floats like this:
>>>data = {...'a': [0.9, 1.0, 1.1, 1.2],...'b': [0.8, 0.9, 1.0, 1.1],...'c': [4.9, 5.0, 5.1, 5.2],...}
Clearly, a
is very similar to b
, but both are different from c
.
There are two kinds of comparisons you may want to do.
- Pairwise: Is
a
similar tob
? Isa
similar toc
? Isb
similar toc
? - Combined: Are
a
,b
andc
drawn from the same group? (This is generally a better question)
The former can be achieved using independent t-tests as follows:
>>>from itertools import combinations>>>from scipy.stats import ttest_ind>>>for list1, list2 in combinations(data.keys(), 2):... t, p = ttest_ind(data[list1], data[list2])...print list1, list2, p...
a c 9.45895002589e-09
a b 0.315333596201
c b 8.15963804843e-09
This provides the relevant p-values, and implies that that a
and c
are
different, b
and c
are different, but a
and b
may be similar.
The latter can be achieved using the one-way ANOVA as follows:
>>>from scipy.stats import f_oneway>>>t, p = f_oneway(*data.values())>>>p
7.959305946160327e-12
The p-value indicates that a
, b
, and c
are unlikely to be from the same population.
Post a Comment for "How To Calculate P-value For Two Lists Of Floats?"