How Should The Interquartile Range Be Calculated In Python?
Solution 1:
Version 1.9 of numpy features a handy 'interpolation' argument to help you get to 4.
a = numpy.array([1, 2, 3, 4, 5, 6, 7])
numpy.percentile(a, 75, interpolation='higher') - numpy.percentile(a, 25, interpolation='lower')
Solution 2:
You have 7 numbers which you are attempting to split into quartiles. Because 7 is not divisible by 4 there are a couple of different ways to do this as mentioned here.
Your way is the first given by that link, wolfram alpha seems to be using the third. Numpy is doing basically the same thing as wolfram however its interpolating based on percentiles (as shown here) rather than quartiles so its getting a different answer. You can choose how numpy handles this using the interpolation option (I tried to link to the documentation but apparently I'm only allowed two links per post).
You'll have to choose which definition you prefer for your application.
Solution 3:
Not perfect but these functions should approximate it:
defquartile_1(l):
returnsorted(l)[int(len(l) * .25)]
defmedian(l):
returnsorted(l)[len(l)/2]
defquartile_3(l):
returnsorted(l)[int(len(l) * .75)]
Post a Comment for "How Should The Interquartile Range Be Calculated In Python?"