Skip to content Skip to sidebar Skip to footer

Wrong Decimal Calculations With Pandas

I have a data frame (df) in pandas with four columns and I want a new column to represent the mean of this four columns: df['mean']= df.mean(1) 1 2 3 4 mean NaN NaN

Solution 1:

You could use the float_format parameter:

import pandas as pd
import io

content = '''\
1    2    3    4   mean 
NaN  NaN  NaN  NaN      NaN  
5.9  5.4  2.4  3.2    4.225  
0.6  0.7  0.7  0.7    0.675  
2.5  1.6  1.5  1.2    1.700  
0.4  0.4  0.4  0.4    0.400'''

df = pd.read_table(io.BytesIO(content), sep='\s+')
df.to_csv('/tmp/test.csv', float_format='%g', index=False)

yields

1,2,3,4,mean
,,,,
5.9,5.4,2.4,3.2,4.225
0.6,0.7,0.7,0.7,0.675
2.5,1.6,1.5,1.2,1.7
0.4,0.4,0.4,0.4,0.4

Solution 2:

The answers seem correct. Floating point numbers cannot be perfectly represented on our systems. There are bound to be some differences. Read The Floating Point Guide.

>>> a = 5.9+5.4+2.4+3.2
>>> a / 4
4.2250000000000005

As you said, you could always format the results if you want to get only a fixed number of points after the decimal.

>>> "{:.3f}".format(a/4)
'4.225'

Post a Comment for "Wrong Decimal Calculations With Pandas"