Python Distribution Fitting With Sum Of Square Error (sse)
Solution 1:
Your situation is not the same as that in the one treated in the question you cited. You have both the ordinates and the abscissae of the data points, rather than the usual i.i.d. sample. I would suggest that you use scipy curve_fit
. Here's a sample.
x_axis = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0]
y_axis = [0, 0, 0, 0, 0.24, 0.53, 0.49, 0.64, 0.54, 0.78, 0.59, 0.44, 0.34, 0.88, 0.2, 0.49, 0.39, 0.39, 0.29, 0.2, 0.05, 0.05, 0.25, 0.05, 0.1, 0.15, 0.1, 0.1, 0.1, 0, 0, 0, 0, 0]
## y_axis values must be normalised
sum_ys = sum(y_axis)
y_axis = [_/sum_ys for _ in y_axis]
print (sum(y_axis))
from scipy.stats import gamma, norm
from scipy.optimize import curve_fit
defgamma_f(x, a, loc, scale):
return gamma.pdf(x, a, loc, scale)
defnorm_f(x, loc, scale):
return norm.pdf(x, loc, scale)
fitting = norm_f
result = curve_fit(fitting, x_axis, y_axis)
print (result)
import matplotlib.pyplot as plt
plt.plot(x_axis, y_axis, 'ro')
plt.plot(x_axis, [fitting(_, *result[0]) for _ in x_axis], 'b-')
plt.axis([0,35,0,.5])
plt.show()
This version shows how to do one plot, for the normal fit to the data. (The gamma provides a poor fit.) Only two parameters are needed for the normal. In general you would need only the first part of the output results, the estimates of the parameters, shape, location and scale.
(array([ 2.3352639 , -3.08105104, 10.15024823]), array([[ 5954.86532869, -27818.92220973, -19675.22421994],
[ -27818.92220973, 133161.76500251, 90741.43608615],
[ -19675.22421994, 90741.43608615, 66054.79087992]]))
Notice that the pdf of the gamma distribution is also available in scipy, as are the others that you need, I think, saving you the work of coding them.
The most important thing I omitted from the first code was the need to normalise the y-values, that is, to make them sum to one, since they should approximate histogram heights.
Solution 2:
I tried your example using OpenTURNS platform Here what I got.
I started with the same data as you after importing openturns and openturs.viewer.View for plotting
import openturns as ot
from openturns.viewerimportView
x_axis = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0,
12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0,
22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0,
32.0, 33.0, 34.0]
y_axis = [0, 0, 0, 0, 0.24, 0.53, 0.49, 0.64, 0.54, 0.78, 0.59, 0.44,
0.34, 0.88, 0.2, 0.49, 0.39, 0.39, 0.29, 0.2, 0.05, 0.05,
0.25, 0.05, 0.1, 0.15, 0.1, 0.1, 0.1, 0, 0, 0, 0, 0]
First step: we can define the corresponding distribution
distribution = ot.UserDefined(ot.Sample([[s] forsin x_axis]), y_axis)
graph = distribution.drawPDF()
graph.setColors(["black"])
graph.setLegends(["your input"])
at this stage, if you View(graph)
you would get:
Second step: we can derive a sample from the obtained distibution
sample = distribution.getSample(10000)
this sample will be used to fit any kind of distributions. I tried with WeibullMin and Gamma distributions
# WeibullMin Factory
distribution2 = ot.WeibullMinFactory().build(sample)
print(distribution2)
graph2 = distribution2.drawPDF() ; graph2.setLegends(["Best WeibullMin"])
>>> WeibullMin(beta = 8.83969, alpha = 1.48142, gamma = 4.76832)
# Gamma Factory
distribution3 = ot.GammaFactory().build(sample)
print(distribution3)
>>> Gamma(k = 2.08142, lambda = 0.25157, gamma = 4.9995)
graph3 = distribution3.drawPDF() ; graph3.setLegends(["Best Gamma"]) ;
graph3.setColors(["blue"])
# plotting all the results
graph.add(graph2) ; graph.add(graph3)
View(graph)
Solution 3:
I think its the best and simple way to calculate the sum of square error:
#write the function
defSSE(y_true, y_pred):
sse= np.sum((y_true-y_pred)**2)
print(sse)
#now call the function and get results
SSE(y_true, y_pred)
Post a Comment for "Python Distribution Fitting With Sum Of Square Error (sse)"