What's the reasoning behind MinimizerResult
's ndata
determination?
#995
-
Hello, I was looking at some minimization results and I realized a few things didn't make much sense, namely the AIC, BIC and ndata values. Looking into the source code I found the method that calculates those, def _calculate_statistics(self):
"""Calculate the fitting statistics."""
self.nvarys = len(self.init_vals)
if not hasattr(self, 'residual'):
self.residual = -np.inf
if isinstance(self.residual, np.ndarray):
self.chisqr = (self.residual**2).sum()
self.ndata = len(self.residual)
self.nfree = self.ndata - self.nvarys
else:
self.chisqr = self.residual
self.ndata = 1
self.nfree = 1
self.redchi = self.chisqr / max(1, self.nfree)
# this is -2*loglikelihood
self.chisqr = max(self.chisqr, 1.e-250*self.ndata)
_neg2_log_likel = self.ndata * np.log(self.chisqr / self.ndata)
self.aic = _neg2_log_likel + 2 * self.nvarys
self.bic = _neg2_log_likel + np.log(self.ndata) * self.nvarys . My question is why are E.g., in my case I am passing the data as part of args, args = (
data,# <-- df
media_data # <-- media
)
res_m1 = lf.minimize(
fcn=cost_m1_weighted,
params=parameters_m1_dp,
method="lbfgsb",
args=(args,)
) . I understand it would be impossible to have an automated generic way to determine args = (
data,# <-- df
media_data # <-- media
)
res_m1 = lf.minimize(
fcn=cost_m1_weighted,
params=parameters_m1_dp,
method="lbfgsb",
args=(args,),
naparams=len(args[0])
) Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Well, they do not default to 1. The code you quote says
So, um, what? If
How could we know that?
The number of parameters is most definitely a well-known value. So is the number of variable parameters. Every part of the code knows these values. I am completely baffled. I am always skeptical to see people using a scalar minimizer with a custom cost function for data fitting. Can you show that these fits are better than using the default |
Beta Was this translation helpful? Give feedback.
-
Well,
I do not understand.
Um, it was not clear to me. That was exactly my comment. When someone says they do not understand what you are saying, you need to believe them.
The difference can be enormous. If you have multiple observations (ndata > 1), the fitting algorithms can look at the residual for each of these points individually. Many of the algorithms (in fact, the best, most reliable and most efficient ones) will take derivatives of "(change in misfit)/(change in parameter value)" for each data point. Some algorithms will simply refuse to run if there are more variable parameters than observations, as the problem is not well-determined. When you do a "cost function" yourself that reduces this to a single value, you throw away that information that can be (and indeed, is) used by the fit. So, do not return "((data-model)**2).sum()", return "data-model" and let the fit do its work. If you want some other "cost function", adjust that so that squaring and summing the value you do return gives the cost function you want. For fitting data, using a scalar minimizer is just a terrible approach. The best algorithms simply disallow it, and you are stuck with poor substitutes that are highly prone to failure.
Do you mean an optional kwargs to My suggestion is to start with |
Beta Was this translation helpful? Give feedback.
@JulianoGianlupi
Well,
ndata
is the length of the residual. That is the number of observations that are used in the minimization.I do not understand.
Um, it was not clear to me. That was exactly my comment. When someone says they do not understand what you are saying, you need to believe them.