What's the reasoning behind `MinimizerResult`'s `ndata` determination? #995

JulianoGianlupi · 2025-02-27T15:45:34Z

JulianoGianlupi
Feb 27, 2025

Hello, I was looking at some minimization results and I realized a few things didn't make much sense, namely the AIC, BIC and ndata values. Looking into the source code I found the method that calculates those,

def _calculate_statistics(self):
        """Calculate the fitting statistics."""
        self.nvarys = len(self.init_vals)
        if not hasattr(self, 'residual'):
            self.residual = -np.inf
        if isinstance(self.residual, np.ndarray):
            self.chisqr = (self.residual**2).sum()
            self.ndata = len(self.residual)
            self.nfree = self.ndata - self.nvarys
        else:
            self.chisqr = self.residual
            self.ndata = 1
            self.nfree = 1
        self.redchi = self.chisqr / max(1, self.nfree)
        # this is -2*loglikelihood
        self.chisqr = max(self.chisqr, 1.e-250*self.ndata)
        _neg2_log_likel = self.ndata * np.log(self.chisqr / self.ndata)
        self.aic = _neg2_log_likel + 2 * self.nvarys
        self.bic = _neg2_log_likel + np.log(self.ndata) * self.nvarys

. My question is why are ndata and nfree defaulting to 1? Wouldn't it be a good addition to allow the user to set ndata?

E.g., in my case I am passing the data as part of args,

args = (
    data,# <-- df
    media_data # <-- media
)
res_m1 = lf.minimize(
    fcn=cost_m1_weighted,
    params=parameters_m1_dp,
    method="lbfgsb",
    args=(args,)
)

. I understand it would be impossible to have an automated generic way to determine nparams for any and all user-styles, so I think it would be nice if lmfit.minimize() was able to accept an argument nparams directly. That way I could do

args = (
    data,# <-- df
    media_data # <-- media
)
res_m1 = lf.minimize(
    fcn=cost_m1_weighted,
    params=parameters_m1_dp,
    method="lbfgsb",
    args=(args,),
    naparams=len(args[0])
)

Thanks!
Talk soon

Answered by newville

Mar 4, 2025

@JulianoGianlupi

Well, ndata is the length of the residual. That is the number of observations that are used in the minimization.

e.g., in my case I am passing the data as part of args,
How could we know that?
Exactly! I am glad you understood

I do not understand.

The number of parameters is most definitely a well-known value
Right, my mistake, I meant ndata. I do think that it's clear from the rest of the post what I meant though.

Um, it was not clear to me. That was exactly my comment. When someone says they do not understand what you are saying, you need to believe them.

I am always skeptical to see people using a scalar minimizer with a custom cost function for data fittin…

View full answer

newville · 2025-02-28T03:05:34Z

newville
Feb 28, 2025
Maintainer

@JulianoGianlupi

My question is why are ndata and nfree defaulting to 1?

Well, they do not default to 1. The code you quote says

            self.ndata = len(self.residual)
            self.nfree = self.ndata - self.nvarys

So, um, what?

If self.residual is not an array, ndata is 1. Literally, a scalar minimizer has 1 data point. That is what a scalar means.

e.g., in my case I am passing the data as part of args,

How could we know that?

.I understand it would be impossible to have an automated generic way to determine nparams for any and all user-styles, so I think it would be nice if lmfit.minimize() was able to accept an argument nparams directly.

The number of parameters is most definitely a well-known value. So is the number of variable parameters. Every part of the code knows these values. I am completely baffled.

I am always skeptical to see people using a scalar minimizer with a custom cost function for data fitting. Can you show that these fits are better than using the default leastsq solver?

1 reply

JulianoGianlupi Mar 4, 2025
Author

Thanks for the reply.

Well, they do not default to 1. The code you quote says

If self.residual is not an array, ndata is 1. Literally, a scalar minimizer has 1 data point. That is what a scalar means.

So it defaults to 1 if I do the sum in my cost function... As for your last sentence, let's tone down the condescension and try to be productive?

e.g., in my case I am passing the data as part of args,

How could we know that?

Exactly! I am glad you understood

The number of parameters is most definitely a well-known value

Right, my mistake, I meant ndata. I do think that it's clear from the rest of the post what I meant though.

I am always skeptical to see people using a scalar minimizer with a custom cost function for data fitting. Can you show that these fits are better than using the default leastsq solver?

What's the difference? It's just easier for me to do the sum given what I am trying to solve and the shape of my data

Now, do you have any constructive comment on my suggestion of making a optional kwargs for ndata?

Thanks again

newville · 2025-03-04T17:49:41Z

newville
Mar 4, 2025
Maintainer

@JulianoGianlupi

Well, ndata is the length of the residual. That is the number of observations that are used in the minimization.

e.g., in my case I am passing the data as part of args,
How could we know that?
Exactly! I am glad you understood

I do not understand.

The number of parameters is most definitely a well-known value
Right, my mistake, I meant ndata. I do think that it's clear from the rest of the post what I meant though.

Um, it was not clear to me. That was exactly my comment. When someone says they do not understand what you are saying, you need to believe them.

I am always skeptical to see people using a scalar minimizer with a custom cost function for data fitting. Can you show that these fits are better than using the default leastsq solver?
What's the difference? It's just easier for me to do the sum given what I am trying to solve and the shape of my data

The difference can be enormous. If you have multiple observations (ndata > 1), the fitting algorithms can look at the residual for each of these points individually. Many of the algorithms (in fact, the best, most reliable and most efficient ones) will take derivatives of "(change in misfit)/(change in parameter value)" for each data point. Some algorithms will simply refuse to run if there are more variable parameters than observations, as the problem is not well-determined.

When you do a "cost function" yourself that reduces this to a single value, you throw away that information that can be (and indeed, is) used by the fit.

So, do not return "((data-model)**2).sum()", return "data-model" and let the fit do its work. If you want some other "cost function", adjust that so that squaring and summing the value you do return gives the cost function you want.

For fitting data, using a scalar minimizer is just a terrible approach. The best algorithms simply disallow it, and you are stuck with poor substitutes that are highly prone to failure.

Now, do you have any constructive comment on my suggestion of making a optional kwargs for ndata?

Do you mean an optional kwargs to minimize()? [Yes, not clear to me]. The minimize function minimizes an objective function, it does not minimize model-data. If an objective function returns a single value, there is no way to know how many data points went into making that residual. If minimize() took an optional ndata parameter, what would we do with that? Do you mean "let the user lie about the number of data points"?

My suggestion is to start with leastsq, and maybe even the Model interface, and then move on to other cost functions when you are certain that is inappropriate. That way, you will always have the leastsq solution to compare to any other results you get.

1 reply

JulianoGianlupi Mar 4, 2025
Author

Ok, this was actually useful, thanks.

the fitting algorithms can look at the residual for each of these points individually. Many of the algorithms (in fact, the best, most reliable and most efficient ones)

I wasn't aware of that and probably should have been. And, if you look into how most scientific works are fit, I think you'll find that they live with the same ignorance I had on the subject. So, thank you.

Do you mean an optional kwargs to minimize()? [Yes, not clear to me].

Yup, but

"let the user lie about the number of data points"?

is a very good point. I guess I have more trust in the user than I should have

I'll look into leastsq, and Model.

Many thanks!

PS: As for the parts that there was friction between us, let's agree to let bygones be bygones ;)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lmfit

What's the reasoning behind `MinimizerResult`'s `ndata` determination? #995

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

lmfit

What's the reasoning behind MinimizerResult's ndata determination? #995

JulianoGianlupi Feb 27, 2025

Replies: 2 comments · 2 replies

newville Feb 28, 2025 Maintainer

JulianoGianlupi Mar 4, 2025 Author

newville Mar 4, 2025 Maintainer

JulianoGianlupi Mar 4, 2025 Author

What's the reasoning behind `MinimizerResult`'s `ndata` determination? #995

JulianoGianlupi
Feb 27, 2025

Replies: 2 comments 2 replies

newville
Feb 28, 2025
Maintainer

JulianoGianlupi Mar 4, 2025
Author

newville
Mar 4, 2025
Maintainer

JulianoGianlupi Mar 4, 2025
Author