Skip to content

Commit 66afef4

Browse files
authored
Merge pull request #23 from rickecon/mle
Merging
2 parents 51dc8ca + a38fda4 commit 66afef4

File tree

8 files changed

+48
-50
lines changed

8 files changed

+48
-50
lines changed

docs/book/_toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ parts:
3838
numbered: True
3939
chapters:
4040
- file: struct_est/intro
41-
- file: struct_est/MaxLikelihood
41+
- file: struct_est/MLE
4242
- file: struct_est/GMM
4343
- file: struct_est/SMM
4444
- caption: Appendix

docs/book/basic_empirics/BasicEmpirMethods.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -386,7 +386,7 @@ results = reg1.fit()
386386
type(results)
387387
```
388388

389-
We now have the fitted regression model stored in `results` (see [statsmodels.regression.linear_model.RegressionResultsWrapper](http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html)). The `results` from the `reg1.fit()` command is a regression results object with a lot of information, similar to the results object of the `scipy.optimize.minimize()` function we worked with in the {ref}`Chap_MaxLikeli` and {ref}`Chap_GMM` chapters.
389+
We now have the fitted regression model stored in `results` (see [statsmodels.regression.linear_model.RegressionResultsWrapper](http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html)). The `results` from the `reg1.fit()` command is a regression results object with a lot of information, similar to the results object of the `scipy.optimize.minimize()` function we worked with in the {ref}`Chap_MLE` and {ref}`Chap_GMM` chapters.
390390

391391
To view the OLS regression results, we can call the `.summary()` method.
392392

docs/book/basic_empirics/LogisticReg.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -442,4 +442,4 @@ The footnotes from this chapter.
442442

443443
[^GMM]: See the {ref}`Chap_GMM` chapter of this book.
444444

445-
[^MaxLikeli]: See the {ref}`Chap_MaxLikeli` chapter of this book.
445+
[^MaxLikeli]: See the {ref}`Chap_MLE` chapter of this book.
Lines changed: 36 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -10,19 +10,19 @@ kernelspec:
1010
name: python3
1111
---
1212

13-
(Chap_MaxLikeli)=
13+
(Chap_MLE)=
1414
# Maximum Likelihood Estimation
1515

16-
This chapter describes the maximum likelihood estimation (MLE) method. All data and images from this chapter can be found in the data directory ([./data/maxlikeli/](https://github.com/OpenSourceEcon/CompMethods/tree/main/data/maxlikeli/)) and images directory ([./images/maxlikeli/](https://github.com/OpenSourceEcon/CompMethods/tree/main/images/maxlikeli/)) for the GitHub repository for this online book.
16+
This chapter describes the maximum likelihood estimation (MLE) method. All data and images from this chapter can be found in the data directory ([./data/mle/](https://github.com/OpenSourceEcon/CompMethods/tree/main/data/mle/)) and images directory ([./images/mle/](https://github.com/OpenSourceEcon/CompMethods/tree/main/images/mle/)) for the GitHub repository for this online book.
1717

1818

19-
(SecMaxLikeli_GenModel)=
19+
(SecMLE_GenModel)=
2020
## General characterization of a model and data generating process
2121

2222
Each of the model estimation approaches that we will discuss in this section on Maximum Likelihood estimation (MLE) and in subsequent sections on generalized method of moments (GMM) and simulated method of moments (SMM) involves choosing values of the parameters of a model to make the model match some number of properties of the data. Define a model or a data generating process (DGP) as,
2323

2424
```{math}
25-
:label: EqMaxLikeli_GenMod
25+
:label: EqMLE_GenMod
2626
F(x_t, z_t|\theta) = 0
2727
```
2828

@@ -31,45 +31,45 @@ where $x_t$ and $z_t$ are variables, $\theta$ is a vector of parameters, and $F(
3131
In richer examples, a model could also include inequalities representing constraints. But this is sufficient for our discussion. The goal of maximum likelihood estimation (MLE) is to choose the parameter vector of the model $\theta$ to maximize the likelihood of seeing the data produced by the model $(x_t, z_t)$.
3232

3333

34-
(SecMaxLikeli_GenModel_SimpDist)=
34+
(SecMLE_GenModel_SimpDist)=
3535
### Simple distribution example
3636

3737
A simple example of a model is a statistical distribution [e.g., the normal distribution $N(\mu, \sigma)$].
3838

3939
```{math}
40-
:label: EqMaxLikeli_GenMod_NormDistPDF
40+
:label: EqMLE_GenMod_NormDistPDF
4141
Pr(x|\theta) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x - \mu)^2}{2\sigma^2}}
4242
```
4343

4444
The probability of drawing value $x_i$ from the distribution $f(x|\theta)$ is $f(x_i|\theta)$. The probability of drawing the following vector of two observations $(x_1,x_2)$ from the distribution $f(x|\theta)$ is $f(x_1|\theta)\times f(x_2|\theta)$. We define the likelihood function of $N$ draws $(x_1,x_2,...x_N)$ from a model or distribution $f(x|\theta)$ as $\mathcal{L}$.
4545

4646
```{math}
47-
:label: EqMaxLikeli_GenMod_NormDistLike
47+
:label: EqMLE_GenMod_NormDistLike
4848
\mathcal{L}(x_1,x_2,...x_N|\theta) \equiv \prod_{i=1}^N f(x_i|\theta)
4949
```
5050

5151
Because it can be numerically difficult to maximize a product of percentages (one small value can make dominate the entire product), it is almost always easier to use the log likelihood function $\ln(\mathcal{L})$.
5252

5353
```{math}
54-
:label: EqMaxLikeli_GenMod_NormDistLnLike
54+
:label: EqMLE_GenMod_NormDistLnLike
5555
\ln\Bigl(\mathcal{L}(x_1,x_2,...x_N|\theta)\Bigr) \equiv \sum_{i=1}^N \ln\Bigl(f(x_i|\theta)\Bigr)
5656
```
5757

5858
The maximum likelihood estimate $\hat{\theta}_{MLE}$ is the following:
5959

6060
```{math}
61-
:label: EqMaxLikeli_GenMod_NormDistMLE
61+
:label: EqMLE_GenMod_NormDistMLE
6262
\hat{\theta}_{MLE} = \theta:\quad \max_\theta \: \ln\mathcal{L} = \sum_{i=1}^N\ln\Bigl(f(x_i|\theta)\Bigr)
6363
```
6464

6565

66-
(SecMaxLikeli_GenModel_Econ)=
66+
(SecMLE_GenModel_Econ)=
6767
### Economic example
6868

6969
An example of an economic model that follows the more general definition of $F(x_t, z_t|\theta) = 0$ is {cite}`BrockMirman:1972`. This model has multiple nonlinear dynamic equations, 7 parameters, 1 exogenous time series of variables, and about 5 endogenous time series of variables. Let's look at a simplified piece of that model--the production function--which is commonly used in total factor productivity estimations.
7070

7171
```{math}
72-
:label: EqMaxLikeli_GenMod_EconProdFunc
72+
:label: EqMLE_GenMod_EconProdFunc
7373
Y_t = e^{z_t}(K_t)^\alpha(L_t)^{1-\alpha} \quad\text{where}\quad z_t = \rho z_{t-1} + (1 - \rho)\mu + \varepsilon_t \quad\text{and}\quad \varepsilon_t\sim N(0,\sigma^2)
7474
```
7575

@@ -82,54 +82,47 @@ The likelihood of a given data point is determined by $\varepsilon_t = z_t - \rh
8282
The likelihood function of all the data is:
8383

8484
```{math}
85-
:label: EqMaxLikeli_GenMod_EconProdFuncLike
85+
:label: EqMLE_GenMod_EconProdFuncLike
8686
\mathcal{L}\left(z_1,z_2,...z_T|\rho,\mu,\sigma\right) = \prod_{t=2}^T f(z_{t+1},z_t|\rho,\mu,\sigma)
8787
```
8888

8989
The log likelihood function of all the data is:
9090

9191
```{math}
92-
:label: EqMaxLikeli_GenMod_EconProdFuncLnLike
92+
:label: EqMLE_GenMod_EconProdFuncLnLike
9393
\ln\Bigl(\mathcal{L}\bigl(z_1,z_2,...z_T|\rho,\mu,\sigma\bigr)\Bigr) = \sum_{t=2}^T \ln\Bigl(f(z_{t+1},z_t|\rho,\mu,\sigma)\Bigr)
9494
```
9595

9696
The maximum likelihood estimate of $\rho$, $\mu$, and $\sigma$ is given by the following maximization problem.
9797

9898
```{math}
99-
:label: EqMaxLikeli_GenMod_EconProdFuncMLE
99+
:label: EqMLE_GenMod_EconProdFuncMLE
100100
(\hat{\rho}_{MLE},\hat{\mu}_{MLE},\hat{\sigma}_{MLE})=(\rho,\mu,\sigma):\quad \max_{\rho,\mu,\sigma}\ln\mathcal{L} = \sum_{t=2}^T \ln\Bigl(f(z_{t+1},z_t|\rho,\mu,\sigma)\Bigr)
101101
```
102102

103103

104-
(SecMaxLikeli_DistData)=
104+
(SecMLE_DistData)=
105105
## Comparisons of distributions and data
106106

107-
Import some data from the total points earned by all the students in two sections of an intermediate macroeconomics class for undergraduates at an unnamed University in a certain year (two semesters).
107+
Import some data from the total points earned by all the students in two sections of an intermediate macroeconomics class for undergraduates at an unnamed University in a certain year (two semesters). Let's create a histogram of the data.
108108

109109
```{code-cell} ipython3
110110
:tags: []
111111
112112
# Import the necessary libraries
113113
import numpy as np
114-
import scipy.stats as sts
114+
import matplotlib.pyplot as plt
115115
import requests
116116
117-
# Download and save the data file Econ381totpts.txt
117+
# Download and save the data file Econ381totpts.txt as NumPy array
118118
url = ('https://raw.githubusercontent.com/OpenSourceEcon/CompMethods/' +
119-
'main/data/maxlikeli/Econ381totpts.txt')
120-
# data_file = requests.get(url, allow_redirects=True)
121-
# open('../../../data/maxlikeli/Econ381totpts.txt', 'wb').write(data_file.content)
122-
123-
# Load the data as a NumPy array
124-
data = np.loadtxt('../../../data/maxlikeli/Econ381totpts.txt')
125-
```
126-
127-
Let's create a histogram of the data.
128-
129-
```{code-cell} ipython3
130-
:tags: []
131-
132-
import matplotlib.pyplot as plt
119+
'main/data/mle/Econ381totpts.txt')
120+
data_file = requests.get(url)
121+
if data_file.status_code == 200:
122+
# Load the downloaded data into a NumPy array
123+
data = np.loadtxt(data_file.content)
124+
else:
125+
print('Error downloading the file')
133126
134127
num_bins = 30
135128
count, bins, ignored = plt.hist(data, num_bins, density=True,
@@ -138,15 +131,24 @@ plt.title('Intermediate macro scores: 2011-2012', fontsize=15)
138131
plt.xlabel(r'Total points')
139132
plt.ylabel(r'Percent of scores')
140133
plt.xlim([0, 550]) # This gives the xmin and xmax to be plotted"
134+
135+
plt.show()
141136
```
137+
<!-- ```{figure} ../../../images/mle/Econ381scores_hist.png
138+
---
139+
height: 500px
140+
name: FigMLE_EconScoreHist
141+
---
142+
Intermediate macroeconomics midterm scores over two semesters
143+
``` -->
142144

143145

144-
(SecMaxLikeli_Exerc)=
146+
(SecMLE_Exerc)=
145147
## Exercises
146148

147149

148150

149-
(SecMaxLikeliFootnotes)=
151+
(SecMLEfootnotes)=
150152
## Footnotes
151153

152154
The footnotes from this chapter.

docs/book/struct_est/SMM.md

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Let the data be represented, in general, by $x$. This could have many variables,
2828
\theta \equiv \left[\theta_1, \theta_2, ...\theta_K\right]^T
2929
```
3030

31-
In the {ref}`Chap_MaxLikeli` chapter, we used data $x$ and model parameters $\theta$ to maximize the likelihood of drawing that data $x$ from the model given parameters $\theta$,
31+
In the {ref}`Chap_MLE` chapter, we used data $x$ and model parameters $\theta$ to maximize the likelihood of drawing that data $x$ from the model given parameters $\theta$,
3232

3333
```{math}
3434
:label: EqSMM_MLestimator
@@ -271,7 +271,7 @@ Let the parameter vector $\theta$ have length $K$ such that $K$ parameters are b
271271

272272
Recall that each element of $e(\tilde{x},x|\theta)$ is an average moment error across all simulations. $\hat{\Omega}$ from the previous section is the $R\times R$ variance-covariance matrix of the $R$ moment errors used to identify the $K$ parameters $\theta$ to be estimated. The estimated variance-covariance matrix $\hat{\Sigma}$ of the estimated parameter vector is a $K\times K$ matrix. We say the model is *exactly identified* if $K = R$ (number of parameters $K$ equals number of moments $R$). We say the model is *overidentified* if $K<R$. We say the model is *not identified* or *underidentified* if $K>R$.
273273

274-
Similar to the inverse Hessian estimator of the variance-covariance matrix of the maximum likelihood estimator from the {ref}`Chap_MaxLikeli` chapter, the SMM variance-covariance matrix is related to the derivative of the criterion function with respect to each parameter. The intuition is that if the second derivative of the criterion function with respect to the parameters is large, there is a lot of curvature around the criterion minimizing estimate. In other words, the parameters of the model are precisely estimated. The inverse of the Hessian matrix will be small.
274+
Similar to the inverse Hessian estimator of the variance-covariance matrix of the maximum likelihood estimator from the {ref}`Chap_MLE` chapter, the SMM variance-covariance matrix is related to the derivative of the criterion function with respect to each parameter. The intuition is that if the second derivative of the criterion function with respect to the parameters is large, there is a lot of curvature around the criterion minimizing estimate. In other words, the parameters of the model are precisely estimated. The inverse of the Hessian matrix will be small.
275275

276276
Define $R\times K$ matrix $d(\tilde{x},x|\theta)$ as the Jacobian matrix of derivatives of the $R\times 1$ error vector $e(\tilde{x},x|\theta)$ from {eq}`EqSMM_MomError_vec`.
277277

@@ -324,12 +324,12 @@ The following is a centered second-order finite difference numerical approximati
324324
(SecSMM_CodeExmp)=
325325
## Code Examples
326326

327-
In this section, we will use SMM to estimate parameters of the models from the {ref}`Chap_MaxLikeli` chapter and from the {ref}`Chap_GMM` chapter.
327+
In this section, we will use SMM to estimate parameters of the models from the {ref}`Chap_MLE` chapter and from the {ref}`Chap_GMM` chapter.
328328

329329
(SecSMM_CodeExmp_MacrTest)=
330330
### Fitting a truncated normal to intermediate macroeconomics test scores
331331

332-
Let's revisit the problem from the MLE and GMM notebooks of fitting a truncated normal distribution to intermediate macroeconomics test scores. The data are in the text file [`Econ381totpts.txt`](https://github.com/OpenSourceEcon/CompMethods/blob/main/data/smm/Econ381totpts.txt). Recall that these test scores are between 0 and 450. {numref}`Figure %s <FigSMM_EconScoreTruncNorm>` below shows a histogram of the data, as well as three truncated normal PDF's with different values for $\mu$ and $\sigma$. The black line is the maximum likelihood estimate of $\mu$ and $\sigma$ of the truncated normal pdf from the {ref}`Chap_MaxLikeli` chapter. The red, green, and black lines are just the PDF's of two "arbitrarily" chosen combinations of the truncated normal parameters $\mu$ and $\sigma$.[^TruncNorm]
332+
Let's revisit the problem from the MLE and GMM notebooks of fitting a truncated normal distribution to intermediate macroeconomics test scores. The data are in the text file [`Econ381totpts.txt`](https://github.com/OpenSourceEcon/CompMethods/blob/main/data/smm/Econ381totpts.txt). Recall that these test scores are between 0 and 450. {numref}`Figure %s <FigSMM_EconScoreTruncNorm>` below shows a histogram of the data, as well as three truncated normal PDF's with different values for $\mu$ and $\sigma$. The black line is the maximum likelihood estimate of $\mu$ and $\sigma$ of the truncated normal pdf from the {ref}`Chap_MLE` chapter. The red, green, and black lines are just the PDF's of two "arbitrarily" chosen combinations of the truncated normal parameters $\mu$ and $\sigma$.[^TruncNorm]
333333

334334
```{code-cell} ipython3
335335
:tags: ["hide-input", "remove-output"]
@@ -394,20 +394,16 @@ def trunc_norm_pdf(xvals, mu, sigma, cut_lb, cut_ub):
394394
return pdf_vals
395395
396396
397-
# Download and save the data file Econ381totpts.txt
397+
# Download and save the data file Econ381totpts.txt as NumPy array
398398
url = ('https://raw.githubusercontent.com/OpenSourceEcon/CompMethods/' +
399399
'main/data/smm/Econ381totpts.txt')
400-
data_file = requests.get(url, allow_redirects=True)
401-
open('../../../data/smm/Econ381totpts.txt', 'wb').write(data_file.content)
402-
403-
# Load the data as a NumPy array
404-
data = np.loadtxt('../../../data/smm/Econ381totpts.txt')
400+
data = np.loadtxt(url)
405401
406402
num_bins = 30
407403
count, bins, ignored = plt.hist(
408404
data, num_bins, density=True, edgecolor='k', label='data'
409405
)
410-
plt.title('Econ 381 scores: 2011-2012', fontsize=20)
406+
plt.title('Intermediate macro scores: 2011-2012', fontsize=20)
411407
plt.xlabel(r'Total points')
412408
plt.ylabel(r'Percent of scores')
413409
plt.xlim([0, 550]) # This gives the xmin and xmax to be plotted"
@@ -975,7 +971,7 @@ name: FigSMM_Econ381_SMM1
975971
SMM-estimated PDF function and data histogram, 2 moments, identity weighting matrix, Econ 381 scores (2011-2012)
976972
```
977973

978-
That looks just like the maximum likelihood estimate from the {ref}`Chap_MaxLikeli` chapter. {numref}`Figure %s <FigSMM_Econ381_crit1>` below shows what the minimizer is doing. The figure shows the criterion function surface for different of $\mu$ and $\sigma$ in the truncated normal distribution. The minimizer is searching for the parameter values that give the lowest criterion function value.
974+
That looks just like the maximum likelihood estimate from the {ref}`Chap_MLE` chapter. {numref}`Figure %s <FigSMM_Econ381_crit1>` below shows what the minimizer is doing. The figure shows the criterion function surface for different of $\mu$ and $\sigma$ in the truncated normal distribution. The minimizer is searching for the parameter values that give the lowest criterion function value.
979975

980976
```{code-cell} ipython3
981977
:tags: ["remove-output"]
@@ -1071,7 +1067,7 @@ In the next section, we see if we can get more accurate estimates (lower criteri
10711067

10721068
(SecSMM_CodeExmp_MacrTest_2m2st)=
10731069
#### Two moments, two-step optimal weighting matrix
1074-
Similar to the maximum likelihood estimation problem in Chapter {ref}`Chap_MaxLikeli`, it looks like the minimum value of the criterion function shown in {numref}`Figure %s <FigSMM_Econ381_crit1>` is roughly equal for a specific portion increase of $\mu$ and $\sigma$ together. That is, the estimation problem with these two moments probably has a correspondence of values of $\mu$ and $\sigma$ that give roughly the same minimum criterion function value. This issue has two possible solutions.
1070+
Similar to the maximum likelihood estimation problem in Chapter {ref}`Chap_MLE`, it looks like the minimum value of the criterion function shown in {numref}`Figure %s <FigSMM_Econ381_crit1>` is roughly equal for a specific portion increase of $\mu$ and $\sigma$ together. That is, the estimation problem with these two moments probably has a correspondence of values of $\mu$ and $\sigma$ that give roughly the same minimum criterion function value. This issue has two possible solutions.
10751071

10761072
1. Maybe we need the two-step variance covariance estimator to calculate a "more" optimal weighting matrix $W$.
10771073
2. Maybe our two moments aren't very good moments for fitting the data.

images/mle/Econ381scores_hist.png

89.6 KB
Loading

0 commit comments

Comments
 (0)