Skip to content

Commit 886b6e1

Browse files
committed
Updated MLE.md and SMM.md
1 parent 66afef4 commit 886b6e1

File tree

2 files changed

+104
-9
lines changed

2 files changed

+104
-9
lines changed

docs/book/struct_est/MLE.md

Lines changed: 96 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ This chapter describes the maximum likelihood estimation (MLE) method. All data
1919
(SecMLE_GenModel)=
2020
## General characterization of a model and data generating process
2121

22-
Each of the model estimation approaches that we will discuss in this section on Maximum Likelihood estimation (MLE) and in subsequent sections on generalized method of moments (GMM) and simulated method of moments (SMM) involves choosing values of the parameters of a model to make the model match some number of properties of the data. Define a model or a data generating process (DGP) as,
22+
Each of the model estimation approaches that we will discuss in this section on Maximum Likelihood estimation (MLE) and in subsequent sections on {ref}`Chap_GMM` (GMM) and {ref}`Chap_SMM` (SMM) involves choosing values of the parameters of a model to make the model match some number of properties of the data. Define a model or a data generating process (DGP) as,
2323

2424
```{math}
2525
:label: EqMLE_GenMod
@@ -107,7 +107,7 @@ The maximum likelihood estimate of $\rho$, $\mu$, and $\sigma$ is given by the f
107107
Import some data from the total points earned by all the students in two sections of an intermediate macroeconomics class for undergraduates at an unnamed University in a certain year (two semesters). Let's create a histogram of the data.
108108

109109
```{code-cell} ipython3
110-
:tags: []
110+
:tags: ["remove-output"]
111111
112112
# Import the necessary libraries
113113
import numpy as np
@@ -117,13 +117,15 @@ import requests
117117
# Download and save the data file Econ381totpts.txt as NumPy array
118118
url = ('https://raw.githubusercontent.com/OpenSourceEcon/CompMethods/' +
119119
'main/data/mle/Econ381totpts.txt')
120-
data_file = requests.get(url)
120+
data_file = requests.get(url, allow_redirects=True)
121+
open('../../../data/mle/Econ381totpts.txt', 'wb').write(data_file.content)
121122
if data_file.status_code == 200:
122123
# Load the downloaded data into a NumPy array
123-
data = np.loadtxt(data_file.content)
124+
data = np.loadtxt('../../../data/mle/Econ381totpts.txt')
124125
else:
125126
print('Error downloading the file')
126127
128+
# Create a histogram of the data
127129
num_bins = 30
128130
count, bins, ignored = plt.hist(data, num_bins, density=True,
129131
edgecolor='k')
@@ -134,13 +136,101 @@ plt.xlim([0, 550]) # This gives the xmin and xmax to be plotted"
134136
135137
plt.show()
136138
```
137-
<!-- ```{figure} ../../../images/mle/Econ381scores_hist.png
139+
140+
```{figure} ../../../images/mle/Econ381scores_hist.png
138141
---
139142
height: 500px
140143
name: FigMLE_EconScoreHist
141144
---
142145
Intermediate macroeconomics midterm scores over two semesters
143-
``` -->
146+
```
147+
148+
Now lets code up a parametric distribution that is flexible enough to fit lots of different distributions of test scores, has the properties we would expect from a distribution of test scores, and is characterized by a minimal number of parameters. In this case, we will use a truncated normal distribution.[^TruncNorm]
149+
150+
```{code-cell} ipython3
151+
:tags: []
152+
153+
import scipy.stats as sts
154+
155+
156+
def trunc_norm_pdf(xvals, mu, sigma, cut_lb=None, cut_ub=None):
157+
'''
158+
--------------------------------------------------------------------
159+
Generate pdf values from the truncated normal pdf with mean mu and
160+
standard deviation sigma. If the cutoff is given, then the PDF
161+
values are inflated upward to reflect the zero probability on values
162+
above the cutoff. If there is no cutoff given, this function does
163+
the same thing as sp.stats.norm.pdf(x, loc=mu, scale=sigma).
164+
--------------------------------------------------------------------
165+
INPUTS:
166+
xvals = (N,) vector, values of the normally distributed random
167+
variable
168+
mu = scalar, mean of the normally distributed random variable
169+
sigma = scalar > 0, standard deviation of the normally distributed
170+
random variable
171+
cut_lb = scalar or string, ='None' if no cutoff is given, otherwise
172+
is scalar lower bound value of distribution. Values below
173+
this value have zero probability
174+
cut_ub = scalar or string, ='None' if no cutoff is given, otherwise
175+
is scalar upper bound value of distribution. Values above
176+
this value have zero probability
177+
178+
OTHER FUNCTIONS AND FILES CALLED BY THIS FUNCTION: None
179+
180+
OBJECTS CREATED WITHIN FUNCTION:
181+
prob_notcut = scalar
182+
pdf_vals = (N,) vector, normal PDF values for mu and sigma
183+
corresponding to xvals data
184+
185+
FILES CREATED BY THIS FUNCTION: None
186+
187+
RETURNS: pdf_vals
188+
--------------------------------------------------------------------
189+
'''
190+
if cut_ub == 'None' and cut_lb == 'None':
191+
prob_notcut = 1.0
192+
elif cut_ub == 'None' and cut_lb != 'None':
193+
prob_notcut = 1.0 - sts.norm.cdf(cut_lb, loc=mu, scale=sigma)
194+
elif cut_ub != 'None' and cut_lb == 'None':
195+
prob_notcut = sts.norm.cdf(cut_ub, loc=mu, scale=sigma)
196+
elif cut_ub != 'None' and cut_lb != 'None':
197+
prob_notcut = (sts.norm.cdf(cut_ub, loc=mu, scale=sigma) -
198+
sts.norm.cdf(cut_lb, loc=mu, scale=sigma))
199+
200+
pdf_vals = ((1/(sigma * np.sqrt(2 * np.pi)) *
201+
np.exp( - (xvals - mu)**2 / (2 * sigma**2))) /
202+
prob_notcut)
203+
204+
return pdf_vals
205+
```
206+
207+
208+
(SecMLE_LinReg)=
209+
## Linear regression with MLE
210+
211+
Although linear regression is most often performed using the ordinary least squares (OLS) estimator, which is a particular type of generalized method of moments (GMM) estimator, this can also be done using MLE. A simple regression specification in which the dependent variable $y_i$ is a linear function of two independent variables $x_{1,i}$ and $x_{2,i}$ is the following:
212+
213+
```{math}
214+
:label: EqMLE_LinReg_eqn
215+
y_i = \beta_0 + \beta_1 x_{1,i} + \beta_2 x_{2,i} + \varepsilon_i \quad\text{where}\quad \varepsilon_i\sim N\left(0,\sigma^2\right)
216+
```
217+
218+
If we solve this regression equation for the error term $\varepsilon_i$, we can start to see how we might estimate the parameters of the model by maximum likelihood.
219+
220+
```{math}
221+
:label: EqMLE_LinReg_eps
222+
\varepsilon_i = y_i - \beta_0 - \beta_1 x_{1,i} - \beta_2 x_{2,i} \sim N\left(0,\sigma^2\right)
223+
```
224+
225+
The parameters of the regression model are $(\beta_0, \beta_1, \beta_2, \sigma)$. Given some data $(y_i, x_{1,i}, x_{2,i})$ and given some parameter values $(\beta_0, \beta_1, \beta_2, \sigma)$, we could plot a histogram of the distribution of those error terms. And we could compare that empirical histogram to the assumed histogram of the distribution of the errors $N(0,\sigma^2)$. ML estimation of this regression equation is to choose the paramters $(\beta_0, \beta_1, \beta_2, \sigma)$ to make that empirical distribution of errors $\varepsilon_i$ most closely match the assumed distribution of errors $N(0,\sigma^2)$.
226+
227+
Note that estimating a linear regression model using MLE has the flexible property of being able to accomodate any distribution of the error terms, and not just normally distributed errors.
228+
229+
230+
(SecMLE_GBfam)=
231+
## Generalized beta family of distributions
232+
233+
144234

145235

146236
(SecMLE_Exerc)=

docs/book/struct_est/SMM.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -342,8 +342,7 @@ import matplotlib.pyplot as plt
342342
from mpl_toolkits.mplot3d import Axes3D
343343
344344
345-
# Define function that generates values of a normal pdf
346-
def trunc_norm_pdf(xvals, mu, sigma, cut_lb, cut_ub):
345+
def trunc_norm_pdf(xvals, mu, sigma, cut_lb=None, cut_ub=None):
347346
'''
348347
--------------------------------------------------------------------
349348
Generate pdf values from the normal pdf with mean mu and standard
@@ -397,7 +396,13 @@ def trunc_norm_pdf(xvals, mu, sigma, cut_lb, cut_ub):
397396
# Download and save the data file Econ381totpts.txt as NumPy array
398397
url = ('https://raw.githubusercontent.com/OpenSourceEcon/CompMethods/' +
399398
'main/data/smm/Econ381totpts.txt')
400-
data = np.loadtxt(url)
399+
data_file = requests.get(url, allow_redirects=True)
400+
open('../../../data/smm/Econ381totpts.txt', 'wb').write(data_file.content)
401+
if data_file.status_code == 200:
402+
# Load the downloaded data into a NumPy array
403+
data = np.loadtxt('../../../data/smm/Econ381totpts.txt')
404+
else:
405+
print('Error downloading the file')
401406
402407
num_bins = 30
403408
count, bins, ignored = plt.hist(

0 commit comments

Comments
 (0)