-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Hello,
I read your blog and you may help me. I am learning Bayesian stats, discovered emcee and I want to simulate the famous German tank problem. You capture some tanks with serial numbers captured = [42, 63, 101] and you want to estimate the total size of the army if you assume that the serial numbers go from 1 to Big_Number.
The prior on N, P(N) would be discrete uniform between max(captured) and Big_Number (upper bound), 0 otherwise.
The likelihood sampler is also simple, P(x_i | N) = 1/N for 1 < x_i <= N, 0 otherwise.
I managed to write down ln_prior and ln_likelihood in case of observing 1 tank and made it run in emcee if I hardcoded a single x_1 observed value in the likelihood, but I am clueless how to extend it to a vector of captured tanks. I thought that it would be simple, just to calculate the likelihood as the product of P(x_i | N) over the data, but I don't know how to pass the data.
I always end with
ValueError: If you start sampling with a given log_prob, you also need to provide the current list of blobs at that position.
Would you show me how to define the parameters and discrete uniform distributions in emcee when input x is a vector and likelihood depends on parameters, please?
big_number = 2000
x_obs = [42]
m = max(x_obs)
def log_prior(n):
if (m <= n <= big_number):
return 0. # flat prior
else:
return -np.inf
def log_sampling(x, n):
if 0< x <= n:
return -np.log(n)
else:
return -np.inf
def log_posterior(n,D):
lp = log_prior(n)
s = 0.0
for i in D:
s += log_sampling(i,n)
return lp + s
nwalkers, ndim = 5,1
pos = m + (2000-m) * np.random.randn(nwalkers, ndim)
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_posterior, args=[x_obs])
sampler.run_mcmc(pos, 20000, progress=True);