Lab2.txt

Revision 1 - 5/31/07 at 3:43 pm by e2holmes

Back to revision history for Lab2.txt
This file is part of the project Teaching code for State-space models
Lab 2

In Lab 2, ML parameter estimates are obtained by minimizing the negative log likelihood from the fit of a CDA to the data (fit is obtained using the Kalman filter).   The uncertainty in the estimates is illustrated with likelihood profiling.  One parameter is held constant while and the ML CDA is fit to the data.  The x-axis shows the value of the parameter being held, while the y-axis shows the negative log likelihood (normed by substracting the minimum neg log like).  This normed neg log like is approximately chi-square distributed with 1 degree of freedom (from statistical theory).

Panel 1: This shows the data with the CDA fit using the ML parameters.

Panel 2: This shows the likelihood profile for mu.  mu is the parameter that defines the longterm trend.  In Leslie matrix modeling, this would be lambda ( more precisely mu = median exp(lambda) ).  The profile shows your uncertainty given that you allow uncertainty in whether the variability in the data is due to process or non-process error.

Panel 3&4: These show the likelihood profiles for process and non-process error.

At the matlab prompt, type 'Lab2'.
You'll be asked to enter a data code.  Type 0 and see the available data.  Now, you'll pick an animal.  The ML parameters will be estimated (this takes a moment) and the likelihood profiles will be plotted (this takes a few moments).


Exercise 1:  Mute swans on the river Thames.  This is a 100 year time series.  This shows you an example where process and non-process error can be separated.

Type 'Lab2' at matlab prompt.
When asked for data code, type 6 (this is a White-capped albatross time series).
Be patient, the likelihood profiles are going to take a minute or two since this is such a long time series.

Things to note:  While this is a 100 year time series, the CIs on mu still include 0.  Thus, this data could easily be produced with either a slightly declining or increasing CDA.



Exercise 2:  Separating process and non-process error is not easy

Type 'Lab2' at matlab prompt.
When asked for data code, type 2 (Trumpeter swans).

When the figure comes up, type Lab2 again, and type in data code 3 (White-capped albatross).

Do it one more time, and type in the data code 1 (Grey-headed albatross).

Questions for Exercise 2:

1. In these 3 examples, we see that the maximum likelihood fit assigns all the variability in the data to non-process error (e.g. measurement error).  Why is that?  Recall the slide from the lecture on CDAs with 9 panels showing different realizations of the same CDA.  If the CDA had high process error, how likely is a beeline exponential decline with relatively equal deviations on both sides of the decline?  Look at the data from the Grey-headed albatross.

2. That pattern is unlikely, which suggests that process error is low, but is it 0?  Why is the ML fit assigning 0 to the process error?  That's biologically unreasonable; process error cannot be zero.  By the way, this really is the maximum likelihood estimate.

3. However, we don't know if it matters if process error is around 1e-5 (for example) and the ML fit assigns 1e-18 to it.  How might you try to understand that?  It depends on what you're trying to do with the CDA (i.e. what PVA risk metric you're trying to estimate).  How might you try to understand risk metric sensitivity?


Exercise 3: Interior local minima.  

To get the maximum likelihood estimates, we find the global maximum in that surface.  But as we've seen, that global minima is at process error equal the zero (essentially) even though we can be sure that process error is not equal to zero.  However the likelihood surface has multiple peaks.  Sometimes (but not always) the local minima with both process and non-process non-zero is evident.

Type 'Lab2' at the matlab prompt.
When asked for data code, type 5 (Sharp-tailed grouse).

Questions for Exercise 3:

1.  What is the ML estimate of non-process error, conditioned on non-process error being non-zero?  I'll refer to this as s2_new.

2.  To get the ML estimate of non-process error, we would then hold non-process error at this value while maximizing over the other parameters.  However, we can get a rough estimate using:  total variance = s2 + 2*s2np (process error + twice non-process error).  For the first fit, s2_old = 0, so we can calculate total variance as 2*s2np_old (the s2np_old estimate is in the title of the bottom panel). s2np_new = total variance - s2_new

3. Write down {mu, s2_old, s2np_old} and {mu, s2_new, s2np_new}.  At the matlab prompt, type Lab1, then type in data code 5, then at the prompt type in these two different parameter sets.  Now you can see how these different estimates affect the ML estimate of the true population size.


Exercise 4: High uncertainty in trend when process error estimate is high.

Type 'Lab2' at matlab prompt.
When asked for data code, type 7 (Wolves).

Questions for Exercise 4

1. Why is the trend estimate (the mu parameter) so uncertain?  Hint: it has to do with all the variance being attributed to process error.


Sculpin 0.2 | xhtml | problems or comments? | report bugs