Version corta:
Sabemos que la regresión logística y la regresión probit pueden interpretarse como una variable latente continua que se discretiza de acuerdo con un umbral fijo antes de la observación. ¿Hay disponible una interpretación variable latente similar para, por ejemplo, la regresión de Poisson? ¿Qué tal para la regresión binomial (como logit o probit) cuando hay más de dos resultados discretos? En el nivel más general, ¿hay alguna forma de interpretar cualquier GLM en términos de variables latentes?
Versión larga:
Una forma estándar de motivar el modelo probit para resultados binarios (por ejemplo, de Wikipedia ) es la siguiente. Tenemos una variable no observada / latente resultado que se distribuye normalmente, condicionada a la predictor . Esta variable latente está sujeta a un proceso de umbral, de modo que el resultado discreto que realmente observamos es si , si . Esto lleva a la probabilidad de que dado tome la forma de un CDF normal, con desviación estándar y media en función del umbral y la pendiente de la regresión de en, respectively. So the probit model is motivated as a way of estimating the slope from this latent regression of on .
This is illustrated in the plot below, from Thissen & Orlando (2001). These authors are technically discussing the normal ogive model from item response theory, which looks pretty much like probit regression for our purposes (note that these authors use in place of , and probability is written with instead of the usual ).
We can interpret logistic regression in pretty much exactly the same way. The only difference is that now the unobserved continuous follows a logistic distribution, not a normal distribution, given . A theoretical argument for why might follow a logistic distribution rather than a normal distribution is a bit less clear... but since the resulting logistic curve looks essentially the same as the normal CDF for practical purposes (after rescaling), arguably it won't tend to matter much in practice which model you use. The point is that both models have a pretty straightforward latent variable interpretation.
I want to know if we can apply similar-looking (or, hell, dissimilar-looking) latent variable interpretations to other GLMs -- or even to any GLM.
Even extending the models above to account for Binomial outcomes with (i.e., not just Bernoulli outcomes) is not entirely clear to me. Presumably one could do this by imagining that instead of having a single threshold , we have multiple thresholds (one fewer than the number of observed discrete outcomes). But we would need to impose some constraint on the thresholds, like that they are evenly spaced. I'm pretty sure something like this could work, although I haven't worked out the details.
Moving to the case of Poisson regression seems even less clear to me. I'm not sure if the notion of thresholds is going to be the best way to think about the model in this case. I'm also not sure what kind of distribution we could conceive of the latent outcome as having.
The most desirable solution to this would be a general way of interpreting any GLM in terms of latent variables with some distributions or other -- even if this general solution were to imply a different latent variable interpretation than the usual one for logit/probit regression. Of course, it would be even cooler if the general method agreed with the usual interpretations of logit/probit, but also extended naturally to other GLMs.
But even if such latent variable interpretations are not generally available in the general GLM case, I would also like to hear about latent variable interpretations of special cases like the Binomial and Poisson cases that I mentioned above.
References
Thissen, D. & Orlando, M. (2001). Item response theory for items scored in two categories. In D. Thissen & Wainer, H. (Eds.), Test Scoring (pp. 73-140). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Edit 2016-09-23
There is one sort of trivial sense in which any GLM is a latent variable model, which is that we can arguably always view the parameter of the outcome distribution being estimated as a "latent variable" -- that is, we don't directly observe, say, the rate parameter of the Poisson, we just infer it from data. I consider this to be a rather trivial interpretation, and not really what I'm looking for, because according to this interpretation any linear model (and of course many other models!) is a "latent variable model." For example, in normal regression we estimate a "latent" of normal given . So this seems to conflate latent variable modeling with just parameter estimation. What I'm looking for, in the Poisson regression case for example, would look more like a theoretical model for why the observed outcome should have a Poisson distribution in the first place, given some assumptions (to be filled in by you!) about the distribution of the latent , the selection process if there is one, etc. Then (perhaps crucially?) we should be able to interpret the estimated GLM coefficients in terms of the parameters of these latent distributions/processes, similar to how we can interpret coefficients from probit regression in terms of mean shifts in the latent normal variable and/or shifts in the threshold .
Respuestas:
For models with more than one discrete outcome, there are several versions of logit models (e.g. conditional logit, multinomial logit, mixed logit, nested logit, ...). See Kenneth Train's book on the subject: http://eml.berkeley.edu/books/choice2.html
For example, in conditional logit, the outcome,y , is the car chosen by an individual, and there may be, say J cars to choose from and car j has attributes given by xj . Then suppose that individual i receives utility uij=xjβ+εij from chosing car j , where εij is distributed type I extreme value. Then the probability that car j is chosen is given by
In this model,uij , form a ranking of the alternatives. We are searching for parameters, β , so that this ranking conforms with the observed choices we see people making. E.g. if more expensive cars have lower market shares all else equals, then the coefficient on price must be negative.
Economists interpretu as a latent "utility" of making each choice. In microeconomics, there is a considerable body of work on utility theory: see e.g. https://en.wikipedia.org/wiki/Utility.
Note that there is no "threshold" parameter here: instead, when one utility becomes greater than the previously greatest, then the consumer will switch to choosing that alternative.
Therefore, there cannot be an intercept inxjβ : if there were, this would just scale up the utility of all the available options, leaving the ranking preserved and the choice unchanged.
fuente