1
$\begingroup$

Consider the following equation for $Y>0$: $$ (1) \quad \log(Y)=\log(\gamma)+\log(\alpha+\beta X)+\epsilon. $$ Assume that $E(\epsilon| X)=c\neq 0$. What are the consequences of this assumption on the estimation of $(\gamma, \alpha, \beta)$? More precisely, does $c\neq 0$ only cause bias in the estimated intercept $\gamma$ (as in linear least squares), or the fact that this is a nonlinear least square model implies that $c\neq 0$ may cause bias also in estimated $(\alpha, \beta)$?

Update 1: Thanks to the comment below, we can answer the question as follows. Let $\zeta\equiv \epsilon -c$. Hence, (1) can be rewritten as $$ \begin{aligned} \log(Y)&=\log(\gamma)+\log(\alpha+\beta X)+\zeta+c,\\ &=\log(\gamma\exp(c))+\log(\alpha+\beta X)+\zeta. \end{aligned} $$ Therefore, $c$ only causes a bias in the estimation of $\gamma$.

Update 2: To make sure I understand, let me complicate (1) as follows: $$ (2) \quad \log(Y)=\log(\gamma)+\sum_{k=1}^K D_k[\log(\alpha_k+\beta_k X)+\epsilon_k], $$ where $(D_1,\dots, D_K)$ is a vector of dummy variables in which exactly one element takes the value 1, and $E(\epsilon_k| D_1,\dots, D_K, X)=c\neq 0$. As before, I want to investigate which coefficients are biased by $c$. Let $\zeta_k\equiv \epsilon_k -c$. Hence, (2) can be rewritten as $$ \begin{aligned} \log(Y)&=\log(\gamma)+\sum_k D_k [\log(\alpha_k+\beta_k X+\zeta_k+c)],\\ &=\log(\gamma\exp(c\sum_k D_k ))+\sum_k D_k [\log(\alpha_k+\beta_k X) +\zeta_k],\\ &=\log(\gamma\exp(c))+\sum_k D_k [\log(\alpha_k+\beta_k X) +\zeta_k]. \end{aligned} $$ Hence, same answer as above.

Update 3: The comment below also suggests, more generally, that (1) is not identifiable. This is because (with $c=0$): $$ \begin{aligned} &\log(\gamma )+\log(\alpha+\beta X)+\epsilon\\ &=\log(\gamma (\alpha+\beta X))\\ &=\log(\gamma \alpha + \gamma \beta X)+\epsilon. \end{aligned} $$ Hence, we can only identify $\gamma \alpha$ and $\gamma \beta$. Does the same issue hold in (2)? Yes, I think. In fact, $$ \begin{aligned} &\log(\gamma )+\sum_k D_k[\log(\alpha_k+\beta_k X)+\epsilon_k],\\ &=\log(\gamma) + \sum_k \log[(\alpha_k+\beta_k X)^{D_k}] + \sum_k D_k \epsilon_k,\\ &=\log(\gamma \Pi_{k} (\alpha_k+\beta_k X)^{D_k})+ \sum_k D_k \epsilon_k.\\ \end{aligned} $$

$\endgroup$
2
  • 2
    $\begingroup$ You can answer this yourself. Write $\epsilon = c + \zeta$ where now $E[\zeta]=0$ and notice you can incorporate $c$ into the $\log\gamma = \log(\gamma\exp(c))$ term, thereby changing $\gamma$ -- but nothing else. Thus, if there are any consequences at all, they would lie in any distributional assumptions you might be making about $\epsilon:$ what are they? // Unfortunately, this is a terrible model because it's not identifiable: it's equivalent to $\log(Y)=\log(\alpha\gamma+(\beta\gamma)X)+\epsilon,$ demonstrating it really only has three parameters $c,$ $\alpha\gamma,$ and $\beta\gamma.$ $\endgroup$ Commented May 7, 2024 at 20:08
  • $\begingroup$ @whuber: Thanks. Could you please check the 3 updates added to my question, based on your comment? $\endgroup$ Commented May 8, 2024 at 9:45

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.