5
$\begingroup$

In a multiple regression, how do you interpret the slope coefficient of a variable if some other power of the variable, say, the variable squared also appears as an explanatory variable, i.e., suppose I estimate the following regression function: $$y = \beta_0 + \beta_1x + \beta_2x^2 + \epsilon.$$ How do I interpret the respective slope coefficients since changing $x$ while keeping $x^2$ constant or vice versa makes no sense.

$\endgroup$
4
  • 1
    $\begingroup$ This is very similar to, and possibly a duplicate of, this question: How to include $x$ and $x^2$ into regression, and whether to center them? $\endgroup$ Commented Sep 22 at 12:15
  • $\begingroup$ @Silverfish the linked question doesn't talk of the interpretation of the coefficients, so I don't think this is an exact duplicate. $\endgroup$ Commented Sep 22 at 17:45
  • 1
    $\begingroup$ Yes, the question posed is slightly different and I wouldn't consider it a duplicate myself which is why I didn't flag it - some people have wider interpretations of "duplication" than others. As whuber commented on the other question, "How to answer your questions about centering comes down to how you wish to interpret the coefficients" so the question implicitly concerned interpretation more than that OP realised! $\endgroup$ Commented Sep 22 at 17:56
  • $\begingroup$ Are you perhaps trying to estimate $\mathrm dy/\mathrm dx = \beta_1 + 2\beta_2x$? $\endgroup$ Commented Sep 23 at 15:16

4 Answers 4

7
$\begingroup$

$\beta_1$ by itself tells you if $y$ is increasing or decreasing at $x=0$, this may or may not be interesting.

Combining with the other estimates can give other pieces of information that may be helpful. In the quadratic case shown in the question, $-\frac{\beta_1}{2\beta_2}$ is the value of $x$ where the best fit parabola reaches its minimum or maximum (depending on the sign of $\beta_2$) and $\beta_0 - \frac{\beta^2_1}{4\beta_2}$ is the height ($y$) at the minimum/maximum.

If there are cube and/or higher terms, then you may be able to find some similar types of values by completing the cube, etc., but at that point it is probably easiest to just plot the curve over an area of interest as @Peter Flom suggests.

$\endgroup$
1
14
$\begingroup$

I find that the best way to do this is graphically.

If these are the only regressors, it's relatively simple. Make a plot with x on the x-axis and the dependent variable on the y-axis, then plot the predicted value for the DV at various levels of x. Software should make this fairly easy; I know R and SAS do, I'd be surprised if it was hard in other good stats software.

If there are other covariates, you can make multiple plots, or trellis plots, or multiple lines on one plot (for different levels of the covariate). My choice would depend on how many covariates there are and whether they are continuous, ordinal, or nominal.

$\endgroup$
11
$\begingroup$

If your model were $y = \beta_0 + \beta_1x + \beta_2z +\epsilon$, you might say that increasing $x$ by $\delta x$ while keeping $z$ constant would be associated with an expected change $\beta_1 \delta x$ in $y$. This corresponds to the partial derivative of $y = \beta_0 + \beta_1x + \beta_2z$ being $\frac{\partial y}{\partial x} = \beta_1$ - what you are calling the slope.

When your model is $y = \beta_0 + \beta_1x + \beta_2x^2+ \epsilon$, you might similarly say that increasing $x$ by $\delta x$ corresponds to the partial derivative of $y = \beta_0 + \beta_1x + \beta_2x^2$ being $\frac{\partial y}{\partial x} = \beta_1+2\beta_2 x$ and so an expected change in $y$ of $$(\beta_1+2\beta_2 x)\delta x.$$ Note that this now varies with $x$.

$\endgroup$
3
  • $\begingroup$ I get how to find the change, what I wish to inquire is whether $\beta_1$ has a straightforward explanation in this case like it does when it measures the ceteris paribus effect? $\endgroup$ Commented Sep 19 at 10:54
  • 1
    $\begingroup$ @Science_notfound In the polynomial model, $\beta_1$ roughly measures the expected effect on $y$ of a small change in $x$ when $x$ is close to $0$, though not elsewhere. That should not be a surprise as the purpose of introducing higher powers of $x$ into the model is deliberately designed to allow the relationship between $y$ and $x$ in the model to change with different values of $x$. $\endgroup$ Commented Sep 19 at 11:47
  • $\begingroup$ Further to Henry's comment above: this is one of the reasons people sometimes recenter their predictor variables. Then the coefficient on the linear term represents the slope at the mean value of $x$ rather than at $x=0$, which is often more important and much more readily interpretable (especially for variables whose observed range doesn't go anywhere near zero). But this a rather controversial choice, see When conducting multiple regression, when should you center your predictor variables & when should you standardize them? $\endgroup$ Commented Sep 22 at 12:09
1
$\begingroup$

You can reformulate the equation $$y=ax^2+bx+c$$ to the equivalent $$y=4y_{max}\frac{(x-x_0)(x-x_1)}{(x_0-x_1)(x_1-x_0)^2}$$

In some scenarios, the parameters $y_{max},x_0$ and $x_1$ are more easy to interpret.

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.