Newest 'linear-model' Questions

0 votes

0 answers

40 views

Maximum likelihood estimation for linear regression [duplicate]

When conducting maximum likelihood estimation for simple linear regression whilst considering the regressors as random, the joint distribution of $f_{X,Y}(x,y;\theta) = f_{Y|X}(y|x;\theta) * f_{X}(x;\...

froot

83

asked Nov 7 at 19:53

4 votes

3 answers

145 views

homoscedasticity for a linear model

I have a linear model with two continuous variables and three categorical variables. Do I need to check homoscedasticity within each level of my categorical variables, or is it sufficient to check ...

Guasco

41

asked Oct 27 at 14:48

2 votes

1 answer

242 views

Calculating standard errors in least squares and the normality assumption

The question titled “How are the standard errors of coefficients calculated in a regression?” is asking how the standard errors of regression coefficient estimates are computed (for example, the ...

Laut567

83

asked Oct 25 at 8:11

1 vote

0 answers

22 views

What to do if your residuals over time are not independent?

Design: 2 groups (treat vs control), 4 time points (baseline/time 0, time 1, time 2, and follow-up/time 3), time 0 to 2 are equally spaced in time (2 weeks apart) while follow up occurs 4 weeks after ...

Maya

11

asked Oct 24 at 14:28

0 votes

1 answer

85 views

Does endogeneity in a linear model imply a non-linear conditional mean?

Given the model: $y = a + bx + u$, and that $x$ is endogenous, This implies that $E(u|x)\neq 0$. I believe this implies that there are no values for $a$ and $b$ that exist that can make $E(u|x)=0$? So ...

seekingknowledge111

1

asked Oct 24 at 6:31

3 votes

1 answer

68 views

Efficient minimization of minimax objective function involving piecewise linear functions

Given an empirical cdf $\hat{F}$ with support on $[0,1]$, I am interested in finding the histogram with $B$ (unequal) bins with cdf $F_B$ that minimizes the maximum absolute deviation between the cdfs....

Leland Stirner

243

asked Oct 21 at 23:36

1 vote

0 answers

38 views

Choosing a Reference Value when Releveling Factors to Calculate Change Over Time

Context: I have a data set based around 16 different locations. Each location has a contaminant value measured once per year, from 2012 to 2023. The data looks something like this: Location Type Year ...

User493461

11

asked Oct 21 at 16:16

6 votes

2 answers

262 views

Expectation and Kronecker product

Let $\mathbf{u} \sim \mathcal{U}(S_{\mathbb{R}^m})$ be a uniformly distributed random vector on the unit sphere $S_{\mathbb{R}^m} \triangleq \{\mathbf{u}\in \mathbb{R}^m\mid\|\mathbf{u}\|=1\}$ and let ...

User1002546

203

asked Oct 17 at 8:55

4 votes

1 answer

77 views

Standardized coefficients vs Permutation-based variable importance

I recently read a post detailing the issues with using standardized coefficients as a measure of variable importance, and while looking for alternatives, I found several posts here discussing the use ...

CorinthianHelm

41

asked Oct 10 at 16:30

0 votes

0 answers

77 views

Durbin-Watson test for weighted linear regression

My question concerns the use of the Durbin-Watson test for a weighted linear model in the context of calibration curves (a simple model y = ax + b in my case). I saw that there is a similar question ...

finattisaka

31

asked Sep 27 at 0:30

3 votes

0 answers

93 views

Running the Breusch-Pagan test manually in R assuming a weighted linear regression

I am trying to run the Breusch-Pagan test manually in RStudio from a weighted linear model (wi = 1/x^2). I need help verifying whether the following rationale is correct: What I did: WLS and residuals ...

finattisaka

31

asked Sep 26 at 1:30

6 votes

2 answers

301 views

Question on simple causal modeling

My causal graph looks like this: $A\to B$, $B \to C$ and $A \to C$. I want to model the direct influence of $B$ on $C$, i.e. changing $B$ by one unit, how much does $C$ change? I think the correct ...

Baron Yugovich

509

asked Sep 21 at 15:06

0 votes

0 answers

27 views

Partial Least Square Regression with Oracle on Variance matrix

I consider a centered random vector $(X_1,\cdots,X_d)$ and a real-valued random variable $Y$ such that the following model holds : \begin{align*} Y = \beta^{*}X^{\top} + \varepsilon \end{align*} with $...

arthur_elbrdn

83

asked Sep 3 at 16:32

0 votes

0 answers

61 views

Research method selection

I used a robust linear regression to evaluate the impact of some variables on a dependent variable, their linear correlation being tested and proven. Now, I want to compute an importance score of ...

Corina

1

asked Sep 1 at 23:29

3 votes

0 answers

120 views

Poles of rational basis functions as nonlinear features

Suppose I want to fit a linear model to non-linear rational features. Something like RationalTransformer instead of ...

Alex Shtoff

31

asked Aug 24 at 6:50

0 votes

0 answers

93 views

Meaning of zero autocorrelation when performing linear regression on unstructured data

I have a seemingly very simple question that I cannot find the answer to. When performing linear regression, we are assuming that the correlations between residuals is zero. This makes sense to me ...

Joshua Schroijen

235

asked Aug 18 at 14:40

2 votes

1 answer

75 views

Test for Pleiotropy vs close Linkage

Especially experts in fitting linear models. I'm currently investigating pleiotropic associations in oats, and I found a paper by Schulthess et. al., 2017 that proposes a method to distinguish ...

Francoise Dariva

21

asked Aug 1 at 13:56

6 votes

1 answer

168 views

Should I use lme4::lmer or nlme::lme for a repeated measures frog phonotaxis experiment with low between-subject variance?

I'm analyzing data from a frog phonotaxis experiment where I tested 17 females, each undergoing two trials. In each trial, a female was placed in a choice arena and exposed to two different acoustic ...

Lucero Luna Montilla lLunace

63

asked Jul 31 at 15:02

3 votes

2 answers

136 views

Impact of selection of features before Ridge regression : adaptation of regularization

I consider $X=(X_1,\cdots,X_d)$ a centered random vector such that its covariance matrix $\Sigma \in \mathbb{R}^{d \times d}$ is well defined. I suppose that for all $i= 1,\cdots,d$ we have $\text{Var}...

arthur_elbrdn

83

asked Jul 30 at 11:11

3 votes

1 answer

123 views

How to deal with unbalanced data in a within-subjects design using linear mixed effects model?

I conducted an experiment in which n=29 subjects participated. Each subject was measured under 5 different conditions, with 3-5 measurements per subject in conditions 1-4 and a maximum of 2 ...

M. Skillaz

61

asked Jul 29 at 18:14

8 votes

2 answers

308 views

Testing Hypotheses with Limited Data in an Ecological Experiment. How do I approach my data?

For my bachelor's thesis, I’m investigating the effect of voles and mulch on soil infiltration and saturated hydraulic conductivity (Ksat). I want to test the following three hypotheses: Vole ...

Faith

155

asked Jul 27 at 7:57

4 votes

1 answer

300 views

Intercept in design matrix

Consider the design matrix: 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 when fitted to a linear model as y ~ design,...

dariober

6,053

asked Jul 1 at 17:29

2 votes

0 answers

52 views

Is it problematic to use a covariate derived from the dependent variable in linear regression?

I'm performing a simple linear regression with one dependent and one independent variable: dependent variable (y): Nighttime lights raster, Independent variable (x): Population raster The issue is ...

Dolby

225

asked Jun 27 at 11:01

0 votes

0 answers

68 views

Selecting number of PCs (principal components) to include in PCR (principal component regression)

How do you decide the number of principal components (PC) to include in principal component regression (PCR)? I have seen these methods: choosing the lowest RMSEP with the pls() package Choosing PC's ...

Osuke Miyamaru

35

asked Jun 9 at 9:05

0 votes

0 answers

60 views

Singular fit warning for LMM: Removing randoms effects problematic for model comparisons?

I have 33 plots measured in 2020 and remeasured in 2025, with three response variables. I'm using linear mixed models with "stand" and "age" as random effects. However, for some ...

Conor

61

asked Jun 6 at 11:47

0 votes

0 answers

52 views

Using random effects in a Linear Mixed Model and I think I am doing something wrong

I am performing an analysis on the correlation between the density of predators and the density of prey on plants, with exposure as a additional environmental/ explanatory variable. Sampled five ...

Ddiara

1

asked Jun 3 at 3:16

1 vote

0 answers

32 views

When dealing with correlated slopes and intercept, does it make sense to include only certain levels of the random slope variable (by subject)?

I am fitting a mixed effect model where some levels of the categorical variable are correlated with the intercept for the following formula, resulting in a singular fit: ...

MCH

21

asked May 21 at 15:09

0 votes

0 answers

50 views

Linear Mixed Model: Dealing with Predictors Collected Only During the Intervention (once)

We have conducted a study and are currently uncertain about the appropriate statistical analysis. We believe that a linear mixed model with random effects is required. In the pre-test (time = 0), we ...

JJ_muc

1

asked May 17 at 19:28

1 vote

0 answers

71 views

Linear classifier confusing two classes in both directions

I'm training a linear classier (converging fine), i.e. multi-class logistic regression, on 169 data points using 13 features. It's doing only slightly above chance, which is expected, it's a hard ...

ludog

61

asked May 2 at 3:07

5 votes

2 answers

277 views

Closed form for two-way ANOVA

Consider $Y_{ij} = \alpha_i + \beta_j + \varepsilon_{ij}$, where $\sum_i \alpha_i = 0$ for identifiability and $\varepsilon_{ij}$ is noise. The data is not balanced. What is the closed form for the ...

james hoffman

53

asked Apr 20 at 6:50

0 votes

0 answers

230 views

What does the Grenander condition imply about the data-generating process of $(y_i, x_i)$?

Consider a correctly specified linear model $$ y_i = x_i^\top \beta + \varepsilon_i,\quad i=1,\dots,n, $$ where the errors $\varepsilon_i$ are independent with zero mean and finite variance. ...

spie227

242

asked Apr 18 at 8:50

9 votes

2 answers

609 views

Good texts on Bayesian approach to ANOVA and beyond, specifically with replacement/comparison with frequentist methods in mind

I am pretty much at wit's end following a year of frequentist instruction on linear methods and models. I tend to "think Bayesian" and find, for whatever reason, that Bayesian methods feel ...

Chris

322

asked Apr 14 at 22:42

1 vote

2 answers

145 views

Linear Mixed Model on correlated deltas for repeated measurements

There are already numerous threads related to Linear Mixed models, but they always deal with the raw dataset. However, I would like to use LMM on the deltas between the raw measurements, as using ...

Belgium_Physics

111

asked Apr 11 at 7:51

6 votes

1 answer

326 views

Why isn't Frisch–Waugh–Lovell theorem (FWL) equivalent to fitting to residuals without orthogonalization?

In the context of regression by iteratively fitting each predictor, why isn't FWL equivalent to fitting each predictor to the residuals of the previous predictor without orthogonalizing the predictors ...

ron burgundy

325

asked Apr 9 at 20:03

0 votes

0 answers

35 views

Obtain centered Regressors from `lm` object in R or via transformation

This is to some degree a software and to some degree a purely stats question. I have a design matrix $X$ with categorial and continuous variables. The first column contains only ones. For a given ...

Quertiopler

324

asked Apr 8 at 9:09

0 votes

0 answers

48 views

Reporting unequal variance among groups in linear model

I have a linear model that predicts root mass as a function of root volume in 2 plant species. Code in R: ...

Jacob Weverka

384

asked Apr 3 at 20:43

4 votes

1 answer

122 views

Consequence of useless regressor: Proving $\operatorname{cov}(\hat{\beta}) \succeq \operatorname{cov}(\tilde{\beta_1})$

$\newcommand{\cov}{\operatorname{cov}}$I am reading this note Linear Model and Extensions by Peng Ding and came across the following problem in Page 27 (Problem 4.4). Can someone help me figuring out ...

melatonin15

423

asked Mar 26 at 18:47

1 vote

0 answers

69 views

Computation of R squared in weighted linear regression [duplicate]

This question is based on the formulas as presented by the documentation of the fitting software Origin. Particularly this page. I'm working from the conceptualisation of the weights as inverse ...

Sjoerd Smit

173

asked Mar 24 at 11:44

4 votes

2 answers

232 views

Scale at which circular data approach linearity

We have a data set for hue, which is a circular variable. However, the data range only over 10 degrees of the possible 360. Can we use a linear mixed model to analyze the data, or do we have to use ...

user469627

41

asked Mar 20 at 16:06

10 votes

3 answers

392 views

Covariance between $\hat{\beta}_1$ and $\hat{\beta}_0$ for a simple linear model with correlated errors

$\newcommand{\Var}{\operatorname{Var}} \newcommand{\Cov}{\operatorname{Cov}}$I've found this assignment, given to undergrad students in a university in Cyprus, in 2022, where a simple linear model is ...

Graham Crexwood

101

asked Mar 16 at 8:56

1 vote

0 answers

107 views

Sampling events to predict low-frequency events in linear regression

I am working on a project in which I am using two different datasources to predict a country's change in population as a percentage. The frequency at which that I receive data from these different ...

Joey

143

asked Mar 15 at 20:54

0 votes

0 answers

81 views

Covariance of observed and fitted values

I am confused about several computations I've seen for the covariance between the response and fitted values in linear regression. For instance, it is a standard step to derive the bias-variance trade-...

Makas

1

asked Mar 15 at 19:28

0 votes

1 answer

57 views

Equivalence of two ways to obtain indirect effect in mediation analysis

In simple mediation analyisis related to usual linear regression we have 3 fitted regression models: Y = aX Y = bX + cZ Z = dX Here Y is outcome, X is explanatory variable and Z is a mediator. ...

Mark Nh

149

asked Mar 7 at 9:58

0 votes

1 answer

117 views

Finding Weights for WLS Regression Using OLS

I am using statsmodels to run linear regressions on heteroscedastic data stored in DataFrame df_temp. Currently, I am trying to find the variance of the model by ...

Annie J.

1

asked Mar 4 at 18:07

1 vote

0 answers

57 views

In time-series data with autocorrelation, how should I filter observations?

I'm looking at the simulation accuracy of a model that predicts forest carbon. I'm comparing these simulated values against measurements of forest carbon at specific sites. Each site has had forest ...

frandude

217

asked Feb 26 at 20:05

0 votes

1 answer

89 views

Contextualizing Mediation Analysis Results

Overview I have no experience with mediation analysis, but I've run into a situation where it may be relevant. Since I lack experience, I'm not sure how much weight I should put into a significant ...

shridhar singh

21

asked Feb 19 at 0:09

0 votes

0 answers

59 views

Can I use Standard error of prediction to execute a t-test for a new observation?

I have a linear model fitted to literature data, that correlates beek size to beek length. I have a new observation and would like to test if, given its beek length, the beek size is inside the ...

Augusto Nunes

1

asked Feb 14 at 16:05

8 votes

1 answer

299 views

Regression when $y$ has been calculated from $x$

I'm reading this paper: Catalán, N., Marcé, R., Kothawala, D. N., & Tranvik, L. J. (2016). Organic carbon decomposition rates controlled by water retention time across inland waters. Nature ...

JamesS

570

asked Feb 5 at 12:43

1 vote

0 answers

82 views

Problem with Adjusted $R^2$ as Criterion for Variable Selection

I have came across a problem when I am studying linear regression. From the book Plane Answers to Complex Questions (Christensen, 2020), he mentioned that: If $F$ statistic is greater than 1, then ...

stats_newbie

31

asked Jan 31 at 19:00

3 votes

3 answers

196 views

Why do highly correlated features cause the corresponding coefficients to be large and with opposite signs?

I've conducted the following experiment. Suppose we want to build a linear regression with 3 features: $$y = w_1 * x_1 + w_2 * x_2 + w_3 * x_3$$ and we have a dataset with certain number of samples. ...

Nikita Tkachuk

33

asked Jan 27 at 18:59

Questions tagged [linear-model]