1
$\begingroup$

I have decided to use the likelihood ratio test to evaluate if all the covariates my model considers are strictly necessary, as explain in page 388 and later illustrated in Example 14.4 of Statistical methods for survival data analysis, by Lee and Wenyu Wang (3rd ed.).

Unfortunately for me, my model is trained in R based on a modified log-likelihood function that also contains a regularisation parameter of the form

$$ C_1\sum_ j \|\vec{b}_j\|^2 . $$

The index $j$ runs over the measurement times (see the details below).

Here, an extra covariate translates to a non-zero component in any of $\vec{b}_j$, and I want to find out how many I can add, based on this likelihood ratio test.

Due to the setup of the problem, I would like to know:

1.- Does a likelihood ratio test still make sense when the likelihood function has been modified?

2.- In case it does, is it clear that

$$W = -2(\ell(\vec{b}_j) - \ell(\vec{b}_j'))$$

follows a chi-square distribution with one degree of freedom (as the example referred) or should I carry my own analysis to see the number of degrees of freedom (I do hope it will still follow a chi-square distribution)?.

Further details

Formally, I am dealing with a non-homogenous finite Markov chain, where each $X_j$ corresponds to a “measurement” and the state space is $S=\{0,1\}$ (alive or dead). The model computes the probability of observing a particular sequence $x=(x_1,\dots,x_N)$, where each $x_j=0$ or $1$ and of course $P(X_{j+1}=0|X_j = 1) = 0$. The regularisation term above is introduced apparently to avoid overfitting and to make sure that the parameters $\vec{b}_j$ vary smoothly from time $j$ to $j+1$.

The final probability is a sort of logistic regression, but it is actually easier to understand from the Markov random fields perspective.

EDIT:

As it seems there was some confusion, please let me clarify. The model I use has a modified likelihood function that includes a regularisation term. Right now, changing the likelihood function is not a plausible option. The question should be understood as: Given this modified likelihood function, what...?

Looking for a deduction for the likelihood ratio test, I have found the book Likelihood Methods in Statistics, by Thomas Severini, which present the expansion in Taylor series of $W$ as well as the necessary assumptions for it to follow a $\chi^2$ distribution in a certain number of degrees of freedom.

Now, I realise that question 1 is trivial: of course a likelihood ratio test make sense. The question should be:

What distribution does $W$ follow?

$\endgroup$

1 Answer 1

0
$\begingroup$
  1. A modified likelihood function is just a different likelihood function. Whether the regularization better helps model your data is a different question.
  2. According to Wilks' theorem, the log-likelihood test is only guaranteed to be chi-squared distributed asymptotically (large sample sizes) and under certain "regularity" conditions, such as being iid. Given that you are using a non-homogeneous markov chain, I would guess that that these conditions do not hold. See this answer for a good summary of what these condtions are. However, the LLR test would still be valid even if its not $\chi^2$-distributed, you just wouldn't be able to use $\chi^2$ as your sampling distribution to compute p-values, for example.
$\endgroup$
2
  • $\begingroup$ Sorry for my late reply, I missed your answer. I will modify my question accordingly now but two remarks: 1.- The model comes with this modified likelihood function, and changing it is not an option, regardless of whether it helps with the data or not. I'll make sure I state that clearly. 2.- Precisely becasue it's a modified likelihood function, the difference might not follow a $\chi^2$-distribution any more. But I've found this book by Thomas Severini, Likelihood methos in Statistics, that explains why the ratio follows that distribution and under what hypotheses. $\endgroup$ Commented Mar 3 at 21:21
  • $\begingroup$ If you want, you can add more to your answer in this sense, and talk about the assumptions and why they are likely to be violated in the case of a non-homogeneous Markvo chain. One the hypotheses is that all the estimators $\hat{b}_i$ follow a normal distribution. You can also comment on this. $\endgroup$ Commented Mar 3 at 21:24

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.