1
$\begingroup$

I have only one time series $(y_0, t_0), (y_1,t_1), \ldots, (y_n, t_n)$, with $y_i \in \mathbb{R}$ and $t_0 < \cdots < t_n$. The believe is that these are points on a function $f(t; \mu)$ with $\mu \in \mathbb{R}^d$ the learnable parameters , in the sense that, given $t_i$, $y_i$ are realizations of $Y_i$ given by:

\begin{equation*} Y_i = f(t_i;\mu) + \epsilon_i \end{equation*}

Here $\epsilon_i$ are the independent identically distributed error terms, say normally distributed with $0$ mean and some variance. Say we are frequentists. We use methods like least square to estimate $\hat{\mu}$ and we make inferences.

The question is that whether there is any rule of thumb in terms of $d$ and $n$ that works for large nonlinear class of functions $f$ to allow one to say in practice whether there is overfit or not. For example, in linear regression case, one usually asks for ten observations per learnable parameter excluding the intercept. It is not that we do not do the regression if we only have nine learnable parameters, but it is just a general rule of thumb. I am specially interested in the nonlinear regression case for smooth unimodal (i.e. has exactly one peak) functions $f$, for such is what my time series looks like.

$\endgroup$

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.