I have only one time series $(y_0, t_0), (y_1,t_1), \ldots, (y_n, t_n)$, with $y_i \in \mathbb{R}$ and $t_0 < \cdots < t_n$. The believe is that these are points on a function $f(t; \mu)$ with $\mu \in \mathbb{R}^d$ the learnable parameters , in the sense that, given $t_i$, $y_i$ are realizations of $Y_i$ given by:
\begin{equation*} Y_i = f(t_i;\mu) + \epsilon_i \end{equation*}
Here $\epsilon_i$ are the independent identically distributed error terms, say normally distributed with $0$ mean and some variance. Say we are frequentists. We use methods like least square to estimate $\hat{\mu}$ and we make inferences.
The question is that whether there is any rule of thumb in terms of $d$ and $n$ that works for large nonlinear class of functions $f$ to allow one to say in practice whether there is overfit or not. For example, in linear regression case, one usually asks for ten observations per learnable parameter excluding the intercept. It is not that we do not do the regression if we only have nine learnable parameters, but it is just a general rule of thumb. I am specially interested in the nonlinear regression case for smooth unimodal (i.e. has exactly one peak) functions $f$, for such is what my time series looks like.