3
$\begingroup$

Suppose I want to fit a linear model to non-linear rational features. Something like RationalTransformer instead of SplineTransformer in Scikit-Learn, that uses a basis of rational functions. The domain of my raw features before being transformed are (theoretically) unbounded non-negative numbers, such as "time since X happened", "total time spent on the website", or "bid in an auction".

Where would you put the poles? Why?

Note, I'm not aiming on fitting one rational curve. I'm aiming at a component I can use in a pipeline that transformes features before model fitting, such as MinMaxScaler or SplineTransformer in scikit-learn.

Update

Comments asked for explanation about sklearn. So sklearn pipelines consist of a sequence of "transformer" classes that manipulate the data in a (possibly) data dependent way, followed by an "estimator" that is in practice a model being trained that provides predictions. During the "fit" stage, each component estimates some parameters from the data. For example, a scaling component may estimate mean and standard deviation. The SplineTransformer class is one such transformers, that estimates the spline's knots from the data, such as empirical quantiles, or uniformly distributed knots between minimum and maximum observed value of each feature. But since it doesn't know which model is going to be fit, it cannot depend on the actual model (Ridge / Linear Regression / Logistic Regression / etc...)

$\endgroup$
9
  • $\begingroup$ Can you guarantee that your features are strictly positive? Because in that case, an easy to interpret approach would be to compute $\text{log}\frac{x_i}{x_j}, \forall i < j$. (If there are zeros, you might want to add a small $\varepsilon$ to both the numerator and the denominator.) Fitting general rational functions has a lot of degrees of freedom – it would be helpful if you could perhaps tell us something about the domain and the response variable. $\endgroup$ Commented Aug 24 at 7:02
  • $\begingroup$ The point is that I'm not aiming to analyze a specific data-set, but to build something generic that works across datasets. That's why I'm deliberately not saying anything about the data. But I can say that it's heavy tailed. Maybe very heavy tailed. Consider something like bids - there can always be bidders with extremely high bids, and they aren't "outliers", it's not noise. This is real data. Real behavior. $\endgroup$ Commented Aug 24 at 7:10
  • 1
    $\begingroup$ I seriously doubt you can derive a single generic formula that is guaranteed to work across all datasets. An approach that I've seen work in the past is to create a large bunch of features with no particular prior assumption as to what they "should" be (it was polynomials/interactions in my case but it can be done with rational functions too), then use correlation-based heuristics, regularization, and/or dimensionality reduction to capture the interesting/predictive ones only. $\endgroup$ Commented Aug 24 at 7:22
  • 1
    $\begingroup$ So, 'SplineTransformer' transforms a feature $x$ into multiple features by using different basis spline functions $x_1 = f_1(x)$, $x_2 = f_2(x)$, ..., $x_n = f_n(x)$? What do you mean by a basis of 'rational functions'? E.g. $g_0(x) = 1/(a+x)$, $g_1(x) = x/(a+x)$, $g_2(x) = x^2/(a+x)$ such that different linear combinations will form many different potentially rational functions? What do you mean with the question about the poles? How are we supposed to know where you should put them, that depends on the problem and your reasons for using rational functions. $\endgroup$ Commented Aug 28 at 7:54
  • 1
    $\begingroup$ +1 for the question. I do find this interesting. But it is a speciality and I have no time to dig into it. The question "Where would you put the poles? Why?" that might sound very clear to some people but to a general public it sounds very vague. I have only read the relevant article very briefly and to me the point of the poles doesn't seem to be much of an issue. The data shows a clear location for them. Probably, if I had this type of problem without knowing about these basis functions, I would just use some non-linear optimisation with a good theoretical background to aid the fitting. $\endgroup$ Commented Sep 2 at 20:34

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.