Introduction
You don't need interpolation for this problem. In general it's true that interpolation is well suited to approximating an unknown, smooth function of a few inputs ($\mathbf{x}=(x_1,\dots,x_n$), with $n$ small) over a hypercube, starting from the function values over a set of training points $S=\{(\mathbf{x}_i,y_i)\}_i^N$ located in the hypercube. This is especially true when function values at new training points can be generated cheaply, which is your case since your code runs very quickly. Also, interpolation has (close to machine) zero approximation error at points in $S$: this is a disadvantage (overfitting) when trying to approximate noisy data, but it's actually the correct behavior when approximating data which are known without error, as in your case.
On the other hand, interpolation has some distinct disadvantages:
- the number of parameters you need to store to perform interpolation grows with the size of the training set. Thus memory requirements grow quickly with the size of the training set.
- making predictions with some interpolation methods (for example vanilla Gaussian Processes) is much slower than making predictions with a simple parametric regression model, even though it's probably still way faster than running your computer code
- the approximation error (difference between values predicted by the interpolation method $y_p$ and unknown function values $y$) grows a lot as soon as you move outside the region of the input parameter space "spanned" by the training set (extrapolation). Regression models usually have the same issue, but not always (see below).
in your case it's evident that, for each fixed $u$, $\Delta T$ is very well approximated by a linear function of $T_{a,in}$, while for each fixed $T_{a,in}$, $\Delta T$ is very well approximated by a logarithmic function of $u$ (plus an intercept). It's important to verify that the model makes physical sense: based on your physical knowledge of the problem, is it plausible that for fixed $u$, $\Delta T$ is linear in $T_{a,in}$, and for
fixed $T_{a,in}$, it is logarithmic in $u$? If so, we can try to fit the following model:
$$\Delta T \approx \beta_0+\beta_1 T_{a,in} + \beta_2 \log(u) + \beta_3 T_{a,in}\log(u) $$
and check whether on extrapolated data points, not used for training, its accuracy is still good. Note that this model has only 4 parameters.
Model Fitting
I will use R to show the approach:
# R_a and R_u are the corresponding vectors in your code
R_a <- seq(15 ,20.5, by = 0.5)
R_u <- seq(0.1, 0.95, by = 0.05)
# DeltaT_matrix is the output matrix which you called Ta in your Python code
DeltaT_matrix <- structure(c(9.16474143429817, 8.94118290271588, 8.71731921196482,
8.49315204987976, 8.26868309204161, 8.04391400188728, 7.81884643081838,
7.59348201830876, 7.36782239201091, 7.14186916786124, 6.9156239501842,
6.68908833179539, 8.77268383665777, 8.55914642773912, 8.3452924610834,
8.13112363233619, 7.91664162521941, 7.70184811163494, 7.48674475176713,
7.27133319418426, 7.05561507593893, 6.83959202266747, 6.62326564868829,
6.40663755709921, 8.48960063743449, 8.28327754024661, 8.07663085966664,
7.86966228789457, 7.66237350551092, 7.45476618157531, 7.24684197372405,
7.03860252826679, 6.8300494802822, 6.62118445371259, 6.4120090614577,
6.20252490546751, 8.26779964588656, 8.06711631671647, 7.86610461555747,
7.66476622681244, 7.46310282353722, 7.26111606753518, 7.05880760945069,
6.85617908886182, 6.65323213437205, 6.44996836370108, 6.24638938377474,
6.04249679081404, 8.08543143687201, 7.88937668440084, 7.69299009870359,
7.49627335453585, 7.29922811555063, 7.10185603438936, 6.90415875277201,
6.70613790158635, 6.50779510097636, 6.30913196042974, 6.11015007886457,
5.91085104471518, 7.93061256207576, 7.73848107017399, 7.54601514630728,
7.3532164548195, 7.16008664917265, 6.96662737203485, 6.7728402553675,
6.57872692051164, 6.38428897827341, 6.18952802900874, 5.99444566270719,
5.79904345907504, 7.79614628340926, 7.60741771126701, 7.41835270460851,
7.2289529171241, 7.03921999182307, 6.84915556111902, 6.65876124691438,
6.46803866068416, 6.27698940355893, 6.08561506640695, 5.89391722991556,
5.70189746467181, 7.67733775913916, 7.49161228525216, 7.30554880534835,
7.11914896248988, 6.93241438924182, 6.74534670775515, 6.55794752984902,
6.37021845709215, 6.18216108088354, 5.99377698253235, 5.80506773333712,
5.61603489466418, 7.5709520023041, 7.3879127571883, 7.20453425705099,
7.02081813449161, 6.83676601178218, 6.65237950094813, 6.46766020384846,
6.28260971225508, 6.09722960793146, 5.91152146271053, 5.72548683857184,
5.53912728771803, 7.47466418569886, 7.29405385400307, 7.11310326613761,
6.9318140444779, 6.75018780122923, 6.5682261385056, 6.38593064840804,
6.20330291310201, 6.02034450489423, 5.83705698630867, 5.65344191016203,
5.46950081963829, 7.3867468512066, 7.20835232899164, 7.02961674413013,
6.85054170904833, 6.67112882614881, 6.49137968788769, 6.31129587685102,
6.13087896583062, 5.95013051789909, 5.76905208648422, 5.58764521544269,
5.40591143913309, 7.30588106490289, 7.12952297600297, 6.95282317363471,
6.77578326056511, 6.59840482967455, 6.4206894640324, 6.24263873697182,
6.06425421216397, 5.88553744369151, 5.70648997612139, 5.52711334457709,
5.34740907481004, 7.23103683577564, 7.05656212263367, 6.88174517153485,
6.70658757587974, 6.53109091931105, 6.35525677578768, 6.17908670965812,
6.00258227573309, 5.82574501935767, 5.64857647648261, 5.47107817373513,
5.29325162848903, 7.16139436134207, 6.98867089917336, 6.81560477840227,
6.64219758335165, 6.46845088870781, 6.29436625959328, 6.1199452516386,
5.94518941105382, 5.77010027469909, 5.59467937015477, 5.41892821579085,
5.24284832083573, 7.09629042461093, 6.92520301178304, 6.75377260599608,
6.58200078277584, 6.40988910812628, 6.23743913860088, 6.06465242137331,
5.89153049430752, 5.71807488602711, 5.54428711598418, 5.37016869452754,
5.19572112297026, 7.03518087250459, 6.86562818364286, 6.69573223984451,
6.52549460810856, 6.35491684602097, 6.18400050182467, 6.01274711448908,
5.84115821377896, 5.66923532032263, 5.49697994567962, 5.32439359240773,
5.15147775412946, 6.97761370854148, 6.80950593760835, 6.64105471097168,
6.47226158736275, 6.30312811620306, 6.13365583767325, 5.96384628278145,
5.79370097343104, 5.62322142248771, 5.45240913384602, 5.28126560249541,
5.1097923145855, 6.92320938647168, 6.75646639488318, 6.58937979896262,
6.42195114942078, 6.25418198775705, 6.08607384632729, 5.91762824841104,
5.74884670827825, 5.57973073125531, 5.41028181379056, 5.24050144351922,
5.07039109932776), .Dim = c(12L, 18L), .Dimnames = list(NULL,
NULL))
We now reshape the data to streamline the estimation of the regression model:
# flatten response and create design matrix from predictor levels
DeltaT <- as.vector(DeltaT_matrix)
Ta_in <- rep(R_a, 18)
u <- rep(R_u, each = 12)
# assemble data frame for modeling
df <- data.frame(Ta_in, u, DeltaT)
Finally, we fit the linear model:
# fit linear model
my_model <- lm(data = df, DeltaT ~ Ta_in*log(u) )
summary(my_model)
#
# Call:
# lm(formula = DeltaT ~ Ta_in * log(u), data = df)
#
# Residuals:
# Min 1Q Median 3Q Max
# -0.0165326 -0.0020604 0.0003775 0.0032458 0.0070474
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 11.8946112 0.0051852 2294.0 <2e-16 ***
# Ta_in -0.3343834 0.0002908 -1150.1 <2e-16 ***
# log(u) -1.7572983 0.0050434 -348.4 <2e-16 ***
# Ta_in:log(u) 0.0504915 0.0002828 178.5 <2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.004541 on 212 degrees of freedom
# Multiple R-squared: 1, Adjusted R-squared: 1
# F-statistic: 2.513e+06 on 3 and 212 DF, p-value: < 2.2e-16
my_summary <- summary(my_model)
my_summary$adj.r.squared
# [1] 0.9999715
The adjusted $R^2$ is very high, taking into account the fact that we have a lot of training points and just 4 parameters. To get another measure of approximation accuracy, let's compute the root mean squared relative error: we can do this because DeltaT never gets close to 0.
RMSRE <- function(residuals, response){
sqrt(mean((residuals/response)^2))
}
RMSRE(my_summary$residuals, df$DeltaT)
# [1] 0.0006351842
Model check on extrapolated data
Let's see how we perform in extrapolation: I re-ran your Python code changing only these lines
R_a = np.arange(1, 15, 1)
R_u = np.arange(0.01, 0.1, 0.01)
and I got a brand new Ta array from your code. Repeating analogous steps to those I performed before, I store the extrapolated points in the dataframe df_extrapolated:
DeltaT <- as.vector(DeltaT_matrix_extrapolated)
R_a <- seq(1, 14)
R_u <- seq(0.01, 0.09, by = 0.01)
Ta_in <- rep(R_a, 9)
u <- rep(R_u, each = 14)
df_extrapolated <- data.frame(Ta_in, u, DeltaT)
To get predictions at the new points, it's convenient to generate the model matrix X and convert it to a data frame, for usage with the generic function predict:
X <- data.frame(model.matrix.lm(my_model, data = df_extrapolated))
my_predictions <- predict(my_model, X)
extrapolation_residuals <- my_predictions - DeltaT
RMSRE(extrapolation_residuals, DeltaT)
# [1] 0.0160831
The root-mean-square relative error is still quite small ($\sim 1.6\%$) even if we tested the model for values of $T_a$ and $u$ considerably lower than those used in fitting the model. Remember that $\log(0)$ is undefined, so the fact that even as we approach the $u=0$ line, the approximation is still good, is definitely remarkable.