Let me set the stage.
- We are dealing with two variables; $A$ and $B$.
- We can easily obtain $A(x)$ for a specific data point $x$.
- $B(x)$, on the other hand, is very difficult to know.
- We know Pearson's Correlation Coefficient for the $(A,B)$ couple (referred to as $r(A,B)$ from now on).
Given only $A(x)$ and $r(A,B)$, can we reliably obtain an interval or an approximation for $B(x)$?
Gathering data points for $B$ is very difficult, let alone data points presenting both A and B. $r(A,B)$ was obtained via extensive research, which I unfortunately cannot replicate. Using linear regression to predict $B$ based on $A$ is thus not an option.
I figured that since Pearson's Correlation Coefficient describes how linear a relationship between two variables is, there might be a way to interpret it as an inequality; given $A(x)$ and $r(A,B)$, $B(x)$ can be found in this range.
Statistics are far from my domain of expertise. I tried to go back to the mathematical definition of the coefficient, thinking that I could trace the variance back to an inequality of some kind through standard deviation, but this didn't lead me anywhere.
I also thought that maybe a geometrical approach could help, in a similar vein to this, where we use the Cauchy-Schwarz inequality and the link between correlation and cosine to deduce an interval for $Corr(A,C)$ knowing $Corr(A,B)$ and $Corr(B,C)$, but I am also out of my depth on this one.
Any help would be greatly appreciated. Thank you for reading this far! :)
EDIT: Changed formatting. Here is some additional information:
- Obtaining a small number of observations $(A(x),B(x))$ is manageable.
- The standard deviation of $B$ is not known.
- Both A and B range from 0 to 100.