1
$\begingroup$

Let me set the stage.

  • We are dealing with two variables; $A$ and $B$.
  • We can easily obtain $A(x)$ for a specific data point $x$.
  • $B(x)$, on the other hand, is very difficult to know.
  • We know Pearson's Correlation Coefficient for the $(A,B)$ couple (referred to as $r(A,B)$ from now on).

Given only $A(x)$ and $r(A,B)$, can we reliably obtain an interval or an approximation for $B(x)$?

Gathering data points for $B$ is very difficult, let alone data points presenting both A and B. $r(A,B)$ was obtained via extensive research, which I unfortunately cannot replicate. Using linear regression to predict $B$ based on $A$ is thus not an option.

I figured that since Pearson's Correlation Coefficient describes how linear a relationship between two variables is, there might be a way to interpret it as an inequality; given $A(x)$ and $r(A,B)$, $B(x)$ can be found in this range.

Statistics are far from my domain of expertise. I tried to go back to the mathematical definition of the coefficient, thinking that I could trace the variance back to an inequality of some kind through standard deviation, but this didn't lead me anywhere.

I also thought that maybe a geometrical approach could help, in a similar vein to this, where we use the Cauchy-Schwarz inequality and the link between correlation and cosine to deduce an interval for $Corr(A,C)$ knowing $Corr(A,B)$ and $Corr(B,C)$, but I am also out of my depth on this one.

Any help would be greatly appreciated. Thank you for reading this far! :)

EDIT: Changed formatting. Here is some additional information:

  • Obtaining a small number of observations $(A(x),B(x))$ is manageable.
  • The standard deviation of $B$ is not known.
  • Both A and B range from 0 to 100.
$\endgroup$
7
  • $\begingroup$ Your restrictions rule out being able to do anything, because you supply no information that could be used to estimate an unknown additive constant for $B.$ This problem would be overcome by having even one paired observation $(A(x),B(x)).$ $\endgroup$ Commented Apr 19, 2021 at 17:04
  • $\begingroup$ That sounds reasonable enough. What would this enable? $\endgroup$ Commented Apr 19, 2021 at 17:06
  • $\begingroup$ It would permit you to do regression, which is what you need. I'm curious how you know $r(A,B),$ though: is this estimated from other data? Predicted by theory? Established in some indirect way? BTW, you also need some estimate of the variance of $B,$ but that can be accomplished by having (at least) two paired observations. $\endgroup$ Commented Apr 19, 2021 at 17:08
  • $\begingroup$ Do you know the standard deviation of $B$? $\endgroup$ Commented Apr 19, 2021 at 17:11
  • $\begingroup$ The correlation coefficient was obtained by a third party which had the means to conduct a large-scale experiment. Only the result of said experiment is available to me. That being said, having a few observations is possible, but not many (I would say 5 maximum?). However, if I understand your suggestion correctly, performing a linear regression with only a single observation would be fairly imprecise. Would knowing the correlation coefficient help at all? $\endgroup$ Commented Apr 19, 2021 at 17:14

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.