I’m looking for standard methods (e.g. methods accepted and used by the community) for fitting a probability distribution (either a probability density function (PDF) or cumulative distribution function (CDF)) to another one. Specifically fitting to the distribution itself, not data sampled from it.
The situation is:
- I have a distribution $g(x)$, which I can calculate by some method at any value of $x$
- I have a second distribution $f(x;\boldsymbol{\alpha})$, where $\boldsymbol{\alpha}$ is a set of parameters defining the distribution’s shape
And I want to determine the values of $\boldsymbol{\alpha}$ that make $f(x;\boldsymbol{\alpha})$ best match $g(x)$.
I’ve looked around but not found anything particularly helpful online, which (given that this seems like a reasonable thing to want to do) might suggest I’m overthinking this. My ideas for potential approaches are:
- Simply fitting the functional form of $f(x;\boldsymbol{\alpha})$ to $g(x)$ using a standard method (e.g., least squares)
- Comparing the lowest $n$ moments of the two distributions (with $n$ set by the number of parameters in $\boldsymbol{\alpha}$)
- Minimising the Kullback–Leibler divergence (or some other measure of distance) between the two distributions
- Using the known form of $g(x)$ to pretend to generate a very large number of draws from itself (i.e., with “perfectly” distributed counts) and fitting $f(x;\boldsymbol{\alpha})$ to that synthetic dataset with a standard method.
I expect that all of these options would give very similar results if the agreement between the two distributions is very good, but perhaps not if the agreement is less good.
For context, if it’s useful, the reason I want to do something like this is that $g(x)$ is expensive to calculate, and I need a computer to calculate it many times (in different situations where it has different shapes). I’ve also noticed that it always looks very similar to a standard distribution. So, if I can do some work in advance and tabulate the parameters $\boldsymbol{\alpha}$ that can approximate it well in any situation, that would speed up my code significantly.
Any help or ideas would be much appreciated, thanks!