You do thr computations in Mathematica's arbitrary precision with 32 decimal digits. The problem is that this is software-emulated arithmetic and thus much slower than machine precision. Using machine precision has the nice advantage that one can simply change FoldList + Table to a Do loop and compile it. So I tried:
cf = Compile[{{s, _Real}, {x, _Real}, {mkni, _Real}, {k1i, _Real}, {n1i, _Real}, {iter, _Integer}},
Module[{r, sum},
r = s;
sum = s;
Do[
r = r (x - i) (mkni - i)/((k1i + i) (n1i + i));
sum += r
, {i, 0, iter}];
sum
],
CompilationTarget -> "C",
RuntimeOptions -> "Speed"
];
Here are the results of my experiments:
OP's version:
mkni = 9.577576587094`32*^14;
k1i = 1.2937885137981`32*^13;
n1i = 2.8913878463172`32*^13;
s = 1.5316416966770`32*^-25;
iter = 1249924;
x = 3.90577689449`32.*^11;
AbsoluteTiming[
xvars = Range[x, x - iter, -1];
mknis = Range[mkni, mkni - iter, -1];
k1is = Range[k1i, k1i + iter];
n1is = Range[n1i, n1i + iter];
factors = xvars*mknis/(k1is*n1is);
sList = FoldList[Times, s, factors];
c = Total[sList]
]
{2.2683, 9.9999965762182896702400893*10^-21}
My compiled version:
AbsoluteTiming[
c2 = cf[s, x, mkni, k1i, n1i, iter]
]
{0.004983`, 9.999996576218254`*^-21}
The relative error pretty low, so I guess that machine precision will do for your application:
Abs[c - cMC]/Abs[c]
1.45949*10^-14
More robust implementation
Okay, the floating point analysis in the comments below revealed that we can run quickly into underflow or overflow problems for values of x only slightly smaller or larger than OP's value of x. We can use an 64-bit signed integer to represent the exponent in the binary representation. This should considerably extend the under- and overflow threshold. Here is my crude implementation of this:
cf2 = Compile[{{s, _Real}, {x, _Real}, {mkni, _Real}, {k1i, _Real}, {n1i, _Real}, {iter, _Integer}},
Module[{z, mr, er, msum, esum,
factor, der, de, desum, esum10, msum10},
er = Round[Log2[s]];
mr = s (2.^-er);
{msum, esum} = {mr, er};
Do[
factor = (x - i) (mkni - i)/((k1i + i) (n1i + i));
z = mr factor;
der = Round[Log2[z]];
mr = z (2.^-der);
er = er + der;
de = er - esum;
(* Mantissa has 53 bits.
Adding some further bits of tolerance for safety. *)
If[-60 < de < 60,
(
z = msum + mr (2^de);
desum = Round[Log2[z]];
msum = z (2.^-desum);
esum += desum;
)
,
If[
de >= 60,(* Summand is too big;
new sum equals the summand. *)
msum = mr;
esum = er;
,
(* de<=-56: summand is too small ; discard it. *)
msum = msum;
esum = esum;
]
];
, {i, 0, iter}];
esum10 = esum Log[2.]/Log[10.];
msum10 = msum (10.^(esum10 - Round[esum10]));
{msum10, Round[esum10]}
],
CompilationTarget -> "C",
RuntimeOptions -> "Speed"
];
Here a usage case:
AbsoluteTiming[{mc2, ec2} = cf2[s, x, mkni, k1i, n1i, iter]]
{0.052009, {1.70182, -25.}}
Read this as: The result is mc2 * 10.^ec2. It is quite a bit slower than cf, but the main reasons are Log2 and 2.^#&. These are quite costly functions. I use them to get the mantissa and the exponent of the binary representation. In C++ I could use std::frexp to directly access those bits; that is substantially less expensive. But to my knowledge, Mathematica and Compile do not provide any interface to that. =/
xis numeric when you run this? If it is an undefined symbol, then htis takes way longer than a few systems. (It took so long that I did not let it finish.) $\endgroup$xis numeric and is an input. I've edited the question. $\endgroup$s = 1.5316416966770`32*^-25;? $\endgroup$swas 16 there. $\endgroup$