Bayesian multinomial dirichlet regression (using user specific information) model for purchase counts across several products

Question

Suppose user $i$ has purchase history of $J$ products $q_i^1,\ldots,q_i^J$. Also, values of $K$ user characteristics (gender, for example) are known $x_i^1,\ldots,x_i^K$. I want to build Bayesian regression model. I would use it to infer probability distribution across products for any user.

I would like to use it in the following manner: if no purchase history is known (only user characteristics) there would be some average distribution. With more and more purchase history, the distribution would be more and more user specific.

I have started with Multinomial-Dirichlet framework:

$$ \text{concentration}\sim\Gamma(2, 0.5)\\ \beta_j^k\sim N(0, 1)\\ \text{purchase}_i\sim\text{DirichletMultinomial}(q_i, p_i*\text{concentration}) $$

where the first two lines define prior distributions, the last defines likelihood with $q_i$ total purchase quantity and $p_i=(p_i^1,\ldots,p_i^J)$ denotes sampling probability vector. The user specific characteristics control these probabilities through $p_i^j=\text{softmax}(\sum_kx_i^k\beta_j^k)$.

After estimating model coefficients, I could do posterior sampling to get predicted purchase distribution, given some user information $x_i$. But that would give some average purchase distribution for specific set of $x_i$. What I don't understand is how to use it to get different distribution with regards user-specific purchase history. I would expect different distribution for customer with one purchase and one hundred purchases, although they might have the same values of $x_i$.

Is it some conceptual misunderstanding on my side, or do I need different model setting?

Bill Vander Lugt · Accepted Answer · 2024-11-19 17:55:50Z

If you already know a customer's actual purchase history, I gather you trying to infer their future, additional purchases based on their personal characteristics K?

If so, your setup may not be accomplishing what you want because your concentration parameter is being randomly drawn instead of reflecting the actual, observed number of purchases already made by a given customer or the strength of your prior.

Instead of thinking about a DirichletMultinomial distribution, I myself would consider a Bayesian model with a Dirichlet prior distribution (derived, as you propose, from K) and a Multinomial likelihood captured by your previously observed purchase vector J. Combining the those two would (because of these two distributions' conjugacy) yield a Dirichlet posterior distribution, whose attributes could then be analytically evaluated without any recourse to sampling (though it's easy enough to draw samples from a Dirichlet too if you really wanted). The parameters of the posterior would then represent the sum of the observed past purchases (likelihood) and the predicted future purchases (prior), if that is, in fact, what you are trying to capture.

This setup at least has the benefit of capturing your expectation of a "different distribution for customer with one purchase and one hundred purchases, although they might have the same values of x" because the Dirichlet prior can be made as strong or weak as you choose and because the amount of variance in the posterior will reflect both the strength of your K-derived prior and the number of purchases already observed (the sum of the elements in J).

Since your proposed setup already uses a concentration parameter, note that a Dirichlet distribution can also be parameterized in terms of an overall concentration parameter (which in the prior can be set to whatever "strength" you want) that gets multiplied by a normalized vector of probabilities. For example, if, based on some user's characteristics K you've generated a Dir(1, 3, 4, 2) prior, that prior can readily be reparameterized as Dir(.1 * concentration, .3 * concentration, .4 * concentration, .2 * concentration) with a concentration of 10.

Since I'm not entirely clear what you are trying to predict, I can't be sure this alternative setup fits the bill either. But I hope it at least addresses your puzzlement about the number of previously observed purchases failing to impact your posterior.

Stack Exchange Network

Bayesian multinomial dirichlet regression (using user specific information) model for purchase counts across several products

1 Answer 1

Your Answer

Hot Network Questions

Bayesian multinomial dirichlet regression (using user specific information) model for purchase counts across several products

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions