The estimation of categorical distributions under marginal constraints summarizing some sample from a population in the most-generalizable way is key for many machine-learning and data-driven approaches. We provide a parameter-agnostic theoretical framework that enables this task ensuring (i) that a categorical distribution of Maximum Entropy under marginal constraints always exists and (ii) that it is unique. The procedure of iterative proportional fitting (IPF) naturally estimates that distribution from any consistent set of marginal constraints directly in the space of probabilities, thus deductively identifying a least-biased characterization of the population. The theoretical framework together with IPF leads to a holistic workflow that enables modeling any class of categorical distributions solely using the phenomenological information provided.
翻译:暂无翻译