This article introduces a non-parametric information-theoretic approach to inference about the tail of a continuous or a discrete distribution. Leveraging a new concept named tail profile -- a set of information-theoretic quantities developed from results of domains of attraction on countable alphabets -- theoretical evidence supports the identification of specific discrete distributional tail types through a sequence of plots. The approach discerns tail types by bench-marking against exponential, and three thicker-than-exponential families: near-exponential, sub-exponential, and power-law (zipf, Pareto). For tails thicker-than-exponential, the approach also provides point and interval estimates for some of the underlying distribution parameters. While primarily designed to streamline the selection of discrete parametric models for detailed statistical analysis, a supporting theorem enables the method's extension use to continuous data, stating that binning continuous data with a common width preserves the tail decay rate under certain conditions. Simulations are presented to demonstrate the method's performance across various scenarios.
翻译:暂无翻译