Minimum Description Length (MDL) estimators, using two-part codes for universal coding, are analyzed. For general parametric families under certain regularity conditions, we introduce a two-part code whose regret is close to the minimax regret, where regret of a code with respect to a target family M is the difference between the code length of the code and the ideal code length achieved by an element in M. This is a generalization of the result for exponential families by Gr\"unwald. Our code is constructed by using an augmented structure of M with a bundle of local exponential families for data description, which is not needed for exponential families. This result gives a tight upper bound on risk and loss of the MDL estimators based on the theory introduced by Barron and Cover in 1991. Further, we show that we can apply the result to mixture families, which are a typical example of non-exponential families.
翻译:暂无翻译