The German Tank Problem dates back to World War II when the Allies used a statistical approach to estimate the number of enemy tanks produced or on the field from observed serial numbers after battles. Assuming that the tanks are labeled consecutively starting from 1, if we observe $k$ tanks from a total of $N$ tanks with the maximum observed tank being $m$, then the best estimate for $N$ is $m(1 + 1/k) - 1$. We explore many generalizations. We looked at the discrete and continuous one dimensional case. We explored different estimators such as the $L$\textsuperscript{th} largest tank, and applied motivation from portfolio theory and studied a weighted average; however, the original formula was the best. We generalized the problem in two dimensions, with pairs instead of points, studying the discrete and continuous square and circle variants. There were complications from curvature issues and that not every number is representable as a sum of two squares. We often concentrated on the large $N$ limit. For the discrete and continuous square, we tested various statistics, finding the largest observed component did best; the scaling factor for both cases is $(2k+1)/2k$. The discrete case was especially involved because we had to use approximation formulas that gave us the number of lattice points inside the circle. Interestingly, the scaling factors were different for the cases. Lastly, we generalized the problem into $L$ dimensional squares and circles. The discrete and continuous square proved similar to the two dimensional square problem. However, for the $L$\textsuperscript{th} dimensional circle, we had to use formulas for the volume of the $L$-ball, and had to approximate the number of lattice points inside it. The formulas for the discrete circle were particularly interesting, as there was no $L$ dependence in the formula.
翻译:德国坦克问题可追溯到二战,当时盟国采用统计方法估算战斗后所观察到的敌坦克数量或实地的敌坦克数量。假设坦克标签从1开始连续贴上,如果我们从总共1美元坦克看到美元坦克,而最大观察坦克为1美元,那么最好的估计是1美元(1+1/k) - 1美元。我们探讨的是许多一般化。我们研究了离散和连续的一个维度案例。我们探索了不同的估计值,如美元(textsuperscript{th}最大的坦克,并应用了组合理论的动力,并研究了一个加权平均数;然而,最初的公式是最好的。我们从两个方面将问题推广到两个层面,研究离散和连续的方方差和圆的变方差。由于曲线问题的复杂性,并不是每个数字,我们常常集中到两个基值的基值。对于有趣的和连续的基值,我们测试了各种数字,我们测试了各种案例,发现最大一个基值的基值,特别是基值的基值,因为我们用了一个基值的基值的基值,因为我们的基值是不同的数值。