Missing values occur commonly in the multidimensional data warehouses. They may generate problems of usefulness of data since the analysis performed on a multidimensional data warehouse is through different dimensions with hierarchies where we can roll up or drill down to the different parameters of analysis. Therefore, it's essential to complete these missing values in order to carry out a better analysis. There are existing data imputation methods which are suitable for numeric data, so they can be applied for fact tables but not for dimension tables. Some other data imputation methods need extra time and effort costs. As consequence, we propose in this article an internal data imputation method for multidimensional data warehouse based on the existing data and considering the intra-dimension and inter-dimension relationships.
翻译:缺失值通常出现在多维数据仓库中,可能会产生数据的用处问题,因为在多维数据仓库中进行的分析是通过不同层次的分级进行的不同层面,我们可以据此卷起或钻入到不同的分析参数中。 因此,为了进行更好的分析,必须完成这些缺失值。 现有数据估算方法适合数字数据, 以便用于事实表格, 而不是维度表格。 其他一些数据估算方法需要额外的时间和精力。 因此,我们在本篇文章中提议在现有数据的基础上, 并在考虑到内部和内部关系的情况下, 为多维数据仓库采用内部数据估算方法 。