More and more, with the growing focus on large scale analytics, we are confronted with the need of integrating data from multiple sources. The problem is that these data are impossible to reuse as-is. The net result is high cost, with the further drawback that the resulting integrated data will again be hardly reusable as-is. iTelos is a general purpose methodology aiming at minimizing the effects of this process. The intuition is that data will be treated differently based on their popularity: the more a certain set of data have been reused, the more they will be reused and the less they will be changed across reuses, thus decreasing the overall data preprocessing costs, while increasing backward compatibility and future sharing
翻译:越来越多的是,随着对大规模分析分析的日益重视,我们面临综合来自多种来源的数据的需要。问题在于这些数据不可能按原样再利用。净结果是成本高,结果是净成本高,由此产生的综合数据将不再像现在那样可以重新使用。 iTelos是一种通用方法,目的是最大限度地减少这一进程的影响。直觉是,数据根据其普及程度将受到不同的对待:某一组数据被再利用的程度越高,再利用的程度越高,跨再利用中数据的变化越少,从而降低了整个数据处理预处理费用,同时增加了后向兼容性和今后的共享。