Selecting a small set of representatives from a large database is important in many applications such as multi-criteria decision making, web search, and recommendation. The $k$-regret minimizing set ($k$-RMS) problem was recently proposed for representative tuple discovery. Specifically, for a large database $P$ of tuples with multiple numerical attributes, the $k$-RMS problem returns a size-$r$ subset $Q$ of $P$ such that, for any possible ranking function, the score of the top-ranked tuple in $Q$ is not much worse than the score of the $k$\textsuperscript{th}-ranked tuple in $P$. Although the $k$-RMS problem has been extensively studied in the literature, existing methods are designed for the static setting and cannot maintain the result efficiently when the database is updated. To address this issue, we propose the first fully-dynamic algorithm for the $k$-RMS problem that can efficiently provide the up-to-date result w.r.t.~any insertion and deletion in the database with a provable guarantee. Experimental results on several real-world and synthetic datasets demonstrate that our algorithm runs up to four orders of magnitude faster than existing $k$-RMS algorithms while returning results of nearly equal quality.
翻译:从大型数据库中选择少量代表对于许多应用来说很重要,例如多标准决策、网络搜索和建议。最近为代表图普尔的发现提议了美元-雷格特尽量减少(美元-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克/马克-马克-马克-马克-马克-马克/马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-马克-调换成一个可以有效提供最新结果(r.t)-在数据库中插入和删除的得分不差得多。虽然文献中已广泛研究了美元-马克-马克-卡萨洛-以近四等质量显示我们实际和合成数据质量的算方法。