基于领域本体的数字数据集聚类质量增强 (Enhancing Cluster Quality of Numerical Datasets with Domain Ontology) - 专知论文

会员服务 ·

0

领域本体 · 本体 · 数据集 · 聚类方法 · 属性 ·

2023 年 4 月 2 日

Enhancing Cluster Quality of Numerical Datasets with Domain Ontology

翻译：基于领域本体的数字数据集聚类质量增强

Sudath Rohitha Heiyanthuduwage,Md Anisur Rahman,Md Zahidul Islam

from arxiv, 6 Pages, IEEE CSDE2022 Conference Paper

Ontology-based clustering has gained attention in recent years due to the potential benefits of ontology. Current ontology-based clustering approaches have mainly been applied to reduce the dimensionality of attributes in text document clustering. Reduction in dimensionality of attributes using ontology helps to produce high quality clusters for a dataset. However, ontology-based approaches in clustering numerical datasets have not been gained enough attention. Moreover, some literature mentions that ontology-based clustering can produce either high quality or low-quality clusters from a dataset. Therefore, in this paper we present a clustering approach that is based on domain ontology to reduce the dimensionality of attributes in a numerical dataset using domain ontology and to produce high quality clusters. For every dataset, we produce three datasets using domain ontology. We then cluster these datasets using a genetic algorithm-based clustering technique called GenClust++. The clusters of each dataset are evaluated in terms of Sum of Squared-Error (SSE). We use six numerical datasets to evaluate the performance of our ontology-based approach. The experimental results of our approach indicate that cluster quality gradually improves from lower to the higher levels of a domain ontology.

翻译：领域本体聚类已经在最近几年中受到了关注，由于本体的潜在优势，使得本体聚类的方法主要应用于减少文本文档聚类中的属性维度。使用本体减少属性维度可以产生高质量的数据集聚类。然而，本体聚类方法在聚类数字数据集方面还没有得到足够的关注。此外，一些文献提到，本体聚类可能会从数据集中产生高质量或低质量的聚类。因此，本文提出了一种基于领域本体的聚类方法，该方法通过使用领域本体来减少数字数据集中的属性维度，并生成高质量的聚类。对于每个数据集，我们使用领域本体生成三个数据集。然后使用遗传算法聚类技术GenClust++对这些数据集进行聚类。分别从聚类的角度和平方误差（SSE）的角度对每个数据集的聚类进行评估。我们使用了六个数字数据集来评估我们的本体聚类的性能。实验结果表明，本文方法的聚类质量从领域本体较低的级别逐渐提高到较高的级别。

0

相关内容

领域本体

ChatAug: 利用ChatGPT进行文本数据增强

ChatAug: 利用ChatGPT进行文本数据增强

专知会员服务

81+阅读 · 2023年3月4日

【NeurIPS2021】利用领域特定特征来增强领域泛化

专知会员服务

26+阅读 · 2021年10月20日

自然语言预训练模型知识增强方法综述

专知会员服务

62+阅读 · 2021年8月4日

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

专知会员服务

20+阅读 · 2020年6月11日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

专知会员服务

26+阅读 · 2020年3月16日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

47+阅读 · 2019年11月24日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

AAAI2020 图相关论文集

AAAI2020 图相关论文集

图与推荐

11+阅读 · 2020年7月15日

再谈人脸识别损失函数综述

再谈人脸识别损失函数综述

人工智能前沿讲习班

14+阅读 · 2019年5月7日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

专知

19+阅读 · 2018年3月26日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

【综述】最新7篇数据科学/深度学习/CNN/知识图谱/文本匹配等中英文综述论文推介（附下载）

【综述】最新7篇数据科学/深度学习/CNN/知识图谱/文本匹配等中英文综述论文推介（附下载）

机器学习研究会

16+阅读 · 2017年12月3日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

基于概率语义分析的多关系图多类标分类方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

语义知识驱动的网络上下文广告投放高效方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

结构化过完备稀疏性约束的超分辨率图像重建研究

国家自然科学基金

0+阅读 · 2011年12月31日

我国医院绩效评价方法与实证研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于粗概念格模型的电子商务领域本体的构建、映射与合并研究

国家自然科学基金

0+阅读 · 2010年12月31日

基于本体的深层网络数据集成方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

大规模本体的分块映射及相关评价方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Web图像的语义表示及在聚类与排序中的应用

国家自然科学基金

1+阅读 · 2009年12月31日

基于迁移学习的图像搜索理论与方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

Supervised Feature Compression based on Counterfactual Analysis

Arxiv

0+阅读 · 2023年5月23日

Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph

Arxiv

0+阅读 · 2023年5月22日

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

Arxiv

0+阅读 · 2023年5月22日

On recoverability from failures in dual voting

Arxiv

0+阅读 · 2023年5月20日

SFP: Spurious Feature-targeted Pruning for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2023年5月19日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Open Domain Generalization with Domain-Augmented Meta-Learning

Arxiv

21+阅读 · 2021年4月8日

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Arxiv

17+阅读 · 2019年10月9日

Domain Representation for Knowledge Graph Embedding

Domain Representation for Knowledge Graph Embedding

Arxiv

14+阅读 · 2019年9月11日

VIP会员

文章信息

相关主题

相关VIP内容

ChatAug: 利用ChatGPT进行文本数据增强

ChatAug: 利用ChatGPT进行文本数据增强

专知会员服务

81+阅读 · 2023年3月4日

【NeurIPS2021】利用领域特定特征来增强领域泛化

专知会员服务

26+阅读 · 2021年10月20日

自然语言预训练模型知识增强方法综述

专知会员服务

62+阅读 · 2021年8月4日

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

专知会员服务

20+阅读 · 2020年6月11日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

专知会员服务

26+阅读 · 2020年3月16日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

47+阅读 · 2019年11月24日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《关于俄乌战争的系列文章》2025最新70页

《军事行动中的人机AI编队本体模型》

更智能的人工智能实现更快速的电磁辐射控制（EMCON）

《俄罗斯常规军队能力现状及重建》2025最新124页

相关资讯

AAAI2020 图相关论文集

AAAI2020 图相关论文集

图与推荐

11+阅读 · 2020年7月15日

再谈人脸识别损失函数综述

再谈人脸识别损失函数综述

人工智能前沿讲习班

14+阅读 · 2019年5月7日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

专知

19+阅读 · 2018年3月26日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

【综述】最新7篇数据科学/深度学习/CNN/知识图谱/文本匹配等中英文综述论文推介（附下载）

【综述】最新7篇数据科学/深度学习/CNN/知识图谱/文本匹配等中英文综述论文推介（附下载）

机器学习研究会

16+阅读 · 2017年12月3日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Supervised Feature Compression based on Counterfactual Analysis

Arxiv

0+阅读 · 2023年5月23日

Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph

Arxiv

0+阅读 · 2023年5月22日

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

Arxiv

0+阅读 · 2023年5月22日

On recoverability from failures in dual voting

Arxiv

0+阅读 · 2023年5月20日

SFP: Spurious Feature-targeted Pruning for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2023年5月19日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Open Domain Generalization with Domain-Augmented Meta-Learning

Arxiv

21+阅读 · 2021年4月8日

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Arxiv

17+阅读 · 2019年10月9日

Domain Representation for Knowledge Graph Embedding

Domain Representation for Knowledge Graph Embedding

Arxiv

14+阅读 · 2019年9月11日

相关基金

基于概率语义分析的多关系图多类标分类方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

语义知识驱动的网络上下文广告投放高效方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

结构化过完备稀疏性约束的超分辨率图像重建研究

国家自然科学基金

0+阅读 · 2011年12月31日

我国医院绩效评价方法与实证研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于粗概念格模型的电子商务领域本体的构建、映射与合并研究

国家自然科学基金

0+阅读 · 2010年12月31日

基于本体的深层网络数据集成方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

大规模本体的分块映射及相关评价方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Web图像的语义表示及在聚类与排序中的应用

国家自然科学基金

1+阅读 · 2009年12月31日

基于迁移学习的图像搜索理论与方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员