VIP内容

在当今的信息和计算社会中,复杂系统常常被建模为与异质结构关系、非结构化属性/内容、时间上下文或它们的组合相关联的多模态网络。多模态网络中丰富的信息要求在进行特征工程时既要有一个领域的理解,又要有一个大的探索性搜索空间,以建立针对不同目的的定制化智能解决方案。因此,在多模态网络中,通过表示学习自动发现特征已成为许多应用的必要。在本教程中,我们系统地回顾了多模态网络表示学习的领域,包括一系列最近的方法和应用。这些方法将分别从无监督、半监督和监督学习的角度进行分类和介绍,并分别给出相应的实际应用。最后,我们总结了本教程并进行了公开讨论。本教程的作者是这一领域活跃且富有成效的研究人员。

https://chuxuzhang.github.io/KDD20_Tutorial.html

  • Part 1: Introduction and Overview 导论与概述 (Nitesh Chawla) (1:00-1:10pm) [slide] [video]
  • Part 2: Supervised Methods and Applications 监督方法与应用 2-1: User and behavior modeling (Meng Jiang) (1:10-1- :50pm) [slide] [video] 2-2: Cybersecurity and health intelligence (Yanfang Ye) (1:50-2:20pm) [slide] [video] 2-3: Relation learning (Chuxu Zhang) (2:20-2:35pm) [slide] [video] Coffee Break (2:35-3:00pm)
  • Part 3: Semi-supervised Methods and Applications 半监督方法与应用 3-1: Attributed network embedding (Xiangliang Zhang) (3:00-3:25pm) [slide] [video] 3-2: Graph alignment (Xiangliang Zhang) (3:25-3:40pm) [slide] [video]
  • Part 4: Unsupervised Methods and Applications 无监督方法与应用 4-1: Heterogeneous graph representation learning (Chuxu Zhang) (3:40-4:00pm) [slide] [video] 4-2: Graph neural network for dynamic graph and unsupervised anomaly detection (Meng Jiang) (4:00-4:20pm) [slide] [video] Part 5: Conclusions (Chuxu Zhang) (4:20-5:00pm) [slide] [video] 结论
成为VIP会员查看完整内容
0
81

最新内容

Causality knowledge is vital to building robust AI systems. Deep learning models often perform poorly on tasks that require causal reasoning, which is often derived using some form of commonsense knowledge not immediately available in the input but implicitly inferred by humans. Prior work has unraveled spurious observational biases that models fall prey to in the absence of causality. While language representation models preserve contextual knowledge within learned embeddings, they do not factor in causal relationships during training. By blending causal relationships with the input features to an existing model that performs visual cognition tasks (such as scene understanding, video captioning, video question-answering, etc.), better performance can be achieved owing to the insight causal relationships bring about. Recently, several models have been proposed that have tackled the task of mining causal data from either the visual or textual modality. However, there does not exist widespread research that mines causal relationships by juxtaposing the visual and language modalities. While images offer a rich and easy-to-process resource for us to mine causality knowledge from, videos are denser and consist of naturally time-ordered events. Also, textual information offers details that could be implicit in videos. We propose iReason, a framework that infers visual-semantic commonsense knowledge using both videos and natural language captions. Furthermore, iReason's architecture integrates a causal rationalization module to aid the process of interpretability, error analysis and bias detection. We demonstrate the effectiveness of iReason using a two-pronged comparative analysis with language representation learning models (BERT, GPT-2) as well as current state-of-the-art multimodal causality models.

0
0
下载
预览

最新论文

Causality knowledge is vital to building robust AI systems. Deep learning models often perform poorly on tasks that require causal reasoning, which is often derived using some form of commonsense knowledge not immediately available in the input but implicitly inferred by humans. Prior work has unraveled spurious observational biases that models fall prey to in the absence of causality. While language representation models preserve contextual knowledge within learned embeddings, they do not factor in causal relationships during training. By blending causal relationships with the input features to an existing model that performs visual cognition tasks (such as scene understanding, video captioning, video question-answering, etc.), better performance can be achieved owing to the insight causal relationships bring about. Recently, several models have been proposed that have tackled the task of mining causal data from either the visual or textual modality. However, there does not exist widespread research that mines causal relationships by juxtaposing the visual and language modalities. While images offer a rich and easy-to-process resource for us to mine causality knowledge from, videos are denser and consist of naturally time-ordered events. Also, textual information offers details that could be implicit in videos. We propose iReason, a framework that infers visual-semantic commonsense knowledge using both videos and natural language captions. Furthermore, iReason's architecture integrates a causal rationalization module to aid the process of interpretability, error analysis and bias detection. We demonstrate the effectiveness of iReason using a two-pronged comparative analysis with language representation learning models (BERT, GPT-2) as well as current state-of-the-art multimodal causality models.

0
0
下载
预览
Top