建立法律领域文本分类布尔搜索规则 (Computer-Assisted Creation of Boolean Search Rules for Text Classification in the Legal Domain) - 专知论文

会员服务 ·

0

文本分类 · CASE · Machine Learning · INTERACT · INFORMS ·

2021 年 12 月 10 日

Computer-Assisted Creation of Boolean Search Rules for Text Classification in the Legal Domain

翻译：建立法律领域文本分类布尔搜索规则

Hannes Westermann,Jaromir Savelka,Vern R. Walker,Kevin D. Ashley,Karim Benyekhlef

In this paper, we present a method of building strong, explainable classifiers in the form of Boolean search rules. We developed an interactive environment called CASE (Computer Assisted Semantic Exploration) which exploits word co-occurrence to guide human annotators in selection of relevant search terms. The system seamlessly facilitates iterative evaluation and improvement of the classification rules. The process enables the human annotators to leverage the benefits of statistical information while incorporating their expert intuition into the creation of such rules. We evaluate classifiers created with our CASE system on 4 datasets, and compare the results to machine learning methods, including SKOPE rules, Random forest, Support Vector Machine, and fastText classifiers. The results drive the discussion on trade-offs between superior compactness, simplicity, and intuitiveness of the Boolean search rules versus the better performance of state-of-the-art machine learning models for text classification.

翻译：在本文中,我们介绍了一种以布尔搜索规则的形式建立强大、可解释的分类方法。我们开发了一种互动环境,称为CASE(计算机辅助语义探索),它利用“共字”来指导相关搜索术语的选择;这个系统无缝地促进了对分类规则的迭代评价和改进。这个过程使人类标识者能够利用统计资料的好处,同时将其专家直觉纳入此类规则的创建中。我们评估了与我们的CASE系统在4个数据集上创建的分类者,并将结果与机器学习方法进行了比较,包括SKOPE规则、随机森林、支持矢量机和快式分类方法。结果推动了关于“布林搜索规则”的超紧凑性、简单性和直观性与文本分类方面最先进的机器学习模型的更好性能之间的取舍的讨论。

0

相关内容

文本分类

文本分类（Text Classification）任务是根据给定文档的内容或主题，自动分配预先定义的类别标签。

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【开放书】预测模型:探索、解释和调试，以人为本的可解释机器学习，Predictive Models: Explore, Explain, and Debug，Human-Centered Interpretable Machine Learning

【开放书】预测模型:探索、解释和调试，以人为本的可解释机器学习，Predictive Models: Explore, Explain, and Debug，Human-Centered Interpretable Machine Learning

专知会员服务

37+阅读 · 2019年12月26日

【目标检测 | 2019最新综述】目标检测的20年，附39页PDF，Object Detection in 20 Years: A Survey

【目标检测 | 2019最新综述】目标检测的20年，附39页PDF，Object Detection in 20 Years: A Survey

专知会员服务

60+阅读 · 2019年11月15日

【O'Reilly AI Conference 2019】当飞行比停下便宜时，When flying is cheaper than standing still，苏黎世联邦理工学院Raffaello D'Andrea教授

【O'Reilly AI Conference 2019】当飞行比停下便宜时，When flying is cheaper than standing still，苏黎世联邦理工学院Raffaello D'Andrea教授

专知会员服务

15+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

AI可解释性文献列表

AI可解释性文献列表

专知

43+阅读 · 2019年10月7日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Kaggle 新赛：Google AI Open Images 目标检测

Kaggle 新赛：Google AI Open Images 目标检测

AI研习社

18+阅读 · 2018年7月4日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Revisiting Parameter-Efficient Tuning: Are We Really There Yet?

Arxiv

0+阅读 · 2022年2月16日

Sim-to-Real Domain Adaptation for Lane Detection and Classification in Autonomous Driving

Arxiv

0+阅读 · 2022年2月15日

JuriBERT: A Masked-Language Model Adaptation for French Legal Text

Arxiv

0+阅读 · 2022年2月12日

Uncertainty-Aware Reliable Text Classification

Arxiv

8+阅读 · 2021年7月15日

Exploiting Diverse Characteristics and Adversarial Ambivalence for Domain Adaptive Segmentation

Exploiting Diverse Characteristics and Adversarial Ambivalence for Domain Adaptive Segmentation

Arxiv

9+阅读 · 2020年12月10日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Active Generative Adversarial Network for Image Classification

Arxiv

4+阅读 · 2019年6月17日

Towards Explainable NLP: A Generative Explanation Framework for Text Classification

Arxiv

3+阅读 · 2019年6月11日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

TextMountain: Accurate Scene Text Detection via Instance Segmentation

Arxiv

4+阅读 · 2018年11月30日

VIP会员

文章信息

相关主题

Machine Learning

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【开放书】预测模型:探索、解释和调试，以人为本的可解释机器学习，Predictive Models: Explore, Explain, and Debug，Human-Centered Interpretable Machine Learning

【开放书】预测模型:探索、解释和调试，以人为本的可解释机器学习，Predictive Models: Explore, Explain, and Debug，Human-Centered Interpretable Machine Learning

专知会员服务

37+阅读 · 2019年12月26日

【目标检测 | 2019最新综述】目标检测的20年，附39页PDF，Object Detection in 20 Years: A Survey

【目标检测 | 2019最新综述】目标检测的20年，附39页PDF，Object Detection in 20 Years: A Survey

专知会员服务

60+阅读 · 2019年11月15日

【O'Reilly AI Conference 2019】当飞行比停下便宜时，When flying is cheaper than standing still，苏黎世联邦理工学院Raffaello D'Andrea教授

【O'Reilly AI Conference 2019】当飞行比停下便宜时，When flying is cheaper than standing still，苏黎世联邦理工学院Raffaello D'Andrea教授

专知会员服务

15+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

美军小型无人机项目

无人机蜂群——作为执行非常规战争的创新工具 | 2025最新文献

不确定环境下无人机与无人地面车辆编队的地下勘探规划算法 | 122页

接纳无人机多样性：西方军事在无人机战争中适应的五个挑战 | 28页报告

相关资讯

AI可解释性文献列表

AI可解释性文献列表

专知

43+阅读 · 2019年10月7日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Kaggle 新赛：Google AI Open Images 目标检测

Kaggle 新赛：Google AI Open Images 目标检测

AI研习社

18+阅读 · 2018年7月4日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Revisiting Parameter-Efficient Tuning: Are We Really There Yet?

Arxiv

0+阅读 · 2022年2月16日

Sim-to-Real Domain Adaptation for Lane Detection and Classification in Autonomous Driving

Arxiv

0+阅读 · 2022年2月15日

JuriBERT: A Masked-Language Model Adaptation for French Legal Text

Arxiv

0+阅读 · 2022年2月12日

Uncertainty-Aware Reliable Text Classification

Arxiv

8+阅读 · 2021年7月15日

Exploiting Diverse Characteristics and Adversarial Ambivalence for Domain Adaptive Segmentation

Exploiting Diverse Characteristics and Adversarial Ambivalence for Domain Adaptive Segmentation

Arxiv

9+阅读 · 2020年12月10日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Active Generative Adversarial Network for Image Classification

Arxiv

4+阅读 · 2019年6月17日

Towards Explainable NLP: A Generative Explanation Framework for Text Classification

Arxiv

3+阅读 · 2019年6月11日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

TextMountain: Accurate Scene Text Detection via Instance Segmentation

Arxiv

4+阅读 · 2018年11月30日

微信扫码咨询专知VIP会员