Registration Deadline: 2019/05/01
The CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC) is the annual conference of CCF TCCI (Technical Committee of Chinese Information, China Computer Federation). The NLPCC conferences have been successfully held in Beijing (2012)，Chongqing (2013), Shenzhen (2014), Nanchang (2015), Kunming (2016), Dalian (2017) and Hohhot (2018). This year's NLPCC conference will be held in Dunhuang on October 9–14.
NLPCC 2019 will follow the NLPCC tradition of holding several shared tasks in natural language processing and Chinese computing. This year's shared tasks focus on both classic problems and newly emerging problems, including,
• Task 1: Cross-Domain Dependency Parsing
o Organizer: Soochow University and Alibaba Inc.
o Contact: Zhenghua Li (firstname.lastname@example.org) and Rui Wang (email@example.com)
• Task 2: Open Domain Semantic Parsing
o Organizer: Microsoft Research Asia
o Contact: Nan Duan (firstname.lastname@example.org)
• Task 3: Dialogue System
o Organizer: RSVP Technologies Inc.
o Contact: Ying Shan (email@example.com)
The top 3 participating teams of each task will be certificated by NLPCC and CCF Technical Committee on Chinese Information Technology. If a task has multiple sub-tasks, then only the top 1 participating team of each sub-task will be certificated.
The detailed description of each task can be found in the task guidelines. Participants from both academia and industry are welcomed. Each group can participate in one or multiple tasks and members in each group can attend the NLPCC conference to present their techniques and results. The participants will be invited to submit papers to the main conference and the accepted papers will appear in the conference proceedings published by Springer LNCS.
1. Overview of the Shared Tasks
There are three shared tasks in this year’s NLPCC conference and the detailed description of each task can be found in the task guidelines released. Here we only give a brief overview of each task.
◇ Task 1 – Cross-Domain Dependency Parsing
With the surge of web data (or user generated content), cross-domain parsing has become the major challenge for applying syntactic analysis in realistic NLP systems. To meet the challenge of the lack of labeled data, we have manually annotated large-scale high-quality domain-aware datasets with a lot of effort (http://hlt.suda.edu.cn/index.php/SUCDT) in the past few years. We provide a source-domain labeled dataset (~20K sentences from balanced corpus), three target-domain labeled datasets (product blogs, product comments, and web fiction; ~25K in total), and large-scale unlabeled texts (size to be determined). We setup four sub-tasks with two cross-domain scenarios, i.e., semi-supervised (thousands of target-domain labeled data for training) and unsupervised (no target-domain labeled data for training), and two tracks, i.e., closed and open.
◇ Task 2 – Open Domain Semantic Parsing
The goal of this task is to predict the logical form (in lambda-calculus) of an input question based on a given knowledge graph. For example, for question “when was Bill Gates born?”, the predicted logical form should be . Each question in our dataset is annotated with entities, the question type and the corresponding logical form. We split this dataset into a train set, a development set and a test set. Both train and development sets will be provided to participating teams, while the test set will NOT. After participating teams submit their output files, we will evaluate their performances.
◇ Task 3 – Dialogue System
In NLPCC2019, we setup an open domain conversation task to evaluate human-computer conversations. All participating systems will be talking with human annotators, live user-in-the-loop. In the task, understanding natural language inputs (which can be questions or statements) is crucial, as well as providing smooth responses. The responses will be evaluated from five aspects. We will also provide human-annotated real data for researchers, to contribute to the community.
2. How to Participate
Please fill out the registration form and send it to the coordinators of the tasks by email.
If you have any question about the shared tasks, please do not hesitate to contact us (firstname.lastname@example.org and email@example.com).
3. Important dates
2019/03/15：announcement of shared tasks and call for participation；
2019/04/01：release of detailed task guidelines & training data release;
2019/05/15：test data release;
2019/05/20：participants’ results submission deadline;
2019/05/30：evaluation results release and call for system reports and conference papers;
2019/06/30：conference paper submission deadline (only for shared tasks);
2019/07/30：conference paper accept/reject notification;
2019/08/10：camera-ready paper submission deadline;
2019/10/12～14：NLPCC 2019 main conference.
4. Shared Task Organizers (in alphabetical order)
Nan Duan, Microsoft Research Asia
Zhenghua Li, Soochow University
Ying Shan, RSVP Technologies Inc.
Rui Wang, Alibaba Inc.
Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning has been applied to reinforcement learning such that experience gained in one task can be leveraged when starting to learn the next, harder task. More recently, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum for the purpose of learning a problem that may otherwise be too difficult to learn from scratch. In this article, we present a framework for curriculum learning (CL) in reinforcement learning, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals. Finally, we use our framework to find open problems and suggest directions for future RL curriculum learning research.
We survey research on self-driving cars published in the literature focusing on autonomous cars developed since the DARPA challenges, which are equipped with an autonomy system that can be categorized as SAE level 3 or higher. The architecture of the autonomy system of self-driving cars is typically organized into the perception system and the decision-making system. The perception system is generally divided into many subsystems responsible for tasks such as self-driving-car localization, static obstacles mapping, moving obstacles detection and tracking, road mapping, traffic signalization detection and recognition, among others. The decision-making system is commonly partitioned as well into many subsystems responsible for tasks such as route planning, path planning, behavior selection, motion planning, and control. In this survey, we present the typical architecture of the autonomy system of self-driving cars. We also review research on relevant methods for perception and decision making. Furthermore, we present a detailed description of the architecture of the autonomy system of the UFES's car, IARA. Finally, we list prominent autonomous research cars developed by technology companies and reported in the media.
In this proposal we present the idea of a "macro recommender system", and "micro recommender system". Both systems can be considered as a recommender system for recommendation algorithms. A macro recommender system recommends the best performing recommendation algorithm to an organization that wants to build a recommender system. This way, an organization does not need to test many algorithms over long periods to find the best one for their particular platform. A micro recommender system recommends the best performing recommendation algorithm for each individual recommendation request. This proposal is based on the premise that there is no single-best algorithm for all users, items, and contexts. For instance, a micro recommender system might recommend one algorithm when recommendations for an elderly male user in the evening should be created. When recommendations for a young female user in the morning should be given, the micro recommender system might recommend a different algorithm.
Internet of Things (IoT) infrastructure within the physical library environment is the basis for an integrative, hybrid approach to digital resource recommenders. The IoT infrastructure provides mobile, dynamic wayfinding support for items in the collection, which includes features for location-based recommendations. The evaluation and analysis herein clarified the nature of users' requests for recommendations based on their location, and describes subject areas of the library for which users request recommendations. The results indicated that users of IoT-based recommendations are interested in a broad distribution of subjects, with a short-head distribution from this collection in American and English Literature. A long-tail finding showed a diversity of topics that are recommended to users in the library book stacks with IoT-powered recommendations.
The goal in the NER task is to classify proper nouns of a text into classes such as person, location, and organization. This is an important preprocessing step in many NLP tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art NER systems have reached performances of higher than 90 percent in terms of F1 measure, there are very few research studies for this task in Persian. One of the main important causes of this may be the lack of a standard Persian NER dataset to train and test NER systems. In this research we create a standard, big-enough tagged Persian NER dataset which will be distributed for free for research purposes. In order to construct such a standard dataset, we studied standard NER datasets which are constructed for English researches and found out that almost all of these datasets are constructed using news texts. So we collected documents from ten news websites. Later, in order to provide annotators with some guidelines to tag these documents, after studying guidelines used for constructing CoNLL and MUC standard English datasets, we set our own guidelines considering the Persian linguistic rules.
In order to answer natural language questions over knowledge graphs, most processing pipelines involve entity and relation linking. Traditionally, entity linking and relation linking has been performed either as dependent sequential tasks or independent parallel tasks. In this paper, we propose a framework called "EARL", which performs entity linking and relation linking as a joint single task. EARL uses a graph connection based solution to the problem. We model the linking task as an instance of the Generalised Travelling Salesman Problem (GTSP) and use GTSP approximate algorithm solutions. We later develop EARL which uses a pair-wise graph-distance based solution to the problem.The system determines the best semantic connection between all keywords of the question by referring to a knowledge graph. This is achieved by exploiting the "connection density" between entity candidates and relation candidates. The "connection density" based solution performs at par with the approximate GTSP solution.We have empirically evaluated the framework on a dataset with 5000 questions. Our system surpasses state-of-the-art scores for entity linking task by reporting an accuracy of 0.65 to 0.40 from the next best entity linker.
This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We provide a dataset containing ~0.25M images, ~0.76M questions, and ~10M answers (www.visualqa.org), and discuss the information it provides. Numerous baselines and methods for VQA are provided and compared with human performance. Our VQA demo is available on CloudCV (http://cloudcv.org/vqa).