January 26, 2020

3024 words 15 mins read

Paper Group ANR 1481

Deep Short Text Classification with Knowledge Powered Attention. Optimal Decision Trees for the Algorithm Selection Problem: Integer Programming Based Approaches. Making Learners (More) Monotone. Non-autoregressive Transformer by Position Learning. Zero-Shot Paraphrase Generation with Multilingual Language Models. 3D Cardiac Shape Prediction with D …

Deep Short Text Classification with Knowledge Powered Attention


Title	Deep Short Text Classification with Knowledge Powered Attention
Authors	Jindong Chen, Yizhou Hu, Jingping Liu, Yanghua Xiao, Haiyun Jiang
Abstract	Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention.
Tasks	Text Classification
Published	2019-02-21
URL	http://arxiv.org/abs/1902.08050v1
PDF	http://arxiv.org/pdf/1902.08050v1.pdf
PWC	https://paperswithcode.com/paper/deep-short-text-classification-with-knowledge
Repo
Framework

Optimal Decision Trees for the Algorithm Selection Problem: Integer Programming Based Approaches


Title	Optimal Decision Trees for the Algorithm Selection Problem: Integer Programming Based Approaches
Authors	Matheus Guedes Vilas Boas, Haroldo Gambini Santos, Luiz Henrique de Campos Merschmann, Greet Vanden Berghe
Abstract	Even though it is well known that for most relevant computational problems different algorithms may perform better on different classes of problem instances, most researchers still focus on determining a single best algorithmic configuration based on aggregate results such as the average. In this paper, we propose Integer Programming based approaches to build decision trees for the Algorithm Selection Problem. These techniques allow automate three crucial decisions: (i) discerning the most important problem features to determine problem classes; (ii) grouping the problems into classes and (iii) select the best algorithm configuration for each class. To evaluate this new approach, extensive computational experiments were executed using the linear programming algorithms implemented in the COIN-OR Branch & Cut solver across a comprehensive set of instances, including all MIPLIB benchmark instances. The results exceeded our expectations. While selecting the single best parameter setting across all instances decreased the total running time by 22%, our approach decreased the total running time by 40% on average across 10-fold cross validation experiments. These results indicate that our method generalizes quite well and does not overfit.
Tasks
Published	2019-07-03
URL	https://arxiv.org/abs/1907.02211v3
PDF	https://arxiv.org/pdf/1907.02211v3.pdf
PWC	https://paperswithcode.com/paper/optimal-decision-trees-for-the-algorithm
Repo
Framework

Making Learners (More) Monotone


Title	Making Learners (More) Monotone
Authors	Tom J. Viering, Alexander Mey, Marco Loog
Abstract	Learning performance can show non-monotonic behavior. That is, more data does not necessarily lead to better models, even on average. We propose three algorithms that take a supervised learning model and make it perform more monotone. We prove consistency and monotonicity with high probability, and evaluate the algorithms on scenarios where non-monotone behaviour occurs. Our proposed algorithm $\text{MT}_{\text{HT}}$ makes less than $1%$ non-monotone decisions on MNIST while staying competitive in terms of error rate compared to several baselines.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/1911.11030v1
PDF	https://arxiv.org/pdf/1911.11030v1.pdf
PWC	https://paperswithcode.com/paper/making-learners-more-monotone
Repo
Framework

Non-autoregressive Transformer by Position Learning


Title	Non-autoregressive Transformer by Position Learning
Authors	Yu Bao, Hao Zhou, Jiangtao Feng, Mingxuan Wang, Shujian Huang, Jiajun Chen, Lei LI
Abstract	Non-autoregressive models are promising on various text generation tasks. Previous work hardly considers to explicitly model the positions of generated words. However, position modeling is an essential problem in non-autoregressive text generation. In this study, we propose PNAT, which incorporates positions as a latent variable into the text generative process. Experimental results show that PNAT achieves top results on machine translation and paraphrase generation tasks, outperforming several strong baselines.
Tasks	Machine Translation, Paraphrase Generation, Text Generation
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10677v1
PDF	https://arxiv.org/pdf/1911.10677v1.pdf
PWC	https://paperswithcode.com/paper/non-autoregressive-transformer-by-position
Repo
Framework

Zero-Shot Paraphrase Generation with Multilingual Language Models


Title	Zero-Shot Paraphrase Generation with Multilingual Language Models
Authors	Yinpeng Guo, Yi Liao, Xin Jiang, Qing Zhang, Yibo Zhang, Qun Liu
Abstract	Leveraging multilingual parallel texts to automatically generate paraphrases has drawn much attention as size of high-quality paraphrase corpus is limited. Round-trip translation, also known as the pivoting method, is a typical approach to this end. However, we notice that the pivoting process involves multiple machine translation models and is likely to incur semantic drift during the two-step translations. In this paper, inspired by the Transformer-based language models, we propose a simple and unified paraphrasing model, which is purely trained on multilingual parallel data and can conduct zero-shot paraphrase generation in one step. Compared with the pivoting approach, paraphrases generated by our model is more semantically similar to the input sentence. Moreover, since our model shares the same architecture as GPT (Radford et al., 2018), we are able to pre-train the model on large-scale unparallel corpus, which further improves the fluency of the output sentences. In addition, we introduce the mechanism of denoising auto-encoder (DAE) to improve diversity and robustness of the model. Experimental results show that our model surpasses the pivoting method in terms of relevance, diversity, fluency and efficiency.
Tasks	Denoising, Machine Translation, Paraphrase Generation
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03597v1
PDF	https://arxiv.org/pdf/1911.03597v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-paraphrase-generation-with
Repo
Framework

3D Cardiac Shape Prediction with Deep Neural Networks: Simultaneous Use of Images and Patient Metadata


Title	3D Cardiac Shape Prediction with Deep Neural Networks: Simultaneous Use of Images and Patient Metadata
Authors	Rahman Attar, Marco Pereanez, Christopher Bowles, Stefan K. Piechnik, Stefan Neubauer, Steffen E. Petersen, Alejandro F. Frangi
Abstract	Large prospective epidemiological studies acquire cardiovascular magnetic resonance (CMR) images for pre-symptomatic populations and follow these over time. To support this approach, fully automatic large-scale 3D analysis is essential. In this work, we propose a novel deep neural network using both CMR images and patient metadata to directly predict cardiac shape parameters. The proposed method uses the promising ability of statistical shape models to simplify shape complexity and variability together with the advantages of convolutional neural networks for the extraction of solid visual features. To the best of our knowledge, this is the first work that uses such an approach for 3D cardiac shape prediction. We validated our proposed CMR analytics method against a reference cohort containing 500 3D shapes of the cardiac ventricles. Our results show broadly significant agreement with the reference shapes in terms of the estimated volume of the cardiac ventricles, myocardial mass, 3D Dice, and mean and Hausdorff distance.
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01913v1
PDF	https://arxiv.org/pdf/1907.01913v1.pdf
PWC	https://paperswithcode.com/paper/3d-cardiac-shape-prediction-with-deep-neural
Repo
Framework

Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs


Title	Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs
Authors	Rob Clark, Hanna Silen, Tom Kenter, Ralph Leith
Abstract	Text-to-speech systems are typically evaluated on single sentences. When long-form content, such as data consisting of full paragraphs or dialogues is considered, evaluating sentences in isolation is not always appropriate as the context in which the sentences are synthesized is missing. In this paper, we investigate three different ways of evaluating the naturalness of long-form text-to-speech synthesis. We compare the results obtained from evaluating sentences in isolation, evaluating whole paragraphs of speech, and presenting a selection of speech or text as context and evaluating the subsequent speech. We find that, even though these three evaluations are based upon the same material, the outcomes differ per setting, and moreover that these outcomes do not necessarily correlate with each other. We show that our findings are consistent between a single speaker setting of read paragraphs and a two-speaker dialogue scenario. We conclude that to evaluate the quality of long-form speech, the traditional way of evaluating sentences in isolation does not suffice, and that multiple evaluations are required.
Tasks	Speech Synthesis, Text-To-Speech Synthesis
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03965v1
PDF	https://arxiv.org/pdf/1909.03965v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-long-form-text-to-speech-comparing
Repo
Framework

Hebbian-Descent


Title	Hebbian-Descent
Authors	Jan Melchior, Laurenz Wiskott
Abstract	In this work we propose Hebbian-descent as a biologically plausible learning rule for hetero-associative as well as auto-associative learning in single layer artificial neural networks. It can be used as a replacement for gradient descent as well as Hebbian learning, in particular in online learning, as it inherits their advantages while not suffering from their disadvantages. We discuss the drawbacks of Hebbian learning as having problems with correlated input data and not profiting from seeing training patterns several times. For gradient descent we identify the derivative of the activation function as problematic especially in online learning. Hebbian-descent addresses these problems by getting rid of the activation function’s derivative and by centering, i.e. keeping the neural activities mean free, leading to a biologically plausible update rule that is provably convergent, does not suffer from the vanishing error term problem, can deal with correlated data, profits from seeing patterns several times, and enables successful online learning when centering is used. We discuss its relationship to Hebbian learning, contrastive learning, and gradient decent and show that in case of a strictly positive derivative of the activation function Hebbian-descent leads to the same update rule as gradient descent but for a different loss function. In this case Hebbian-descent inherits the convergence properties of gradient descent, but we also show empirically that it converges when the derivative of the activation function is only non-negative, such as for the step function for example. Furthermore, in case of the mean squared error loss Hebbian-descent can be understood as the difference between two Hebb-learning steps, which in case of an invertible and integrable activation function actually optimizes a generalized linear model. …
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10585v1
PDF	https://arxiv.org/pdf/1905.10585v1.pdf
PWC	https://paperswithcode.com/paper/hebbian-descent
Repo
Framework

Distributionally Robust and Multi-Objective Nonnegative Matrix Factorization


Title	Distributionally Robust and Multi-Objective Nonnegative Matrix Factorization
Authors	Nicolas Gillis, Le Thi Khanh Hien, Valentin Leplat, Vincent Y. F. Tan
Abstract	Nonnegative matrix factorization (NMF) is a linear dimensionality reduction technique for analyzing nonnegative data. A key aspect of NMF is the choice of the objective function that depends on the noise model (or statistics of the noise) assumed on the data. In many applications, the noise model is unknown and difficult to estimate. In this paper, we define a multi-objective NMF (MO-NMF) problem, where several objectives are combined within the same NMF model. We propose to use Lagrange duality to judiciously optimize for a set of weights to be used within the framework of the weighted-sum approach, that is, we minimize a single objective function which is a weighted sum of the all objective functions. We design a simple algorithm using multiplicative updates to minimize this weighted sum. We show how this can be used to find distributionally robust NMF (DR-NMF) solutions, that is, solutions that minimize the largest error among all objectives. We illustrate the effectiveness of this approach on synthetic, document and audio datasets. The results show that DR-NMF is robust to our incognizance of the noise model of the NMF problem.
Tasks	Dimensionality Reduction
Published	2019-01-30
URL	http://arxiv.org/abs/1901.10757v2
PDF	http://arxiv.org/pdf/1901.10757v2.pdf
PWC	https://paperswithcode.com/paper/distributionally-robust-and-multi-objective
Repo
Framework

Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective


Title	Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective
Authors	Danielle Bragg, Oscar Koller, Mary Bellard, Larwan Berke, Patrick Boudrealt, Annelies Braffort, Naomi Caselli, Matt Huenerfauth, Hernisa Kacorri, Tessa Verhoef, Christian Vogler, Meredith Ringel Morris
Abstract	Developing successful sign language recognition, generation, and translation systems requires expertise in a wide range of fields, including computer vision, computer graphics, natural language processing, human-computer interaction, linguistics, and Deaf culture. Despite the need for deep interdisciplinary knowledge, existing research occurs in separate disciplinary silos, and tackles separate portions of the sign language processing pipeline. This leads to three key questions: 1) What does an interdisciplinary view of the current landscape reveal? 2) What are the biggest challenges facing the field? and 3) What are the calls to action for people working in the field? To help answer these questions, we brought together a diverse group of experts for a two-day workshop. This paper presents the results of that interdisciplinary workshop, providing key background that is often overlooked by computer scientists, a review of the state-of-the-art, a set of pressing challenges, and a call to action for the research community.
Tasks	Sign Language Recognition
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08597v1
PDF	https://arxiv.org/pdf/1908.08597v1.pdf
PWC	https://paperswithcode.com/paper/sign-language-recognition-generation-and
Repo
Framework

Topological Bayesian Optimization with Persistence Diagrams


Title	Topological Bayesian Optimization with Persistence Diagrams
Authors	Tatsuya Shiraishi, Tam Le, Hisashi Kashima, Makoto Yamada
Abstract	Finding an optimal parameter of a black-box function is important for searching stable material structures and finding optimal neural network structures, and Bayesian optimization algorithms are widely used for the purpose. However, most of existing Bayesian optimization algorithms can only handle vector data and cannot handle complex structured data. In this paper, we propose the topological Bayesian optimization, which can efficiently find an optimal solution from structured data using \emph{topological information}. More specifically, in order to apply Bayesian optimization to structured data, we extract useful topological information from a structure and measure the proper similarity between structures. To this end, we utilize persistent homology, which is a topological data analysis method that was recently applied in machine learning. Moreover, we propose the Bayesian optimization algorithm that can handle multiple types of topological information by using a linear combination of kernels for persistence diagrams. Through experiments, we show that topological information extracted by persistent homology contributes to a more efficient search for optimal structures compared to the random search baseline and the graph Bayesian optimization algorithm.
Tasks	Topological Data Analysis
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09722v1
PDF	http://arxiv.org/pdf/1902.09722v1.pdf
PWC	https://paperswithcode.com/paper/topological-bayesian-optimization-with
Repo
Framework

Fully Parallel Architecture for Semi-global Stereo Matching with Refined Rank Method


Title	Fully Parallel Architecture for Semi-global Stereo Matching with Refined Rank Method
Authors	Yiwu Yao, Yuhua Cheng
Abstract	Fully parallel architecture at disparity-level for efficient semi-global matching (SGM) with refined rank method is presented. The improved SGM algorithm is implemented with the non-parametric unified rank model which is the combination of Rank filter/AD and Rank SAD. Rank SAD is a novel definition by introducing the constraints of local image structure into the rank method. As a result, the unified rank model with Rank SAD can make up for the defects of Rank filter/AD. Experimental results show both excellent subjective quality and objective performance of the refined SGM algorithm. The fully parallel construction for hardware implementation of SGM is architected with reasonable strategies at disparity-level. The parallelism of the data-stream allows proper throughput for specific applications with acceptable maximum frequency. The results of RTL emulation and synthesis ensure that the proposed parallel architecture is suitable for VLSI implementation.
Tasks	Stereo Matching, Stereo Matching Hand
Published	2019-05-07
URL	https://arxiv.org/abs/1905.03716v1
PDF	https://arxiv.org/pdf/1905.03716v1.pdf
PWC	https://paperswithcode.com/paper/190503716
Repo
Framework

Survey on Evaluation Methods for Dialogue Systems


Title	Survey on Evaluation Methods for Dialogue Systems
Authors	Jan Deriu, Alvaro Rodrigo, Arantxa Otegi, Guillermo Echegoyen, Sophie Rosset, Eneko Agirre, Mark Cieliebak
Abstract	In this paper we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost and time intensive. Thus, much work has been put into finding methods, which allow to reduce the involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented dialogue systems, conversational dialogue systems, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class.
Tasks	Question Answering, Task-Oriented Dialogue Systems
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04071v1
PDF	https://arxiv.org/pdf/1905.04071v1.pdf
PWC	https://paperswithcode.com/paper/survey-on-evaluation-methods-for-dialogue
Repo
Framework

NaMemo: Enhancing Lecturers’ Interpersonal Competence of Remembering Students’ Names


Title	NaMemo: Enhancing Lecturers’ Interpersonal Competence of Remembering Students’ Names
Authors	Guang Jiang, Mengzhen Shi, Ying Su, Pengcheng An, Yunlong Wang, Brian Y. Lim
Abstract	Addressing students by their names helps a teacher to start building rapport with students and thus facilitates their classroom participation. However, this basic yet effective skill has become rather challenging for university lecturers, who have to handle large-sized (sometimes exceeding 100) groups in their daily teaching. To enhance lecturers’ competence in delivering interpersonal interaction, we developed NaMemo, a real-time name-indicating system based on a dedicated face-recognition pipeline. This paper presents the system design, the pilot feasibility test, and our plan for the following study, which aims to evaluate NaMemo’s impacts on learning and teaching, as well as to probe design implications including privacy considerations.
Tasks	Face Recognition
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09279v3
PDF	https://arxiv.org/pdf/1911.09279v3.pdf
PWC	https://paperswithcode.com/paper/namemo-enhancing-lecturers-interpersonal
Repo
Framework

SF-Net: Structured Feature Network for Continuous Sign Language Recognition


Title	SF-Net: Structured Feature Network for Continuous Sign Language Recognition
Authors	Zhaoyang Yang, Zhenmei Shi, Xiaoyong Shen, Yu-Wing Tai
Abstract	Continuous sign language recognition (SLR) aims to translate a signing sequence into a sentence. It is very challenging as sign language is rich in vocabulary, while many among them contain similar gestures and motions. Moreover, it is weakly supervised as the alignment of signing glosses is not available. In this paper, we propose Structured Feature Network (SF-Net) to address these challenges by effectively learn multiple levels of semantic information in the data. The proposed SF-Net extracts features in a structured manner and gradually encodes information at the frame level, the gloss level and the sentence level into the feature representation. The proposed SF-Net can be trained end-to-end without the help of other models or pre-training. We tested the proposed SF-Net on two large scale public SLR datasets collected from different continuous SLR scenarios. Results show that the proposed SF-Net clearly outperforms previous sequence level supervision based methods in terms of both accuracy and adaptability.
Tasks	Sign Language Recognition
Published	2019-08-04
URL	https://arxiv.org/abs/1908.01341v1
PDF	https://arxiv.org/pdf/1908.01341v1.pdf
PWC	https://paperswithcode.com/paper/sf-net-structured-feature-network-for
Repo
Framework