January 27, 2020

2801 words 14 mins read

Paper Group ANR 1316

PHT-bot: Deep-Learning based system for automatic risk stratification of COPD patients based upon signs of Pulmonary Hypertension. Attention-based Supply-Demand Prediction for Autonomous Vehicles. A Stochastic Variance Reduced Nesterov’s Accelerated Quasi-Newton Method. Anti-efficient encoding in emergent communication. Relaxed Softmax for learning …

PHT-bot: Deep-Learning based system for automatic risk stratification of COPD patients based upon signs of Pulmonary Hypertension


Title	PHT-bot: Deep-Learning based system for automatic risk stratification of COPD patients based upon signs of Pulmonary Hypertension
Authors	David Chettrit, Orna Bregman Amitai, Itamar Tamir, Amir Bar, Eldad Elnekave
Abstract	Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of morbidity and mortality worldwide. Identifying those at highest risk of deterioration would allow more effective distribution of preventative and surveillance resources. Secondary pulmonary hypertension is a manifestation of advanced COPD, which can be reliably diagnosed by the main Pulmonary Artery (PA) to Ascending Aorta (Ao) ratio. In effect, a PA diameter to Ao diameter ratio of greater than 1 has been demonstrated to be a reliable marker of increased pulmonary arterial pressure. Although clinically valuable and readily visualized, the manual assessment of the PA and the Ao diameters is time consuming and under-reported. The present study describes a non invasive method to measure the diameters of both the Ao and the PA from contrast-enhanced chest Computed Tomography (CT). The solution applies deep learning techniques in order to select the correct axial slice to measure, and to segment both arteries. The system achieves test Pearson correlation coefficient scores of 93% for the Ao and 92% for the PA. To the best of our knowledge, it is the first such fully automated solution.
Tasks	Computed Tomography (CT)
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11773v1
PDF	https://arxiv.org/pdf/1905.11773v1.pdf
PWC	https://paperswithcode.com/paper/pht-bot-deep-learning-based-system-for
Repo
Framework

Attention-based Supply-Demand Prediction for Autonomous Vehicles


Title	Attention-based Supply-Demand Prediction for Autonomous Vehicles
Authors	Zikai Zhang, Yidong Li, Hairong Dong, Yizhe You, Fengping Zhao
Abstract	As one of the important functions of the intelligent transportation system (ITS), supply-demand prediction for autonomous vehicles provides a decision basis for its control. In this paper, we present two prediction models (i.e. ARLP model and Advanced ARLP model) based on two system environments that only the current day’s historical data is available or several days’ historical data are available. These two models jointly consider the spatial, temporal, and semantic relations. Spatial dependency is captured with residual network and dimension reduction. Short term temporal dependency is captured with LSTM. Long term temporal dependency and temporal shifting are captured with LSTM and attention mechanism. Semantic dependency is captured with multi-attention mechanism and autocorrelation coefficient method. Extensive experiments show that our frameworks provide more accurate and stable prediction results than the existing methods.
Tasks	Autonomous Vehicles, Dimensionality Reduction
Published	2019-05-27
URL	https://arxiv.org/abs/1905.10983v1
PDF	https://arxiv.org/pdf/1905.10983v1.pdf
PWC	https://paperswithcode.com/paper/attention-based-supply-demand-prediction-for
Repo
Framework

A Stochastic Variance Reduced Nesterov’s Accelerated Quasi-Newton Method


Title	A Stochastic Variance Reduced Nesterov’s Accelerated Quasi-Newton Method
Authors	Sota Yasuda, Shahrzad Mahboubi, S. Indrapriyadarsini, Hiroshi Ninomiya, Hideki Asai
Abstract	Recently algorithms incorporating second order curvature information have become popular in training neural networks. The Nesterov’s Accelerated Quasi-Newton (NAQ) method has shown to effectively accelerate the BFGS quasi-Newton method by incorporating the momentum term and Nesterov’s accelerated gradient vector. A stochastic version of NAQ method was proposed for training of large-scale problems. However, this method incurs high stochastic variance noise. This paper proposes a stochastic variance reduced Nesterov’s Accelerated Quasi-Newton method in full (SVR-NAQ) and limited (SVRLNAQ) memory forms. The performance of the proposed method is evaluated in Tensorflow on four benchmark problems - two regression and two classification problems respectively. The results show improved performance compared to conventional methods.
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07939v1
PDF	https://arxiv.org/pdf/1910.07939v1.pdf
PWC	https://paperswithcode.com/paper/a-stochastic-variance-reduced-nesterovs
Repo
Framework

Anti-efficient encoding in emergent communication


Title	Anti-efficient encoding in emergent communication
Authors	Rahma Chaabouni, Eugene Kharitonov, Emmanuel Dupoux, Marco Baroni
Abstract	Despite renewed interest in emergent language simulations with neural networks, little is known about the basic properties of the induced code, and how they compare to human language. One fundamental characteristic of the latter, known as Zipf’s Law of Abbreviation (ZLA), is that more frequent words are efficiently associated to shorter strings. We study whether the same pattern emerges when two neural networks, a “speaker” and a “listener”, are trained to play a signaling game. Surprisingly, we find that networks develop an \emph{anti-efficient} encoding scheme, in which the most frequent inputs are associated to the longest messages, and messages in general are skewed towards the maximum length threshold. This anti-efficient code appears easier to discriminate for the listener, and, unlike in human communication, the speaker does not impose a contrasting least-effort pressure towards brevity. Indeed, when the cost function includes a penalty for longer messages, the resulting message distribution starts respecting ZLA. Our analysis stresses the importance of studying the basic features of emergent communication in a highly controlled setup, to ensure the latter will not strand too far from human language. Moreover, we present a concrete illustration of how different functional pressures can lead to successful communication codes that lack basic properties of human language, thus highlighting the role such pressures play in the latter.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12561v4
PDF	https://arxiv.org/pdf/1905.12561v4.pdf
PWC	https://paperswithcode.com/paper/anti-efficient-encoding-in-emergent
Repo
Framework

Relaxed Softmax for learning from Positive and Unlabeled data


Title	Relaxed Softmax for learning from Positive and Unlabeled data
Authors	Ugo Tanielian, Flavian Vasile
Abstract	In recent years, the softmax model and its fast approximations have become the de-facto loss functions for deep neural networks when dealing with multi-class prediction. This loss has been extended to language modeling and recommendation, two fields that fall into the framework of learning from Positive and Unlabeled data. In this paper, we stress the different drawbacks of the current family of softmax losses and sampling schemes when applied in a Positive and Unlabeled learning setup. We propose both a Relaxed Softmax loss (RS) and a new negative sampling scheme based on Boltzmann formulation. We show that the new training objective is better suited for the tasks of density estimation, item similarity and next-event prediction by driving uplifts in performance on textual and recommendation datasets against classical softmax.
Tasks	Density Estimation, Language Modelling
Published	2019-09-17
URL	https://arxiv.org/abs/1909.08079v1
PDF	https://arxiv.org/pdf/1909.08079v1.pdf
PWC	https://paperswithcode.com/paper/relaxed-softmax-for-learning-from-positive
Repo
Framework

Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification


Title	Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification
Authors	Hao Peng, Jianxin Li, Qiran Gong, Senzhang Wang, Lifang He, Bo Li, Lihong Wang, Philip S. Yu
Abstract	CNNs, RNNs, GCNs, and CapsNets have shown significant insights in representation learning and are widely used in various text mining tasks such as large-scale multi-label text classification. However, most existing deep models for multi-label text classification consider either the non-consecutive and long-distance semantics or the sequential semantics, but how to consider them both coherently is less studied. In addition, most existing methods treat output labels as independent methods, but ignore the hierarchical relations among them, leading to useful semantic information loss. In this paper, we propose a novel hierarchical taxonomy-aware and attentional graph capsule recurrent CNNs framework for large-scale multi-label text classification. Specifically, we first propose to model each document as a word order preserved graph-of-words and normalize it as a corresponding words-matrix representation which preserves both the non-consecutive, long-distance and local sequential semantics. Then the words-matrix is input to the proposed attentional graph capsule recurrent CNNs for more effectively learning the semantic features. To leverage the hierarchical relations among the class labels, we propose a hierarchical taxonomy embedding method to learn their representations, and define a novel weighted margin loss by incorporating the label representation similarity. Extensive evaluations on three datasets show that our model significantly improves the performance of large-scale multi-label text classification by comparing with state-of-the-art approaches.
Tasks	Multi-Label Text Classification, Representation Learning, Text Classification
Published	2019-06-09
URL	https://arxiv.org/abs/1906.04898v1
PDF	https://arxiv.org/pdf/1906.04898v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-taxonomy-aware-and-attentional
Repo
Framework

Modernizing Historical Documents: a User Study


Title	Modernizing Historical Documents: a User Study
Authors	Miguel Domingo, Francisco Casacuberta
Abstract	Accessibility to historical documents is mostly limited to scholars. This is due to the language barrier inherent in human language and the linguistic properties of these documents. Given a historical document, modernization aims to generate a new version of it, written in the modern version of the document’s language. Its goal is to tackle the language barrier, decreasing the comprehension difficulty and making historical documents accessible to a broader audience. In this work, we proposed a new neural machine translation approach that profits from modern documents to enrich its systems. We tested this approach with both automatic and human evaluation, and conducted a user study. Results showed that modernization is successfully reaching its goal, although it still has room for improvement.
Tasks	Machine Translation
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00659v2
PDF	https://arxiv.org/pdf/1907.00659v2.pdf
PWC	https://paperswithcode.com/paper/modernizing-historical-documents-a-user-study
Repo
Framework

Machine Translation for Machines: the Sentiment Classification Use Case


Title	Machine Translation for Machines: the Sentiment Classification Use Case
Authors	Amirhossein Tebbifakhr, Luisa Bentivogli, Matteo Negri, Marco Turchi
Abstract	We propose a neural machine translation (NMT) approach that, instead of pursuing adequacy and fluency (“human-oriented” quality criteria), aims to generate translations that are best suited as input to a natural language processing component designed for a specific downstream task (a “machine-oriented” criterion). Towards this objective, we present a reinforcement learning technique based on a new candidate sampling strategy, which exploits the results obtained on the downstream task as weak feedback. Experiments in sentiment classification of Twitter data in German and Italian show that feeding an English classifier with machine-oriented translations significantly improves its performance. Classification results outperform those obtained with translations produced by general-purpose NMT models as well as by an approach based on reinforcement learning. Moreover, our results on both languages approximate the classification accuracy computed on gold standard English tweets.
Tasks	Machine Translation, Sentiment Analysis
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00478v1
PDF	https://arxiv.org/pdf/1910.00478v1.pdf
PWC	https://paperswithcode.com/paper/machine-translation-for-machines-the
Repo
Framework

Speculative Beam Search for Simultaneous Translation


Title	Speculative Beam Search for Simultaneous Translation
Authors	Renjie Zheng, Mingbo Ma, Baigong Zheng, Liang Huang
Abstract	Beam search is universally used in full-sentence translation but its application to simultaneous translation remains non-trivial, where output words are committed on the fly. In particular, the recently proposed wait-k policy (Ma et al., 2019a) is a simple and effective method that (after an initial wait) commits one output word on receiving each input word, making beam search seemingly impossible. To address this challenge, we propose a speculative beam search algorithm that hallucinates several steps into the future in order to reach a more accurate decision, implicitly benefiting from a target language model. This makes beam search applicable for the first time to the generation of a single word in each step. Experiments over diverse language pairs show large improvements over previous work.
Tasks	Language Modelling
Published	2019-09-12
URL	https://arxiv.org/abs/1909.05421v1
PDF	https://arxiv.org/pdf/1909.05421v1.pdf
PWC	https://paperswithcode.com/paper/speculative-beam-search-for-simultaneous
Repo
Framework

Visualization tools for parameter selection in cluster analysis


Title	Visualization tools for parameter selection in cluster analysis
Authors	Alexander Rolle, Luis Scoccola
Abstract	We propose an algorithm, HPREF (Hierarchical Partitioning by Repeated Features), that produces a hierarchical partition of a set of clusterings of a fixed dataset, such as sets of clusterings produced by running a clustering algorithm with a range of parameters. This gives geometric structure to such sets of clustering, and can be used to visualize the set of results one obtains by running a clustering algorithm with a range of parameters.
Tasks
Published	2019-02-04
URL	https://arxiv.org/abs/1902.01436v3
PDF	https://arxiv.org/pdf/1902.01436v3.pdf
PWC	https://paperswithcode.com/paper/spaces-of-clusterings
Repo
Framework

Multi-objective multi-generation Gaussian process optimizer for design optimization


Title	Multi-objective multi-generation Gaussian process optimizer for design optimization
Authors	Xiaobiao Huang
Abstract	We present a multi-objective optimization algorithm that uses Gaussian process (GP) regression-based models to generate or select trial solutions in a multi-generation iterative procedure. In each generation, a surrogate model is constructed for each objective function with the sample data. The models are used to evaluate solutions and to select the ones with a high potential before they are evaluated on the actual system. Since the trial solutions selected by the GP models tend to have better performance than other methods that only rely on random operations, the new algorithm has much better efficiency in exploring the parameter space. Simulations with multiple test cases show that the new algorithm has a substantially higher convergence speed that the NSGA-II and PSO algorithms.
Tasks
Published	2019-06-29
URL	https://arxiv.org/abs/1907.00250v1
PDF	https://arxiv.org/pdf/1907.00250v1.pdf
PWC	https://paperswithcode.com/paper/multi-objective-multi-generation-gaussian
Repo
Framework

Variational Predictive Information Bottleneck


Title	Variational Predictive Information Bottleneck
Authors	Alexander A. Alemi
Abstract	In classic papers, Zellner demonstrated that Bayesian inference could be derived as the solution to an information theoretic functional. Below we derive a generalized form of this functional as a variational lower bound of a predictive information bottleneck objective. This generalized functional encompasses most modern inference procedures and suggests novel ones.
Tasks	Bayesian Inference
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10831v1
PDF	https://arxiv.org/pdf/1910.10831v1.pdf
PWC	https://paperswithcode.com/paper/variational-predictive-information-bottleneck
Repo
Framework

Physical Adversarial Textures that Fool Visual Object Tracking


Title	Physical Adversarial Textures that Fool Visual Object Tracking
Authors	Rey Reza Wiyatno, Anqi Xu
Abstract	We present a system for generating inconspicuous-looking textures that, when displayed in the physical world as digital or printed posters, cause visual object tracking systems to become confused. For instance, as a target being tracked by a robot’s camera moves in front of such a poster, our generated texture makes the tracker lock onto it and allows the target to evade. This work aims to fool seldom-targeted regression tasks, and in particular compares diverse optimization strategies: non-targeted, targeted, and a new family of guided adversarial losses. While we use the Expectation Over Transformation (EOT) algorithm to generate physical adversaries that fool tracking models when imaged under diverse conditions, we compare the impacts of different conditioning variables, including viewpoint, lighting, and appearances, to find practical attack setups with high resulting adversarial strength and convergence speed. We further showcase textures optimized solely using simulated scenes can confuse real-world tracking systems.
Tasks	Object Tracking, Visual Object Tracking
Published	2019-04-24
URL	https://arxiv.org/abs/1904.11042v2
PDF	https://arxiv.org/pdf/1904.11042v2.pdf
PWC	https://paperswithcode.com/paper/physical-adversarial-textures-that-fool
Repo
Framework

Effect of Various Regularizers on Model Complexities of Neural Networks in Presence of Input Noise


Title	Effect of Various Regularizers on Model Complexities of Neural Networks in Presence of Input Noise
Authors	Mayank Sharma, Aayush Yadav, Sumit Soman, Jayadeva
Abstract	Deep neural networks are over-parameterized, which implies that the number of parameters are much larger than the number of samples used to train the network. Even in such a regime deep architectures do not overfit. This phenomenon is an active area of research and many theories have been proposed trying to understand this peculiar observation. These include the Vapnik Chervonenkis (VC) dimension bounds and Rademacher complexity bounds which show that the capacity of the network is characterized by the norm of weights rather than the number of parameters. However, the effect of input noise on these measures for shallow and deep architectures has not been studied. In this paper, we analyze the effects of various regularization schemes on the complexity of a neural network which we characterize with the loss, $L_2$ norm of the weights, Rademacher complexities (Directly Approximately Regularizing Complexity-DARC1), VC dimension based Low Complexity Neural Network (LCNN) when subject to varying degrees of Gaussian input noise. We show that $L_2$ regularization leads to a simpler hypothesis class and better generalization followed by DARC1 regularizer, both for shallow as well as deeper architectures. Jacobian regularizer works well for shallow architectures with high level of input noises. Spectral normalization attains highest test set accuracies both for shallow and deeper architectures. We also show that Dropout alone does not perform well in presence of input noise. Finally, we show that deeper architectures are robust to input noise as opposed to their shallow counterparts.
Tasks
Published	2019-01-31
URL	http://arxiv.org/abs/1901.11458v1
PDF	http://arxiv.org/pdf/1901.11458v1.pdf
PWC	https://paperswithcode.com/paper/effect-of-various-regularizers-on-model
Repo
Framework

Characterization of the Handwriting Skills as a Biomarker for Parkinson Disease


Title	Characterization of the Handwriting Skills as a Biomarker for Parkinson Disease
Authors	R. Castrillon, A. Acien, J. R. Orozco-Arroyave, A. Morales, J. F. Vargas, R. Vera-Rodrıguez, J. Fierrez, J. Ortega-Garcia, A. Villegas
Abstract	In this paper we evaluate the suitability of handwriting patterns as potential biomarkers to model Parkinson disease (PD). Although the study of PD is attracting the interest of many researchers around the world, databases to evaluate handwriting patterns are scarce and knowledge about patterns associated to PD is limited and biased to the existing datasets. This paper introduces a database with a total of 935 handwriting tasks collected from 55 PD patients and 94 healthy controls (45 young and 49 old). Three feature sets are extracted from the signals: neuromotor, kinematic, and nonlinear dynamic. Different classifiers are used to discriminate between PD and healthy subjects: support vector machines, knearest neighbors, and a multilayer perceptron. The proposed features and classifiers enable to detect PD with accuracies between 81% and 97%. Additionally, new insights are presented on the utility of the studied features for monitoring and detecting PD.
Tasks
Published	2019-03-19
URL	http://arxiv.org/abs/1903.08226v1
PDF	http://arxiv.org/pdf/1903.08226v1.pdf
PWC	https://paperswithcode.com/paper/characterization-of-the-handwriting-skills-as
Repo
Framework