January 27, 2020

3175 words 15 mins read

Paper Group ANR 1216

Select and Attend: Towards Controllable Content Selection in Text Generation. Class Specific or Shared? A Hybrid Dictionary Learning Network for Image Classification. Lipi Gnani - A Versatile OCR for Documents in any Language Printed in Kannada Script. The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes. …

Select and Attend: Towards Controllable Content Selection in Text Generation


Title	Select and Attend: Towards Controllable Content Selection in Text Generation
Authors	Xiaoyu Shen, Jun Suzuki, Kentaro Inui, Hui Su, Dietrich Klakow, Satoshi Sekine
Abstract	Many text generation tasks naturally contain two steps: content selection and surface realization. Current neural encoder-decoder models conflate both steps into a black-box architecture. As a result, the content to be described in the text cannot be explicitly controlled. This paper tackles this problem by decoupling content selection from the decoder. The decoupled content selection is human interpretable, whose value can be manually manipulated to control the content of generated text. The model can be trained end-to-end without human annotations by maximizing a lower bound of the marginal likelihood. We further propose an effective way to trade-off between performance and controllability with a single adjustable hyperparameter. In both data-to-text and headline generation tasks, our model achieves promising results, paving the way for controllable content selection in text generation.
Tasks	Text Generation
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04453v1
PDF	https://arxiv.org/pdf/1909.04453v1.pdf
PWC	https://paperswithcode.com/paper/select-and-attend-towards-controllable
Repo
Framework

Class Specific or Shared? A Hybrid Dictionary Learning Network for Image Classification


Title	Class Specific or Shared? A Hybrid Dictionary Learning Network for Image Classification
Authors	Shuai Shao, Yan-Jiang Wang, Bao-Di Liu, Rui Xu, Ye Li
Abstract	Dictionary learning methods can be split into two categories: i) class specific dictionary learning ii) class shared dictionary learning. The difference between the two categories is how to use the discriminative information. With the first category, samples of different classes are mapped to different subspaces which leads to some redundancy in the base vectors. For the second category, the samples in each specific class can not be described well. Moreover, most class shared dictionary learning methods use the L0-norm regularization term as the sparse constraint. In this paper, we first propose a novel class shared dictionary learning method named label embedded dictionary learning (LEDL) by introducing the L1-norm sparse constraint to replace the conventional L0-norm regularization term in LC-KSVD method. Then we propose a novel network named hybrid dictionary learning network (HDLN) to combine the class specific dictionary learning with class shared dictionary learning together to fully describe the feature to boost the performance of classification. Extensive experimental results on six benchmark datasets illustrate that our methods are capable of achieving superior performance compared to several conventional classification algorithms.
Tasks	Dictionary Learning, Image Classification
Published	2019-04-17
URL	http://arxiv.org/abs/1904.08928v1
PDF	http://arxiv.org/pdf/1904.08928v1.pdf
PWC	https://paperswithcode.com/paper/class-specific-or-shared-a-hybrid-dictionary
Repo
Framework

Lipi Gnani - A Versatile OCR for Documents in any Language Printed in Kannada Script


Title	Lipi Gnani - A Versatile OCR for Documents in any Language Printed in Kannada Script
Authors	Shiva Kumar H R, Ramakrishnan A G
Abstract	A Kannada OCR, named Lipi Gnani, has been designed and developed from scratch, with the motivation of it being able to convert printed text or poetry in Kannada script, without any restriction on vocabulary. The training and test sets have been collected from over 35 books published between the period 1970 to 2002, and this includes books written in Halegannada and pages containing Sanskrit slokas written in Kannada script. The coverage of the OCR is nearly complete in the sense that it recognizes all the punctuation marks, special symbols, Indo-Arabic and Kannada numerals and also the interspersed English words. Several minor and major original contributions have been done in developing this OCR at the different processing stages such as binarization, line and character segmentation, recognition and Unicode mapping. This has created a Kannada OCR that performs as good as, and in some cases, better than the Google’s Tesseract OCR, as shown by the results. To the knowledge of the authors, this is the maiden report of a complete Kannada OCR, handling all the issues involved. Currently, there is no dictionary based postprocessing, and the obtained results are due solely to the recognition process. Four benchmark test databases containing scanned pages from books in Kannada, Sanskrit, Konkani and Tulu languages, but all of them printed in Kannada script, have been created. The word level recognition accuracy of Lipi Gnani is 4% higher on the Kannada dataset than that of Google’s Tesseract OCR, 8% higher on the datasets of Tulu and Sanskrit, and 25% higher on the Konkani dataset.
Tasks	Optical Character Recognition
Published	2019-01-02
URL	http://arxiv.org/abs/1901.00413v1
PDF	http://arxiv.org/pdf/1901.00413v1.pdf
PWC	https://paperswithcode.com/paper/lipi-gnani-a-versatile-ocr-for-documents-in
Repo
Framework

The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes


Title	The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes
Authors	Abhishek Patil, Srikanth Malla, Haiming Gang, Yi-Ting Chen
Abstract	3D multi-object detection and tracking are crucial for traffic scene understanding. However, the community pays less attention to these areas due to the lack of a standardized benchmark dataset to advance the field. Moreover, existing datasets (e.g., KITTI) do not provide sufficient data and labels to tackle challenging scenes where highly interactive and occluded traffic participants are present. To address the issues, we present the Honda Research Institute 3D Dataset (H3D), a large-scale full-surround 3D multi-object detection and tracking dataset collected using a 3D LiDAR scanner. H3D comprises of 160 crowded and highly interactive traffic scenes with a total of 1 million labeled instances in 27,721 frames. With unique dataset size, rich annotations, and complex scenes, H3D is gathered to stimulate research on full-surround 3D multi-object detection and tracking. To effectively and efficiently annotate a large-scale 3D point cloud dataset, we propose a labeling methodology to speed up the overall annotation cycle. A standardized benchmark is created to evaluate full-surround 3D multi-object detection and tracking algorithms. 3D object detection and tracking algorithms are trained and tested on H3D. Finally, sources of errors are discussed for the development of future algorithms.
Tasks	3D Object Detection, Object Detection, Scene Understanding
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01568v1
PDF	http://arxiv.org/pdf/1903.01568v1.pdf
PWC	https://paperswithcode.com/paper/the-h3d-dataset-for-full-surround-3d-multi
Repo
Framework

Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation


Title	Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Authors	Ziyang Tang, Yihao Feng, Lihong Li, Dengyong Zhou, Qiang Liu
Abstract	Infinite horizon off-policy policy evaluation is a highly challenging task due to the excessively large variance of typical importance sampling (IS) estimators. Recently, Liu et al. (2018a) proposed an approach that significantly reduces the variance of infinite-horizon off-policy evaluation by estimating the stationary density ratio, but at the cost of introducing potentially high biases due to the error in density ratio estimation. In this paper, we develop a bias-reduced augmentation of their method, which can take advantage of a learned value function to obtain higher accuracy. Our method is doubly robust in that the bias vanishes when either the density ratio or the value function estimation is perfect. In general, when either of them is accurate, the bias can also be reduced. Both theoretical and empirical results show that our method yields significant advantages over previous methods.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07186v1
PDF	https://arxiv.org/pdf/1910.07186v1.pdf
PWC	https://paperswithcode.com/paper/doubly-robust-bias-reduction-in-infinite
Repo
Framework

ISBNet: Instance-aware Selective Branching Network


Title	ISBNet: Instance-aware Selective Branching Network
Authors	Shaofeng Cai, Yao Shu, Wei Wang, Beng Chin Ooi
Abstract	Recent years have witnessed growing interests in designing efficient neural networks and neural architecture search (NAS). Although remarkable efficiency and accuracy have been achieved, existing expert designed and NAS models neglect the fact that input instances are of varying complexity and thus different amounts of computation are required. Inference with a fixed model that processes all instances through the same transformations would incur computational resources unnecessarily. Customizing the model capacity in an instance-aware manner is required to alleviate such a problem. In this paper, we propose a novel Instance-aware Selective Branching Network-ISBNet to support efficient instance-level inference by selectively bypassing transformation branches of insignificant importance weight. These weights are dynamically determined by a lightweight hypernetwork SelectionNet and recalibrated by gumbel-softmax for sparse branch selection. Extensive experiments show that ISBNet achieves extremely efficient inference in terms of parameter size and FLOPs comparing to existing networks. For example, ISBNet takes only 8.70% parameters and 31.01% FLOPs of the efficient network MobileNetV2 with comparable accuracy on CIFAR-10.
Tasks	Neural Architecture Search
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04849v3
PDF	https://arxiv.org/pdf/1905.04849v3.pdf
PWC	https://paperswithcode.com/paper/isbnet-instance-aware-selective-branching
Repo
Framework

Improving reinforcement learning algorithms: towards optimal learning rate policies


Title	Improving reinforcement learning algorithms: towards optimal learning rate policies
Authors	Othmane Mounjid, Charles-Albert Lehalle
Abstract	This paper investigates to what extent one can improve reinforcement learning algorithms. Our study is split in three parts. First, our analysis shows that the classical asymptotic convergence rate $O(1/\sqrt{N})$ is pessimistic and can be replaced by $O((\log(N)/N)^{\beta})$ with $\frac{1}{2}\leq \beta \leq 1$ and $N$ the number of iterations. Second, we propose a dynamic optimal policy for the choice of the learning rate $(\gamma_k){k\geq 0}$ used in stochastic approximation (SA). We decompose our policy into two interacting levels: the inner and the outer level. In the inner level, we present the \nameref{Alg:v_4_s} algorithm (for “PAst Sign Search”) which, based on a predefined sequence $(\gamma^o_k){k\geq 0}$, constructs a new sequence $(\gamma^i_k){k\geq 0}$ whose error decreases faster. In the outer level, we propose an optimal methodology for the selection of the predefined sequence $(\gamma^o_k){k\geq 0}$. Third, we show empirically that our selection methodology of the learning rate outperforms significantly standard algorithms used in reinforcement learning (RL) in the three following applications: the estimation of a drift, the optimal placement of limit orders and the optimal execution of large number of shares.
Tasks
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02319v3
PDF	https://arxiv.org/pdf/1911.02319v3.pdf
PWC	https://paperswithcode.com/paper/improving-reinforcement-learning-algorithms
Repo
Framework

Adversarial Cross-Domain Action Recognition with Co-Attention


Title	Adversarial Cross-Domain Action Recognition with Co-Attention
Authors	Boxiao Pan, Zhangjie Cao, Ehsan Adeli, Juan Carlos Niebles
Abstract	Action recognition has been a widely studied topic with a heavy focus on supervised learning involving sufficient labeled videos. However, the problem of cross-domain action recognition, where training and testing videos are drawn from different underlying distributions, remains largely under-explored. Previous methods directly employ techniques for cross-domain image recognition, which tend to suffer from the severe temporal misalignment problem. This paper proposes a Temporal Co-attention Network (TCoN), which matches the distributions of temporally aligned action features between source and target domains using a novel cross-domain co-attention mechanism. Experimental results on three cross-domain action recognition datasets demonstrate that TCoN improves both previous single-domain and cross-domain methods significantly under the cross-domain setting.
Tasks
Published	2019-12-22
URL	https://arxiv.org/abs/1912.10405v1
PDF	https://arxiv.org/pdf/1912.10405v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-cross-domain-action-recognition
Repo
Framework


Title	A CNN-RNN Framework for Image Annotation from Visual Cues and Social Network Metadata
Authors	Tobia Tesan, Pasquale Coscia, Lamberto Ballan
Abstract	Images represent a commonly used form of visual communication among people. Nevertheless, image classification may be a challenging task when dealing with unclear or non-common images needing more context to be correctly annotated. Metadata accompanying images on social-media represent an ideal source of additional information for retrieving proper neighborhoods easing image annotation task. To this end, we blend visual features extracted from neighbors and their metadata to jointly leverage context and visual cues. Our models use multiple semantic embeddings to achieve the dual objective of being robust to vocabulary changes between train and test sets and decoupling the architecture from the low-level metadata representation. Convolutional and recurrent neural networks (CNNs-RNNs) are jointly adopted to infer similarity among neighbors and query images. We perform comprehensive experiments on the NUS-WIDE dataset showing that our models outperform state-of-the-art architectures based on images and metadata, and decrease both sensory and semantic gaps to better annotate images.
Tasks	Image Classification
Published	2019-10-13
URL	https://arxiv.org/abs/1910.05770v2
PDF	https://arxiv.org/pdf/1910.05770v2.pdf
PWC	https://paperswithcode.com/paper/a-cnn-rnn-framework-for-image-annotation-from
Repo
Framework

Effective reinforcement learning based local search for the maximum k-plex problem


Title	Effective reinforcement learning based local search for the maximum k-plex problem
Authors	Yan Jin, John H. Drake, Una Benlic, Kun He
Abstract	The maximum k-plex problem is a computationally complex problem, which emerged from graph-theoretic social network studies. This paper presents an effective hybrid local search for solving the maximum k-plex problem that combines the recently proposed breakout local search algorithm with a reinforcement learning strategy. The proposed approach includes distinguishing features such as: a unified neighborhood search based on the swapping operator, a distance-and-quality reward for actions and a new parameter control mechanism based on reinforcement learning. Extensive experiments for the maximum k-plex problem (k = 2, 3, 4, 5) on 80 benchmark instances from the second DIMACS Challenge demonstrate that the proposed approach can match the best-known results from the literature in all but four problem instances. In addition, the proposed algorithm is able to find 32 new best solutions.
Tasks
Published	2019-03-13
URL	http://arxiv.org/abs/1903.05537v1
PDF	http://arxiv.org/pdf/1903.05537v1.pdf
PWC	https://paperswithcode.com/paper/effective-reinforcement-learning-based-local
Repo
Framework

BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance


Title	BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
Authors	Timo Schick, Hinrich Schütze
Abstract	Pretraining deep contextualized representations using an unsupervised language modeling objective has led to large performance gains for a variety of NLP tasks. Despite this success, recent work by Schick and Sch"utze (2019) suggests that these architectures struggle to understand rare words. For context-independent word embeddings, this problem can be addressed by separately learning representations for infrequent words. In this work, we show that the same idea can also be applied to contextualized models and clearly improves their downstream task performance. Most approaches for inducing word embeddings into existing embedding spaces are based on simple bag-of-words models; hence they are not a suitable counterpart for deep neural network language models. To overcome this problem, we introduce BERTRAM, a powerful architecture based on a pretrained BERT language model and capable of inferring high-quality representations for rare words. In BERTRAM, surface form and contexts of a word directly interact with each other in a deep architecture. Both on a rare word probing task and on three downstream task datasets, BERTRAM considerably improves representations for rare and medium frequency words compared to both a standalone BERT model and previous work.
Tasks	Language Modelling, Word Embeddings
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07181v2
PDF	https://arxiv.org/pdf/1910.07181v2.pdf
PWC	https://paperswithcode.com/paper/bertram-improved-word-embeddings-have-big
Repo
Framework

Transferrable Operative Difficulty Assessment in Robot-assisted Teleoperation: A Domain Adaptation Approach


Title	Transferrable Operative Difficulty Assessment in Robot-assisted Teleoperation: A Domain Adaptation Approach
Authors	Ziheng Wang, Cong Feng, Jie Zhang, Ann Majewicz Fey
Abstract	Providing an accurate and efficient assessment of operative difficulty is important for designing robot-assisted teleoperation interfaces that are easy and natural for human operators to use. In this paper, we aim to develop a data-driven approach to numerically characterize the operative difficulty demand of complex teleoperation. In effort to provide an entirely task-independent assessment, we consider using only data collected from the human user including: (1) physiological response, and (2) movement kinematics. By leveraging an unsupervised domain adaptation technique, our approach learns the user information that defines task difficulty in a well-known source, namely, a Fitt’s target reaching task, and generalizes that knowledge to a more complex human motor control scenario, namely, the teleoperation of a robotic system. Our approach consists of two main parts: (1) The first part accounts for the inherent variances of user physiological and kinematic response between these cross-domain motor control scenarios that are vastly different. (2) A stacked two-layer learner is designed to improve the overall modeling performance, yielding a 96.6% accuracy in predicting the known difficulty of a Fitts’ reaching task when using movement kinematic features. We then validate the effectiveness of our model by investigating teleoperated robotic needle steering as a case study. Compared with a standard NASA TLX user survey, our results indicate significant differences in the difficulty demand for various choices of needle steering control algorithms, p<0.05, as well as the difficulty of steering the needle to different targets, p<0.05. The results highlight the potential of our approach to be used as a design tool to create more intuitive and natural teleoperation interfaces in robot-assisted systems.
Tasks	Domain Adaptation, Steering Control, Unsupervised Domain Adaptation
Published	2019-06-12
URL	https://arxiv.org/abs/1906.04934v1
PDF	https://arxiv.org/pdf/1906.04934v1.pdf
PWC	https://paperswithcode.com/paper/transferrable-operative-difficulty-assessment
Repo
Framework

A Just and Comprehensive Strategy for Using NLP to Address Online Abuse


Title	A Just and Comprehensive Strategy for Using NLP to Address Online Abuse
Authors	David Jurgens, Eshwar Chandrasekharan, Libby Hemphill
Abstract	Online abusive behavior affects millions and the NLP community has attempted to mitigate this problem by developing technologies to detect abuse. However, current methods have largely focused on a narrow definition of abuse to detriment of victims who seek both validation and solutions. In this position paper, we argue that the community needs to make three substantive changes: (1) expanding our scope of problems to tackle both more subtle and more serious forms of abuse, (2) developing proactive technologies that counter or inhibit abuse before it harms, and (3) reframing our effort within a framework of justice to promote healthy communities.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01738v2
PDF	https://arxiv.org/pdf/1906.01738v2.pdf
PWC	https://paperswithcode.com/paper/a-just-and-comprehensive-strategy-for-using
Repo
Framework

Gradient Descent based Weight Learning for Grouping Problems: Application on Graph Coloring and Equitable Graph Coloring


Title	Gradient Descent based Weight Learning for Grouping Problems: Application on Graph Coloring and Equitable Graph Coloring
Authors	Olivier Goudet, Béatrice Duval, Jin-Kao Hao
Abstract	A grouping problem involves partitioning a set of items into mutually disjoint groups or clusters according to some guiding decision criteria and imperative constraints. Grouping problems have many relevant applications and are computationally difficult. In this work, we present a general weight learning based optimization framework for solving grouping problems. The central idea of our approach is to formulate the task of seeking a solution as a real-valued weight matrix learning problem that is solved by first order gradient descent. A practical implementation of this framework is proposed with tensor calculus in order to benefit from parallel computing on GPU devices. To show its potential for tackling difficult problems, we apply the approach to two typical and well-known grouping problems (graph coloring and equitable graph coloring). We present large computational experiments and comparisons on popular benchmarks and report improved best-known results (new upper bounds) for several large graphs.
Tasks
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02261v1
PDF	https://arxiv.org/pdf/1909.02261v1.pdf
PWC	https://paperswithcode.com/paper/gradient-descent-based-weight-learning-for
Repo
Framework

Banded Matrix Operators for Gaussian Markov Models in the Automatic Differentiation Era


Title	Banded Matrix Operators for Gaussian Markov Models in the Automatic Differentiation Era
Authors	Nicolas Durrande, Vincent Adam, Lucas Bordeaux, Stefanos Eleftheriadis, James Hensman
Abstract	Banded matrices can be used as precision matrices in several models including linear state-space models, some Gaussian processes, and Gaussian Markov random fields. The aim of the paper is to make modern inference methods (such as variational inference or gradient-based sampling) available for Gaussian models with banded precision. We show that this can efficiently be achieved by equipping an automatic differentiation framework, such as TensorFlow or PyTorch, with some linear algebra operators dedicated to banded matrices. This paper studies the algorithmic aspects of the required operators, details their reverse-mode derivatives, and show that their complexity is linear in the number of observations.
Tasks	Gaussian Processes
Published	2019-02-26
URL	http://arxiv.org/abs/1902.10078v1
PDF	http://arxiv.org/pdf/1902.10078v1.pdf
PWC	https://paperswithcode.com/paper/banded-matrix-operators-for-gaussian-markov
Repo
Framework