Paper Group ANR 1216
Select and Attend: Towards Controllable Content Selection in Text Generation. Class Specific or Shared? A Hybrid Dictionary Learning Network for Image Classification. Lipi Gnani - A Versatile OCR for Documents in any Language Printed in Kannada Script. The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes. …
Select and Attend: Towards Controllable Content Selection in Text Generation
Title | Select and Attend: Towards Controllable Content Selection in Text Generation |
Authors | Xiaoyu Shen, Jun Suzuki, Kentaro Inui, Hui Su, Dietrich Klakow, Satoshi Sekine |
Abstract | Many text generation tasks naturally contain two steps: content selection and surface realization. Current neural encoder-decoder models conflate both steps into a black-box architecture. As a result, the content to be described in the text cannot be explicitly controlled. This paper tackles this problem by decoupling content selection from the decoder. The decoupled content selection is human interpretable, whose value can be manually manipulated to control the content of generated text. The model can be trained end-to-end without human annotations by maximizing a lower bound of the marginal likelihood. We further propose an effective way to trade-off between performance and controllability with a single adjustable hyperparameter. In both data-to-text and headline generation tasks, our model achieves promising results, paving the way for controllable content selection in text generation. |
Tasks | Text Generation |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04453v1 |
https://arxiv.org/pdf/1909.04453v1.pdf | |
PWC | https://paperswithcode.com/paper/select-and-attend-towards-controllable |
Repo | |
Framework | |
Class Specific or Shared? A Hybrid Dictionary Learning Network for Image Classification
Title | Class Specific or Shared? A Hybrid Dictionary Learning Network for Image Classification |
Authors | Shuai Shao, Yan-Jiang Wang, Bao-Di Liu, Rui Xu, Ye Li |
Abstract | Dictionary learning methods can be split into two categories: i) class specific dictionary learning ii) class shared dictionary learning. The difference between the two categories is how to use the discriminative information. With the first category, samples of different classes are mapped to different subspaces which leads to some redundancy in the base vectors. For the second category, the samples in each specific class can not be described well. Moreover, most class shared dictionary learning methods use the L0-norm regularization term as the sparse constraint. In this paper, we first propose a novel class shared dictionary learning method named label embedded dictionary learning (LEDL) by introducing the L1-norm sparse constraint to replace the conventional L0-norm regularization term in LC-KSVD method. Then we propose a novel network named hybrid dictionary learning network (HDLN) to combine the class specific dictionary learning with class shared dictionary learning together to fully describe the feature to boost the performance of classification. Extensive experimental results on six benchmark datasets illustrate that our methods are capable of achieving superior performance compared to several conventional classification algorithms. |
Tasks | Dictionary Learning, Image Classification |
Published | 2019-04-17 |
URL | http://arxiv.org/abs/1904.08928v1 |
http://arxiv.org/pdf/1904.08928v1.pdf | |
PWC | https://paperswithcode.com/paper/class-specific-or-shared-a-hybrid-dictionary |
Repo | |
Framework | |
Lipi Gnani - A Versatile OCR for Documents in any Language Printed in Kannada Script
Title | Lipi Gnani - A Versatile OCR for Documents in any Language Printed in Kannada Script |
Authors | Shiva Kumar H R, Ramakrishnan A G |
Abstract | A Kannada OCR, named Lipi Gnani, has been designed and developed from scratch, with the motivation of it being able to convert printed text or poetry in Kannada script, without any restriction on vocabulary. The training and test sets have been collected from over 35 books published between the period 1970 to 2002, and this includes books written in Halegannada and pages containing Sanskrit slokas written in Kannada script. The coverage of the OCR is nearly complete in the sense that it recognizes all the punctuation marks, special symbols, Indo-Arabic and Kannada numerals and also the interspersed English words. Several minor and major original contributions have been done in developing this OCR at the different processing stages such as binarization, line and character segmentation, recognition and Unicode mapping. This has created a Kannada OCR that performs as good as, and in some cases, better than the Google’s Tesseract OCR, as shown by the results. To the knowledge of the authors, this is the maiden report of a complete Kannada OCR, handling all the issues involved. Currently, there is no dictionary based postprocessing, and the obtained results are due solely to the recognition process. Four benchmark test databases containing scanned pages from books in Kannada, Sanskrit, Konkani and Tulu languages, but all of them printed in Kannada script, have been created. The word level recognition accuracy of Lipi Gnani is 4% higher on the Kannada dataset than that of Google’s Tesseract OCR, 8% higher on the datasets of Tulu and Sanskrit, and 25% higher on the Konkani dataset. |
Tasks | Optical Character Recognition |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00413v1 |
http://arxiv.org/pdf/1901.00413v1.pdf | |
PWC | https://paperswithcode.com/paper/lipi-gnani-a-versatile-ocr-for-documents-in |
Repo | |
Framework | |
The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes
Title | The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes |
Authors | Abhishek Patil, Srikanth Malla, Haiming Gang, Yi-Ting Chen |
Abstract | 3D multi-object detection and tracking are crucial for traffic scene understanding. However, the community pays less attention to these areas due to the lack of a standardized benchmark dataset to advance the field. Moreover, existing datasets (e.g., KITTI) do not provide sufficient data and labels to tackle challenging scenes where highly interactive and occluded traffic participants are present. To address the issues, we present the Honda Research Institute 3D Dataset (H3D), a large-scale full-surround 3D multi-object detection and tracking dataset collected using a 3D LiDAR scanner. H3D comprises of 160 crowded and highly interactive traffic scenes with a total of 1 million labeled instances in 27,721 frames. With unique dataset size, rich annotations, and complex scenes, H3D is gathered to stimulate research on full-surround 3D multi-object detection and tracking. To effectively and efficiently annotate a large-scale 3D point cloud dataset, we propose a labeling methodology to speed up the overall annotation cycle. A standardized benchmark is created to evaluate full-surround 3D multi-object detection and tracking algorithms. 3D object detection and tracking algorithms are trained and tested on H3D. Finally, sources of errors are discussed for the development of future algorithms. |
Tasks | 3D Object Detection, Object Detection, Scene Understanding |
Published | 2019-03-04 |
URL | http://arxiv.org/abs/1903.01568v1 |
http://arxiv.org/pdf/1903.01568v1.pdf | |
PWC | https://paperswithcode.com/paper/the-h3d-dataset-for-full-surround-3d-multi |
Repo | |
Framework | |
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Title | Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation |
Authors | Ziyang Tang, Yihao Feng, Lihong Li, Dengyong Zhou, Qiang Liu |
Abstract | Infinite horizon off-policy policy evaluation is a highly challenging task due to the excessively large variance of typical importance sampling (IS) estimators. Recently, Liu et al. (2018a) proposed an approach that significantly reduces the variance of infinite-horizon off-policy evaluation by estimating the stationary density ratio, but at the cost of introducing potentially high biases due to the error in density ratio estimation. In this paper, we develop a bias-reduced augmentation of their method, which can take advantage of a learned value function to obtain higher accuracy. Our method is doubly robust in that the bias vanishes when either the density ratio or the value function estimation is perfect. In general, when either of them is accurate, the bias can also be reduced. Both theoretical and empirical results show that our method yields significant advantages over previous methods. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07186v1 |
https://arxiv.org/pdf/1910.07186v1.pdf | |
PWC | https://paperswithcode.com/paper/doubly-robust-bias-reduction-in-infinite |
Repo | |
Framework | |
ISBNet: Instance-aware Selective Branching Network
Title | ISBNet: Instance-aware Selective Branching Network |
Authors | Shaofeng Cai, Yao Shu, Wei Wang, Beng Chin Ooi |
Abstract | Recent years have witnessed growing interests in designing efficient neural networks and neural architecture search (NAS). Although remarkable efficiency and accuracy have been achieved, existing expert designed and NAS models neglect the fact that input instances are of varying complexity and thus different amounts of computation are required. Inference with a fixed model that processes all instances through the same transformations would incur computational resources unnecessarily. Customizing the model capacity in an instance-aware manner is required to alleviate such a problem. In this paper, we propose a novel Instance-aware Selective Branching Network-ISBNet to support efficient instance-level inference by selectively bypassing transformation branches of insignificant importance weight. These weights are dynamically determined by a lightweight hypernetwork SelectionNet and recalibrated by gumbel-softmax for sparse branch selection. Extensive experiments show that ISBNet achieves extremely efficient inference in terms of parameter size and FLOPs comparing to existing networks. For example, ISBNet takes only 8.70% parameters and 31.01% FLOPs of the efficient network MobileNetV2 with comparable accuracy on CIFAR-10. |
Tasks | Neural Architecture Search |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.04849v3 |
https://arxiv.org/pdf/1905.04849v3.pdf | |
PWC | https://paperswithcode.com/paper/isbnet-instance-aware-selective-branching |
Repo | |
Framework | |
Improving reinforcement learning algorithms: towards optimal learning rate policies
Title | Improving reinforcement learning algorithms: towards optimal learning rate policies |
Authors | Othmane Mounjid, Charles-Albert Lehalle |
Abstract | This paper investigates to what extent one can improve reinforcement learning algorithms. Our study is split in three parts. First, our analysis shows that the classical asymptotic convergence rate $O(1/\sqrt{N})$ is pessimistic and can be replaced by $O((\log(N)/N)^{\beta})$ with $\frac{1}{2}\leq \beta \leq 1$ and $N$ the number of iterations. Second, we propose a dynamic optimal policy for the choice of the learning rate $(\gamma_k){k\geq 0}$ used in stochastic approximation (SA). We decompose our policy into two interacting levels: the inner and the outer level. In the inner level, we present the \nameref{Alg:v_4_s} algorithm (for “PAst Sign Search”) which, based on a predefined sequence $(\gamma^o_k){k\geq 0}$, constructs a new sequence $(\gamma^i_k){k\geq 0}$ whose error decreases faster. In the outer level, we propose an optimal methodology for the selection of the predefined sequence $(\gamma^o_k){k\geq 0}$. Third, we show empirically that our selection methodology of the learning rate outperforms significantly standard algorithms used in reinforcement learning (RL) in the three following applications: the estimation of a drift, the optimal placement of limit orders and the optimal execution of large number of shares. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02319v3 |
https://arxiv.org/pdf/1911.02319v3.pdf | |
PWC | https://paperswithcode.com/paper/improving-reinforcement-learning-algorithms |
Repo | |
Framework | |
Adversarial Cross-Domain Action Recognition with Co-Attention
Title | Adversarial Cross-Domain Action Recognition with Co-Attention |
Authors | Boxiao Pan, Zhangjie Cao, Ehsan Adeli, Juan Carlos Niebles |
Abstract | Action recognition has been a widely studied topic with a heavy focus on supervised learning involving sufficient labeled videos. However, the problem of cross-domain action recognition, where training and testing videos are drawn from different underlying distributions, remains largely under-explored. Previous methods directly employ techniques for cross-domain image recognition, which tend to suffer from the severe temporal misalignment problem. This paper proposes a Temporal Co-attention Network (TCoN), which matches the distributions of temporally aligned action features between source and target domains using a novel cross-domain co-attention mechanism. Experimental results on three cross-domain action recognition datasets demonstrate that TCoN improves both previous single-domain and cross-domain methods significantly under the cross-domain setting. |
Tasks | |
Published | 2019-12-22 |
URL | https://arxiv.org/abs/1912.10405v1 |
https://arxiv.org/pdf/1912.10405v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-cross-domain-action-recognition |
Repo | |
Framework | |
A CNN-RNN Framework for Image Annotation from Visual Cues and Social Network Metadata
Title | A CNN-RNN Framework for Image Annotation from Visual Cues and Social Network Metadata |
Authors | Tobia Tesan, Pasquale Coscia, Lamberto Ballan |
Abstract | Images represent a commonly used form of visual communication among people. Nevertheless, image classification may be a challenging task when dealing with unclear or non-common images needing more context to be correctly annotated. Metadata accompanying images on social-media represent an ideal source of additional information for retrieving proper neighborhoods easing image annotation task. To this end, we blend visual features extracted from neighbors and their metadata to jointly leverage context and visual cues. Our models use multiple semantic embeddings to achieve the dual objective of being robust to vocabulary changes between train and test sets and decoupling the architecture from the low-level metadata representation. Convolutional and recurrent neural networks (CNNs-RNNs) are jointly adopted to infer similarity among neighbors and query images. We perform comprehensive experiments on the NUS-WIDE dataset showing that our models outperform state-of-the-art architectures based on images and metadata, and decrease both sensory and semantic gaps to better annotate images. |
Tasks | Image Classification |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05770v2 |
https://arxiv.org/pdf/1910.05770v2.pdf | |
PWC | https://paperswithcode.com/paper/a-cnn-rnn-framework-for-image-annotation-from |
Repo | |
Framework | |
Effective reinforcement learning based local search for the maximum k-plex problem
Title | Effective reinforcement learning based local search for the maximum k-plex problem |
Authors | Yan Jin, John H. Drake, Una Benlic, Kun He |
Abstract | The maximum k-plex problem is a computationally complex problem, which emerged from graph-theoretic social network studies. This paper presents an effective hybrid local search for solving the maximum k-plex problem that combines the recently proposed breakout local search algorithm with a reinforcement learning strategy. The proposed approach includes distinguishing features such as: a unified neighborhood search based on the swapping operator, a distance-and-quality reward for actions and a new parameter control mechanism based on reinforcement learning. Extensive experiments for the maximum k-plex problem (k = 2, 3, 4, 5) on 80 benchmark instances from the second DIMACS Challenge demonstrate that the proposed approach can match the best-known results from the literature in all but four problem instances. In addition, the proposed algorithm is able to find 32 new best solutions. |
Tasks | |
Published | 2019-03-13 |
URL | http://arxiv.org/abs/1903.05537v1 |
http://arxiv.org/pdf/1903.05537v1.pdf | |
PWC | https://paperswithcode.com/paper/effective-reinforcement-learning-based-local |
Repo | |
Framework | |
BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
Title | BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance |
Authors | Timo Schick, Hinrich Schütze |
Abstract | Pretraining deep contextualized representations using an unsupervised language modeling objective has led to large performance gains for a variety of NLP tasks. Despite this success, recent work by Schick and Sch"utze (2019) suggests that these architectures struggle to understand rare words. For context-independent word embeddings, this problem can be addressed by separately learning representations for infrequent words. In this work, we show that the same idea can also be applied to contextualized models and clearly improves their downstream task performance. Most approaches for inducing word embeddings into existing embedding spaces are based on simple bag-of-words models; hence they are not a suitable counterpart for deep neural network language models. To overcome this problem, we introduce BERTRAM, a powerful architecture based on a pretrained BERT language model and capable of inferring high-quality representations for rare words. In BERTRAM, surface form and contexts of a word directly interact with each other in a deep architecture. Both on a rare word probing task and on three downstream task datasets, BERTRAM considerably improves representations for rare and medium frequency words compared to both a standalone BERT model and previous work. |
Tasks | Language Modelling, Word Embeddings |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07181v2 |
https://arxiv.org/pdf/1910.07181v2.pdf | |
PWC | https://paperswithcode.com/paper/bertram-improved-word-embeddings-have-big |
Repo | |
Framework | |
Transferrable Operative Difficulty Assessment in Robot-assisted Teleoperation: A Domain Adaptation Approach
Title | Transferrable Operative Difficulty Assessment in Robot-assisted Teleoperation: A Domain Adaptation Approach |
Authors | Ziheng Wang, Cong Feng, Jie Zhang, Ann Majewicz Fey |
Abstract | Providing an accurate and efficient assessment of operative difficulty is important for designing robot-assisted teleoperation interfaces that are easy and natural for human operators to use. In this paper, we aim to develop a data-driven approach to numerically characterize the operative difficulty demand of complex teleoperation. In effort to provide an entirely task-independent assessment, we consider using only data collected from the human user including: (1) physiological response, and (2) movement kinematics. By leveraging an unsupervised domain adaptation technique, our approach learns the user information that defines task difficulty in a well-known source, namely, a Fitt’s target reaching task, and generalizes that knowledge to a more complex human motor control scenario, namely, the teleoperation of a robotic system. Our approach consists of two main parts: (1) The first part accounts for the inherent variances of user physiological and kinematic response between these cross-domain motor control scenarios that are vastly different. (2) A stacked two-layer learner is designed to improve the overall modeling performance, yielding a 96.6% accuracy in predicting the known difficulty of a Fitts’ reaching task when using movement kinematic features. We then validate the effectiveness of our model by investigating teleoperated robotic needle steering as a case study. Compared with a standard NASA TLX user survey, our results indicate significant differences in the difficulty demand for various choices of needle steering control algorithms, p<0.05, as well as the difficulty of steering the needle to different targets, p<0.05. The results highlight the potential of our approach to be used as a design tool to create more intuitive and natural teleoperation interfaces in robot-assisted systems. |
Tasks | Domain Adaptation, Steering Control, Unsupervised Domain Adaptation |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.04934v1 |
https://arxiv.org/pdf/1906.04934v1.pdf | |
PWC | https://paperswithcode.com/paper/transferrable-operative-difficulty-assessment |
Repo | |
Framework | |
A Just and Comprehensive Strategy for Using NLP to Address Online Abuse
Title | A Just and Comprehensive Strategy for Using NLP to Address Online Abuse |
Authors | David Jurgens, Eshwar Chandrasekharan, Libby Hemphill |
Abstract | Online abusive behavior affects millions and the NLP community has attempted to mitigate this problem by developing technologies to detect abuse. However, current methods have largely focused on a narrow definition of abuse to detriment of victims who seek both validation and solutions. In this position paper, we argue that the community needs to make three substantive changes: (1) expanding our scope of problems to tackle both more subtle and more serious forms of abuse, (2) developing proactive technologies that counter or inhibit abuse before it harms, and (3) reframing our effort within a framework of justice to promote healthy communities. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01738v2 |
https://arxiv.org/pdf/1906.01738v2.pdf | |
PWC | https://paperswithcode.com/paper/a-just-and-comprehensive-strategy-for-using |
Repo | |
Framework | |
Gradient Descent based Weight Learning for Grouping Problems: Application on Graph Coloring and Equitable Graph Coloring
Title | Gradient Descent based Weight Learning for Grouping Problems: Application on Graph Coloring and Equitable Graph Coloring |
Authors | Olivier Goudet, Béatrice Duval, Jin-Kao Hao |
Abstract | A grouping problem involves partitioning a set of items into mutually disjoint groups or clusters according to some guiding decision criteria and imperative constraints. Grouping problems have many relevant applications and are computationally difficult. In this work, we present a general weight learning based optimization framework for solving grouping problems. The central idea of our approach is to formulate the task of seeking a solution as a real-valued weight matrix learning problem that is solved by first order gradient descent. A practical implementation of this framework is proposed with tensor calculus in order to benefit from parallel computing on GPU devices. To show its potential for tackling difficult problems, we apply the approach to two typical and well-known grouping problems (graph coloring and equitable graph coloring). We present large computational experiments and comparisons on popular benchmarks and report improved best-known results (new upper bounds) for several large graphs. |
Tasks | |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02261v1 |
https://arxiv.org/pdf/1909.02261v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-descent-based-weight-learning-for |
Repo | |
Framework | |
Banded Matrix Operators for Gaussian Markov Models in the Automatic Differentiation Era
Title | Banded Matrix Operators for Gaussian Markov Models in the Automatic Differentiation Era |
Authors | Nicolas Durrande, Vincent Adam, Lucas Bordeaux, Stefanos Eleftheriadis, James Hensman |
Abstract | Banded matrices can be used as precision matrices in several models including linear state-space models, some Gaussian processes, and Gaussian Markov random fields. The aim of the paper is to make modern inference methods (such as variational inference or gradient-based sampling) available for Gaussian models with banded precision. We show that this can efficiently be achieved by equipping an automatic differentiation framework, such as TensorFlow or PyTorch, with some linear algebra operators dedicated to banded matrices. This paper studies the algorithmic aspects of the required operators, details their reverse-mode derivatives, and show that their complexity is linear in the number of observations. |
Tasks | Gaussian Processes |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.10078v1 |
http://arxiv.org/pdf/1902.10078v1.pdf | |
PWC | https://paperswithcode.com/paper/banded-matrix-operators-for-gaussian-markov |
Repo | |
Framework | |