July 27, 2019

3197 words 16 mins read

Paper Group ANR 724

Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models. Ensembling Factored Neural Machine Translation Models for Automatic Post-Editing and Quality Estimation. Deep Learning for Computational Chemistry. Generalized Zero-Shot Learning for Action Recognition with Web-Scale Video Data. Fine-grained Visual-textual Representation …

Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models


Title	Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models
Authors	Yi Luan, Chris Brockett, Bill Dolan, Jianfeng Gao, Michel Galley
Abstract	Building a persona-based conversation agent is challenging owing to the lack of large amounts of speaker-specific conversation data for model training. This paper addresses the problem by proposing a multi-task learning approach to training neural conversation models that leverages both conversation data across speakers and other types of data pertaining to the speaker and speaker roles to be modeled. Experiments show that our approach leads to significant improvements over baseline model quality, generating responses that capture more precisely speakers’ traits and speaking styles. The model offers the benefits of being algorithmically simple and easy to implement, and not relying on large quantities of data representing specific individual speakers.
Tasks	Multi-Task Learning
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07388v1
PDF	http://arxiv.org/pdf/1710.07388v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-learning-for-speaker-role
Repo
Framework

Ensembling Factored Neural Machine Translation Models for Automatic Post-Editing and Quality Estimation


Title	Ensembling Factored Neural Machine Translation Models for Automatic Post-Editing and Quality Estimation
Authors	Chris Hokamp
Abstract	This work presents a novel approach to Automatic Post-Editing (APE) and Word-Level Quality Estimation (QE) using ensembles of specialized Neural Machine Translation (NMT) systems. Word-level features that have proven effective for QE are included as input factors, expanding the representation of the original source and the machine translation hypothesis, which are used to generate an automatically post-edited hypothesis. We train a suite of NMT models that use different input representations, but share the same output space. These models are then ensembled together, and tuned for both the APE and the QE task. We thus attempt to connect the state-of-the-art approaches to APE and QE within a single framework. Our models achieve state-of-the-art results in both tasks, with the only difference in the tuning step which learns weights for each component of the ensemble.
Tasks	Automatic Post-Editing, Machine Translation
Published	2017-06-15
URL	http://arxiv.org/abs/1706.05083v2
PDF	http://arxiv.org/pdf/1706.05083v2.pdf
PWC	https://paperswithcode.com/paper/ensembling-factored-neural-machine
Repo
Framework

Deep Learning for Computational Chemistry


Title	Deep Learning for Computational Chemistry
Authors	Garrett B. Goh, Nathan O. Hodas, Abhinav Vishnu
Abstract	The rise and fall of artificial neural networks is well documented in the scientific literature of both computer science and computational chemistry. Yet almost two decades later, we are now seeing a resurgence of interest in deep learning, a machine learning algorithm based on multilayer neural networks. Within the last few years, we have seen the transformative impact of deep learning in many domains, particularly in speech recognition and computer vision, to the extent that the majority of expert practitioners in those field are now regularly eschewing prior established models in favor of deep learning models. In this review, we provide an introductory overview into the theory of deep neural networks and their unique properties that distinguish them from traditional machine learning algorithms used in cheminformatics. By providing an overview of the variety of emerging applications of deep neural networks, we highlight its ubiquity and broad applicability to a wide range of challenges in the field, including QSAR, virtual screening, protein structure prediction, quantum chemistry, materials design and property prediction. In reviewing the performance of deep neural networks, we observed a consistent outperformance against non-neural networks state-of-the-art models across disparate research topics, and deep neural network based models often exceeded the “glass ceiling” expectations of their respective tasks. Coupled with the maturity of GPU-accelerated computing for training deep neural networks and the exponential growth of chemical data on which to train these networks on, we anticipate that deep learning algorithms will be a valuable tool for computational chemistry.
Tasks	Speech Recognition
Published	2017-01-17
URL	http://arxiv.org/abs/1701.04503v1
PDF	http://arxiv.org/pdf/1701.04503v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-computational-chemistry
Repo
Framework

Generalized Zero-Shot Learning for Action Recognition with Web-Scale Video Data


Title	Generalized Zero-Shot Learning for Action Recognition with Web-Scale Video Data
Authors	Kun Liu, Wu Liu, Huadong Ma, Wenbing Huang, Xiongxiong Dong
Abstract	Action recognition in surveillance video makes our life safer by detecting the criminal events or predicting violent emergencies. However, efficient action recognition is not free of difficulty. First, there are so many action classes in daily life that we cannot pre-define all possible action classes beforehand. Moreover, it is very hard to collect real-word videos for certain particular actions such as steal and street fight due to legal restrictions and privacy protection. These challenges make existing data-driven recognition methods insufficient to attain desired performance. Zero-shot learning is potential to be applied to solve these issues since it can perform classification without positive example. Nevertheless, current zero-shot learning algorithms have been studied under the unreasonable setting where seen classes are absent during the testing phase. Motivated by this, we study the task of action recognition in surveillance video under a more realistic \emph{generalized zero-shot setting}, where testing data contains both seen and unseen classes. To our best knowledge, this is the first work to study video action recognition under the generalized zero-shot setting. We firstly perform extensive empirical studies on several existing zero-shot leaning approaches under this new setting on a web-scale video data. Our experimental results demonstrate that, under the generalize setting, typical zero-shot learning methods are no longer effective for the dataset we applied. Then, we propose a method for action recognition by deploying generalized zero-shot learning, which transfers the knowledge of web video to detect the anomalous actions in surveillance videos. To verify the effectiveness of our proposed method, we further construct a new surveillance video dataset consisting of nine action classes related to the public safety situation.
Tasks	Temporal Action Localization, Zero-Shot Learning
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07455v1
PDF	http://arxiv.org/pdf/1710.07455v1.pdf
PWC	https://paperswithcode.com/paper/generalized-zero-shot-learning-for-action
Repo
Framework

Fine-grained Visual-textual Representation Learning


Title	Fine-grained Visual-textual Representation Learning
Authors	Xiangteng He, Yuxin Peng
Abstract	Fine-grained visual categorization is to recognize hundreds of subcategories belonging to the same basic-level category, which is a highly challenging task due to the quite subtle and local visual distinctions among similar subcategories. Most existing methods generally learn part detectors to discover discriminative regions for better categorization performance. However, not all parts are beneficial and indispensable for visual categorization, and the setting of part detector number heavily relies on prior knowledge as well as experimental validation. As is known to all, when we describe the object of an image via textual descriptions, we mainly focus on the pivotal characteristics, and rarely pay attention to common characteristics as well as the background areas. This is an involuntary transfer from human visual attention to textual attention, which leads to the fact that textual attention tells us how many and which parts are discriminative and significant to categorization. So textual attention could help us to discover visual attention in image. Inspired by this, we propose a fine-grained visual-textual representation learning (VTRL) approach, and its main contributions are: (1) Fine-grained visual-textual pattern mining devotes to discovering discriminative visual-textual pairwise information for boosting categorization performance through jointly modeling vision and text with generative adversarial networks (GANs), which automatically and adaptively discovers discriminative parts. (2) Visual-textual representation learning jointly combines visual and textual information, which preserves the intra-modality and inter-modality information to generate complementary fine-grained representation, as well as further improves categorization performance.
Tasks	Fine-Grained Visual Categorization, Representation Learning
Published	2017-08-31
URL	http://arxiv.org/abs/1709.00340v4
PDF	http://arxiv.org/pdf/1709.00340v4.pdf
PWC	https://paperswithcode.com/paper/fine-grained-visual-textual-representation
Repo
Framework

Exploring the Imposition of Synaptic Precision Restrictions For Evolutionary Synthesis of Deep Neural Networks


Title	Exploring the Imposition of Synaptic Precision Restrictions For Evolutionary Synthesis of Deep Neural Networks
Authors	Mohammad Javad Shafiee, Francis Li, Alexander Wong
Abstract	A key contributing factor to incredible success of deep neural networks has been the significant rise on massively parallel computing devices allowing researchers to greatly increase the size and depth of deep neural networks, leading to significant improvements in modeling accuracy. Although deeper, larger, or complex deep neural networks have shown considerable promise, the computational complexity of such networks is a major barrier to utilization in resource-starved scenarios. We explore the synaptogenesis of deep neural networks in the formation of efficient deep neural network architectures within an evolutionary deep intelligence framework, where a probabilistic generative modeling strategy is introduced to stochastically synthesize increasingly efficient yet effective offspring deep neural networks over generations, mimicking evolutionary processes such as heredity, random mutation, and natural selection in a probabilistic manner. In this study, we primarily explore the imposition of synaptic precision restrictions and its impact on the evolutionary synthesis of deep neural networks to synthesize more efficient network architectures tailored for resource-starved scenarios. Experimental results show significant improvements in synaptic efficiency (~10X decrease for GoogLeNet-based DetectNet) and inference speed (>5X increase for GoogLeNet-based DetectNet) while preserving modeling accuracy.
Tasks
Published	2017-07-01
URL	http://arxiv.org/abs/1707.00095v1
PDF	http://arxiv.org/pdf/1707.00095v1.pdf
PWC	https://paperswithcode.com/paper/exploring-the-imposition-of-synaptic
Repo
Framework

One-Shot Fine-Grained Instance Retrieval


Title	One-Shot Fine-Grained Instance Retrieval
Authors	Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
Abstract	Fine-Grained Visual Categorization (FGVC) has achieved significant progress recently. However, the number of fine-grained species could be huge and dynamically increasing in real scenarios, making it difficult to recognize unseen objects under the current FGVC framework. This raises an open issue to perform large-scale fine-grained identification without a complete training set. Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR). “One-Shot” denotes the ability of identifying unseen objects through a fine-grained retrieval task assisted with an incomplete auxiliary training set. This paper first presents the detailed description to OSFGIR task and our collected OSFGIR-378K dataset. Next, we propose the Convolutional and Normalization Networks (CN-Nets) learned on the auxiliary dataset to generate a concise and discriminative representation. Finally, we present a coarse-to-fine retrieval framework consisting of three components, i.e., coarse retrieval, fine-grained retrieval, and query expansion, respectively. The framework progressively retrieves images with similar semantics, and performs fine-grained identification. Experiments show our OSFGIR framework achieves significantly better accuracy and efficiency than existing FGVC and image retrieval methods, thus could be a better solution for large-scale fine-grained object identification.
Tasks	Fine-Grained Visual Categorization, Image Retrieval
Published	2017-07-04
URL	http://arxiv.org/abs/1707.00811v1
PDF	http://arxiv.org/pdf/1707.00811v1.pdf
PWC	https://paperswithcode.com/paper/one-shot-fine-grained-instance-retrieval
Repo
Framework

Exploiting Active Subspaces in Global Optimization: How Complex is your Problem?


Title	Exploiting Active Subspaces in Global Optimization: How Complex is your Problem?
Authors	Pramudita Satria Palar, Koji Shimoyama
Abstract	When applying optimization method to a real-world problem, the possession of prior knowledge and preliminary analysis on the landscape of a global optimization problem can give us an insight into the complexity of the problem. This knowledge can better inform us in deciding what optimization method should be used to tackle the problem. However, this analysis becomes problematic when the dimensionality of the problem is high. This paper presents a framework to take a deeper look at the global optimization problem to be tackled: by analyzing the low-dimensional representation of the problem through discovering the active subspaces of the given problem. The virtue of this is that the problem’s complexity can be visualized in a one or two-dimensional plot, thus allow one to get a better grip about the problem’s difficulty. One could then have a better idea regarding the complexity of their problem to determine the choice of global optimizer or what surrogate-model type to be used. Furthermore, we also demonstrate how the active subspaces can be used to perform design exploration and analysis.
Tasks
Published	2017-07-09
URL	http://arxiv.org/abs/1707.02533v1
PDF	http://arxiv.org/pdf/1707.02533v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-active-subspaces-in-global
Repo
Framework

Context encoding enables machine learning-based quantitative photoacoustics


Title	Context encoding enables machine learning-based quantitative photoacoustics
Authors	Thomas Kirchner, Janek Gröhl, Lena Maier-Hein
Abstract	Real-time monitoring of functional tissue parameters, such as local blood oxygenation, based on optical imaging could provide groundbreaking advances in the diagnosis and interventional therapy of various diseases. While photoacoustic (PA) imaging is a novel modality with great potential to measure optical absorption deep inside tissue, quantification of the measurements remains a major challenge. In this paper, we introduce the first machine learning based approach to quantitative PA imaging (qPAI), which relies on learning the fluence in a voxel to deduce the corresponding optical absorption. The method encodes relevant information of the measured signal and the characteristics of the imaging system in voxel-based feature vectors, which allow the generation of thousands of training samples from a single simulated PA image. Comprehensive in silico experiments suggest that context encoding (CE)-qPAI enables highly accurate and robust quantification of the local fluence and thereby the optical absorption from PA images.
Tasks
Published	2017-06-12
URL	http://arxiv.org/abs/1706.03595v2
PDF	http://arxiv.org/pdf/1706.03595v2.pdf
PWC	https://paperswithcode.com/paper/context-encoding-enables-machine-learning
Repo
Framework

Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification


Title	Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification
Authors	Bo-Ru Lu, Frank Shyu, Yun-Nung Chen, Hung-Yi Lee, Lin-shan Lee
Abstract	Connectionist temporal classification (CTC) is a powerful approach for sequence-to-sequence learning, and has been popularly used in speech recognition. The central ideas of CTC include adding a label “blank” during training. With this mechanism, CTC eliminates the need of segment alignment, and hence has been applied to various sequence-to-sequence learning problems. In this work, we applied CTC to abstractive summarization for spoken content. The “blank” in this case implies the corresponding input data are less important or noisy; thus it can be ignored. This approach was shown to outperform the existing methods in term of ROUGE scores over Chinese Gigaword and MATBN corpora. This approach also has the nice property that the ordering of words or characters in the input documents can be better preserved in the generated summaries.
Tasks	Abstractive Text Summarization, Speech Recognition
Published	2017-09-16
URL	http://arxiv.org/abs/1709.05475v2
PDF	http://arxiv.org/pdf/1709.05475v2.pdf
PWC	https://paperswithcode.com/paper/order-preserving-abstractive-summarization
Repo
Framework

Convergent Tree Backup and Retrace with Function Approximation


Title	Convergent Tree Backup and Retrace with Function Approximation
Authors	Ahmed Touati, Pierre-Luc Bacon, Doina Precup, Pascal Vincent
Abstract	Off-policy learning is key to scaling up reinforcement learning as it allows to learn about a target policy from the experience generated by a different behavior policy. Unfortunately, it has been challenging to combine off-policy learning with function approximation and multi-step bootstrapping in a way that leads to both stable and efficient algorithms. In this work, we show that the \textsc{Tree Backup} and \textsc{Retrace} algorithms are unstable with linear function approximation, both in theory and in practice with specific examples. Based on our analysis, we then derive stable and efficient gradient-based algorithms using a quadratic convex-concave saddle-point formulation. By exploiting the problem structure proper to these algorithms, we are able to provide convergence guarantees and finite-sample bounds. The applicability of our new analysis also goes beyond \textsc{Tree Backup} and \textsc{Retrace} and allows us to provide new convergence rates for the GTD and GTD2 algorithms without having recourse to projections or Polyak averaging.
Tasks
Published	2017-05-25
URL	http://arxiv.org/abs/1705.09322v4
PDF	http://arxiv.org/pdf/1705.09322v4.pdf
PWC	https://paperswithcode.com/paper/convergent-tree-backup-and-retrace-with
Repo
Framework

Proxy Non-Discrimination in Data-Driven Systems


Title	Proxy Non-Discrimination in Data-Driven Systems
Authors	Anupam Datta, Matt Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen
Abstract	Machine learnt systems inherit biases against protected classes, historically disparaged groups, from training data. Usually, these biases are not explicit, they rely on subtle correlations discovered by training algorithms, and are therefore difficult to detect. We formalize proxy discrimination in data-driven systems, a class of properties indicative of bias, as the presence of protected class correlates that have causal influence on the system’s output. We evaluate an implementation on a corpus of social datasets, demonstrating how to validate systems against these properties and to repair violations where they occur.
Tasks
Published	2017-07-25
URL	http://arxiv.org/abs/1707.08120v1
PDF	http://arxiv.org/pdf/1707.08120v1.pdf
PWC	https://paperswithcode.com/paper/proxy-non-discrimination-in-data-driven
Repo
Framework

Effective scaling registration approach by imposing the emphasis on the scale factor


Title	Effective scaling registration approach by imposing the emphasis on the scale factor
Authors	Minmin Xu, Siyu Xu, Jihua Zhu, Yaochen Li, Jun Wang, Huimin Lu
Abstract	This paper proposes an effective approach for the scaling registration of $m$-D point sets. Different from the rigid transformation, the scaling registration can not be formulated into the common least square function due to the ill-posed problem caused by the scale factor. Therefore, this paper designs a novel objective function for the scaling registration problem. The appearance of this objective function is a rational fraction, where the numerator item is the least square error and the denominator item is the square of the scale factor. By imposing the emphasis on scale factor, the ill-posed problem can be avoided in the scaling registration. Subsequently, the new objective function can be solved by the proposed scaling iterative closest point (ICP) algorithm, which can obtain the optimal scaling transformation. For the practical applications, the scaling ICP algorithm is further extended to align partially overlapping point sets. Finally, the proposed approach is tested on public data sets and applied to merging grid maps of different resolutions. Experimental results demonstrate its superiority over previous approaches on efficiency and robustness.
Tasks
Published	2017-04-28
URL	http://arxiv.org/abs/1705.00086v2
PDF	http://arxiv.org/pdf/1705.00086v2.pdf
PWC	https://paperswithcode.com/paper/effective-scaling-registration-approach-by
Repo
Framework

Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints


Title	Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints
Authors	Wenlong Mou, Liwei Wang, Xiyu Zhai, Kai Zheng
Abstract	Algorithm-dependent generalization error bounds are central to statistical learning theory. A learning algorithm may use a large hypothesis space, but the limited number of iterations controls its model capacity and generalization error. The impacts of stochastic gradient methods on generalization error for non-convex learning problems not only have important theoretical consequences, but are also critical to generalization errors of deep learning. In this paper, we study the generalization errors of Stochastic Gradient Langevin Dynamics (SGLD) with non-convex objectives. Two theories are proposed with non-asymptotic discrete-time analysis, using Stability and PAC-Bayesian results respectively. The stability-based theory obtains a bound of $O\left(\frac{1}{n}L\sqrt{\beta T_k}\right)$, where $L$ is uniform Lipschitz parameter, $\beta$ is inverse temperature, and $T_k$ is aggregated step sizes. For PAC-Bayesian theory, though the bound has a slower $O(1/\sqrt{n})$ rate, the contribution of each step is shown with an exponentially decaying factor by imposing $\ell^2$ regularization, and the uniform Lipschitz constant is also replaced by actual norms of gradients along trajectory. Our bounds have no implicit dependence on dimensions, norms or other capacity measures of parameter, which elegantly characterizes the phenomenon of “Fast Training Guarantees Generalization” in non-convex settings. This is the first algorithm-dependent result with reasonable dependence on aggregated step sizes for non-convex learning, and has important implications to statistical learning aspects of stochastic gradient methods in complicated models such as deep learning.
Tasks
Published	2017-07-19
URL	http://arxiv.org/abs/1707.05947v1
PDF	http://arxiv.org/pdf/1707.05947v1.pdf
PWC	https://paperswithcode.com/paper/generalization-bounds-of-sgld-for-non-convex
Repo
Framework

Residual Expansion Algorithm: Fast and Effective Optimization for Nonconvex Least Squares Problems


Title	Residual Expansion Algorithm: Fast and Effective Optimization for Nonconvex Least Squares Problems
Authors	Daiki Ikami, Toshihiko Yamasaki, Kiyoharu Aizawa
Abstract	We propose the residual expansion (RE) algorithm: a global (or near-global) optimization method for nonconvex least squares problems. Unlike most existing nonconvex optimization techniques, the RE algorithm is not based on either stochastic or multi-point searches; therefore, it can achieve fast global optimization. Moreover, the RE algorithm is easy to implement and successful in high-dimensional optimization. The RE algorithm exhibits excellent empirical performance in terms of k-means clustering, point-set registration, optimized product quantization, and blind image deblurring.
Tasks	Blind Image Deblurring, Deblurring, Quantization
Published	2017-05-26
URL	http://arxiv.org/abs/1705.09549v1
PDF	http://arxiv.org/pdf/1705.09549v1.pdf
PWC	https://paperswithcode.com/paper/residual-expansion-algorithm-fast-and
Repo
Framework