January 27, 2020

3382 words 16 mins read

Paper Group ANR 1236

The Similarity-Consensus Regularized Multi-view Learning for Dimension Reduction. Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering. Visualizing Point Cloud Classifiers by Curvature Smoothing. Computational Ceramicology. Improving N-gram Language Models with Pre-trained Deep Transformer. On-line Search History-assiste …

The Similarity-Consensus Regularized Multi-view Learning for Dimension Reduction


Title	The Similarity-Consensus Regularized Multi-view Learning for Dimension Reduction
Authors	Xiangzhu Meng, Huibing Wang, Lin Feng
Abstract	During the last decades, learning a low-dimensional space with discriminative information for dimension reduction (DR) has gained a surge of interest. However, it’s not accessible for these DR methods to achieve satisfactory performance when facing the features from multiple views. In multi-view learning problems, one instance can be represented by multiple heterogeneous features, which are highly related but sometimes look different from each other. In addition, correlations between features from multiple views always vary greatly, which challenges the capability of multi-view learning methods. Consequently, constructing a multi-view learning framework with generalization and scalability, which could take advantage of multi-view information as much as possible, is extremely necessary but challenging. To implement the above target, this paper proposes a novel multi-view learning framework based on similarity consensus, which makes full use of correlations among multi-view features while considering the scalability and robustness of the framework. It aims to straightforwardly extend those existing DR methods into multi-view learning domain by preserving the similarity between different views to capture the low-dimensional embedding. Two schemes based on pairwise-consensus and centroid-consensus are separately proposed to force multiple views to learn from each other and then an iterative alternating strategy is developed to obtain the optimal solution. The proposed method is evaluated on 5 benchmark datasets and comprehensive experiments show that our proposed multi-view framework can yield comparable and promising performance with previous approaches proposed in recent literatures.
Tasks	Dimensionality Reduction, MULTI-VIEW LEARNING
Published	2019-11-15
URL	https://arxiv.org/abs/1911.07656v1
PDF	https://arxiv.org/pdf/1911.07656v1.pdf
PWC	https://paperswithcode.com/paper/the-similarity-consensus-regularized-multi
Repo
Framework

Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering


Title	Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering
Authors	Ameya Godbole, Dilip Kavarthapu, Rajarshi Das, Zhiyu Gong, Abhishek Singhal, Hamed Zamani, Mo Yu, Tian Gao, Xiaoxiao Guo, Manzil Zaheer, Andrew McCallum
Abstract	Multi-hop question answering (QA) requires an information retrieval (IR) system that can find \emph{multiple} supporting evidence needed to answer the question, making the retrieval process very challenging. This paper introduces an IR technique that uses information of entities present in the initially retrieved evidence to learn to `\emph{hop}’ to other relevant evidence. In a setting, with more than \textbf{5 million} Wikipedia paragraphs, our approach leads to significant boost in retrieval performance. The retrieved evidence also increased the performance of an existing QA model (without any training) on the \hotpot benchmark by \textbf{10.59} F1. \|
Tasks	Information Retrieval, Question Answering
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07598v1
PDF	https://arxiv.org/pdf/1909.07598v1.pdf
PWC	https://paperswithcode.com/paper/multi-step-entity-centric-information
Repo
Framework

Visualizing Point Cloud Classifiers by Curvature Smoothing


Title	Visualizing Point Cloud Classifiers by Curvature Smoothing
Authors	Chen Ziwen, Wenxuan Wu, Zhongang Qi, Li Fuxin
Abstract	Recently, several networks that operate directly on point clouds have been proposed. There is significant utility in understanding them better, so that humans can understand more about the mechanisms how those networks classify point clouds, potentially helping diagnosing them and designing better architectures and data augmentation pipelines. In this paper, we propose a novel approach to visualize important features used in classification decisions of point cloud networks. Following ideas in visualizing 2-D convolutional networks, our approach is based on gradually smoothing parts of the point cloud. However, different from the 2-D case, we smooth the curvature of the point cloud to remove sharp shape features. The resulting point cloud is then evaluated on the original point cloud network to see whether the performance has dropped or remained the same, from which parts that are important to the point cloud classification are identified. A technical contribution of the paper is an approximated curvature smoothing algorithm, which can smoothly transition from the original point cloud to one of constant curvature, such as a uniform sphere. With this smoothing algorithm, we propose PCI-GOS, a 3-D extension of the Integrated-Gradients Optimized Saliency (I-GOS) algorithm, as a perturbation-based visualization technique realized on 3-D shapes. Experiment results revealed insights into these classifiers.
Tasks	Data Augmentation
Published	2019-11-23
URL	https://arxiv.org/abs/1911.10415v2
PDF	https://arxiv.org/pdf/1911.10415v2.pdf
PWC	https://paperswithcode.com/paper/visualizing-point-cloud-classifiers-by
Repo
Framework

Computational Ceramicology


Title	Computational Ceramicology
Authors	Barak Itkin, Lior Wolf, Nachum Dershowitz
Abstract	Field archeologists are called upon to identify potsherds, for which purpose they rely on their experience and on reference works. We have developed two complementary machine-learning tools to propose identifications based on images captured on site. One method relies on the shape of the fracture outline of a sherd; the other is based on decorative features. For the outline-identification tool, a novel deep-learning architecture was employed, one that integrates shape information from points along the inner and outer surfaces. The decoration classifier is based on relatively standard architectures used in image recognition. In both cases, training the classifiers required tackling challenges that arise when working with real-world archeological data: paucity of labeled data; extreme imbalance between instances of the different categories; and the need to avoid neglecting rare classes and to take note of minute distinguishing features of some classes. The scarcity of training data was overcome by using synthetically-produced virtual potsherds and by employing multiple data-augmentation techniques. A novel form of training loss allowed us to overcome the problems caused by under-populated classes and non-homogeneous distribution of discriminative features.
Tasks	Data Augmentation
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09960v1
PDF	https://arxiv.org/pdf/1911.09960v1.pdf
PWC	https://paperswithcode.com/paper/computational-ceramicology
Repo
Framework

Improving N-gram Language Models with Pre-trained Deep Transformer


Title	Improving N-gram Language Models with Pre-trained Deep Transformer
Authors	Yiren Wang, Hongzhao Huang, Zhe Liu, Yutong Pang, Yongqiang Wang, ChengXiang Zhai, Fuchun Peng
Abstract	Although n-gram language models (LMs) have been outperformed by the state-of-the-art neural LMs, they are still widely used in speech recognition due to its high efficiency in inference. In this paper, we demonstrate that n-gram LM can be improved by neural LMs through a text generation based data augmentation method. In contrast to previous approaches, we employ a large-scale general domain pre-training followed by in-domain fine-tuning strategy to construct deep Transformer based neural LMs. Large amount of in-domain text data is generated with the well trained deep Transformer to construct new n-gram LMs, which are then interpolated with baseline n-gram systems. Empirical studies on different speech recognition tasks show that the proposed approach can effectively improve recognition accuracy. In particular, our proposed approach brings significant relative word error rate reduction up to 6.0% for domains with limited in-domain data.
Tasks	Data Augmentation, Speech Recognition, Text Generation
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10235v1
PDF	https://arxiv.org/pdf/1911.10235v1.pdf
PWC	https://paperswithcode.com/paper/improving-n-gram-language-models-with-pre
Repo
Framework

On-line Search History-assisted Restart Strategy for Covariance Matrix Adaptation Evolution Strategy


Title	On-line Search History-assisted Restart Strategy for Covariance Matrix Adaptation Evolution Strategy
Authors	Yang Lou, Shiu Yin Yuen, Guanrong Chen, Xin Zhang
Abstract	Restart strategy helps the covariance matrix adaptation evolution strategy (CMA-ES) to increase the probability of finding the global optimum in optimization, while a single run CMA-ES is easy to be trapped in local optima. In this paper, the continuous non-revisiting genetic algorithm (cNrGA) is used to help CMA-ES to achieve multiple restarts from different sub-regions of the search space. The CMA-ES with on-line search history-assisted restart strategy (HR-CMA-ES) is proposed. The entire on-line search history of cNrGA is stored in a binary space partitioning (BSP) tree, which is effective for performing local search. The frequently sampled sub-region is reflected by a deep position in the BSP tree. When leaf nodes are located deeper than a threshold, the corresponding sub-region is considered a region of interest (ROI). In HR-CMA-ES, cNrGA is responsible for global exploration and suggesting ROI for CMA-ES to perform an exploitation within or around the ROI. CMA-ES restarts independently in each suggested ROI. The non-revisiting mechanism of cNrGA avoids to suggest the same ROI for a second time. Experimental results on the CEC 2013 and 2017 benchmark suites show that HR-CMA-ES performs better than both CMA-ES and cNrGA. A positive synergy is observed by the memetic cooperation of the two algorithms.
Tasks
Published	2019-03-16
URL	http://arxiv.org/abs/1903.09085v1
PDF	http://arxiv.org/pdf/1903.09085v1.pdf
PWC	https://paperswithcode.com/paper/on-line-search-history-assisted-restart
Repo
Framework

Generating Diverse Translation by Manipulating Multi-Head Attention


Title	Generating Diverse Translation by Manipulating Multi-Head Attention
Authors	Zewei Sun, Shujian Huang, Hao-Ran Wei, Xin-yu Dai, Jiajun Chen
Abstract	Transformer model has been widely used on machine translation tasks and obtained state-of-the-art results. In this paper, we report an interesting phenomenon in its encoder-decoder multi-head attention: different attention heads of the final decoder layer align to different word translation candidates. We empirically verify this discovery and propose a method to generate diverse translations by manipulating heads. Furthermore, we make use of these diverse translations with the back-translation technique for better data augmentation. Experiment results show that our method generates diverse translations without severe drop in translation quality. Experiments also show that back-translation with these diverse translations could bring significant improvement on performance on translation tasks. An auxiliary experiment of conversation response generation task proves the effect of diversity as well.
Tasks	Data Augmentation, Machine Translation
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09333v1
PDF	https://arxiv.org/pdf/1911.09333v1.pdf
PWC	https://paperswithcode.com/paper/generating-diverse-translation-by
Repo
Framework

Hierarchical Average Reward Policy Gradient Algorithms


Title	Hierarchical Average Reward Policy Gradient Algorithms
Authors	Akshay Dharmavaram, Matthew Riemer, Shalabh Bhatnagar
Abstract	Option-critic learning is a general-purpose reinforcement learning (RL) framework that aims to address the issue of long term credit assignment by leveraging temporal abstractions. However, when dealing with extended timescales, discounting future rewards can lead to incorrect credit assignments. In this work, we address this issue by extending the hierarchical option-critic policy gradient theorem for the average reward criterion. Our proposed framework aims to maximize the long-term reward obtained in the steady-state of the Markov chain defined by the agent’s policy. Furthermore, we use an ordinary differential equation based approach for our convergence analysis and prove that the parameters of the intra-option policies, termination functions, and value functions, converge to their corresponding optimal values, with probability one. Finally, we illustrate the competitive advantage of learning options, in the average reward setting, on a grid-world environment with sparse rewards.
Tasks
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08826v1
PDF	https://arxiv.org/pdf/1911.08826v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-average-reward-policy-gradient
Repo
Framework

Efficient Privacy-Preserving Nonconvex Optimization


Title	Efficient Privacy-Preserving Nonconvex Optimization
Authors	Lingxiao Wang, Bargav Jayaraman, David Evans, Quanquan Gu
Abstract	While many solutions for privacy-preserving convex empirical risk minimization (ERM) have been developed, privacy-preserving nonconvex ERM remains under challenging. In this paper, we study nonconvex ERM, which takes the form of minimizing a finite-sum of nonconvex loss functions over a training set. To achieve both efficiency and strong privacy guarantees with efficiency, we propose a differentially-private stochastic gradient descent algorithm for nonconvex ERM, and provide a tight analysis of its privacy and utility guarantees, as well as its gradient complexity. We show that our proposed algorithm can substantially reduce gradient complexity while matching the best-known utility guarantee obtained by Wang et al. (2017). We extend our algorithm to the distributed setting using secure multi-party computation, and show that it is possible for a distributed algorithm to match the privacy and utility guarantees of a centralized algorithm in this setting. Our experiments on benchmark nonconvex ERM problems and real datasets demonstrate superior performance in terms of both training time and utility gains compared with previous differentially-private methods using the same privacy budgets.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.13659v1
PDF	https://arxiv.org/pdf/1910.13659v1.pdf
PWC	https://paperswithcode.com/paper/efficient-privacy-preserving-nonconvex
Repo
Framework

Setup of a Recurrent Neural Network as a Body Model for Solving Inverse and Forward Kinematics as well as Dynamics for a Redundant Manipulator


Title	Setup of a Recurrent Neural Network as a Body Model for Solving Inverse and Forward Kinematics as well as Dynamics for a Redundant Manipulator
Authors	Malte Schilling
Abstract	An internal model of the own body can be assumed a fundamental and evolutionary-early representation as it is present throughout the animal kingdom. Such functional models are, on the one hand, required in motor control, for example solving the inverse kinematic or dynamic task in goal-directed movements or a forward task in ballistic movements. On the other hand, such models are recruited in cognitive tasks as are planning ahead or observation of actions of a conspecific. Here, we present a functional internal body model that is based on the Mean of Multiple Computations principle. For the first time such a model is completely realized in a recurrent neural network as necessary normalization steps are integrated into the neural model itself. Secondly, a dynamic extension is applied to the model. It is shown how the neural network solves a series of inverse tasks. Furthermore, emerging representation in transformational layers are analyzed that show a form of prototypical population-coding as found in place or direction cells.
Tasks
Published	2019-04-12
URL	http://arxiv.org/abs/1904.10926v1
PDF	http://arxiv.org/pdf/1904.10926v1.pdf
PWC	https://paperswithcode.com/paper/190410926
Repo
Framework

FootAndBall: Integrated player and ball detector


Title	FootAndBall: Integrated player and ball detector
Authors	Jacek Komorowski, Grzegorz Kurzejamski, Grzegorz Sarwas
Abstract	The paper describes a deep neural network-based detector dedicated for ball and players detection in high resolution, long shot, video recordings of soccer matches. The detector, dubbed FootAndBall, has an efficient fully convolutional architecture and can operate on input video stream with an arbitrary resolution. It produces ball confidence map encoding the position of the detected ball, player confidence map and player bounding boxes tensor encoding players’ positions and bounding boxes. The network uses Feature Pyramid Network desing pattern, where lower level features with higher spatial resolution are combined with higher level features with bigger receptive field. This improves discriminability of small objects (the ball) as larger visual context around the object of interest is taken into account for the classification. Due to its specialized design, the network has two orders of magnitude less parameters than a generic deep neural network-based object detector, such as SSD or YOLO. This allows real-time processing of high resolution input video stream.
Tasks
Published	2019-12-10
URL	https://arxiv.org/abs/1912.05445v1
PDF	https://arxiv.org/pdf/1912.05445v1.pdf
PWC	https://paperswithcode.com/paper/footandball-integrated-player-and-ball
Repo
Framework

Accelerated Linear Convergence of Stochastic Momentum Methods in Wasserstein Distances


Title	Accelerated Linear Convergence of Stochastic Momentum Methods in Wasserstein Distances
Authors	Bugra Can, Mert Gurbuzbalaban, Lingjiong Zhu
Abstract	Momentum methods such as Polyak’s heavy ball (HB) method, Nesterov’s accelerated gradient (AG) as well as accelerated projected gradient (APG) method have been commonly used in machine learning practice, but their performance is quite sensitive to noise in the gradients. We study these methods under a first-order stochastic oracle model where noisy estimates of the gradients are available. For strongly convex problems, we show that the distribution of the iterates of AG converges with the accelerated $O(\sqrt{\kappa}\log(1/\varepsilon))$ linear rate to a ball of radius $\varepsilon$ centered at a unique invariant distribution in the 1-Wasserstein metric where $\kappa$ is the condition number as long as the noise variance is smaller than an explicit upper bound we can provide. Our analysis also certifies linear convergence rates as a function of the stepsize, momentum parameter and the noise variance; recovering the accelerated rates in the noiseless case and quantifying the level of noise that can be tolerated to achieve a given performance. In the special case of strongly convex quadratic objectives, we can show accelerated linear rates in the $p$-Wasserstein metric for any $p\geq 1$ with improved sensitivity to noise for both AG and HB through a non-asymptotic analysis under some additional assumptions on the noise structure. Our analysis for HB and AG also leads to improved non-asymptotic convergence bounds in suboptimality for both deterministic and stochastic settings which is of independent interest. To the best of our knowledge, these are the first linear convergence results for stochastic momentum methods under the stochastic oracle model. We also extend our results to the APG method and weakly convex functions showing accelerated rates when the noise magnitude is sufficiently small.
Tasks
Published	2019-01-22
URL	https://arxiv.org/abs/1901.07445v2
PDF	https://arxiv.org/pdf/1901.07445v2.pdf
PWC	https://paperswithcode.com/paper/accelerated-linear-convergence-of-stochastic
Repo
Framework

Many-to-Many Voice Conversion using Cycle-Consistent Variational Autoencoder with Multiple Decoders


Title	Many-to-Many Voice Conversion using Cycle-Consistent Variational Autoencoder with Multiple Decoders
Authors	Keonnyeong Lee, In-Chul Yoo, Dongsuk Yook
Abstract	One of the obstacles in many-to-many voice conversion is the requirement of the parallel training data, which contain pairs of utterances with the same linguistic content spoken by different speakers. Since collecting such parallel data is a highly expensive task, many works attempted to use non-parallel training data for many-to-many voice conversion. One of such approaches is using the variational autoencoder (VAE). Though it can handle many-to-many voice conversion without the parallel training, the VAE based voice conversion methods suffer from low sound qualities of the converted speech. One of the major reasons is because the VAE learns only the self-reconstruction path. The conversion path is not trained at all. In this paper, we propose a cycle consistency loss for VAE to explicitly learn the conversion path. In addition, we propose to use multiple decoders to further improve the sound qualities of the conventional VAE based voice conversion methods. The effectiveness of the proposed method is validated using objective and the subjective evaluations.
Tasks	Voice Conversion
Published	2019-09-15
URL	https://arxiv.org/abs/1909.06805v4
PDF	https://arxiv.org/pdf/1909.06805v4.pdf
PWC	https://paperswithcode.com/paper/voice-conversion-using-cycle-consistent
Repo
Framework

Unsupervised Image Noise Modeling with Self-Consistent GAN


Title	Unsupervised Image Noise Modeling with Self-Consistent GAN
Authors	Hanshu Yan, Vincent Y. F. Tan, Wenhan Yang, Jiashi Feng
Abstract	Noise modeling lies in the heart of many image processing tasks. However, existing deep learning methods for noise modeling generally require clean and noisy image pairs for model training; these image pairs are difficult to obtain in many realistic scenarios. To ameliorate this problem, we propose a self-consistent GAN (SCGAN), that can directly extract noise maps from noisy images, thus enabling unsupervised noise modeling. In particular, the SCGAN introduces three novel self-consistent constraints that are complementary to one another, viz.: the noise model should produce a zero response over a clean input; the noise model should return the same output when fed with a specific pure noise input; and the noise model also should re-extract a pure noise map if the map is added to a clean image. These three constraints are simple yet effective. They jointly facilitate unsupervised learning of a noise model for various noise types. To demonstrate its wide applicability, we deploy the SCGAN on three image processing tasks including blind image denoising, rain streak removal, and noisy image super-resolution. The results demonstrate the effectiveness and superiority of our method over the state-of-the-art methods on a variety of benchmark datasets, even though the noise types vary significantly and paired clean images are not available.
Tasks	Denoising, Image Denoising, Image Super-Resolution, Super-Resolution
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05762v2
PDF	https://arxiv.org/pdf/1906.05762v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-image-noise-modeling-with-self
Repo
Framework

Community-preserving Graph Convolutions for Structural and Functional Joint Embedding of Brain Networks


Title	Community-preserving Graph Convolutions for Structural and Functional Joint Embedding of Brain Networks
Authors	Jiahao Liu, Guixiang Ma, Fei Jiang, Chun-Ta Lu, Philip S. Yu, Ann B. Ragin
Abstract	Brain networks have received considerable attention given the critical significance for understanding human brain organization, for investigating neurological disorders and for clinical diagnostic applications. Structural brain network (e.g. DTI) and functional brain network (e.g. fMRI) are the primary networks of interest. Most existing works in brain network analysis focus on either structural or functional connectivity, which cannot leverage the complementary information from each other. Although multi-view learning methods have been proposed to learn from both networks (or views), these methods aim to reach a consensus among multiple views, and thus distinct intrinsic properties of each view may be ignored. How to jointly learn representations from structural and functional brain networks while preserving their inherent properties is a critical problem. In this paper, we propose a framework of Siamese community-preserving graph convolutional network (SCP-GCN) to learn the structural and functional joint embedding of brain networks. Specifically, we use graph convolutions to learn the structural and functional joint embedding, where the graph structure is defined with structural connectivity and node features are from the functional connectivity. Moreover, we propose to preserve the community structure of brain networks in the graph convolutions by considering the intra-community and inter-community properties in the learning process. Furthermore, we use Siamese architecture which models the pair-wise similarity learning to guide the learning process. To evaluate the proposed approach, we conduct extensive experiments on two real brain network datasets. The experimental results demonstrate the superior performance of the proposed approach in structural and functional joint embedding for neurological disorder analysis, indicating its promising value for clinical applications.
Tasks	MULTI-VIEW LEARNING
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03583v1
PDF	https://arxiv.org/pdf/1911.03583v1.pdf
PWC	https://paperswithcode.com/paper/community-preserving-graph-convolutions-for
Repo
Framework