October 19, 2019

3042 words 15 mins read

Paper Group ANR 211

Parallel-tempered Stochastic Gradient Hamiltonian Monte Carlo for Approximate Multimodal Posterior Sampling. Representing Sets as Summed Semantic Vectors. Performance Analysis of Robust Stable PID Controllers Using Dominant Pole Placement for SOPTD Process Models. On the overfly algorithm in deep learning of neural networks. The Importance of Being …

Parallel-tempered Stochastic Gradient Hamiltonian Monte Carlo for Approximate Multimodal Posterior Sampling


Title	Parallel-tempered Stochastic Gradient Hamiltonian Monte Carlo for Approximate Multimodal Posterior Sampling
Authors	Rui Luo, Qiang Zhang, Yuanyuan Liu
Abstract	We propose a new sampler that integrates the protocol of parallel tempering with the Nos'e-Hoover (NH) dynamics. The proposed method can efficiently draw representative samples from complex posterior distributions with multiple isolated modes in the presence of noise arising from stochastic gradient. It potentially facilitates deep Bayesian learning on large datasets where complex multimodal posteriors and mini-batch gradient are encountered.
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01181v2
PDF	http://arxiv.org/pdf/1812.01181v2.pdf
PWC	https://paperswithcode.com/paper/parallel-tempered-stochastic-gradient
Repo
Framework

Representing Sets as Summed Semantic Vectors


Title	Representing Sets as Summed Semantic Vectors
Authors	Douglas Summers-Stay, Peter Sutor, Dandan Li
Abstract	Representing meaning in the form of high dimensional vectors is a common and powerful tool in biologically inspired architectures. While the meaning of a set of concepts can be summarized by taking a (possibly weighted) sum of their associated vectors, this has generally been treated as a one-way operation. In this paper we show how a technique built to aid sparse vector decomposition allows in many cases the exact recovery of the inputs and weights to such a sum, allowing a single vector to represent an entire set of vectors from a dictionary. We characterize the number of vectors that can be recovered under various conditions, and explore several ways such a tool can be used for vector-based reasoning.
Tasks
Published	2018-09-24
URL	http://arxiv.org/abs/1809.08823v1
PDF	http://arxiv.org/pdf/1809.08823v1.pdf
PWC	https://paperswithcode.com/paper/representing-sets-as-summed-semantic-vectors
Repo
Framework

Performance Analysis of Robust Stable PID Controllers Using Dominant Pole Placement for SOPTD Process Models


Title	Performance Analysis of Robust Stable PID Controllers Using Dominant Pole Placement for SOPTD Process Models
Authors	Saptarshi Das, Kaushik Halder, Amitava Gupta
Abstract	This paper derives new formulations for designing dominant pole placement based proportional-integral-derivative (PID) controllers to handle second order processes with time delays (SOPTD). Previously, similar attempts have been made for pole placement in delay-free systems. The presence of the time delay term manifests itself as a higher order system with variable number of interlaced poles and zeros upon Pade approximation, which makes it difficult to achieve precise pole placement control. We here report the analytical expressions to constrain the closed loop dominant and non-dominant poles at the desired locations in the complex s-plane, using a third order Pade approximation for the delay term. However, invariance of the closed loop performance with different time delay approximation has also been verified using increasing order of Pade, representing a closed to reality higher order delay dynamics. The choice of the nature of non-dominant poles e.g. all being complex, real or a combination of them modifies the characteristic equation and influences the achievable stability regions. The effect of different types of non-dominant poles and the corresponding stability regions are obtained for nine test-bench processes indicating different levels of open-loop damping and lag to delay ratio. Next, we investigate which expression yields a wider stability region in the design parameter space by using Monte Carlo simulations while uniformly sampling a chosen design parameter space. Various time and frequency domain control performance parameters are investigated next, as well as their deviations with uncertain process parameters, using thousands of Monte Carlo simulations, around the robust stable solution for each of the nine test-bench processes.
Tasks
Published	2018-01-28
URL	http://arxiv.org/abs/1801.09238v1
PDF	http://arxiv.org/pdf/1801.09238v1.pdf
PWC	https://paperswithcode.com/paper/performance-analysis-of-robust-stable-pid
Repo
Framework

On the overfly algorithm in deep learning of neural networks


Title	On the overfly algorithm in deep learning of neural networks
Authors	Alexei Tsygvintsev
Abstract	In this paper we investigate the supervised backpropagation training of multilayer neural networks from a dynamical systems point of view. We discuss some links with the qualitative theory of differential equations and introduce the overfly algorithm to tackle the local minima problem. Our approach is based on the existence of first integrals of the generalised gradient system with build-in dissipation.
Tasks
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10668v6
PDF	http://arxiv.org/pdf/1807.10668v6.pdf
PWC	https://paperswithcode.com/paper/on-the-overfly-algorithm-in-deep-learning-of
Repo
Framework

The Importance of Being Earnest: Performance of Modulation Classification for Real RF Signals


Title	The Importance of Being Earnest: Performance of Modulation Classification for Real RF Signals
Authors	Colin de Vrieze, Ljiljana Simić, Petri Mähönen
Abstract	Digital modulation classification (DMC) can be highly valuable for equipping radios with increased spectrum awareness in complex emerging wireless networks. However, as the existing literature is overwhelmingly based on theoretical or simulation results, it is unclear how well DMC performs in practice. In this paper we study the performance of DMC in real-world wireless networks, using an extensive RF signal dataset of 250,000 over-the-air transmissions with heterogeneous transceiver hardware and co-channel interference. Our results show that DMC can achieve a high classification accuracy even under the challenging real-world conditions of modulated co-channel interference and low-grade hardware. However, this only holds if the training dataset fully captures the variety of interference and hardware types in the real radio environment; otherwise, the DMC performance deteriorates significantly. Our work has two important engineering implications. First, it shows that it is not straightforward to exchange learned classifier models among dissimilar radio environments and devices in practice. Second, our analysis suggests that the key missing link for real-world deployment of DMC is designing signal features that generalize well to diverse wireless network scenarios. We are making our RF signal dataset publicly available as a step towards a unified framework for realistic DMC evaluation.
Tasks
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06176v2
PDF	http://arxiv.org/pdf/1809.06176v2.pdf
PWC	https://paperswithcode.com/paper/the-importance-of-being-earnest-performance
Repo
Framework

Statistical mechanical analysis of sparse linear regression as a variable selection problem


Title	Statistical mechanical analysis of sparse linear regression as a variable selection problem
Authors	Tomoyuki Obuchi, Yoshinori Nakanishi-Ohno, Masato Okada, Yoshiyuki Kabashima
Abstract	An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value of fit error to given data which is assumed to be generated from a linear process using the design matrix. This yields the typical achievable limit of the fit error when solving a representative $\ell_0$ problem and includes the presence of unfavourable phase transitions preventing local search algorithms from reaching the minimum-error configuration. The associated phase diagrams are presented. A noteworthy outcome of the phase diagrams is that there exists a wide parameter region where any phase transition is absent from the high temperature to the lowest temperature at which the minimum-error configuration or the ground state is reached. This implies that certain local search algorithms can find the ground state with moderate computational costs in that region. Another noteworthy result is the presence of the random first-order transition in the strong noise case. The theoretical evaluation of the entropy is confirmed by extensive numerical methods using the exchange Monte Carlo and the multi-histogram methods. Another numerical test based on a metaheuristic optimisation algorithm called simulated annealing is conducted, which well supports the theoretical predictions on the local search algorithms. In the successful region with no phase transition, the computational cost of the simulated annealing to reach the ground state is estimated as the third order polynomial of the model dimensionality.
Tasks
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11259v2
PDF	http://arxiv.org/pdf/1805.11259v2.pdf
PWC	https://paperswithcode.com/paper/statistical-mechanical-analysis-of-sparse
Repo
Framework

Automating Reading Comprehension by Generating Question and Answer Pairs


Title	Automating Reading Comprehension by Generating Question and Answer Pairs
Authors	Vishwajeet Kumar, Kireeti Boorla, Yogesh Meena, Ganesh Ramakrishnan, Yuan-Fang Li
Abstract	Neural network-based methods represent the state-of-the-art in question generation from text. Existing work focuses on generating only questions from text without concerning itself with answer generation. Moreover, our analysis shows that handling rare words and generating the most appropriate question given a candidate answer are still challenges facing existing approaches. We present a novel two-stage process to generate question-answer pairs from the text. For the first stage, we present alternatives for encoding the span of the pivotal answer in the sentence using Pointer Networks. In our second stage, we employ sequence to sequence models for question generation, enhanced with rich linguistic features. Finally, global attention and answer encoding are used for generating the question most relevant to the answer. We motivate and linguistically analyze the role of each component in our framework and consider compositions of these. This analysis is supported by extensive experimental evaluations. Using standard evaluation metrics as well as human evaluations, our experimental results validate the significant improvement in the quality of questions generated by our framework over the state-of-the-art. The technique presented here represents another step towards more automated reading comprehension assessment. We also present a live system \footnote{Demo of the system is available at \url{https://www.cse.iitb.ac.in/~vishwajeet/autoqg.html}.} to demonstrate the effectiveness of our approach.
Tasks	Question Generation, Reading Comprehension
Published	2018-03-07
URL	http://arxiv.org/abs/1803.03664v1
PDF	http://arxiv.org/pdf/1803.03664v1.pdf
PWC	https://paperswithcode.com/paper/automating-reading-comprehension-by
Repo
Framework

Distributed sequential method for analyzing massive data


Title	Distributed sequential method for analyzing massive data
Authors	Zhanfeng Wang, Yuan-chin Ivan Chang
Abstract	To analyse a very large data set containing lengthy variables, we adopt a sequential estimation idea and propose a parallel divide-and-conquer method. We conduct several conventional sequential estimation procedures separately, and properly integrate their results while maintaining the desired statistical properties. Additionally, using a criterion from the statistical experiment design, we adopt an adaptive sample selection, together with an adaptive shrinkage estimation method, to simultaneously accelerate the estimation procedure and identify the effective variables. We confirm the cogency of our methods through theoretical justifications and numerical results derived from synthesized data sets. We then apply the proposed method to three real data sets, including those pertaining to appliance energy use and particulate matter concentration.
Tasks
Published	2018-12-22
URL	http://arxiv.org/abs/1812.09424v1
PDF	http://arxiv.org/pdf/1812.09424v1.pdf
PWC	https://paperswithcode.com/paper/distributed-sequential-method-for-analyzing
Repo
Framework

Separators and Adjustment Sets in Causal Graphs: Complete Criteria and an Algorithmic Framework


Title	Separators and Adjustment Sets in Causal Graphs: Complete Criteria and an Algorithmic Framework
Authors	Benito van der Zander, Maciej Liśkiewicz, Johannes Textor
Abstract	Principled reasoning about the identifiability of causal effects from non-experimental data is an important application of graphical causal models. This paper focuses on effects that are identifiable by covariate adjustment, a commonly used estimation approach. We present an algorithmic framework for efficiently testing, constructing, and enumerating $m$-separators in ancestral graphs (AGs), a class of graphical causal models that can represent uncertainty about the presence of latent confounders. Furthermore, we prove a reduction from causal effect identification by covariate adjustment to $m$-separation in a subgraph for directed acyclic graphs (DAGs) and maximal ancestral graphs (MAGs). Jointly, these results yield constructive criteria that characterize all adjustment sets as well as all minimal and minimum adjustment sets for identification of a desired causal effect with multivariate exposures and outcomes in the presence of latent confounding. Our results extend several existing solutions for special cases of these problems. Our efficient algorithms allowed us to empirically quantify the identifiability gap between covariate adjustment and the do-calculus in random DAGs and MAGs, covering a wide range of scenarios. Implementations of our algorithms are provided in the R package dagitty.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00116v3
PDF	http://arxiv.org/pdf/1803.00116v3.pdf
PWC	https://paperswithcode.com/paper/separators-and-adjustment-sets-in-causal
Repo
Framework

A Proximal Block Coordinate Descent Algorithm for Deep Neural Network Training


Title	A Proximal Block Coordinate Descent Algorithm for Deep Neural Network Training
Authors	Tim Tsz-Kit Lau, Jinshan Zeng, Baoyuan Wu, Yuan Yao
Abstract	Training deep neural networks (DNNs) efficiently is a challenge due to the associated highly nonconvex optimization. The backpropagation (backprop) algorithm has long been the most widely used algorithm for gradient computation of parameters of DNNs and is used along with gradient descent-type algorithms for this optimization task. Recent work have shown the efficiency of block coordinate descent (BCD) type methods empirically for training DNNs. In view of this, we propose a novel algorithm based on the BCD method for training DNNs and provide its global convergence results built upon the powerful framework of the Kurdyka-Lojasiewicz (KL) property. Numerical experiments on standard datasets demonstrate its competitive efficiency against standard optimizers with backprop.
Tasks
Published	2018-03-24
URL	http://arxiv.org/abs/1803.09082v1
PDF	http://arxiv.org/pdf/1803.09082v1.pdf
PWC	https://paperswithcode.com/paper/a-proximal-block-coordinate-descent-algorithm
Repo
Framework

Geometric and Physical Constraints for Drone-Based Head Plane Crowd Density Estimation


Title	Geometric and Physical Constraints for Drone-Based Head Plane Crowd Density Estimation
Authors	Weizhe Liu, Krzysztof Lis, Mathieu Salzmann, Pascal Fua
Abstract	State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density in the image plane. While useful for this purpose, this image-plane density has no immediate physical meaning because it is subject to perspective distortion. This is a concern in sequences acquired by drones because the viewpoint changes often. This distortion is usually handled implicitly by either learning scale-invariant features or estimating density in patches of different sizes, neither of which accounts for the fact that scale changes must be consistent over the whole scene. In this paper, we explicitly model the scale changes and reason in terms of people per square-meter. We show that feeding the perspective model to the network allows us to enforce global scale consistency and that this model can be obtained on the fly from the drone sensors. In addition, it also enables us to enforce physically-inspired temporal consistency constraints that do not have to be learned. This yields an algorithm that outperforms state-of-the-art methods in inferring crowd density from a moving drone camera especially when perspective effects are strong.
Tasks	Density Estimation
Published	2018-03-23
URL	https://arxiv.org/abs/1803.08805v3
PDF	https://arxiv.org/pdf/1803.08805v3.pdf
PWC	https://paperswithcode.com/paper/geometric-and-physical-constraints-for-head
Repo
Framework

Distance-Free Modeling of Multi-Predicate Interactions in End-to-End Japanese Predicate-Argument Structure Analysis


Title	Distance-Free Modeling of Multi-Predicate Interactions in End-to-End Japanese Predicate-Argument Structure Analysis
Authors	Yuichiroh Matsubayashi, Kentaro Inui
Abstract	Capturing interactions among multiple predicate-argument structures (PASs) is a crucial issue in the task of analyzing PAS in Japanese. In this paper, we propose new Japanese PAS analysis models that integrate the label prediction information of arguments in multiple PASs by extending the input and last layers of a standard deep bidirectional recurrent neural network (bi-RNN) model. In these models, using the mechanisms of pooling and attention, we aim to directly capture the potential interactions among multiple PASs, without being disturbed by the word order and distance. Our experiments show that the proposed models improve the prediction accuracy specifically for cases where the predicate and argument are in an indirect dependency relation and achieve a new state of the art in the overall $F_1$ on a standard benchmark corpus.
Tasks
Published	2018-06-11
URL	http://arxiv.org/abs/1806.03869v2
PDF	http://arxiv.org/pdf/1806.03869v2.pdf
PWC	https://paperswithcode.com/paper/distance-free-modeling-of-multi-predicate
Repo
Framework

Robust Adversarial Perturbation on Deep Proposal-based Models


Title	Robust Adversarial Perturbation on Deep Proposal-based Models
Authors	Yuezun Li, Daniel Tian, Ming-Ching Chang, Xiao Bian, Siwei Lyu
Abstract	Adversarial noises are useful tools to probe the weakness of deep learning based computer vision algorithms. In this paper, we describe a robust adversarial perturbation (R-AP) method to attack deep proposal-based object detectors and instance segmentation algorithms. Our method focuses on attacking the common component in these algorithms, namely Region Proposal Network (RPN), to universally degrade their performance in a black-box fashion. To do so, we design a loss function that combines a label loss and a novel shape loss, and optimize it with respect to image using a gradient based iterative algorithm. Evaluations are performed on the MS COCO 2014 dataset for the adversarial attacking of 6 state-of-the-art object detectors and 2 instance segmentation algorithms. Experimental results demonstrate the efficacy of the proposed method.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2018-09-16
URL	https://arxiv.org/abs/1809.05962v2
PDF	https://arxiv.org/pdf/1809.05962v2.pdf
PWC	https://paperswithcode.com/paper/robust-adversarial-perturbation-on-deep
Repo
Framework

Invertible Autoencoder for domain adaptation


Title	Invertible Autoencoder for domain adaptation
Authors	Yunfei Teng, Anna Choromanska, Mariusz Bojarski
Abstract	The unsupervised image-to-image translation aims at finding a mapping between the source ($A$) and target ($B$) image domains, where in many applications aligned image pairs are not available at training. This is an ill-posed learning problem since it requires inferring the joint probability distribution from marginals. Joint learning of coupled mappings $F_{AB}: A \rightarrow B$ and $F_{BA}: B \rightarrow A$ is commonly used by the state-of-the-art methods, like CycleGAN [Zhu et al., 2017], to learn this translation by introducing cycle consistency requirement to the learning problem, i.e. $F_{AB}(F_{BA}(B)) \approx B$ and $F_{BA}(F_{AB}(A)) \approx A$. Cycle consistency enforces the preservation of the mutual information between input and translated images. However, it does not explicitly enforce $F_{BA}$ to be an inverse operation to $F_{AB}$. We propose a new deep architecture that we call invertible autoencoder (InvAuto) to explicitly enforce this relation. This is done by forcing an encoder to be an inverted version of the decoder, where corresponding layers perform opposite mappings and share parameters. The mappings are constrained to be orthonormal. The resulting architecture leads to the reduction of the number of trainable parameters (up to $2$ times). We present image translation results on benchmark data sets and demonstrate state-of-the art performance of our approach. Finally, we test the proposed domain adaptation method on the task of road video conversion. We demonstrate that the videos converted with InvAuto have high quality and show that the NVIDIA neural-network-based end-to-end learning system for autonomous driving, known as PilotNet, trained on real road videos performs well when tested on the converted ones.
Tasks	Autonomous Driving, Domain Adaptation, Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published	2018-02-10
URL	http://arxiv.org/abs/1802.06869v1
PDF	http://arxiv.org/pdf/1802.06869v1.pdf
PWC	https://paperswithcode.com/paper/invertible-autoencoder-for-domain-adaptation
Repo
Framework

Multi-kernel Regression For Graph Signal Processing


Title	Multi-kernel Regression For Graph Signal Processing
Authors	Arun Venkitaraman, Saikat Chatterjee, Peter Händel
Abstract	We develop a multi-kernel based regression method for graph signal processing where the target signal is assumed to be smooth over a graph. In multi-kernel regression, an effective kernel function is expressed as a linear combination of many basis kernel functions. We estimate the linear weights to learn the effective kernel function by appropriate regularization based on graph smoothness. We show that the resulting optimization problem is shown to be convex and pro- pose an accelerated projected gradient descent based solution. Simulation results using real-world graph signals show efficiency of the multi-kernel based approach over a standard kernel based approach.
Tasks
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04196v1
PDF	http://arxiv.org/pdf/1803.04196v1.pdf
PWC	https://paperswithcode.com/paper/multi-kernel-regression-for-graph-signal
Repo
Framework