October 16, 2019

2968 words 14 mins read

Paper Group ANR 983

Targeted Kernel Networks: Faster Convolutions with Attentive Regularization. Wide Compression: Tensor Ring Nets. Escaping Saddles with Stochastic Gradients. How should we (correctly) compare multiple graphs?. LiveCap: Real-time Human Performance Capture from Monocular Video. On the Practical Computational Power of Finite Precision RNNs for Language …

Targeted Kernel Networks: Faster Convolutions with Attentive Regularization


Title	Targeted Kernel Networks: Faster Convolutions with Attentive Regularization
Authors	Kashyap Chitta
Abstract	We propose Attentive Regularization (AR), a method to constrain the activation maps of kernels in Convolutional Neural Networks (CNNs) to specific regions of interest (ROIs). Each kernel learns a location of specialization along with its weights through standard backpropagation. A differentiable attention mechanism requiring no additional supervision is used to optimize the ROIs. Traditional CNNs of different types and structures can be modified with this idea into equivalent Targeted Kernel Networks (TKNs), while keeping the network size nearly identical. By restricting kernel ROIs, we reduce the number of sliding convolutional operations performed throughout the network in its forward pass, speeding up both training and inference. We evaluate our proposed architecture on both synthetic and natural tasks across multiple domains. TKNs obtain significant improvements over baselines, requiring less computation (around an order of magnitude) while achieving superior performance.
Tasks
Published	2018-06-01
URL	http://arxiv.org/abs/1806.00523v2
PDF	http://arxiv.org/pdf/1806.00523v2.pdf
PWC	https://paperswithcode.com/paper/targeted-kernel-networks-faster-convolutions
Repo
Framework

Wide Compression: Tensor Ring Nets


Title	Wide Compression: Tensor Ring Nets
Authors	Wenqi Wang, Yifan Sun, Brian Eriksson, Wenlin Wang, Vaneet Aggarwal
Abstract	Deep neural networks have demonstrated state-of-the-art performance in a variety of real-world applications. In order to obtain performance gains, these networks have grown larger and deeper, containing millions or even billions of parameters and over a thousand layers. The trade-off is that these large architectures require an enormous amount of memory, storage, and computation, thus limiting their usability. Inspired by the recent tensor ring factorization, we introduce Tensor Ring Networks (TR-Nets), which significantly compress both the fully connected layers and the convolutional layers of deep neural networks. Our results show that our TR-Nets approach {is able to compress LeNet-5 by $11\times$ without losing accuracy}, and can compress the state-of-the-art Wide ResNet by $243\times$ with only 2.3% degradation in {Cifar10 image classification}. Overall, this compression scheme shows promise in scientific computing and deep learning, especially for emerging resource-constrained devices such as smartphones, wearables, and IoT devices.
Tasks	Image Classification
Published	2018-02-25
URL	http://arxiv.org/abs/1802.09052v1
PDF	http://arxiv.org/pdf/1802.09052v1.pdf
PWC	https://paperswithcode.com/paper/wide-compression-tensor-ring-nets
Repo
Framework

Escaping Saddles with Stochastic Gradients


Title	Escaping Saddles with Stochastic Gradients
Authors	Hadi Daneshmand, Jonas Kohler, Aurelien Lucchi, Thomas Hofmann
Abstract	We analyze the variance of stochastic gradients along negative curvature directions in certain non-convex machine learning models and show that stochastic gradients exhibit a strong component along these directions. Furthermore, we show that - contrary to the case of isotropic noise - this variance is proportional to the magnitude of the corresponding eigenvalues and not decreasing in the dimensionality. Based upon this observation we propose a new assumption under which we show that the injection of explicit, isotropic noise usually applied to make gradient descent escape saddle points can successfully be replaced by a simple SGD step. Additionally - and under the same condition - we derive the first convergence rate for plain SGD to a second-order stationary point in a number of iterations that is independent of the problem dimension.
Tasks
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05999v2
PDF	http://arxiv.org/pdf/1803.05999v2.pdf
PWC	https://paperswithcode.com/paper/escaping-saddles-with-stochastic-gradients
Repo
Framework

How should we (correctly) compare multiple graphs?


Title	How should we (correctly) compare multiple graphs?
Authors	Sam Safavi, Jose Bento
Abstract	Graphs are used in almost every scientific discipline to express relations among a set of objects. Algorithms that compare graphs, and output a closeness score, or a correspondence among their nodes, are thus extremely important. Despite the large amount of work done, many of the scalable algorithms to compare graphs do not produce closeness scores that satisfy the intuitive properties of metrics. This is problematic since non-metrics are known to degrade the performance of algorithms such as distance-based clustering of graphs (Stratis et al. 2018). On the other hand, the use of metrics increases the performance of several machine learning tasks (Indyk et al. 1999, Clarkson et al. 1999, Angiulli et al. 2002 and Ackermann et al, 2010). In this paper, we introduce a new family of multi-distances (a distance between more than two elements) that satisfies a generalization of the properties of metrics to multiple elements. In the context of comparing graphs, we are the first to show the existence of multi-distances that simultaneously incorporate the useful property of alignment consistency (Nguyen et al. 2011), and a generalized metric property, and that can be computed via convex optimization.
Tasks
Published	2018-07-09
URL	http://arxiv.org/abs/1807.03368v3
PDF	http://arxiv.org/pdf/1807.03368v3.pdf
PWC	https://paperswithcode.com/paper/how-should-we-correctly-compare-multiple
Repo
Framework

LiveCap: Real-time Human Performance Capture from Monocular Video


Title	LiveCap: Real-time Human Performance Capture from Monocular Video
Authors	Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt
Abstract	We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per-frame are solved with specially-tailored data-parallel Gauss-Newton solvers. In order to achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques, while being orders of magnitude faster.
Tasks
Published	2018-10-05
URL	http://arxiv.org/abs/1810.02648v3
PDF	http://arxiv.org/pdf/1810.02648v3.pdf
PWC	https://paperswithcode.com/paper/livecap-real-time-human-performance-capture
Repo
Framework

On the Practical Computational Power of Finite Precision RNNs for Language Recognition


Title	On the Practical Computational Power of Finite Precision RNNs for Language Recognition
Authors	Gail Weiss, Yoav Goldberg, Eran Yahav
Abstract	While Recurrent Neural Networks (RNNs) are famously known to be Turing complete, this relies on infinite precision in the states and unbounded computation time. We consider the case of RNNs with finite precision whose computation time is linear in the input length. Under these limitations, we show that different RNN variants have different computational power. In particular, we show that the LSTM and the Elman-RNN with ReLU activation are strictly stronger than the RNN with a squashing activation and the GRU. This is achieved because LSTMs and ReLU-RNNs can easily implement counting behavior. We show empirically that the LSTM does indeed learn to effectively use the counting mechanism.
Tasks
Published	2018-05-13
URL	http://arxiv.org/abs/1805.04908v1
PDF	http://arxiv.org/pdf/1805.04908v1.pdf
PWC	https://paperswithcode.com/paper/on-the-practical-computational-power-of
Repo
Framework

SEA: A Combined Model for Heat Demand Prediction


Title	SEA: A Combined Model for Heat Demand Prediction
Authors	Jiyang Xie, Jiaxin Guo, Zhanyu Ma, Jing-Hao Xue, Qie Sun, Hailong Li, Jun Guo
Abstract	Heat demand prediction is a prominent research topic in the area of intelligent energy networks. It has been well recognized that periodicity is one of the important characteristics of heat demand. Seasonal-trend decomposition based on LOESS (STL) algorithm can analyze the periodicity of a heat demand series, and decompose the series into seasonal and trend components. Then, predicting the seasonal and trend components respectively, and combining their predictions together as the heat demand prediction is a possible way to predict heat demand. In this paper, STL-ENN-ARIMA (SEA), a combined model, was proposed based on the combination of the Elman neural network (ENN) and the autoregressive integrated moving average (ARIMA) model, which are commonly applied to heat demand prediction. ENN and ARIMA are used to predict seasonal and trend components, respectively. Experimental results demonstrate that the proposed SEA model has a promising performance.
Tasks
Published	2018-07-28
URL	http://arxiv.org/abs/1808.00331v1
PDF	http://arxiv.org/pdf/1808.00331v1.pdf
PWC	https://paperswithcode.com/paper/sea-a-combined-model-for-heat-demand
Repo
Framework

Fuzzy Logic Interpretation of Quadratic Networks


Title	Fuzzy Logic Interpretation of Quadratic Networks
Authors	Fenglei Fan, Ge Wang
Abstract	Over past several years, deep learning has achieved huge successes in various applications. However, such a data-driven approach is often criticized for lack of interpretability. Recently, we proposed artificial quadratic neural networks consisting of second-order neurons in potentially many layers. In each second-order neuron, a quadratic function is used in the place of the inner product in a traditional neuron, and then undergoes a nonlinear activation. With a single second-order neuron, any fuzzy logic operation, such as XOR, can be implemented. In this sense, any deep network constructed with quadratic neurons can be interpreted as a deep fuzzy logic system. Since traditional neural networks and second-order counterparts can represent each other and fuzzy logic operations are naturally implemented in second-order neural networks, it is plausible to explain how a deep neural network works with a second-order network as the system model. In this paper, we generalize and categorize fuzzy logic operations implementable with individual second-order neurons, and then perform statistical/information theoretic analyses of exemplary quadratic neural networks.
Tasks
Published	2018-07-04
URL	https://arxiv.org/abs/1807.03215v3
PDF	https://arxiv.org/pdf/1807.03215v3.pdf
PWC	https://paperswithcode.com/paper/fuzzy-logic-interpretation-of-artificial
Repo
Framework

SimGNN: A Neural Network Approach to Fast Graph Similarity Computation


Title	SimGNN: A Neural Network Approach to Fast Graph Similarity Computation
Authors	Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, Wei Wang
Abstract	Graph similarity search is among the most important graph-based applications, e.g. finding the chemical compounds that are most similar to a query compound. Graph similarity computation, such as Graph Edit Distance (GED) and Maximum Common Subgraph (MCS), is the core operation of graph similarity search and many other applications, but very costly to compute in practice. Inspired by the recent success of neural network approaches to several graph applications, such as node or graph classification, we propose a novel neural network based approach to address this classic yet challenging graph problem, aiming to alleviate the computational burden while preserving a good performance. The proposed approach, called SimGNN, combines two strategies. First, we design a learnable embedding function that maps every graph into a vector, which provides a global summary of a graph. A novel attention mechanism is proposed to emphasize the important nodes with respect to a specific similarity metric. Second, we design a pairwise node comparison method to supplement the graph-level embeddings with fine-grained node-level information. Our model achieves better generalization on unseen graphs, and in the worst case runs in quadratic time with respect to the number of nodes in two graphs. Taking GED computation as an example, experimental results on three real graph datasets demonstrate the effectiveness and efficiency of our approach. Specifically, our model achieves smaller error rate and great time reduction compared against a series of baselines, including several approximation algorithms on GED computation, and many existing graph neural network based models. To the best of our knowledge, we are among the first to adopt neural networks to explicitly model the similarity between two graphs, and provide a new direction for future research on graph similarity computation and graph similarity search.
Tasks	Graph Classification, Graph Similarity
Published	2018-08-16
URL	https://arxiv.org/abs/1808.05689v4
PDF	https://arxiv.org/pdf/1808.05689v4.pdf
PWC	https://paperswithcode.com/paper/graph-edit-distance-computation-via-graph
Repo
Framework

Hierarchical Spatial Transformer Network


Title	Hierarchical Spatial Transformer Network
Authors	Chang Shu, Xi Chen, Qiwei Xie, Hua Han
Abstract	Computer vision researchers have been expecting that neural networks have spatial transformation ability to eliminate the interference caused by geometric distortion for a long time. Emergence of spatial transformer network makes dream come true. Spatial transformer network and its variants can handle global displacement well, but lack the ability to deal with local spatial variance. Hence how to achieve a better manner of deformation in the neural network has become a pressing matter of the moment. To address this issue, we analyze the advantages and disadvantages of approximation theory and optical flow theory, then we combine them to propose a novel way to achieve image deformation and implement it with a hierarchical convolutional neural network. This new approach solves for a linear deformation along with an optical flow field to model image deformation. In the experiments of cluttered MNIST handwritten digits classification and image plane alignment, our method outperforms baseline methods by a large margin.
Tasks	Optical Flow Estimation
Published	2018-01-29
URL	http://arxiv.org/abs/1801.09467v2
PDF	http://arxiv.org/pdf/1801.09467v2.pdf
PWC	https://paperswithcode.com/paper/hierarchical-spatial-transformer-network
Repo
Framework

Automata Guided Reinforcement Learning With Demonstrations


Title	Automata Guided Reinforcement Learning With Demonstrations
Authors	Xiao Li, Yao Ma, Calin Belta
Abstract	Tasks with complex temporal structures and long horizons pose a challenge for reinforcement learning agents due to the difficulty in specifying the tasks in terms of reward functions as well as large variances in the learning signals. We propose to address these problems by combining temporal logic (TL) with reinforcement learning from demonstrations. Our method automatically generates intrinsic rewards that align with the overall task goal given a TL task specification. The policy resulting from our framework has an interpretable and hierarchical structure. We validate the proposed method experimentally on a set of robotic manipulation tasks.
Tasks
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06305v2
PDF	http://arxiv.org/pdf/1809.06305v2.pdf
PWC	https://paperswithcode.com/paper/automata-guided-reinforcement-learning-with
Repo
Framework

Learning Robust Options


Title	Learning Robust Options
Authors	Daniel J. Mankowitz, Timothy A. Mann, Pierre-Luc Bacon, Doina Precup, Shie Mannor
Abstract	Robust reinforcement learning aims to produce policies that have strong guarantees even in the face of environments/transition models whose parameters have strong uncertainty. Existing work uses value-based methods and the usual primitive action setting. In this paper, we propose robust methods for learning temporally abstract actions, in the framework of options. We present a Robust Options Policy Iteration (ROPI) algorithm with convergence guarantees, which learns options that are robust to model uncertainty. We utilize ROPI to learn robust options with the Robust Options Deep Q Network (RO-DQN) that solves multiple tasks and mitigates model misspecification due to model uncertainty. We present experimental results which suggest that policy iteration with linear features may have an inherent form of robustness when using coarse feature representations. In addition, we present experimental results which demonstrate that robustness helps policy iteration implemented on top of deep neural networks to generalize over a much broader range of dynamics than non-robust policy iteration.
Tasks
Published	2018-02-09
URL	http://arxiv.org/abs/1802.03236v1
PDF	http://arxiv.org/pdf/1802.03236v1.pdf
PWC	https://paperswithcode.com/paper/learning-robust-options
Repo
Framework

Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks


Title	Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks
Authors	Zhihao Jia, Sina Lin, Charles R. Qi, Alex Aiken
Abstract	The past few years have witnessed growth in the computational requirements for training deep convolutional neural networks. Current approaches parallelize training onto multiple devices by applying a single parallelization strategy (e.g., data or model parallelism) to all layers in a network. Although easy to reason about, these approaches result in suboptimal runtime performance in large-scale distributed training, since different layers in a network may prefer different parallelization strategies. In this paper, we propose layer-wise parallelism that allows each layer in a network to use an individual parallelization strategy. We jointly optimize how each layer is parallelized by solving a graph search problem. Our evaluation shows that layer-wise parallelism outperforms state-of-the-art approaches by increasing training throughput, reducing communication costs, achieving better scalability to multiple GPUs, while maintaining original network accuracy.
Tasks
Published	2018-02-14
URL	http://arxiv.org/abs/1802.04924v2
PDF	http://arxiv.org/pdf/1802.04924v2.pdf
PWC	https://paperswithcode.com/paper/exploring-hidden-dimensions-in-parallelizing
Repo
Framework

Emergence of grid-like representations by training recurrent neural networks to perform spatial localization


Title	Emergence of grid-like representations by training recurrent neural networks to perform spatial localization
Authors	Christopher J. Cueva, Xue-Xin Wei
Abstract	Decades of research on the neural code underlying spatial navigation have revealed a diverse set of neural response properties. The Entorhinal Cortex (EC) of the mammalian brain contains a rich set of spatial correlates, including grid cells which encode space using tessellating patterns. However, the mechanisms and functional significance of these spatial representations remain largely mysterious. As a new way to understand these neural representations, we trained recurrent neural networks (RNNs) to perform navigation tasks in 2D arenas based on velocity inputs. Surprisingly, we find that grid-like spatial response patterns emerge in trained networks, along with units that exhibit other spatial correlates, including border cells and band-like cells. All these different functional types of neurons have been observed experimentally. The order of the emergence of grid-like and border cells is also consistent with observations from developmental studies. Together, our results suggest that grid cells, border cells and others as observed in EC may be a natural solution for representing space efficiently given the predominant recurrent connections in the neural circuits.
Tasks
Published	2018-03-21
URL	http://arxiv.org/abs/1803.07770v1
PDF	http://arxiv.org/pdf/1803.07770v1.pdf
PWC	https://paperswithcode.com/paper/emergence-of-grid-like-representations-by
Repo
Framework

Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks


Title	Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks
Authors	Shikhar Vashishth, Manik Bhandari, Prateek Yadav, Piyush Rai, Chiranjib Bhattacharyya, Partha Talukdar
Abstract	Word embeddings have been widely adopted across several NLP applications. Most existing word embedding methods utilize sequential context of a word to learn its embedding. While there have been some attempts at utilizing syntactic context of a word, such methods result in an explosion of the vocabulary size. In this paper, we overcome this problem by proposing SynGCN, a flexible Graph Convolution based method for learning word embeddings. SynGCN utilizes the dependency context of a word without increasing the vocabulary size. Word embeddings learned by SynGCN outperform existing methods on various intrinsic and extrinsic tasks and provide an advantage when used with ELMo. We also propose SemGCN, an effective framework for incorporating diverse semantic knowledge for further enhancing learned word representations. We make the source code of both models available to encourage reproducible research.
Tasks	Learning Word Embeddings, Representation Learning, Word Embeddings
Published	2018-09-12
URL	https://arxiv.org/abs/1809.04283v4
PDF	https://arxiv.org/pdf/1809.04283v4.pdf
PWC	https://paperswithcode.com/paper/graph-convolutional-networks-based-word
Repo
Framework