May 6, 2019

2739 words 13 mins read

Paper Group ANR 356

Tracking Time-Vertex Propagation using Dynamic Graph Wavelets. Load Disaggregation Based on Aided Linear Integer Programming. Practical sketching algorithms for low-rank matrix approximation. A Novel Architecture for Computing Approximate Radon Transform. Crowd Counting Considering Network Flow Constraints in Videos. Generative Deep Neural Networks …

Tracking Time-Vertex Propagation using Dynamic Graph Wavelets


Title	Tracking Time-Vertex Propagation using Dynamic Graph Wavelets
Authors	Francesco Grassi, Nathanael Perraudin, Benjamin Ricaud
Abstract	Graph Signal Processing generalizes classical signal processing to signal or data indexed by the vertices of a weighted graph. So far, the research efforts have been focused on static graph signals. However numerous applications involve graph signals evolving in time, such as spreading or propagation of waves on a network. The analysis of this type of data requires a new set of methods that fully takes into account the time and graph dimensions. We propose a novel class of wavelet frames named Dynamic Graph Wavelets, whose time-vertex evolution follows a dynamic process. We demonstrate that this set of functions can be combined with sparsity based approaches such as compressive sensing to reveal information on the dynamic processes occurring on a graph. Experiments on real seismological data show the efficiency of the technique, allowing to estimate the epicenter of earthquake events recorded by a seismic network.
Tasks	Compressive Sensing
Published	2016-06-21
URL	http://arxiv.org/abs/1606.06653v1
PDF	http://arxiv.org/pdf/1606.06653v1.pdf
PWC	https://paperswithcode.com/paper/tracking-time-vertex-propagation-using
Repo
Framework

Load Disaggregation Based on Aided Linear Integer Programming


Title	Load Disaggregation Based on Aided Linear Integer Programming
Authors	Md. Zulfiquar Ali Bhotto, Stephen Makonin, Ivan V. Bajic
Abstract	Load disaggregation based on aided linear integer programming (ALIP) is proposed. We start with a conventional linear integer programming (IP) based disaggregation and enhance it in several ways. The enhancements include additional constraints, correction based on a state diagram, median filtering, and linear programming-based refinement. With the aid of these enhancements, the performance of IP-based disaggregation is significantly improved. The proposed ALIP system relies only on the instantaneous load samples instead of waveform signatures, and hence does not crucially depend on high sampling frequency. Experimental results show that the proposed ALIP system performs better than the conventional IP-based load disaggregation system.
Tasks
Published	2016-03-24
URL	http://arxiv.org/abs/1603.07417v3
PDF	http://arxiv.org/pdf/1603.07417v3.pdf
PWC	https://paperswithcode.com/paper/load-disaggregation-based-on-aided-linear
Repo
Framework

Practical sketching algorithms for low-rank matrix approximation


Title	Practical sketching algorithms for low-rank matrix approximation
Authors	Joel A. Tropp, Alp Yurtsever, Madeleine Udell, Volkan Cevher
Abstract	This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably correct. Moreover, each method is accompanied by an informative error bound that allows users to select parameters a priori to achieve a given approximation quality. These claims are supported by numerical experiments with real and synthetic data.
Tasks
Published	2016-08-31
URL	http://arxiv.org/abs/1609.00048v2
PDF	http://arxiv.org/pdf/1609.00048v2.pdf
PWC	https://paperswithcode.com/paper/practical-sketching-algorithms-for-low-rank
Repo
Framework

A Novel Architecture for Computing Approximate Radon Transform


Title	A Novel Architecture for Computing Approximate Radon Transform
Authors	M. A. Khorsandi, N. Karimi, S. Samavi
Abstract	Radon transform is a type of transform which is used in image processing to transfer the image into intercept-slope coordinate. Its diagonal properties made it appropriate for some applications which need processes in different degrees. Radon transform computation needs a lot of arithmetic operations which makes it a compute-intensive algorithm. In literature an approximate algorithm for computing Radon transform is introduces which reduces the complexity of computations. But this algorithm is complex and need arbitrary accesses to memory. In this paper we proposed an algorithm which accesses to memory sequentially. In the following an architecture is introduced which uses pipeline to reduce the time complexity of algorithm.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1701.05083v1
PDF	http://arxiv.org/pdf/1701.05083v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-architecture-for-computing
Repo
Framework

Crowd Counting Considering Network Flow Constraints in Videos


Title	Crowd Counting Considering Network Flow Constraints in Videos
Authors	Liqing Gao, Yanzhang Wang, Xin Ye, Jian Wang
Abstract	The growth of the number of people in the monitoring scene may increase the probability of security threat, which makes crowd counting more and more important. Most of the existing approaches estimate the number of pedestrians within one frame, which results in inconsistent predictions in terms of time. This paper, for the first time, introduces a quadratic programming model with the network flow constraints to improve the accuracy of crowd counting. Firstly, the foreground of each frame is segmented into groups, each of which contains several pedestrians. Then, a regression-based map is developed in accordance with the relationship between low-level features of each group and the number of people in it. Secondly, a directed graph is constructed to simulate constraints on people’s flow, whose vertices represent groups of each frame and arcs represent people moving from one group to another. Then, the people flow can be viewed as an integer flow in the constructed digraph. Finally, by solving a quadratic programming problem with network flow constraints in the directed graph, we obtain consistency in people counting. The experimental results show that the proposed method can reduce the crowd counting errors and improve the accuracy. Moreover, this method can also be applied to any ultramodern group-based regression counting approach to get improvements.
Tasks	Crowd Counting
Published	2016-05-12
URL	http://arxiv.org/abs/1605.03821v2
PDF	http://arxiv.org/pdf/1605.03821v2.pdf
PWC	https://paperswithcode.com/paper/crowd-counting-considering-network-flow
Repo
Framework

Generative Deep Neural Networks for Dialogue: A Short Review


Title	Generative Deep Neural Networks for Dialogue: A Short Review
Authors	Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau
Abstract	Researchers have recently started investigating deep neural networks for dialogue applications. In particular, generative sequence-to-sequence (Seq2Seq) models have shown promising results for unstructured tasks, such as word-level dialogue response generation. The hope is that such models will be able to leverage massive amounts of data to learn meaningful natural language representations and response generation strategies, while requiring a minimum amount of domain knowledge and hand-crafting. An important challenge is to develop models that can effectively incorporate dialogue context and generate meaningful and diverse responses. In support of this goal, we review recently proposed models based on generative encoder-decoder neural network architectures, and show that these models have better ability to incorporate long-term dialogue history, to model uncertainty and ambiguity in dialogue, and to generate responses with high-level compositional structure.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1611.06216v1
PDF	http://arxiv.org/pdf/1611.06216v1.pdf
PWC	https://paperswithcode.com/paper/generative-deep-neural-networks-for-dialogue
Repo
Framework

Piece-wise quadratic approximations of arbitrary error functions for fast and robust machine learning


Title	Piece-wise quadratic approximations of arbitrary error functions for fast and robust machine learning
Authors	A. N. Gorban, E. M. Mirkes, A. Zinovyev
Abstract	Most of machine learning approaches have stemmed from the application of minimizing the mean squared distance principle, based on the computationally efficient quadratic optimization methods. However, when faced with high-dimensional and noisy data, the quadratic error functionals demonstrated many weaknesses including high sensitivity to contaminating factors and dimensionality curse. Therefore, a lot of recent applications in machine learning exploited properties of non-quadratic error functionals based on $L_1$ norm or even sub-linear potentials corresponding to quasinorms $L_p$ ($0<p<1$). The back side of these approaches is increase in computational cost for optimization. Till so far, no approaches have been suggested to deal with {\it arbitrary} error functionals, in a flexible and computationally efficient framework. In this paper, we develop a theory and basic universal data approximation algorithms ($k$-means, principal components, principal manifolds and graphs, regularized and sparse regression), based on piece-wise quadratic error potentials of subquadratic growth (PQSQ potentials). We develop a new and universal framework to minimize {\it arbitrary sub-quadratic error potentials} using an algorithm with guaranteed fast convergence to the local or global error minimum. The theory of PQSQ potentials is based on the notion of the cone of minorant functions, and represents a natural approximation formalism based on the application of min-plus algebra. The approach can be applied in most of existing machine learning methods, including methods of data approximation and regularized and sparse regression, leading to the improvement in the computational cost/accuracy trade-off. We demonstrate that on synthetic and real-life datasets PQSQ-based machine learning methods achieve orders of magnitude faster computational performance than the corresponding state-of-the-art methods.
Tasks
Published	2016-05-20
URL	http://arxiv.org/abs/1605.06276v2
PDF	http://arxiv.org/pdf/1605.06276v2.pdf
PWC	https://paperswithcode.com/paper/piece-wise-quadratic-approximations-of
Repo
Framework

A study of the effect of JPG compression on adversarial images


Title	A study of the effect of JPG compression on adversarial images
Authors	Gintare Karolina Dziugaite, Zoubin Ghahramani, Daniel M. Roy
Abstract	Neural network image classifiers are known to be vulnerable to adversarial images, i.e., natural images which have been modified by an adversarial perturbation specifically designed to be imperceptible to humans yet fool the classifier. Not only can adversarial images be generated easily, but these images will often be adversarial for networks trained on disjoint subsets of data or with different architectures. Adversarial images represent a potential security risk as well as a serious machine learning challenge—it is clear that vulnerable neural networks perceive images very differently from humans. Noting that virtually every image classification data set is composed of JPG images, we evaluate the effect of JPG compression on the classification of adversarial images. For Fast-Gradient-Sign perturbations of small magnitude, we found that JPG compression often reverses the drop in classification accuracy to a large extent, but not always. As the magnitude of the perturbations increases, JPG recompression alone is insufficient to reverse the effect.
Tasks	Image Classification
Published	2016-08-02
URL	http://arxiv.org/abs/1608.00853v1
PDF	http://arxiv.org/pdf/1608.00853v1.pdf
PWC	https://paperswithcode.com/paper/a-study-of-the-effect-of-jpg-compression-on
Repo
Framework

Latent Sequence Decompositions


Title	Latent Sequence Decompositions
Authors	William Chan, Yu Zhang, Quoc Le, Navdeep Jaitly
Abstract	We present the Latent Sequence Decompositions (LSD) framework. LSD decomposes sequences with variable lengthed output units as a function of both the input sequence and the output sequence. We present a training algorithm which samples valid extensions and an approximate decoding algorithm. We experiment with the Wall Street Journal speech recognition task. Our LSD model achieves 12.9% WER compared to a character baseline of 14.8% WER. When combined with a convolutional network on the encoder, we achieve 9.6% WER.
Tasks	Speech Recognition
Published	2016-10-10
URL	http://arxiv.org/abs/1610.03035v6
PDF	http://arxiv.org/pdf/1610.03035v6.pdf
PWC	https://paperswithcode.com/paper/latent-sequence-decompositions
Repo
Framework

Geometric Learning and Topological Inference with Biobotic Networks: Convergence Analysis


Title	Geometric Learning and Topological Inference with Biobotic Networks: Convergence Analysis
Authors	Alireza Dirafzoon, Alper Bozkurt, Edgar Lobaton
Abstract	In this study, we present and analyze a framework for geometric and topological estimation for mapping of unknown environments. We consider agents mimicking motion behaviors of cyborg insects, known as biobots, and exploit coordinate-free local interactions among them to infer geometric and topological information about the environment, under minimal sensing and localization constraints. Local interactions are used to create a graphical representation referred to as the encounter graph. A metric is estimated over the encounter graph of the agents in order to construct a geometric point cloud using manifold learning techniques. Topological data analysis (TDA), in particular persistent homology, is used in order to extract topological features of the space and a classification method is proposed to infer robust features of interest (e.g. existence of obstacles). We examine the asymptotic behavior of the proposed metric in terms of the convergence to the geodesic distances in the underlying manifold of the domain, and provide stability analysis results for the topological persistence. The proposed framework and its convergences and stability analysis are demonstrated through numerical simulations and experiments.
Tasks	Topological Data Analysis
Published	2016-06-30
URL	http://arxiv.org/abs/1607.00051v1
PDF	http://arxiv.org/pdf/1607.00051v1.pdf
PWC	https://paperswithcode.com/paper/geometric-learning-and-topological-inference
Repo
Framework

Robust Scene Text Recognition with Automatic Rectification


Title	Robust Scene Text Recognition with Automatic Rectification
Authors	Baoguang Shi, Xinggang Wang, Pengyuan Lyu, Cong Yao, Xiang Bai
Abstract	Recognizing text in natural images is a challenging task with many unsolved problems. Different from those in documents, words in natural images often possess irregular shapes, which are caused by perspective distortion, curved character placement, etc. We propose RARE (Robust text recognizer with Automatic REctification), a recognition model that is robust to irregular text. RARE is a specially-designed deep neural network, which consists of a Spatial Transformer Network (STN) and a Sequence Recognition Network (SRN). In testing, an image is firstly rectified via a predicted Thin-Plate-Spline (TPS) transformation, into a more “readable” image for the following SRN, which recognizes text through a sequence recognition approach. We show that the model is able to recognize several types of irregular text, including perspective text and curved text. RARE is end-to-end trainable, requiring only images and associated text labels, making it convenient to train and deploy the model in practical systems. State-of-the-art or highly-competitive performance achieved on several benchmarks well demonstrates the effectiveness of the proposed model.
Tasks	Scene Text Recognition
Published	2016-03-12
URL	http://arxiv.org/abs/1603.03915v2
PDF	http://arxiv.org/pdf/1603.03915v2.pdf
PWC	https://paperswithcode.com/paper/robust-scene-text-recognition-with-automatic
Repo
Framework

Learning Word Sense Embeddings from Word Sense Definitions


Title	Learning Word Sense Embeddings from Word Sense Definitions
Authors	Qi Li, Tianshi Li, Baobao Chang
Abstract	Word embeddings play a significant role in many modern NLP systems. Since learning one representation per word is problematic for polysemous words and homonymous words, researchers propose to use one embedding per word sense. Their approaches mainly train word sense embeddings on a corpus. In this paper, we propose to use word sense definitions to learn one embedding per word sense. Experimental results on word similarity tasks and a word sense disambiguation task show that word sense embeddings produced by our approach are of high quality.
Tasks	Word Embeddings, Word Sense Disambiguation
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04835v4
PDF	http://arxiv.org/pdf/1606.04835v4.pdf
PWC	https://paperswithcode.com/paper/learning-word-sense-embeddings-from-word
Repo
Framework

Recurrent Convolutional Neural Network Regression for Continuous Pain Intensity Estimation in Video


Title	Recurrent Convolutional Neural Network Regression for Continuous Pain Intensity Estimation in Video
Authors	Jing Zhou, Xiaopeng Hong, Fei Su, Guoying Zhao
Abstract	Automatic pain intensity estimation possesses a significant position in healthcare and medical field. Traditional static methods prefer to extract features from frames separately in a video, which would result in unstable changes and peaks among adjacent frames. To overcome this problem, we propose a real-time regression framework based on the recurrent convolutional neural network for automatic frame-level pain intensity estimation. Given vector sequences of AAM-warped facial images, we used a sliding-window strategy to obtain fixed-length input samples for the recurrent network. We then carefully design the architecture of the recurrent network to output continuous-valued pain intensity. The proposed end-to-end pain intensity regression framework can predict the pain intensity of each frame by considering a sufficiently large historical frames while limiting the scale of the parameters within the model. Our method achieves promising results regarding both accuracy and running speed on the published UNBC-McMaster Shoulder Pain Expression Archive Database.
Tasks	Pain Intensity Regression
Published	2016-05-03
URL	http://arxiv.org/abs/1605.00894v1
PDF	http://arxiv.org/pdf/1605.00894v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-convolutional-neural-network
Repo
Framework

Face Alignment by Local Deep Descriptor Regression


Title	Face Alignment by Local Deep Descriptor Regression
Authors	Amit Kumar, Rajeev Ranjan, Vishal Patel, Rama Chellappa
Abstract	We present an algorithm for extracting key-point descriptors using deep convolutional neural networks (CNN). Unlike many existing deep CNNs, our model computes local features around a given point in an image. We also present a face alignment algorithm based on regression using these local descriptors. The proposed method called Local Deep Descriptor Regression (LDDR) is able to localize face landmarks of varying sizes, poses and occlusions with high accuracy. Deep Descriptors presented in this paper are able to uniquely and efficiently describe every pixel in the image and therefore can potentially replace traditional descriptors such as SIFT and HOG. Extensive evaluations on five publicly available unconstrained face alignment datasets show that our deep descriptor network is able to capture strong local features around a given landmark and performs significantly better than many competitive and state-of-the-art face alignment algorithms.
Tasks	Face Alignment
Published	2016-01-29
URL	http://arxiv.org/abs/1601.07950v1
PDF	http://arxiv.org/pdf/1601.07950v1.pdf
PWC	https://paperswithcode.com/paper/face-alignment-by-local-deep-descriptor
Repo
Framework

Video Pixel Networks


Title	Video Pixel Networks
Authors	Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
Abstract	We propose a probabilistic video model, the Video Pixel Network (VPN), that estimates the discrete joint distribution of the raw pixel values in a video. The model and the neural architecture reflect the time, space and color structure of video tensors and encode it as a four-dimensional dependency chain. The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth. The VPN also produces detailed samples on the action-conditional Robotic Pushing benchmark and generalizes to the motion of novel objects.
Tasks
Published	2016-10-03
URL	http://arxiv.org/abs/1610.00527v1
PDF	http://arxiv.org/pdf/1610.00527v1.pdf
PWC	https://paperswithcode.com/paper/video-pixel-networks
Repo
Framework