July 29, 2019

2746 words 13 mins read

Paper Group ANR 104

A Convergence Analysis for A Class of Practical Variance-Reduction Stochastic Gradient MCMC. Cycles in adversarial regularized learning. DAWT: Densely Annotated Wikipedia Texts across multiple languages. The Character Thinks Ahead: creative writing with deep learning nets and its stylistic assessment. Robust Multi-Image HDR Reconstruction for the M …

A Convergence Analysis for A Class of Practical Variance-Reduction Stochastic Gradient MCMC


Title	A Convergence Analysis for A Class of Practical Variance-Reduction Stochastic Gradient MCMC
Authors	Changyou Chen, Wenlin Wang, Yizhe Zhang, Qinliang Su, Lawrence Carin
Abstract	Stochastic gradient Markov Chain Monte Carlo (SG-MCMC) has been developed as a flexible family of scalable Bayesian sampling algorithms. However, there has been little theoretical analysis of the impact of minibatch size to the algorithm’s convergence rate. In this paper, we prove that under a limited computational budget/time, a larger minibatch size leads to a faster decrease of the mean squared error bound (thus the fastest one corresponds to using full gradients), which motivates the necessity of variance reduction in SG-MCMC. Consequently, by borrowing ideas from stochastic optimization, we propose a practical variance-reduction technique for SG-MCMC, that is efficient in both computation and storage. We develop theory to prove that our algorithm induces a faster convergence rate than standard SG-MCMC. A number of large-scale experiments, ranging from Bayesian learning of logistic regression to deep neural networks, validate the theory and demonstrate the superiority of the proposed variance-reduction SG-MCMC framework.
Tasks	Stochastic Optimization
Published	2017-09-04
URL	http://arxiv.org/abs/1709.01180v1
PDF	http://arxiv.org/pdf/1709.01180v1.pdf
PWC	https://paperswithcode.com/paper/a-convergence-analysis-for-a-class-of
Repo
Framework

Cycles in adversarial regularized learning


Title	Cycles in adversarial regularized learning
Authors	Panayotis Mertikopoulos, Christos Papadimitriou, Georgios Piliouras
Abstract	Regularized learning is a fundamental technique in online optimization, machine learning and many other fields of computer science. A natural question that arises in these settings is how regularized learning algorithms behave when faced against each other. We study a natural formulation of this problem by coupling regularized learning dynamics in zero-sum games. We show that the system’s behavior is Poincar'e recurrent, implying that almost every trajectory revisits any (arbitrarily small) neighborhood of its starting point infinitely often. This cycling behavior is robust to the agents’ choice of regularization mechanism (each agent could be using a different regularizer), to positive-affine transformations of the agents’ utilities, and it also persists in the case of networked competition, i.e., for zero-sum polymatrix games.
Tasks
Published	2017-09-08
URL	http://arxiv.org/abs/1709.02738v1
PDF	http://arxiv.org/pdf/1709.02738v1.pdf
PWC	https://paperswithcode.com/paper/cycles-in-adversarial-regularized-learning
Repo
Framework

DAWT: Densely Annotated Wikipedia Texts across multiple languages


Title	DAWT: Densely Annotated Wikipedia Texts across multiple languages
Authors	Nemanja Spasojevic, Preeti Bhargava, Guoning Hu
Abstract	In this work, we open up the DAWT dataset - Densely Annotated Wikipedia Texts across multiple languages. The annotations include labeled text mentions mapping to entities (represented by their Freebase machine ids) as well as the type of the entity. The data set contains total of 13.6M articles, 5.0B tokens, 13.8M mention entity co-occurrences. DAWT contains 4.8 times more anchor text to entity links than originally present in the Wikipedia markup. Moreover, it spans several languages including English, Spanish, Italian, German, French and Arabic. We also present the methodology used to generate the dataset which enriches Wikipedia markup in order to increase number of links. In addition to the main dataset, we open up several derived datasets including mention entity co-occurrence counts and entity embeddings, as well as mappings between Freebase ids and Wikidata item ids. We also discuss two applications of these datasets and hope that opening them up would prove useful for the Natural Language Processing and Information Retrieval communities, as well as facilitate multi-lingual research.
Tasks	Entity Embeddings, Information Retrieval
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00948v1
PDF	http://arxiv.org/pdf/1703.00948v1.pdf
PWC	https://paperswithcode.com/paper/dawt-densely-annotated-wikipedia-texts-across
Repo
Framework

The Character Thinks Ahead: creative writing with deep learning nets and its stylistic assessment


Title	The Character Thinks Ahead: creative writing with deep learning nets and its stylistic assessment
Authors	Roger T. Dean, Hazel Smith
Abstract	We discuss how to control outputs from deep learning models of text corpora so as to create contemporary poetic works. We assess whether these controls are successful in the immediate sense of creating stylo- metric distinctiveness. The specific context is our piece The Character Thinks Ahead (2016/17); the potential applications are broad.
Tasks
Published	2017-12-21
URL	http://arxiv.org/abs/1712.07794v1
PDF	http://arxiv.org/pdf/1712.07794v1.pdf
PWC	https://paperswithcode.com/paper/the-character-thinks-ahead-creative-writing
Repo
Framework

Robust Multi-Image HDR Reconstruction for the Modulo Camera


Title	Robust Multi-Image HDR Reconstruction for the Modulo Camera
Authors	Florian Lang, Tobias Plötz, Stefan Roth
Abstract	Photographing scenes with high dynamic range (HDR) poses great challenges to consumer cameras with their limited sensor bit depth. To address this, Zhao et al. recently proposed a novel sensor concept - the modulo camera - which captures the least significant bits of the recorded scene instead of going into saturation. Similar to conventional pipelines, HDR images can be reconstructed from multiple exposures, but significantly fewer images are needed than with a typical saturating sensor. While the concept is appealing, we show that the original reconstruction approach assumes noise-free measurements and quickly breaks down otherwise. To address this, we propose a novel reconstruction algorithm that is robust to image noise and produces significantly fewer artifacts. We theoretically analyze correctness as well as limitations, and show that our approach significantly outperforms the baseline on real data.
Tasks
Published	2017-07-05
URL	http://arxiv.org/abs/1707.01317v1
PDF	http://arxiv.org/pdf/1707.01317v1.pdf
PWC	https://paperswithcode.com/paper/robust-multi-image-hdr-reconstruction-for-the
Repo
Framework

Deep learning for predicting refractive error from retinal fundus images


Title	Deep learning for predicting refractive error from retinal fundus images
Authors	Avinash V. Varadarajan, Ryan Poplin, Katy Blumer, Christof Angermueller, Joe Ledsam, Reena Chopra, Pearse A. Keane, Greg S. Corrado, Lily Peng, Dale R. Webster
Abstract	Refractive error, one of the leading cause of visual impairment, can be corrected by simple interventions like prescribing eyeglasses. We trained a deep learning algorithm to predict refractive error from the fundus photographs from participants in the UK Biobank cohort, which were 45 degree field of view images and the AREDS clinical trial, which contained 30 degree field of view images. Our model use the “attention” method to identify features that are correlated with refractive error. Mean absolute error (MAE) of the algorithm’s prediction compared to the refractive error obtained in the AREDS and UK Biobank. The resulting algorithm had a MAE of 0.56 diopters (95% CI: 0.55-0.56) for estimating spherical equivalent on the UK Biobank dataset and 0.91 diopters (95% CI: 0.89-0.92) for the AREDS dataset. The baseline expected MAE (obtained by simply predicting the mean of this population) was 1.81 diopters (95% CI: 1.79-1.84) for UK Biobank and 1.63 (95% CI: 1.60-1.67) for AREDS. Attention maps suggested that the foveal region was one of the most important areas used by the algorithm to make this prediction, though other regions also contribute to the prediction. The ability to estimate refractive error with high accuracy from retinal fundus photos has not been previously known and demonstrates that deep learning can be applied to make novel predictions from medical images. Given that several groups have recently shown that it is feasible to obtain retinal fundus photos using mobile phones and inexpensive attachments, this work may be particularly relevant in regions of the world where autorefractors may not be readily available.
Tasks
Published	2017-12-21
URL	http://arxiv.org/abs/1712.07798v1
PDF	http://arxiv.org/pdf/1712.07798v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-predicting-refractive-error
Repo
Framework

Exploring the bidimensional space: A dynamic logic point of view


Title	Exploring the bidimensional space: A dynamic logic point of view
Authors	Philippe Balbiani, David Fernández-Duque, Emiliano Lorini
Abstract	We present a family of logics for reasoning about agents’ positions and motion in the plane which have several potential applications in the area of multi-agent systems (MAS), such as multi-agent planning and robotics. The most general logic includes (i) atomic formulas for representing the truth of a given fact or the presence of a given agent at a certain position of the plane, (ii) atomic programs corresponding to the four basic orientations in the plane (up, down, left, right) as well as the four program constructs of propositional dynamic logic (sequential composition, nondeterministic composition, iteration and test). As this logic is not computably enumerable, we study some interesting decidable and axiomatizable fragments of it. We also present a decidable extension of the iteration-free fragment of the logic by special programs representing motion of agents in the plane.
Tasks
Published	2017-02-06
URL	http://arxiv.org/abs/1702.01601v1
PDF	http://arxiv.org/pdf/1702.01601v1.pdf
PWC	https://paperswithcode.com/paper/exploring-the-bidimensional-space-a-dynamic
Repo
Framework

Introduction to intelligent computing unit 1


Title	Introduction to intelligent computing unit 1
Authors	Isa Inuwa-Dutse
Abstract	This brief note highlights some basic concepts required toward understanding the evolution of machine learning and deep learning models. The note starts with an overview of artificial intelligence and its relationship to biological neuron that ultimately led to the evolution of todays intelligent models.
Tasks
Published	2017-11-15
URL	http://arxiv.org/abs/1711.06552v1
PDF	http://arxiv.org/pdf/1711.06552v1.pdf
PWC	https://paperswithcode.com/paper/introduction-to-intelligent-computing-unit-1
Repo
Framework

Linearly convergent stochastic heavy ball method for minimizing generalization error


Title	Linearly convergent stochastic heavy ball method for minimizing generalization error
Authors	Nicolas Loizou, Peter Richtárik
Abstract	In this work we establish the first linear convergence result for the stochastic heavy ball method. The method performs SGD steps with a fixed stepsize, amended by a heavy ball momentum term. In the analysis, we focus on minimizing the expected loss and not on finite-sum minimization, which is typically a much harder problem. While in the analysis we constrain ourselves to quadratic loss, the overall objective is not necessarily strongly convex.
Tasks
Published	2017-10-30
URL	http://arxiv.org/abs/1710.10737v2
PDF	http://arxiv.org/pdf/1710.10737v2.pdf
PWC	https://paperswithcode.com/paper/linearly-convergent-stochastic-heavy-ball
Repo
Framework

Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation


Title	Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation
Authors	Xinyu Wang, Hanxi Li, Yi Li, Fumin Shen, Fatih Porikli
Abstract	Visual tracking is a fundamental problem in computer vision. Recently, some deep-learning-based tracking algorithms have been achieving record-breaking performances. However, due to the high complexity of deep learning, most deep trackers suffer from low tracking speed, and thus are impractical in many real-world applications. Some new deep trackers with smaller network structure achieve high efficiency while at the cost of significant decrease on precision. In this paper, we propose to transfer the feature for image classification to the visual tracking domain via convolutional channel reductions. The channel reduction could be simply viewed as an additional convolutional layer with the specific task. It not only extracts useful information for object tracking but also significantly increases the tracking speed. To better accommodate the useful feature of the target in different scales, the adaptation filters are designed with different sizes. The yielded visual tracker is real-time and also illustrates the state-of-the-art accuracies in the experiment involving two well-adopted benchmarks with more than 100 test videos.
Tasks	Domain Adaptation, Image Classification, Object Tracking, Visual Tracking
Published	2017-01-03
URL	http://arxiv.org/abs/1701.00561v1
PDF	http://arxiv.org/pdf/1701.00561v1.pdf
PWC	https://paperswithcode.com/paper/robust-and-real-time-deep-tracking-via-multi
Repo
Framework

Wolf in Sheep’s Clothing - The Downscaling Attack Against Deep Learning Applications


Title	Wolf in Sheep’s Clothing - The Downscaling Attack Against Deep Learning Applications
Authors	Qixue Xiao, Kang Li, Deyue Zhang, Yier Jin
Abstract	This paper considers security risks buried in the data processing pipeline in common deep learning applications. Deep learning models usually assume a fixed scale for their training and input data. To allow deep learning applications to handle a wide range of input data, popular frameworks, such as Caffe, TensorFlow, and Torch, all provide data scaling functions to resize input to the dimensions used by deep learning models. Image scaling algorithms are intended to preserve the visual features of an image after scaling. However, common image scaling algorithms are not designed to handle human crafted images. Attackers can make the scaling outputs look dramatically different from the corresponding input images. This paper presents a downscaling attack that targets the data scaling process in deep learning applications. By carefully crafting input data that mismatches with the dimension used by deep learning models, attackers can create deceiving effects. A deep learning application effectively consumes data that are not the same as those presented to users. The visual inconsistency enables practical evasion and data poisoning attacks to deep learning applications. This paper presents proof-of-concept attack samples to popular deep-learning-based image classification applications. To address the downscaling attacks, the paper also suggests multiple potential mitigation strategies.
Tasks	data poisoning, Image Classification
Published	2017-12-21
URL	http://arxiv.org/abs/1712.07805v1
PDF	http://arxiv.org/pdf/1712.07805v1.pdf
PWC	https://paperswithcode.com/paper/wolf-in-sheeps-clothing-the-downscaling
Repo
Framework

Dimensionality reduction for acoustic vehicle classification with spectral embedding


Title	Dimensionality reduction for acoustic vehicle classification with spectral embedding
Authors	Justin Sunu, Allon G. Percus
Abstract	We propose a method for recognizing moving vehicles, using data from roadside audio sensors. This problem has applications ranging widely, from traffic analysis to surveillance. We extract a frequency signature from the audio signal using a short-time Fourier transform, and treat each time window as an individual data point to be classified. By applying a spectral embedding, we decrease the dimensionality of the data sufficiently for K-nearest neighbors to provide accurate vehicle identification.
Tasks	Dimensionality Reduction
Published	2017-05-27
URL	http://arxiv.org/abs/1705.09869v2
PDF	http://arxiv.org/pdf/1705.09869v2.pdf
PWC	https://paperswithcode.com/paper/dimensionality-reduction-for-acoustic-vehicle
Repo
Framework

SREFI: Synthesis of Realistic Example Face Images


Title	SREFI: Synthesis of Realistic Example Face Images
Authors	Sandipan Banerjee, John S. Bernhard, Walter J. Scheirer, Kevin W. Bowyer, Patrick J. Flynn
Abstract	In this paper, we propose a novel face synthesis approach that can generate an arbitrarily large number of synthetic images of both real and synthetic identities. Thus a face image dataset can be expanded in terms of the number of identities represented and the number of images per identity using this approach, without the identity-labeling and privacy complications that come from downloading images from the web. To measure the visual fidelity and uniqueness of the synthetic face images and identities, we conducted face matching experiments with both human participants and a CNN pre-trained on a dataset of 2.6M real face images. To evaluate the stability of these synthetic faces, we trained a CNN model with an augmented dataset containing close to 200,000 synthetic faces. We used a snapshot of this trained CNN to recognize extremely challenging frontal (real) face images. Experiments showed training with the augmented faces boosted the face recognition performance of the CNN.
Tasks	Face Generation, Face Recognition
Published	2017-04-21
URL	http://arxiv.org/abs/1704.06693v2
PDF	http://arxiv.org/pdf/1704.06693v2.pdf
PWC	https://paperswithcode.com/paper/srefi-synthesis-of-realistic-example-face
Repo
Framework

Efficient and accurate monitoring of the depth information in a Wireless Multimedia Sensor Network based surveillance


Title	Efficient and accurate monitoring of the depth information in a Wireless Multimedia Sensor Network based surveillance
Authors	Anthony Tannoury, Rony Darazi, Christophe Guyeux, Abdallah Makhoul
Abstract	Wireless Multimedia Sensor Network (WMSN) is a promising technology capturing rich multimedia data like audio and video, which can be useful to monitor an environment under surveillance. However, many scenarios in real time monitoring requires 3D depth information. In this research work, we propose to use the disparity map that is computed from two or multiple images, in order to monitor the depth information in an object or event under surveillance using WMSN. Our system is based on distributed wireless sensors allowing us to notably reduce the computational time needed for 3D depth reconstruction, thus permitting the success of real time solutions. Each pair of sensors will capture images for a targeted place/object and will operate a Stereo Matching in order to create a Disparity Map. Disparity maps will give us the ability to decrease traffic on the bandwidth, because they are of low size. This will increase WMSN lifetime. Any event can be detected after computing the depth value for the target object in the scene, and also 3D scene reconstruction can be achieved with a disparity map and some reference(s) image(s) taken by the node(s).
Tasks	3D Scene Reconstruction, Stereo Matching, Stereo Matching Hand
Published	2017-06-25
URL	http://arxiv.org/abs/1706.08088v1
PDF	http://arxiv.org/pdf/1706.08088v1.pdf
PWC	https://paperswithcode.com/paper/efficient-and-accurate-monitoring-of-the
Repo
Framework

Interpretable Feature Recommendation for Signal Analytics


Title	Interpretable Feature Recommendation for Signal Analytics
Authors	Snehasis Banerjee, Tanushyam Chattopadhyay, Ayan Mukherjee
Abstract	This paper presents an automated approach for interpretable feature recommendation for solving signal data analytics problems. The method has been tested by performing experiments on datasets in the domain of prognostics where interpretation of features is considered very important. The proposed approach is based on Wide Learning architecture and provides means for interpretation of the recommended features. It is to be noted that such an interpretation is not available with feature learning approaches like Deep Learning (such as Convolutional Neural Network) or feature transformation approaches like Principal Component Analysis. Results show that the feature recommendation and interpretation techniques are quite effective for the problems at hand in terms of performance and drastic reduction in time to develop a solution. It is further shown by an example, how this human-in-loop interpretation system can be used as a prescriptive system.
Tasks
Published	2017-11-06
URL	http://arxiv.org/abs/1711.01870v1
PDF	http://arxiv.org/pdf/1711.01870v1.pdf
PWC	https://paperswithcode.com/paper/interpretable-feature-recommendation-for
Repo
Framework