Paper Group ANR 104
A Convergence Analysis for A Class of Practical Variance-Reduction Stochastic Gradient MCMC. Cycles in adversarial regularized learning. DAWT: Densely Annotated Wikipedia Texts across multiple languages. The Character Thinks Ahead: creative writing with deep learning nets and its stylistic assessment. Robust Multi-Image HDR Reconstruction for the M …
A Convergence Analysis for A Class of Practical Variance-Reduction Stochastic Gradient MCMC
Title | A Convergence Analysis for A Class of Practical Variance-Reduction Stochastic Gradient MCMC |
Authors | Changyou Chen, Wenlin Wang, Yizhe Zhang, Qinliang Su, Lawrence Carin |
Abstract | Stochastic gradient Markov Chain Monte Carlo (SG-MCMC) has been developed as a flexible family of scalable Bayesian sampling algorithms. However, there has been little theoretical analysis of the impact of minibatch size to the algorithm’s convergence rate. In this paper, we prove that under a limited computational budget/time, a larger minibatch size leads to a faster decrease of the mean squared error bound (thus the fastest one corresponds to using full gradients), which motivates the necessity of variance reduction in SG-MCMC. Consequently, by borrowing ideas from stochastic optimization, we propose a practical variance-reduction technique for SG-MCMC, that is efficient in both computation and storage. We develop theory to prove that our algorithm induces a faster convergence rate than standard SG-MCMC. A number of large-scale experiments, ranging from Bayesian learning of logistic regression to deep neural networks, validate the theory and demonstrate the superiority of the proposed variance-reduction SG-MCMC framework. |
Tasks | Stochastic Optimization |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.01180v1 |
http://arxiv.org/pdf/1709.01180v1.pdf | |
PWC | https://paperswithcode.com/paper/a-convergence-analysis-for-a-class-of |
Repo | |
Framework | |
Cycles in adversarial regularized learning
Title | Cycles in adversarial regularized learning |
Authors | Panayotis Mertikopoulos, Christos Papadimitriou, Georgios Piliouras |
Abstract | Regularized learning is a fundamental technique in online optimization, machine learning and many other fields of computer science. A natural question that arises in these settings is how regularized learning algorithms behave when faced against each other. We study a natural formulation of this problem by coupling regularized learning dynamics in zero-sum games. We show that the system’s behavior is Poincar'e recurrent, implying that almost every trajectory revisits any (arbitrarily small) neighborhood of its starting point infinitely often. This cycling behavior is robust to the agents’ choice of regularization mechanism (each agent could be using a different regularizer), to positive-affine transformations of the agents’ utilities, and it also persists in the case of networked competition, i.e., for zero-sum polymatrix games. |
Tasks | |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.02738v1 |
http://arxiv.org/pdf/1709.02738v1.pdf | |
PWC | https://paperswithcode.com/paper/cycles-in-adversarial-regularized-learning |
Repo | |
Framework | |
DAWT: Densely Annotated Wikipedia Texts across multiple languages
Title | DAWT: Densely Annotated Wikipedia Texts across multiple languages |
Authors | Nemanja Spasojevic, Preeti Bhargava, Guoning Hu |
Abstract | In this work, we open up the DAWT dataset - Densely Annotated Wikipedia Texts across multiple languages. The annotations include labeled text mentions mapping to entities (represented by their Freebase machine ids) as well as the type of the entity. The data set contains total of 13.6M articles, 5.0B tokens, 13.8M mention entity co-occurrences. DAWT contains 4.8 times more anchor text to entity links than originally present in the Wikipedia markup. Moreover, it spans several languages including English, Spanish, Italian, German, French and Arabic. We also present the methodology used to generate the dataset which enriches Wikipedia markup in order to increase number of links. In addition to the main dataset, we open up several derived datasets including mention entity co-occurrence counts and entity embeddings, as well as mappings between Freebase ids and Wikidata item ids. We also discuss two applications of these datasets and hope that opening them up would prove useful for the Natural Language Processing and Information Retrieval communities, as well as facilitate multi-lingual research. |
Tasks | Entity Embeddings, Information Retrieval |
Published | 2017-03-02 |
URL | http://arxiv.org/abs/1703.00948v1 |
http://arxiv.org/pdf/1703.00948v1.pdf | |
PWC | https://paperswithcode.com/paper/dawt-densely-annotated-wikipedia-texts-across |
Repo | |
Framework | |
The Character Thinks Ahead: creative writing with deep learning nets and its stylistic assessment
Title | The Character Thinks Ahead: creative writing with deep learning nets and its stylistic assessment |
Authors | Roger T. Dean, Hazel Smith |
Abstract | We discuss how to control outputs from deep learning models of text corpora so as to create contemporary poetic works. We assess whether these controls are successful in the immediate sense of creating stylo- metric distinctiveness. The specific context is our piece The Character Thinks Ahead (2016/17); the potential applications are broad. |
Tasks | |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.07794v1 |
http://arxiv.org/pdf/1712.07794v1.pdf | |
PWC | https://paperswithcode.com/paper/the-character-thinks-ahead-creative-writing |
Repo | |
Framework | |
Robust Multi-Image HDR Reconstruction for the Modulo Camera
Title | Robust Multi-Image HDR Reconstruction for the Modulo Camera |
Authors | Florian Lang, Tobias Plötz, Stefan Roth |
Abstract | Photographing scenes with high dynamic range (HDR) poses great challenges to consumer cameras with their limited sensor bit depth. To address this, Zhao et al. recently proposed a novel sensor concept - the modulo camera - which captures the least significant bits of the recorded scene instead of going into saturation. Similar to conventional pipelines, HDR images can be reconstructed from multiple exposures, but significantly fewer images are needed than with a typical saturating sensor. While the concept is appealing, we show that the original reconstruction approach assumes noise-free measurements and quickly breaks down otherwise. To address this, we propose a novel reconstruction algorithm that is robust to image noise and produces significantly fewer artifacts. We theoretically analyze correctness as well as limitations, and show that our approach significantly outperforms the baseline on real data. |
Tasks | |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01317v1 |
http://arxiv.org/pdf/1707.01317v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-multi-image-hdr-reconstruction-for-the |
Repo | |
Framework | |
Deep learning for predicting refractive error from retinal fundus images
Title | Deep learning for predicting refractive error from retinal fundus images |
Authors | Avinash V. Varadarajan, Ryan Poplin, Katy Blumer, Christof Angermueller, Joe Ledsam, Reena Chopra, Pearse A. Keane, Greg S. Corrado, Lily Peng, Dale R. Webster |
Abstract | Refractive error, one of the leading cause of visual impairment, can be corrected by simple interventions like prescribing eyeglasses. We trained a deep learning algorithm to predict refractive error from the fundus photographs from participants in the UK Biobank cohort, which were 45 degree field of view images and the AREDS clinical trial, which contained 30 degree field of view images. Our model use the “attention” method to identify features that are correlated with refractive error. Mean absolute error (MAE) of the algorithm’s prediction compared to the refractive error obtained in the AREDS and UK Biobank. The resulting algorithm had a MAE of 0.56 diopters (95% CI: 0.55-0.56) for estimating spherical equivalent on the UK Biobank dataset and 0.91 diopters (95% CI: 0.89-0.92) for the AREDS dataset. The baseline expected MAE (obtained by simply predicting the mean of this population) was 1.81 diopters (95% CI: 1.79-1.84) for UK Biobank and 1.63 (95% CI: 1.60-1.67) for AREDS. Attention maps suggested that the foveal region was one of the most important areas used by the algorithm to make this prediction, though other regions also contribute to the prediction. The ability to estimate refractive error with high accuracy from retinal fundus photos has not been previously known and demonstrates that deep learning can be applied to make novel predictions from medical images. Given that several groups have recently shown that it is feasible to obtain retinal fundus photos using mobile phones and inexpensive attachments, this work may be particularly relevant in regions of the world where autorefractors may not be readily available. |
Tasks | |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.07798v1 |
http://arxiv.org/pdf/1712.07798v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-predicting-refractive-error |
Repo | |
Framework | |
Exploring the bidimensional space: A dynamic logic point of view
Title | Exploring the bidimensional space: A dynamic logic point of view |
Authors | Philippe Balbiani, David Fernández-Duque, Emiliano Lorini |
Abstract | We present a family of logics for reasoning about agents’ positions and motion in the plane which have several potential applications in the area of multi-agent systems (MAS), such as multi-agent planning and robotics. The most general logic includes (i) atomic formulas for representing the truth of a given fact or the presence of a given agent at a certain position of the plane, (ii) atomic programs corresponding to the four basic orientations in the plane (up, down, left, right) as well as the four program constructs of propositional dynamic logic (sequential composition, nondeterministic composition, iteration and test). As this logic is not computably enumerable, we study some interesting decidable and axiomatizable fragments of it. We also present a decidable extension of the iteration-free fragment of the logic by special programs representing motion of agents in the plane. |
Tasks | |
Published | 2017-02-06 |
URL | http://arxiv.org/abs/1702.01601v1 |
http://arxiv.org/pdf/1702.01601v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-the-bidimensional-space-a-dynamic |
Repo | |
Framework | |
Introduction to intelligent computing unit 1
Title | Introduction to intelligent computing unit 1 |
Authors | Isa Inuwa-Dutse |
Abstract | This brief note highlights some basic concepts required toward understanding the evolution of machine learning and deep learning models. The note starts with an overview of artificial intelligence and its relationship to biological neuron that ultimately led to the evolution of todays intelligent models. |
Tasks | |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.06552v1 |
http://arxiv.org/pdf/1711.06552v1.pdf | |
PWC | https://paperswithcode.com/paper/introduction-to-intelligent-computing-unit-1 |
Repo | |
Framework | |
Linearly convergent stochastic heavy ball method for minimizing generalization error
Title | Linearly convergent stochastic heavy ball method for minimizing generalization error |
Authors | Nicolas Loizou, Peter Richtárik |
Abstract | In this work we establish the first linear convergence result for the stochastic heavy ball method. The method performs SGD steps with a fixed stepsize, amended by a heavy ball momentum term. In the analysis, we focus on minimizing the expected loss and not on finite-sum minimization, which is typically a much harder problem. While in the analysis we constrain ourselves to quadratic loss, the overall objective is not necessarily strongly convex. |
Tasks | |
Published | 2017-10-30 |
URL | http://arxiv.org/abs/1710.10737v2 |
http://arxiv.org/pdf/1710.10737v2.pdf | |
PWC | https://paperswithcode.com/paper/linearly-convergent-stochastic-heavy-ball |
Repo | |
Framework | |
Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation
Title | Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation |
Authors | Xinyu Wang, Hanxi Li, Yi Li, Fumin Shen, Fatih Porikli |
Abstract | Visual tracking is a fundamental problem in computer vision. Recently, some deep-learning-based tracking algorithms have been achieving record-breaking performances. However, due to the high complexity of deep learning, most deep trackers suffer from low tracking speed, and thus are impractical in many real-world applications. Some new deep trackers with smaller network structure achieve high efficiency while at the cost of significant decrease on precision. In this paper, we propose to transfer the feature for image classification to the visual tracking domain via convolutional channel reductions. The channel reduction could be simply viewed as an additional convolutional layer with the specific task. It not only extracts useful information for object tracking but also significantly increases the tracking speed. To better accommodate the useful feature of the target in different scales, the adaptation filters are designed with different sizes. The yielded visual tracker is real-time and also illustrates the state-of-the-art accuracies in the experiment involving two well-adopted benchmarks with more than 100 test videos. |
Tasks | Domain Adaptation, Image Classification, Object Tracking, Visual Tracking |
Published | 2017-01-03 |
URL | http://arxiv.org/abs/1701.00561v1 |
http://arxiv.org/pdf/1701.00561v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-and-real-time-deep-tracking-via-multi |
Repo | |
Framework | |
Wolf in Sheep’s Clothing - The Downscaling Attack Against Deep Learning Applications
Title | Wolf in Sheep’s Clothing - The Downscaling Attack Against Deep Learning Applications |
Authors | Qixue Xiao, Kang Li, Deyue Zhang, Yier Jin |
Abstract | This paper considers security risks buried in the data processing pipeline in common deep learning applications. Deep learning models usually assume a fixed scale for their training and input data. To allow deep learning applications to handle a wide range of input data, popular frameworks, such as Caffe, TensorFlow, and Torch, all provide data scaling functions to resize input to the dimensions used by deep learning models. Image scaling algorithms are intended to preserve the visual features of an image after scaling. However, common image scaling algorithms are not designed to handle human crafted images. Attackers can make the scaling outputs look dramatically different from the corresponding input images. This paper presents a downscaling attack that targets the data scaling process in deep learning applications. By carefully crafting input data that mismatches with the dimension used by deep learning models, attackers can create deceiving effects. A deep learning application effectively consumes data that are not the same as those presented to users. The visual inconsistency enables practical evasion and data poisoning attacks to deep learning applications. This paper presents proof-of-concept attack samples to popular deep-learning-based image classification applications. To address the downscaling attacks, the paper also suggests multiple potential mitigation strategies. |
Tasks | data poisoning, Image Classification |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.07805v1 |
http://arxiv.org/pdf/1712.07805v1.pdf | |
PWC | https://paperswithcode.com/paper/wolf-in-sheeps-clothing-the-downscaling |
Repo | |
Framework | |
Dimensionality reduction for acoustic vehicle classification with spectral embedding
Title | Dimensionality reduction for acoustic vehicle classification with spectral embedding |
Authors | Justin Sunu, Allon G. Percus |
Abstract | We propose a method for recognizing moving vehicles, using data from roadside audio sensors. This problem has applications ranging widely, from traffic analysis to surveillance. We extract a frequency signature from the audio signal using a short-time Fourier transform, and treat each time window as an individual data point to be classified. By applying a spectral embedding, we decrease the dimensionality of the data sufficiently for K-nearest neighbors to provide accurate vehicle identification. |
Tasks | Dimensionality Reduction |
Published | 2017-05-27 |
URL | http://arxiv.org/abs/1705.09869v2 |
http://arxiv.org/pdf/1705.09869v2.pdf | |
PWC | https://paperswithcode.com/paper/dimensionality-reduction-for-acoustic-vehicle |
Repo | |
Framework | |
SREFI: Synthesis of Realistic Example Face Images
Title | SREFI: Synthesis of Realistic Example Face Images |
Authors | Sandipan Banerjee, John S. Bernhard, Walter J. Scheirer, Kevin W. Bowyer, Patrick J. Flynn |
Abstract | In this paper, we propose a novel face synthesis approach that can generate an arbitrarily large number of synthetic images of both real and synthetic identities. Thus a face image dataset can be expanded in terms of the number of identities represented and the number of images per identity using this approach, without the identity-labeling and privacy complications that come from downloading images from the web. To measure the visual fidelity and uniqueness of the synthetic face images and identities, we conducted face matching experiments with both human participants and a CNN pre-trained on a dataset of 2.6M real face images. To evaluate the stability of these synthetic faces, we trained a CNN model with an augmented dataset containing close to 200,000 synthetic faces. We used a snapshot of this trained CNN to recognize extremely challenging frontal (real) face images. Experiments showed training with the augmented faces boosted the face recognition performance of the CNN. |
Tasks | Face Generation, Face Recognition |
Published | 2017-04-21 |
URL | http://arxiv.org/abs/1704.06693v2 |
http://arxiv.org/pdf/1704.06693v2.pdf | |
PWC | https://paperswithcode.com/paper/srefi-synthesis-of-realistic-example-face |
Repo | |
Framework | |
Efficient and accurate monitoring of the depth information in a Wireless Multimedia Sensor Network based surveillance
Title | Efficient and accurate monitoring of the depth information in a Wireless Multimedia Sensor Network based surveillance |
Authors | Anthony Tannoury, Rony Darazi, Christophe Guyeux, Abdallah Makhoul |
Abstract | Wireless Multimedia Sensor Network (WMSN) is a promising technology capturing rich multimedia data like audio and video, which can be useful to monitor an environment under surveillance. However, many scenarios in real time monitoring requires 3D depth information. In this research work, we propose to use the disparity map that is computed from two or multiple images, in order to monitor the depth information in an object or event under surveillance using WMSN. Our system is based on distributed wireless sensors allowing us to notably reduce the computational time needed for 3D depth reconstruction, thus permitting the success of real time solutions. Each pair of sensors will capture images for a targeted place/object and will operate a Stereo Matching in order to create a Disparity Map. Disparity maps will give us the ability to decrease traffic on the bandwidth, because they are of low size. This will increase WMSN lifetime. Any event can be detected after computing the depth value for the target object in the scene, and also 3D scene reconstruction can be achieved with a disparity map and some reference(s) image(s) taken by the node(s). |
Tasks | 3D Scene Reconstruction, Stereo Matching, Stereo Matching Hand |
Published | 2017-06-25 |
URL | http://arxiv.org/abs/1706.08088v1 |
http://arxiv.org/pdf/1706.08088v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-and-accurate-monitoring-of-the |
Repo | |
Framework | |
Interpretable Feature Recommendation for Signal Analytics
Title | Interpretable Feature Recommendation for Signal Analytics |
Authors | Snehasis Banerjee, Tanushyam Chattopadhyay, Ayan Mukherjee |
Abstract | This paper presents an automated approach for interpretable feature recommendation for solving signal data analytics problems. The method has been tested by performing experiments on datasets in the domain of prognostics where interpretation of features is considered very important. The proposed approach is based on Wide Learning architecture and provides means for interpretation of the recommended features. It is to be noted that such an interpretation is not available with feature learning approaches like Deep Learning (such as Convolutional Neural Network) or feature transformation approaches like Principal Component Analysis. Results show that the feature recommendation and interpretation techniques are quite effective for the problems at hand in terms of performance and drastic reduction in time to develop a solution. It is further shown by an example, how this human-in-loop interpretation system can be used as a prescriptive system. |
Tasks | |
Published | 2017-11-06 |
URL | http://arxiv.org/abs/1711.01870v1 |
http://arxiv.org/pdf/1711.01870v1.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-feature-recommendation-for |
Repo | |
Framework | |