Paper Group ANR 1133
Mean Field Analysis of Neural Networks: A Central Limit Theorem. How the Softmax Output is Misleading for Evaluating the Strength of Adversarial Examples. Image-Dependent Local Entropy Models for Learned Image Compression. Knowing Where to Look? Analysis on Attention of Visual Question Answering System. Multi-Task Learning for Sequence Tagging: An …
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Title | Mean Field Analysis of Neural Networks: A Central Limit Theorem |
Authors | Justin Sirignano, Konstantinos Spiliopoulos |
Abstract | We rigorously prove a central limit theorem for neural network models with a single hidden layer. The central limit theorem is proven in the asymptotic regime of simultaneously (A) large numbers of hidden units and (B) large numbers of stochastic gradient descent training iterations. Our result describes the neural network’s fluctuations around its mean-field limit. The fluctuations have a Gaussian distribution and satisfy a stochastic partial differential equation. The proof relies upon weak convergence methods from stochastic analysis. In particular, we prove relative compactness for the sequence of processes and uniqueness of the limiting process in a suitable Sobolev space. |
Tasks | Speech Recognition |
Published | 2018-08-28 |
URL | https://arxiv.org/abs/1808.09372v2 |
https://arxiv.org/pdf/1808.09372v2.pdf | |
PWC | https://paperswithcode.com/paper/mean-field-analysis-of-neural-networks-a |
Repo | |
Framework | |
How the Softmax Output is Misleading for Evaluating the Strength of Adversarial Examples
Title | How the Softmax Output is Misleading for Evaluating the Strength of Adversarial Examples |
Authors | Utku Ozbulak, Wesley De Neve, Arnout Van Messem |
Abstract | Even before deep learning architectures became the de facto models for complex computer vision tasks, the softmax function was, given its elegant properties, already used to analyze the predictions of feedforward neural networks. Nowadays, the output of the softmax function is also commonly used to assess the strength of adversarial examples: malicious data points designed to fail machine learning models during the testing phase. However, in this paper, we show that it is possible to generate adversarial examples that take advantage of some properties of the softmax function, leading to undesired outcomes when interpreting the strength of the adversarial examples at hand. Specifically, we argue that the output of the softmax function is a poor indicator when the strength of an adversarial example is analyzed and that this indicator can be easily tricked by already existing methods for adversarial example generation. |
Tasks | |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08577v1 |
http://arxiv.org/pdf/1811.08577v1.pdf | |
PWC | https://paperswithcode.com/paper/how-the-softmax-output-is-misleading-for |
Repo | |
Framework | |
Image-Dependent Local Entropy Models for Learned Image Compression
Title | Image-Dependent Local Entropy Models for Learned Image Compression |
Authors | David Minnen, George Toderici, Saurabh Singh, Sung Jin Hwang, Michele Covell |
Abstract | The leading approach for image compression with artificial neural networks (ANNs) is to learn a nonlinear transform and a fixed entropy model that are optimized for rate-distortion performance. We show that this approach can be significantly improved by incorporating spatially local, image-dependent entropy models. The key insight is that existing ANN-based methods learn an entropy model that is shared between the encoder and decoder, but they do not transmit any side information that would allow the model to adapt to the structure of a specific image. We present a method for augmenting ANN-based image coders with image-dependent side information that leads to a 17.8% rate reduction over a state-of-the-art ANN-based baseline model on a standard evaluation set, and 70-98% reductions on images with low visual complexity that are poorly captured by a fixed, global entropy model. |
Tasks | Image Compression |
Published | 2018-05-31 |
URL | http://arxiv.org/abs/1805.12295v1 |
http://arxiv.org/pdf/1805.12295v1.pdf | |
PWC | https://paperswithcode.com/paper/image-dependent-local-entropy-models-for |
Repo | |
Framework | |
Knowing Where to Look? Analysis on Attention of Visual Question Answering System
Title | Knowing Where to Look? Analysis on Attention of Visual Question Answering System |
Authors | Wei Li, Zehuan Yuan, Xiangzhong Fang, Changhu Wang |
Abstract | Attention mechanisms have been widely used in Visual Question Answering (VQA) solutions due to their capacity to model deep cross-domain interactions. Analyzing attention maps offers us a perspective to find out limitations of current VQA systems and an opportunity to further improve them. In this paper, we select two state-of-the-art VQA approaches with attention mechanisms to study their robustness and disadvantages by visualizing and analyzing their estimated attention maps. We find that both methods are sensitive to features, and simultaneously, they perform badly for counting and multi-object related questions. We believe that the findings and analytical method will help researchers identify crucial challenges on the way to improve their own VQA systems. |
Tasks | Question Answering, Visual Question Answering |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.03821v1 |
http://arxiv.org/pdf/1810.03821v1.pdf | |
PWC | https://paperswithcode.com/paper/knowing-where-to-look-analysis-on-attention |
Repo | |
Framework | |
Multi-Task Learning for Sequence Tagging: An Empirical Study
Title | Multi-Task Learning for Sequence Tagging: An Empirical Study |
Authors | Soravit Changpinyo, Hexiang Hu, Fei Sha |
Abstract | We study three general multi-task learning (MTL) approaches on 11 sequence tagging tasks. Our extensive empirical results show that in about 50% of the cases, jointly learning all 11 tasks improves upon either independent or pairwise learning of the tasks. We also show that pairwise MTL can inform us what tasks can benefit others or what tasks can be benefited if they are learned jointly. In particular, we identify tasks that can always benefit others as well as tasks that can always be harmed by others. Interestingly, one of our MTL approaches yields embeddings of the tasks that reveal the natural clustering of semantic and syntactic tasks. Our inquiries have opened the doors to further utilization of MTL in NLP. |
Tasks | Multi-Task Learning |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04151v1 |
http://arxiv.org/pdf/1808.04151v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-for-sequence-tagging-an |
Repo | |
Framework | |
Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry
Title | Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry |
Authors | Nan Yang, Rui Wang, Jörg Stückler, Daniel Cremers |
Abstract | Monocular visual odometry approaches that purely rely on geometric cues are prone to scale drift and require sufficient motion parallax in successive frames for motion estimation and 3D reconstruction. In this paper, we propose to leverage deep monocular depth prediction to overcome limitations of geometry-based monocular visual odometry. To this end, we incorporate deep depth predictions into Direct Sparse Odometry (DSO) as direct virtual stereo measurements. For depth prediction, we design a novel deep network that refines predicted depth from a single image in a two-stage process. We train our network in a semi-supervised way on photoconsistency in stereo images and on consistency with accurate sparse depth reconstructions from Stereo DSO. Our deep predictions excel state-of-the-art approaches for monocular depth on the KITTI benchmark. Moreover, our Deep Virtual Stereo Odometry clearly exceeds previous monocular and deep learning based methods in accuracy. It even achieves comparable performance to the state-of-the-art stereo methods, while only relying on a single camera. |
Tasks | 3D Reconstruction, Depth Estimation, Monocular Visual Odometry, Motion Estimation, Visual Odometry |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02570v2 |
http://arxiv.org/pdf/1807.02570v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-virtual-stereo-odometry-leveraging-deep |
Repo | |
Framework | |
Essential Tensor Learning for Multi-view Spectral Clustering
Title | Essential Tensor Learning for Multi-view Spectral Clustering |
Authors | Jianlong Wu, Zhouchen Lin, Hongbin Zha |
Abstract | Multi-view clustering attracts much attention recently, which aims to take advantage of multi-view information to improve the performance of clustering. However, most recent work mainly focus on self-representation based subspace clustering, which is of high computation complexity. In this paper, we focus on the Markov chain based spectral clustering method and propose a novel essential tensor learning method to explore the high order correlations for multi-view representation. We first construct a tensor based on multi-view transition probability matrices of the Markov chain. By incorporating the idea from robust principle component analysis, tensor singular value decomposition (t-SVD) based tensor nuclear norm is imposed to preserve the low-rank property of the essential tensor, which can well capture the principle information from multiple views. We also employ the tensor rotation operator for this task to better investigate the relationship among views as well as reduce the computation complexity. The proposed method can be efficiently optimized by the alternating direction method of multipliers~(ADMM). Extensive experiments on six real world datasets corresponding to five different applications show that our method achieves superior performance over other state-of-the-art methods. |
Tasks | |
Published | 2018-07-10 |
URL | https://arxiv.org/abs/1807.03602v2 |
https://arxiv.org/pdf/1807.03602v2.pdf | |
PWC | https://paperswithcode.com/paper/essential-tensor-learning-for-multi-view |
Repo | |
Framework | |
Reblur2Deblur: Deblurring Videos via Self-Supervised Learning
Title | Reblur2Deblur: Deblurring Videos via Self-Supervised Learning |
Authors | Huaijin Chen, Jinwei Gu, Orazio Gallo, Ming-Yu Liu, Ashok Veeraraghavan, Jan Kautz |
Abstract | Motion blur is a fundamental problem in computer vision as it impacts image quality and hinders inference. Traditional deblurring algorithms leverage the physics of the image formation model and use hand-crafted priors: they usually produce results that better reflect the underlying scene, but present artifacts. Recent learning-based methods implicitly extract the distribution of natural images directly from the data and use it to synthesize plausible images. Their results are impressive, but they are not always faithful to the content of the latent image. We present an approach that bridges the two. Our method fine-tunes existing deblurring neural networks in a self-supervised fashion by enforcing that the output, when blurred based on the optical flow between subsequent frames, matches the input blurry image. We show that our method significantly improves the performance of existing methods on several datasets both visually and in terms of image quality metrics. The supplementary material is https://goo.gl/nYPjEQ |
Tasks | Deblurring, Optical Flow Estimation |
Published | 2018-01-16 |
URL | http://arxiv.org/abs/1801.05117v1 |
http://arxiv.org/pdf/1801.05117v1.pdf | |
PWC | https://paperswithcode.com/paper/reblur2deblur-deblurring-videos-via-self |
Repo | |
Framework | |
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead
Title | Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead |
Authors | Cynthia Rudin |
Abstract | Black box machine learning models are currently being used for high stakes decision-making throughout society, causing problems throughout healthcare, criminal justice, and in other domains. People have hoped that creating methods for explaining these black box models will alleviate some of these problems, but trying to \textit{explain} black box models, rather than creating models that are \textit{interpretable} in the first place, is likely to perpetuate bad practices and can potentially cause catastrophic harm to society. There is a way forward – it is to design models that are inherently interpretable. This manuscript clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare, and computer vision. |
Tasks | Decision Making, Interpretable Machine Learning |
Published | 2018-11-26 |
URL | https://arxiv.org/abs/1811.10154v3 |
https://arxiv.org/pdf/1811.10154v3.pdf | |
PWC | https://paperswithcode.com/paper/please-stop-explaining-black-box-models-for |
Repo | |
Framework | |
The Automatic Identification of Butterfly Species
Title | The Automatic Identification of Butterfly Species |
Authors | Juanying Xie, Qi Hou, Yinghuan Shi, Lv Peng, Liping Jing, Fuzhen Zhuang, Junping Zhang, Xiaoyang Tang, Shengquan Xu |
Abstract | The available butterfly data sets comprise a few limited species, and the images in the data sets are always standard patterns without the images of butterflies in their living environment. To overcome the aforementioned limitations in the butterfly data sets, we build a butterfly data set composed of all species of butterflies in China with 4270 standard pattern images of 1176 butterfly species, and 1425 images from living environment of 111 species. We propose to use the deep learning technique Faster-Rcnn to train an automatic butterfly identification system including butterfly position detection and species recognition. We delete those species with only one living environment image from data set, then partition the rest images from living environment into two subsets, one used as test subset, the other as training subset respectively combined with all standard pattern butterfly images or the standard pattern butterfly images with the same species of the images from living environment. In order to construct the training subset for FasterRcnn, nine methods were adopted to amplifying the images in the training subset including the turning of up and down, and left and right, rotation with different angles, adding noises, blurring, and contrast ratio adjusting etc. Three prediction models were trained. The mAP (Mean Average prediction) criterion was used to evaluate the performance of the prediction model. The experimental results demonstrate that our Faster-Rcnn based butterfly automatic identification system performed well, and its worst mAP is up to 60%, and can simultaneously detect the positions of more than one butterflies in one images from living environment and recognize the species of those butterflies as well. |
Tasks | |
Published | 2018-03-18 |
URL | http://arxiv.org/abs/1803.06626v1 |
http://arxiv.org/pdf/1803.06626v1.pdf | |
PWC | https://paperswithcode.com/paper/the-automatic-identification-of-butterfly |
Repo | |
Framework | |
Efficient Image Evidence Analysis of CNN Classification Results
Title | Efficient Image Evidence Analysis of CNN Classification Results |
Authors | Keyang Zhou, Bernhard Kainz |
Abstract | Convolutional neural networks (CNNs) define the current state-of-the-art for image recognition. With their emerging popularity, especially for critical applications like medical image analysis or self-driving cars, confirmability is becoming an issue. The black-box nature of trained predictors make it difficult to trace failure cases or to understand the internal reasoning processes leading to results. In this paper we introduce a novel efficient method to visualise evidence that lead to decisions in CNNs. In contrast to network fixation or saliency map methods, our method is able to illustrate the evidence for or against a classifier’s decision in input pixel space approximately 10 times faster than previous methods. We also show that our approach is less prone to noise and can focus on the most relevant input regions, thus making it more accurate and interpretable. Moreover, by making simplifications we link our method with other visualisation methods, providing a general explanation for gradient-based visualisation techniques. We believe that our work makes network introspection more feasible for debugging and understanding deep convolutional networks. This will increase trust between humans and deep learning models. |
Tasks | Self-Driving Cars |
Published | 2018-01-05 |
URL | http://arxiv.org/abs/1801.01693v1 |
http://arxiv.org/pdf/1801.01693v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-image-evidence-analysis-of-cnn |
Repo | |
Framework | |
Accelerated Structure-Aware Reinforcement Learning for Delay-Sensitive Energy Harvesting Wireless Sensors
Title | Accelerated Structure-Aware Reinforcement Learning for Delay-Sensitive Energy Harvesting Wireless Sensors |
Authors | Nikhilesh Sharma, Nicholas Mastronarde, Jacob Chakareski |
Abstract | We investigate an energy-harvesting wireless sensor transmitting latency-sensitive data over a fading channel. The sensor injects captured data packets into its transmission queue and relies on ambient energy harvested from the environment to transmit them. We aim to find the optimal scheduling policy that decides whether or not to transmit the queue’s head-of-line packet at each transmission opportunity such that the expected packet queuing delay is minimized given the available harvested energy. No prior knowledge of the stochastic processes that govern the channel, captured data, or harvested energy dynamics are assumed, thereby necessitating the use of online learning to optimize the scheduling policy. We formulate this scheduling problem as a Markov decision process (MDP) and analyze the structural properties of its optimal value function. In particular, we show that it is non-decreasing and has increasing differences in the queue backlog and that it is non-increasing and has increasing differences in the battery state. We exploit this structure to formulate a novel accelerated reinforcement learning (RL) algorithm to solve the scheduling problem online at a much faster learning rate, while limiting the induced computational complexity. Our experiments demonstrate that the proposed algorithm closely approximates the performance of an optimal offline solution that requires a priori knowledge of the channel, captured data, and harvested energy dynamics. Simultaneously, by leveraging the value function’s structure, our approach achieves competitive performance relative to a state-of-the-art RL algorithm, at potentially orders of magnitude lower complexity. Finally, considerable performance gains are demonstrated over the well-known and widely used Q-learning algorithm. |
Tasks | Q-Learning |
Published | 2018-07-22 |
URL | https://arxiv.org/abs/1807.08315v2 |
https://arxiv.org/pdf/1807.08315v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-structure-aware-reinforcement |
Repo | |
Framework | |
Harmonic Adversarial Attack Method
Title | Harmonic Adversarial Attack Method |
Authors | Wen Heng, Shuchang Zhou, Tingting Jiang |
Abstract | Adversarial attacks find perturbations that can fool models into misclassifying images. Previous works had successes in generating noisy/edge-rich adversarial perturbations, at the cost of degradation of image quality. Such perturbations, even when they are small in scale, are usually easily spottable by human vision. In contrast, we propose Harmonic Adversar- ial Attack Methods (HAAM), that generates edge-free perturbations by using harmonic functions. The property of edge-free guarantees that the generated adversarial images can still preserve visual quality, even when perturbations are of large magnitudes. Experiments also show that adversaries generated by HAAM often have higher rates of success when transferring between models. In addition, we find harmonic perturbations can simulate natural phenomena like natural lighting and shadows. It would then be possible to help find corner cases for given models, as a first step to improving them. |
Tasks | Adversarial Attack |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.10590v2 |
http://arxiv.org/pdf/1807.10590v2.pdf | |
PWC | https://paperswithcode.com/paper/harmonic-adversarial-attack-method |
Repo | |
Framework | |
Sentiment Composition of Words with Opposing Polarities
Title | Sentiment Composition of Words with Opposing Polarities |
Authors | Svetlana Kiritchenko, Saif M. Mohammad |
Abstract | In this paper, we explore sentiment composition in phrases that have at least one positive and at least one negative word—phrases like ‘happy accident’ and ‘best winter break’. We compiled a dataset of such opposing polarity phrases and manually annotated them with real-valued scores of sentiment association. Using this dataset, we analyze the linguistic patterns present in opposing polarity phrases. Finally, we apply several unsupervised and supervised techniques of sentiment composition to determine their efficacy on this dataset. Our best system, which incorporates information from the phrase’s constituents, their parts of speech, their sentiment association scores, and their embedding vectors, obtains an accuracy of over 80% on the opposing polarity phrases. |
Tasks | |
Published | 2018-05-11 |
URL | http://arxiv.org/abs/1805.04542v1 |
http://arxiv.org/pdf/1805.04542v1.pdf | |
PWC | https://paperswithcode.com/paper/sentiment-composition-of-words-with-opposing |
Repo | |
Framework | |
Towards Understanding End-of-trip Instructions in a Taxi Ride Scenario
Title | Towards Understanding End-of-trip Instructions in a Taxi Ride Scenario |
Authors | Deepthi Karkada, Ramesh Manuvinakurike, Kallirroi Georgila |
Abstract | We introduce a dataset containing human-authored descriptions of target locations in an “end-of-trip in a taxi ride” scenario. We describe our data collection method and a novel annotation scheme that supports understanding of such descriptions of target locations. Our dataset contains target location descriptions for both synthetic and real-world images as well as visual annotations (ground truth labels, dimensions of vehicles and objects, coordinates of the target location,distance and direction of the target location from vehicles and objects) that can be used in various visual and language tasks. We also perform a pilot experiment on how the corpus could be applied to visual reference resolution in this domain. |
Tasks | |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.03950v1 |
http://arxiv.org/pdf/1807.03950v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-understanding-end-of-trip |
Repo | |
Framework | |