October 20, 2019

3099 words 15 mins read

Paper Group AWR 274

Perceptual Compressive Sensing. Minimizing Close-k Aggregate Loss Improves Classification. Cogni-Net: Cognitive Feature Learning through Deep Visual Perception. AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias. Are adversarial examples inevitable?. Rethinking Feature Distribution for Loss …

Perceptual Compressive Sensing


Title	Perceptual Compressive Sensing
Authors	Jiang Du, Xuemei Xie, Chenye Wang, Guangming Shi
Abstract	Compressive sensing (CS) works to acquire measurements at sub-Nyquist rate and recover the scene images. Existing CS methods always recover the scene images in pixel level. This causes the smoothness of recovered images and lack of structure information, especially at a low measurement rate. To overcome this drawback, in this paper, we propose perceptual CS to obtain high-level structured recovery. Our task no longer focuses on pixel level. Instead, we work to make a better visual effect. In detail, we employ perceptual loss, defined on feature level, to enhance the structure information of the recovered images. Experiments show that our method achieves better visual results with stronger structure information than existing CS methods at the same measurement rate.
Tasks	Compressive Sensing
Published	2018-02-01
URL	http://arxiv.org/abs/1802.00176v2
PDF	http://arxiv.org/pdf/1802.00176v2.pdf
PWC	https://paperswithcode.com/paper/perceptual-compressive-sensing
Repo	https://github.com/jiang-du/Perceptual-CS
Framework	none

Minimizing Close-k Aggregate Loss Improves Classification


Title	Minimizing Close-k Aggregate Loss Improves Classification
Authors	Bryan He, James Zou
Abstract	In classification, the de facto method for aggregating individual losses is the average loss. When the actual metric of interest is 0-1 loss, it is common to minimize the average surrogate loss for some well-behaved (e.g. convex) surrogate. Recently, several other aggregate losses such as the maximal loss and average top-$k$ loss were proposed as alternative objectives to address shortcomings of the average loss. However, we identify common classification settings, e.g. the data is imbalanced, has too many easy or ambiguous examples, etc., when average, maximal and average top-$k$ all suffer from suboptimal decision boundaries, even on an infinitely large training set. To address this problem, we propose a new classification objective called the close-$k$ aggregate loss, where we adaptively minimize the loss for points close to the decision boundary. We provide theoretical guarantees for the 0-1 accuracy when we optimize close-$k$ aggregate loss. We also conduct systematic experiments across the PMLB and OpenML benchmark datasets. Close-$k$ achieves significant gains in 0-1 test accuracy, improvements of $\geq 2$% and $p<0.05$, in over 25% of the datasets compared to average, maximal and average top-$k$. In contrast, the previous aggregate losses outperformed close-$k$ in less than 2% of the datasets.
Tasks
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00521v2
PDF	http://arxiv.org/pdf/1811.00521v2.pdf
PWC	https://paperswithcode.com/paper/minimizing-close-k-aggregate-loss-improves
Repo	https://github.com/bryan-he/closek
Framework	pytorch

Cogni-Net: Cognitive Feature Learning through Deep Visual Perception


Title	Cogni-Net: Cognitive Feature Learning through Deep Visual Perception
Authors	Pranay Mukherjee, Abhirup Das, Ayan Kumar Bhunia, Partha Pratim Roy
Abstract	Can we ask computers to recognize what we see from brain signals alone? Our paper seeks to utilize the knowledge learnt in the visual domain by popular pre-trained vision models and use it to teach a recurrent model being trained on brain signals to learn a discriminative manifold of the human brain’s cognition of different visual object categories in response to perceived visual cues. For this we make use of brain EEG signals triggered from visual stimuli like images and leverage the natural synchronization between images and their corresponding brain signals to learn a novel representation of the cognitive feature space. The concept of knowledge distillation has been used here for training the deep cognition model, CogniNet\footnote{The source code of the proposed system is publicly available at {https://www.github.com/53X/CogniNET}}, by employing a student-teacher learning technique in order to bridge the process of inter-modal knowledge transfer. The proposed novel architecture obtains state-of-the-art results, significantly surpassing other existing models. The experiments performed by us also suggest that if visual stimuli information like brain EEG signals can be gathered on a large scale, then that would help to obtain a better understanding of the largely unexplored domain of human brain cognition.
Tasks	EEG, Transfer Learning
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00201v2
PDF	http://arxiv.org/pdf/1811.00201v2.pdf
PWC	https://paperswithcode.com/paper/cogni-net-cognitive-feature-learning-through
Repo	https://github.com/53X/CogniNET
Framework	none

AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias


Title	AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias
Authors	Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, Yunfeng Zhang
Abstract	Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This paper introduces a new open source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license {https://github.com/ibm/aif360). The main objectives of this toolkit are to help facilitate the transition of fairness research algorithms to use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms. The package includes a comprehensive set of fairness metrics for datasets and models, explanations for these metrics, and algorithms to mitigate bias in datasets and models. It also includes an interactive Web experience (https://aif360.mybluemix.net) that provides a gentle introduction to the concepts and capabilities for line-of-business users, as well as extensive documentation, usage guidance, and industry-specific tutorials to enable data scientists and practitioners to incorporate the most appropriate tool for their problem into their work products. The architecture of the package has been engineered to conform to a standard paradigm used in data science, thereby further improving usability for practitioners. Such architectural design and abstractions enable researchers and developers to extend the toolkit with their new algorithms and improvements, and to use it for performance benchmarking. A built-in testing infrastructure maintains code quality.
Tasks	Decision Making
Published	2018-10-03
URL	http://arxiv.org/abs/1810.01943v1
PDF	http://arxiv.org/pdf/1810.01943v1.pdf
PWC	https://paperswithcode.com/paper/ai-fairness-360-an-extensible-toolkit-for
Repo	https://github.com/aif360-learn/aif360-learn
Framework	tf

Are adversarial examples inevitable?


Title	Are adversarial examples inevitable?
Authors	Ali Shafahi, W. Ronny Huang, Christoph Studer, Soheil Feizi, Tom Goldstein
Abstract	A wide range of defenses have been proposed to harden neural networks against adversarial attacks. However, a pattern has emerged in which the majority of adversarial defenses are quickly broken by new attacks. Given the lack of success at generating robust defenses, we are led to ask a fundamental question: Are adversarial attacks inevitable? This paper analyzes adversarial examples from a theoretical perspective, and identifies fundamental bounds on the susceptibility of a classifier to adversarial attacks. We show that, for certain classes of problems, adversarial examples are inescapable. Using experiments, we explore the implications of theoretical guarantees for real-world problems and discuss how factors such as dimensionality and image complexity limit a classifier’s robustness against adversarial examples.
Tasks
Published	2018-09-06
URL	https://arxiv.org/abs/1809.02104v3
PDF	https://arxiv.org/pdf/1809.02104v3.pdf
PWC	https://paperswithcode.com/paper/are-adversarial-examples-inevitable
Repo	https://github.com/kc-ml2/ipam-2019-dgl
Framework	pytorch

Rethinking Feature Distribution for Loss Functions in Image Classification


Title	Rethinking Feature Distribution for Loss Functions in Image Classification
Authors	Weitao Wan, Yuanyi Zhong, Tianpeng Li, Jiansheng Chen
Abstract	We propose a large-margin Gaussian Mixture (L-GM) loss for deep neural networks in classification tasks. Different from the softmax cross-entropy loss, our proposal is established on the assumption that the deep features of the training set follow a Gaussian Mixture distribution. By involving a classification margin and a likelihood regularization, the L-GM loss facilitates both a high classification performance and an accurate modeling of the training feature distribution. As such, the L-GM loss is superior to the softmax loss and its major variants in the sense that besides classification, it can be readily used to distinguish abnormal inputs, such as the adversarial examples, based on their features’ likelihood to the training feature distribution. Extensive experiments on various recognition benchmarks like MNIST, CIFAR, ImageNet and LFW, as well as on adversarial examples demonstrate the effectiveness of our proposal.
Tasks	Image Classification
Published	2018-03-08
URL	http://arxiv.org/abs/1803.02988v1
PDF	http://arxiv.org/pdf/1803.02988v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-feature-distribution-for-loss
Repo	https://github.com/yuyijie1995/gluon_GMLoss
Framework	mxnet

Gaussian Process Deep Belief Networks: A Smooth Generative Model of Shape with Uncertainty Propagation


Title	Gaussian Process Deep Belief Networks: A Smooth Generative Model of Shape with Uncertainty Propagation
Authors	Alessandro Di Martino, Erik Bodin, Carl Henrik Ek, Neill D. F. Campbell
Abstract	The shape of an object is an important characteristic for many vision problems such as segmentation, detection and tracking. Being independent of appearance, it is possible to generalize to a large range of objects from only small amounts of data. However, shapes represented as silhouette images are challenging to model due to complicated likelihood functions leading to intractable posteriors. In this paper we present a generative model of shapes which provides a low dimensional latent encoding which importantly resides on a smooth manifold with respect to the silhouette images. The proposed model propagates uncertainty in a principled manner allowing it to learn from small amounts of data and providing predictions with associated uncertainty. We provide experiments that show how our proposed model provides favorable quantitative results compared with the state-of-the-art while simultaneously providing a representation that resides on a low-dimensional interpretable manifold.
Tasks
Published	2018-12-13
URL	http://arxiv.org/abs/1812.05477v1
PDF	http://arxiv.org/pdf/1812.05477v1.pdf
PWC	https://paperswithcode.com/paper/gaussian-process-deep-belief-networks-a
Repo	https://github.com/zeis/deepbelief
Framework	tf

A Probabilistic U-Net for Segmentation of Ambiguous Images


Title	A Probabilistic U-Net for Segmentation of Ambiguous Images
Authors	Simon A. A. Kohl, Bernardino Romera-Paredes, Clemens Meyer, Jeffrey De Fauw, Joseph R. Ledsam, Klaus H. Maier-Hein, S. M. Ali Eslami, Danilo Jimenez Rezende, Olaf Ronneberger
Abstract	Many real-world vision problems suffer from inherent ambiguities. In clinical applications for example, it might not be clear from a CT scan alone which particular region is cancer tissue. Therefore a group of graders typically produces a set of diverse but plausible segmentations. We consider the task of learning a distribution over segmentations given an input. To this end we propose a generative segmentation model based on a combination of a U-Net with a conditional variational autoencoder that is capable of efficiently producing an unlimited number of plausible hypotheses. We show on a lung abnormalities segmentation task and on a Cityscapes segmentation task that our model reproduces the possible segmentation variants as well as the frequencies with which they occur, doing so significantly better than published approaches. These models could have a high impact in real-world applications, such as being used as clinical decision-making algorithms accounting for multiple plausible semantic segmentation hypotheses to provide possible diagnoses and recommend further actions to resolve the present ambiguities.
Tasks	Decision Making, Semantic Segmentation
Published	2018-06-13
URL	http://arxiv.org/abs/1806.05034v4
PDF	http://arxiv.org/pdf/1806.05034v4.pdf
PWC	https://paperswithcode.com/paper/a-probabilistic-u-net-for-segmentation-of
Repo	https://github.com/stefanknegt/probabilistic_unet_pytorch
Framework	pytorch

Empirical fixed point bifurcation analysis


Title	Empirical fixed point bifurcation analysis
Authors	Gergo Bohner, Maneesh Sahani
Abstract	In a common experimental setting, the behaviour of a noisy dynamical system is monitored in response to manipulations of one or more control parameters. Here, we introduce a structured model to describe parametric changes in qualitative system behaviour via stochastic bifurcation analysis. In particular, we describe an extension of Gaussian Process models of transition maps, in which the learned map is directly parametrized by its fixed points and associated local linearisations. We show that the system recovers the behaviour of a well-studied one dimensional system from little data, then learn the behaviour of a more realistic two dimensional process of mutually inhibiting neural populations.
Tasks
Published	2018-07-04
URL	http://arxiv.org/abs/1807.01486v1
PDF	http://arxiv.org/pdf/1807.01486v1.pdf
PWC	https://paperswithcode.com/paper/empirical-fixed-point-bifurcation-analysis
Repo	https://github.com/gbohner/thesis-code-chapter4-fpgp
Framework	none

A Short Note about Kinetics-600


Title	A Short Note about Kinetics-600
Authors	Joao Carreira, Eric Noland, Andras Banki-Horvath, Chloe Hillier, Andrew Zisserman
Abstract	We describe an extension of the DeepMind Kinetics human action dataset from 400 classes, each with at least 400 video clips, to 600 classes, each with at least 600 video clips. In order to scale up the dataset we changed the data collection process so it uses multiple queries per class, with some of them in a language other than english – portuguese. This paper details the changes between the two versions of the dataset and includes a comprehensive set of statistics of the new version as well as baseline results using the I3D neural network architecture. The paper is a companion to the release of the ground truth labels for the public test set.
Tasks
Published	2018-08-03
URL	http://arxiv.org/abs/1808.01340v1
PDF	http://arxiv.org/pdf/1808.01340v1.pdf
PWC	https://paperswithcode.com/paper/a-short-note-about-kinetics-600
Repo	https://github.com/rocksyne/kinetics-dataset-downloader
Framework	none

SimplE Embedding for Link Prediction in Knowledge Graphs


Title	SimplE Embedding for Link Prediction in Knowledge Graphs
Authors	Seyed Mehran Kazemi, David Poole
Abstract	Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction problems. Proposed in 1927, Canonical Polyadic (CP) decomposition is among the first tensor factorization approaches. CP generally performs poorly for link prediction as it learns two independent embedding vectors for each entity, whereas they are really tied. We present a simple enhancement of CP (which we call SimplE) to allow the two embeddings of each entity to be learned dependently. The complexity of SimplE grows linearly with the size of embeddings. The embeddings learned through SimplE are interpretable, and certain types of background knowledge can be incorporated into these embeddings through weight tying. We prove SimplE is fully expressive and derive a bound on the size of its embeddings for full expressivity. We show empirically that, despite its simplicity, SimplE outperforms several state-of-the-art tensor factorization techniques. SimplE’s code is available on GitHub at https://github.com/Mehran-k/SimplE.
Tasks	Knowledge Graphs, Link Prediction
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04868v2
PDF	http://arxiv.org/pdf/1802.04868v2.pdf
PWC	https://paperswithcode.com/paper/simple-embedding-for-link-prediction-in
Repo	https://github.com/Mehran-k/SimplE
Framework	tf

A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures


Title	A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures
Authors	Jan Vanek, Josef Michalek, Jan Zelinka, Josef Psutka
Abstract	Recently, recurrent neural networks have become state-of-the-art in acoustic modeling for automatic speech recognition. The long short-term memory (LSTM) units are the most popular ones. However, alternative units like gated recurrent unit (GRU) and its modifications outperformed LSTM in some publications. In this paper, we compared five neural network (NN) architectures with various adaptation and feature normalization techniques. We have evaluated feature-space maximum likelihood linear regression, five variants of i-vector adaptation and two variants of cepstral mean normalization. The most adaptation and normalization techniques were developed for feed-forward NNs and, according to results in this paper, not all of them worked also with RNNs. For experiments, we have chosen a well known and available TIMIT phone recognition task. The phone recognition is much more sensitive to the quality of AM than large vocabulary task with a complex language model. Also, we published the open-source scripts to easily replicate the results and to help continue the development.
Tasks	Language Modelling, Speech Recognition
Published	2018-07-12
URL	http://arxiv.org/abs/1807.06441v1
PDF	http://arxiv.org/pdf/1807.06441v1.pdf
PWC	https://paperswithcode.com/paper/a-comparison-of-adaptation-techniques-and
Repo	https://github.com/OrcusCZ/NNAcousticModeling
Framework	none

Prosody Modifications for Question-Answering in Voice-Only Settings


Title	Prosody Modifications for Question-Answering in Voice-Only Settings
Authors	Aleksandr Chuklin, Aliaksei Severyn, Johanne Trippas, Enrique Alfonseca, Hanna Silen, Damiano Spina
Abstract	Many popular form factors of digital assistants—such as Amazon Echo, Apple Homepod, or Google Home—enable the user to hold a conversation with these systems based only on the speech modality. The lack of a screen presents unique challenges. To satisfy the information need of a user, the presentation of the answer needs to be optimized for such voice-only interactions. In this paper, we propose a task of evaluating the usefulness of audio transformations (i.e., prosodic modifications) for voice-only question answering. We introduce a crowdsourcing setup where we evaluate the quality of our proposed modifications along multiple dimensions corresponding to the informativeness, naturalness, and ability of the user to identify key parts of the answer. We offer a set of prosodic modifications that highlight potentially important parts of the answer using various acoustic cues. Our experiments show that some of these prosodic modifications lead to better comprehension at the expense of only slightly degraded naturalness of the audio.
Tasks	Question Answering
Published	2018-06-11
URL	https://arxiv.org/abs/1806.03957v4
PDF	https://arxiv.org/pdf/1806.03957v4.pdf
PWC	https://paperswithcode.com/paper/prosody-modifications-for-question-answering
Repo	https://github.com/varepsilon/clef2019-prosody
Framework	none

Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization


Title	Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization
Authors	Jiong Zhang, Qi Lei, Inderjit S. Dhillon
Abstract	Vanishing and exploding gradients are two of the main obstacles in training deep neural networks, especially in capturing long range dependencies in recurrent neural networks~(RNNs). In this paper, we present an efficient parametrization of the transition matrix of an RNN that allows us to stabilize the gradients that arise in its training. Specifically, we parameterize the transition matrix by its singular value decomposition(SVD), which allows us to explicitly track and control its singular values. We attain efficiency by using tools that are common in numerical linear algebra, namely Householder reflectors for representing the orthogonal matrices that arise in the SVD. By explicitly controlling the singular values, our proposed Spectral-RNN method allows us to easily solve the exploding gradient problem and we observe that it empirically solves the vanishing gradient issue to a large extent. We note that the SVD parameterization can be used for any rectangular weight matrix, hence it can be easily extended to any deep neural network, such as a multi-layer perceptron. Theoretically, we demonstrate that our parameterization does not lose any expressive power, and show how it controls generalization of RNN for the classification task. %, and show how it potentially makes the optimization process easier. Our extensive experimental results also demonstrate that the proposed framework converges faster, and has good generalization, especially in capturing long range dependencies, as shown on the synthetic addition and copy tasks, as well as on MNIST and Penn Tree Bank data sets.
Tasks
Published	2018-03-25
URL	http://arxiv.org/abs/1803.09327v1
PDF	http://arxiv.org/pdf/1803.09327v1.pdf
PWC	https://paperswithcode.com/paper/stabilizing-gradients-for-deep-neural
Repo	https://github.com/zhangjiong724/spectral-RNN
Framework	none

Local Temporal Bilinear Pooling for Fine-grained Action Parsing


Title	Local Temporal Bilinear Pooling for Fine-grained Action Parsing
Authors	Yan Zhang, Siyu Tang, Krikamol Muandet, Christian Jarvers, Heiko Neumann
Abstract	Fine-grained temporal action parsing is important in many applications, such as daily activity understanding, human motion analysis, surgical robotics and others requiring subtle and precise operations in a long-term period. In this paper we propose a novel bilinear pooling operation, which is used in intermediate layers of a temporal convolutional encoder-decoder net. In contrast to other work, our proposed bilinear pooling is learnable and hence can capture more complex local statistics than the conventional counterpart. In addition, we introduce exact lower-dimension representations of our bilinear forms, so that the dimensionality is reduced with neither information loss nor extra computation. We perform intensive experiments to quantitatively analyze our model and show the superior performances to other state-of-the-art work on various datasets.
Tasks	Action Parsing
Published	2018-12-05
URL	https://arxiv.org/abs/1812.01922v3
PDF	https://arxiv.org/pdf/1812.01922v3.pdf
PWC	https://paperswithcode.com/paper/local-temporal-bilinear-pooling-for-fine
Repo	https://github.com/yz-cnsdqz/TemporalActionParsing-FineGrained
Framework	tf