Paper Group AWR 274
Perceptual Compressive Sensing. Minimizing Close-k Aggregate Loss Improves Classification. Cogni-Net: Cognitive Feature Learning through Deep Visual Perception. AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias. Are adversarial examples inevitable?. Rethinking Feature Distribution for Loss …
Perceptual Compressive Sensing
Title | Perceptual Compressive Sensing |
Authors | Jiang Du, Xuemei Xie, Chenye Wang, Guangming Shi |
Abstract | Compressive sensing (CS) works to acquire measurements at sub-Nyquist rate and recover the scene images. Existing CS methods always recover the scene images in pixel level. This causes the smoothness of recovered images and lack of structure information, especially at a low measurement rate. To overcome this drawback, in this paper, we propose perceptual CS to obtain high-level structured recovery. Our task no longer focuses on pixel level. Instead, we work to make a better visual effect. In detail, we employ perceptual loss, defined on feature level, to enhance the structure information of the recovered images. Experiments show that our method achieves better visual results with stronger structure information than existing CS methods at the same measurement rate. |
Tasks | Compressive Sensing |
Published | 2018-02-01 |
URL | http://arxiv.org/abs/1802.00176v2 |
http://arxiv.org/pdf/1802.00176v2.pdf | |
PWC | https://paperswithcode.com/paper/perceptual-compressive-sensing |
Repo | https://github.com/jiang-du/Perceptual-CS |
Framework | none |
Minimizing Close-k Aggregate Loss Improves Classification
Title | Minimizing Close-k Aggregate Loss Improves Classification |
Authors | Bryan He, James Zou |
Abstract | In classification, the de facto method for aggregating individual losses is the average loss. When the actual metric of interest is 0-1 loss, it is common to minimize the average surrogate loss for some well-behaved (e.g. convex) surrogate. Recently, several other aggregate losses such as the maximal loss and average top-$k$ loss were proposed as alternative objectives to address shortcomings of the average loss. However, we identify common classification settings, e.g. the data is imbalanced, has too many easy or ambiguous examples, etc., when average, maximal and average top-$k$ all suffer from suboptimal decision boundaries, even on an infinitely large training set. To address this problem, we propose a new classification objective called the close-$k$ aggregate loss, where we adaptively minimize the loss for points close to the decision boundary. We provide theoretical guarantees for the 0-1 accuracy when we optimize close-$k$ aggregate loss. We also conduct systematic experiments across the PMLB and OpenML benchmark datasets. Close-$k$ achieves significant gains in 0-1 test accuracy, improvements of $\geq 2$% and $p<0.05$, in over 25% of the datasets compared to average, maximal and average top-$k$. In contrast, the previous aggregate losses outperformed close-$k$ in less than 2% of the datasets. |
Tasks | |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00521v2 |
http://arxiv.org/pdf/1811.00521v2.pdf | |
PWC | https://paperswithcode.com/paper/minimizing-close-k-aggregate-loss-improves |
Repo | https://github.com/bryan-he/closek |
Framework | pytorch |
Cogni-Net: Cognitive Feature Learning through Deep Visual Perception
Title | Cogni-Net: Cognitive Feature Learning through Deep Visual Perception |
Authors | Pranay Mukherjee, Abhirup Das, Ayan Kumar Bhunia, Partha Pratim Roy |
Abstract | Can we ask computers to recognize what we see from brain signals alone? Our paper seeks to utilize the knowledge learnt in the visual domain by popular pre-trained vision models and use it to teach a recurrent model being trained on brain signals to learn a discriminative manifold of the human brain’s cognition of different visual object categories in response to perceived visual cues. For this we make use of brain EEG signals triggered from visual stimuli like images and leverage the natural synchronization between images and their corresponding brain signals to learn a novel representation of the cognitive feature space. The concept of knowledge distillation has been used here for training the deep cognition model, CogniNet\footnote{The source code of the proposed system is publicly available at {https://www.github.com/53X/CogniNET}}, by employing a student-teacher learning technique in order to bridge the process of inter-modal knowledge transfer. The proposed novel architecture obtains state-of-the-art results, significantly surpassing other existing models. The experiments performed by us also suggest that if visual stimuli information like brain EEG signals can be gathered on a large scale, then that would help to obtain a better understanding of the largely unexplored domain of human brain cognition. |
Tasks | EEG, Transfer Learning |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00201v2 |
http://arxiv.org/pdf/1811.00201v2.pdf | |
PWC | https://paperswithcode.com/paper/cogni-net-cognitive-feature-learning-through |
Repo | https://github.com/53X/CogniNET |
Framework | none |
AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias
Title | AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias |
Authors | Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, Yunfeng Zhang |
Abstract | Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This paper introduces a new open source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license {https://github.com/ibm/aif360). The main objectives of this toolkit are to help facilitate the transition of fairness research algorithms to use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms. The package includes a comprehensive set of fairness metrics for datasets and models, explanations for these metrics, and algorithms to mitigate bias in datasets and models. It also includes an interactive Web experience (https://aif360.mybluemix.net) that provides a gentle introduction to the concepts and capabilities for line-of-business users, as well as extensive documentation, usage guidance, and industry-specific tutorials to enable data scientists and practitioners to incorporate the most appropriate tool for their problem into their work products. The architecture of the package has been engineered to conform to a standard paradigm used in data science, thereby further improving usability for practitioners. Such architectural design and abstractions enable researchers and developers to extend the toolkit with their new algorithms and improvements, and to use it for performance benchmarking. A built-in testing infrastructure maintains code quality. |
Tasks | Decision Making |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01943v1 |
http://arxiv.org/pdf/1810.01943v1.pdf | |
PWC | https://paperswithcode.com/paper/ai-fairness-360-an-extensible-toolkit-for |
Repo | https://github.com/aif360-learn/aif360-learn |
Framework | tf |
Are adversarial examples inevitable?
Title | Are adversarial examples inevitable? |
Authors | Ali Shafahi, W. Ronny Huang, Christoph Studer, Soheil Feizi, Tom Goldstein |
Abstract | A wide range of defenses have been proposed to harden neural networks against adversarial attacks. However, a pattern has emerged in which the majority of adversarial defenses are quickly broken by new attacks. Given the lack of success at generating robust defenses, we are led to ask a fundamental question: Are adversarial attacks inevitable? This paper analyzes adversarial examples from a theoretical perspective, and identifies fundamental bounds on the susceptibility of a classifier to adversarial attacks. We show that, for certain classes of problems, adversarial examples are inescapable. Using experiments, we explore the implications of theoretical guarantees for real-world problems and discuss how factors such as dimensionality and image complexity limit a classifier’s robustness against adversarial examples. |
Tasks | |
Published | 2018-09-06 |
URL | https://arxiv.org/abs/1809.02104v3 |
https://arxiv.org/pdf/1809.02104v3.pdf | |
PWC | https://paperswithcode.com/paper/are-adversarial-examples-inevitable |
Repo | https://github.com/kc-ml2/ipam-2019-dgl |
Framework | pytorch |
Rethinking Feature Distribution for Loss Functions in Image Classification
Title | Rethinking Feature Distribution for Loss Functions in Image Classification |
Authors | Weitao Wan, Yuanyi Zhong, Tianpeng Li, Jiansheng Chen |
Abstract | We propose a large-margin Gaussian Mixture (L-GM) loss for deep neural networks in classification tasks. Different from the softmax cross-entropy loss, our proposal is established on the assumption that the deep features of the training set follow a Gaussian Mixture distribution. By involving a classification margin and a likelihood regularization, the L-GM loss facilitates both a high classification performance and an accurate modeling of the training feature distribution. As such, the L-GM loss is superior to the softmax loss and its major variants in the sense that besides classification, it can be readily used to distinguish abnormal inputs, such as the adversarial examples, based on their features’ likelihood to the training feature distribution. Extensive experiments on various recognition benchmarks like MNIST, CIFAR, ImageNet and LFW, as well as on adversarial examples demonstrate the effectiveness of our proposal. |
Tasks | Image Classification |
Published | 2018-03-08 |
URL | http://arxiv.org/abs/1803.02988v1 |
http://arxiv.org/pdf/1803.02988v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-feature-distribution-for-loss |
Repo | https://github.com/yuyijie1995/gluon_GMLoss |
Framework | mxnet |
Gaussian Process Deep Belief Networks: A Smooth Generative Model of Shape with Uncertainty Propagation
Title | Gaussian Process Deep Belief Networks: A Smooth Generative Model of Shape with Uncertainty Propagation |
Authors | Alessandro Di Martino, Erik Bodin, Carl Henrik Ek, Neill D. F. Campbell |
Abstract | The shape of an object is an important characteristic for many vision problems such as segmentation, detection and tracking. Being independent of appearance, it is possible to generalize to a large range of objects from only small amounts of data. However, shapes represented as silhouette images are challenging to model due to complicated likelihood functions leading to intractable posteriors. In this paper we present a generative model of shapes which provides a low dimensional latent encoding which importantly resides on a smooth manifold with respect to the silhouette images. The proposed model propagates uncertainty in a principled manner allowing it to learn from small amounts of data and providing predictions with associated uncertainty. We provide experiments that show how our proposed model provides favorable quantitative results compared with the state-of-the-art while simultaneously providing a representation that resides on a low-dimensional interpretable manifold. |
Tasks | |
Published | 2018-12-13 |
URL | http://arxiv.org/abs/1812.05477v1 |
http://arxiv.org/pdf/1812.05477v1.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-process-deep-belief-networks-a |
Repo | https://github.com/zeis/deepbelief |
Framework | tf |
A Probabilistic U-Net for Segmentation of Ambiguous Images
Title | A Probabilistic U-Net for Segmentation of Ambiguous Images |
Authors | Simon A. A. Kohl, Bernardino Romera-Paredes, Clemens Meyer, Jeffrey De Fauw, Joseph R. Ledsam, Klaus H. Maier-Hein, S. M. Ali Eslami, Danilo Jimenez Rezende, Olaf Ronneberger |
Abstract | Many real-world vision problems suffer from inherent ambiguities. In clinical applications for example, it might not be clear from a CT scan alone which particular region is cancer tissue. Therefore a group of graders typically produces a set of diverse but plausible segmentations. We consider the task of learning a distribution over segmentations given an input. To this end we propose a generative segmentation model based on a combination of a U-Net with a conditional variational autoencoder that is capable of efficiently producing an unlimited number of plausible hypotheses. We show on a lung abnormalities segmentation task and on a Cityscapes segmentation task that our model reproduces the possible segmentation variants as well as the frequencies with which they occur, doing so significantly better than published approaches. These models could have a high impact in real-world applications, such as being used as clinical decision-making algorithms accounting for multiple plausible semantic segmentation hypotheses to provide possible diagnoses and recommend further actions to resolve the present ambiguities. |
Tasks | Decision Making, Semantic Segmentation |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.05034v4 |
http://arxiv.org/pdf/1806.05034v4.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-u-net-for-segmentation-of |
Repo | https://github.com/stefanknegt/probabilistic_unet_pytorch |
Framework | pytorch |
Empirical fixed point bifurcation analysis
Title | Empirical fixed point bifurcation analysis |
Authors | Gergo Bohner, Maneesh Sahani |
Abstract | In a common experimental setting, the behaviour of a noisy dynamical system is monitored in response to manipulations of one or more control parameters. Here, we introduce a structured model to describe parametric changes in qualitative system behaviour via stochastic bifurcation analysis. In particular, we describe an extension of Gaussian Process models of transition maps, in which the learned map is directly parametrized by its fixed points and associated local linearisations. We show that the system recovers the behaviour of a well-studied one dimensional system from little data, then learn the behaviour of a more realistic two dimensional process of mutually inhibiting neural populations. |
Tasks | |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01486v1 |
http://arxiv.org/pdf/1807.01486v1.pdf | |
PWC | https://paperswithcode.com/paper/empirical-fixed-point-bifurcation-analysis |
Repo | https://github.com/gbohner/thesis-code-chapter4-fpgp |
Framework | none |
A Short Note about Kinetics-600
Title | A Short Note about Kinetics-600 |
Authors | Joao Carreira, Eric Noland, Andras Banki-Horvath, Chloe Hillier, Andrew Zisserman |
Abstract | We describe an extension of the DeepMind Kinetics human action dataset from 400 classes, each with at least 400 video clips, to 600 classes, each with at least 600 video clips. In order to scale up the dataset we changed the data collection process so it uses multiple queries per class, with some of them in a language other than english – portuguese. This paper details the changes between the two versions of the dataset and includes a comprehensive set of statistics of the new version as well as baseline results using the I3D neural network architecture. The paper is a companion to the release of the ground truth labels for the public test set. |
Tasks | |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01340v1 |
http://arxiv.org/pdf/1808.01340v1.pdf | |
PWC | https://paperswithcode.com/paper/a-short-note-about-kinetics-600 |
Repo | https://github.com/rocksyne/kinetics-dataset-downloader |
Framework | none |
SimplE Embedding for Link Prediction in Knowledge Graphs
Title | SimplE Embedding for Link Prediction in Knowledge Graphs |
Authors | Seyed Mehran Kazemi, David Poole |
Abstract | Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction problems. Proposed in 1927, Canonical Polyadic (CP) decomposition is among the first tensor factorization approaches. CP generally performs poorly for link prediction as it learns two independent embedding vectors for each entity, whereas they are really tied. We present a simple enhancement of CP (which we call SimplE) to allow the two embeddings of each entity to be learned dependently. The complexity of SimplE grows linearly with the size of embeddings. The embeddings learned through SimplE are interpretable, and certain types of background knowledge can be incorporated into these embeddings through weight tying. We prove SimplE is fully expressive and derive a bound on the size of its embeddings for full expressivity. We show empirically that, despite its simplicity, SimplE outperforms several state-of-the-art tensor factorization techniques. SimplE’s code is available on GitHub at https://github.com/Mehran-k/SimplE. |
Tasks | Knowledge Graphs, Link Prediction |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04868v2 |
http://arxiv.org/pdf/1802.04868v2.pdf | |
PWC | https://paperswithcode.com/paper/simple-embedding-for-link-prediction-in |
Repo | https://github.com/Mehran-k/SimplE |
Framework | tf |
A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures
Title | A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures |
Authors | Jan Vanek, Josef Michalek, Jan Zelinka, Josef Psutka |
Abstract | Recently, recurrent neural networks have become state-of-the-art in acoustic modeling for automatic speech recognition. The long short-term memory (LSTM) units are the most popular ones. However, alternative units like gated recurrent unit (GRU) and its modifications outperformed LSTM in some publications. In this paper, we compared five neural network (NN) architectures with various adaptation and feature normalization techniques. We have evaluated feature-space maximum likelihood linear regression, five variants of i-vector adaptation and two variants of cepstral mean normalization. The most adaptation and normalization techniques were developed for feed-forward NNs and, according to results in this paper, not all of them worked also with RNNs. For experiments, we have chosen a well known and available TIMIT phone recognition task. The phone recognition is much more sensitive to the quality of AM than large vocabulary task with a complex language model. Also, we published the open-source scripts to easily replicate the results and to help continue the development. |
Tasks | Language Modelling, Speech Recognition |
Published | 2018-07-12 |
URL | http://arxiv.org/abs/1807.06441v1 |
http://arxiv.org/pdf/1807.06441v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparison-of-adaptation-techniques-and |
Repo | https://github.com/OrcusCZ/NNAcousticModeling |
Framework | none |
Prosody Modifications for Question-Answering in Voice-Only Settings
Title | Prosody Modifications for Question-Answering in Voice-Only Settings |
Authors | Aleksandr Chuklin, Aliaksei Severyn, Johanne Trippas, Enrique Alfonseca, Hanna Silen, Damiano Spina |
Abstract | Many popular form factors of digital assistants—such as Amazon Echo, Apple Homepod, or Google Home—enable the user to hold a conversation with these systems based only on the speech modality. The lack of a screen presents unique challenges. To satisfy the information need of a user, the presentation of the answer needs to be optimized for such voice-only interactions. In this paper, we propose a task of evaluating the usefulness of audio transformations (i.e., prosodic modifications) for voice-only question answering. We introduce a crowdsourcing setup where we evaluate the quality of our proposed modifications along multiple dimensions corresponding to the informativeness, naturalness, and ability of the user to identify key parts of the answer. We offer a set of prosodic modifications that highlight potentially important parts of the answer using various acoustic cues. Our experiments show that some of these prosodic modifications lead to better comprehension at the expense of only slightly degraded naturalness of the audio. |
Tasks | Question Answering |
Published | 2018-06-11 |
URL | https://arxiv.org/abs/1806.03957v4 |
https://arxiv.org/pdf/1806.03957v4.pdf | |
PWC | https://paperswithcode.com/paper/prosody-modifications-for-question-answering |
Repo | https://github.com/varepsilon/clef2019-prosody |
Framework | none |
Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization
Title | Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization |
Authors | Jiong Zhang, Qi Lei, Inderjit S. Dhillon |
Abstract | Vanishing and exploding gradients are two of the main obstacles in training deep neural networks, especially in capturing long range dependencies in recurrent neural networks~(RNNs). In this paper, we present an efficient parametrization of the transition matrix of an RNN that allows us to stabilize the gradients that arise in its training. Specifically, we parameterize the transition matrix by its singular value decomposition(SVD), which allows us to explicitly track and control its singular values. We attain efficiency by using tools that are common in numerical linear algebra, namely Householder reflectors for representing the orthogonal matrices that arise in the SVD. By explicitly controlling the singular values, our proposed Spectral-RNN method allows us to easily solve the exploding gradient problem and we observe that it empirically solves the vanishing gradient issue to a large extent. We note that the SVD parameterization can be used for any rectangular weight matrix, hence it can be easily extended to any deep neural network, such as a multi-layer perceptron. Theoretically, we demonstrate that our parameterization does not lose any expressive power, and show how it controls generalization of RNN for the classification task. %, and show how it potentially makes the optimization process easier. Our extensive experimental results also demonstrate that the proposed framework converges faster, and has good generalization, especially in capturing long range dependencies, as shown on the synthetic addition and copy tasks, as well as on MNIST and Penn Tree Bank data sets. |
Tasks | |
Published | 2018-03-25 |
URL | http://arxiv.org/abs/1803.09327v1 |
http://arxiv.org/pdf/1803.09327v1.pdf | |
PWC | https://paperswithcode.com/paper/stabilizing-gradients-for-deep-neural |
Repo | https://github.com/zhangjiong724/spectral-RNN |
Framework | none |
Local Temporal Bilinear Pooling for Fine-grained Action Parsing
Title | Local Temporal Bilinear Pooling for Fine-grained Action Parsing |
Authors | Yan Zhang, Siyu Tang, Krikamol Muandet, Christian Jarvers, Heiko Neumann |
Abstract | Fine-grained temporal action parsing is important in many applications, such as daily activity understanding, human motion analysis, surgical robotics and others requiring subtle and precise operations in a long-term period. In this paper we propose a novel bilinear pooling operation, which is used in intermediate layers of a temporal convolutional encoder-decoder net. In contrast to other work, our proposed bilinear pooling is learnable and hence can capture more complex local statistics than the conventional counterpart. In addition, we introduce exact lower-dimension representations of our bilinear forms, so that the dimensionality is reduced with neither information loss nor extra computation. We perform intensive experiments to quantitatively analyze our model and show the superior performances to other state-of-the-art work on various datasets. |
Tasks | Action Parsing |
Published | 2018-12-05 |
URL | https://arxiv.org/abs/1812.01922v3 |
https://arxiv.org/pdf/1812.01922v3.pdf | |
PWC | https://paperswithcode.com/paper/local-temporal-bilinear-pooling-for-fine |
Repo | https://github.com/yz-cnsdqz/TemporalActionParsing-FineGrained |
Framework | tf |