Paper Group ANR 733
Recovering True Classifier Performance in Positive-Unlabeled Learning. Loss Max-Pooling for Semantic Image Segmentation. Structured Optimal Transport. Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations. SolarisNet: A Deep Regression Network for Solar Radiation Prediction. Mastering Sketching: Adver …
Recovering True Classifier Performance in Positive-Unlabeled Learning
Title | Recovering True Classifier Performance in Positive-Unlabeled Learning |
Authors | Shantanu Jain, Martha White, Predrag Radivojac |
Abstract | A common approach in positive-unlabeled learning is to train a classification model between labeled and unlabeled data. This strategy is in fact known to give an optimal classifier under mild conditions; however, it results in biased empirical estimates of the classifier performance. In this work, we show that the typically used performance measures such as the receiver operating characteristic curve, or the precision-recall curve obtained on such data can be corrected with the knowledge of class priors; i.e., the proportions of the positive and negative examples in the unlabeled data. We extend the results to a noisy setting where some of the examples labeled positive are in fact negative and show that the correction also requires the knowledge of the proportion of noisy examples in the labeled positives. Using state-of-the-art algorithms to estimate the positive class prior and the proportion of noise, we experimentally evaluate two correction approaches and demonstrate their efficacy on real-life data. |
Tasks | |
Published | 2017-02-02 |
URL | http://arxiv.org/abs/1702.00518v1 |
http://arxiv.org/pdf/1702.00518v1.pdf | |
PWC | https://paperswithcode.com/paper/recovering-true-classifier-performance-in |
Repo | |
Framework | |
Loss Max-Pooling for Semantic Image Segmentation
Title | Loss Max-Pooling for Semantic Image Segmentation |
Authors | Samuel Rota Bulò, Gerhard Neuhold, Peter Kontschieder |
Abstract | We introduce a novel loss max-pooling concept for handling imbalanced training data distributions, applicable as alternative loss layer in the context of deep neural networks for semantic image segmentation. Most real-world semantic segmentation datasets exhibit long tail distributions with few object categories comprising the majority of data and consequently biasing the classifiers towards them. Our method adaptively re-weights the contributions of each pixel based on their observed losses, targeting under-performing classification results as often encountered for under-represented object classes. Our approach goes beyond conventional cost-sensitive learning attempts through adaptive considerations that allow us to indirectly address both, inter- and intra-class imbalances. We provide a theoretical justification of our approach, complementary to experimental analyses on benchmark datasets. In our experiments on the Cityscapes and Pascal VOC 2012 segmentation datasets we find consistently improved results, demonstrating the efficacy of our approach. |
Tasks | Semantic Segmentation |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.02966v1 |
http://arxiv.org/pdf/1704.02966v1.pdf | |
PWC | https://paperswithcode.com/paper/loss-max-pooling-for-semantic-image |
Repo | |
Framework | |
Structured Optimal Transport
Title | Structured Optimal Transport |
Authors | David Alvarez-Melis, Tommi S. Jaakkola, Stefanie Jegelka |
Abstract | Optimal Transport has recently gained interest in machine learning for applications ranging from domain adaptation, sentence similarities to deep learning. Yet, its ability to capture frequently occurring structure beyond the “ground metric” is limited. In this work, we develop a nonlinear generalization of (discrete) optimal transport that is able to reflect much additional structure. We demonstrate how to leverage the geometry of this new model for fast algorithms, and explore connections and properties. Illustrative experiments highlight the benefit of the induced structured couplings for tasks in domain adaptation and natural language processing. |
Tasks | Domain Adaptation |
Published | 2017-12-17 |
URL | http://arxiv.org/abs/1712.06199v1 |
http://arxiv.org/pdf/1712.06199v1.pdf | |
PWC | https://paperswithcode.com/paper/structured-optimal-transport |
Repo | |
Framework | |
Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations
Title | Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations |
Authors | Yiping Lu, Aoxiao Zhong, Quanzheng Li, Bin Dong |
Abstract | In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations. The LM-architecture is an effective structure that can be used on any ResNet-like networks. In particular, we demonstrate that LM-ResNet and LM-ResNeXt (i.e. the networks obtained by applying the LM-architecture on ResNet and ResNeXt respectively) can achieve noticeably higher accuracy than ResNet and ResNeXt on both CIFAR and ImageNet with comparable numbers of trainable parameters. In particular, on both CIFAR and ImageNet, LM-ResNet/LM-ResNeXt can significantly compress ($>50$%) the original networks while maintaining a similar performance. This can be explained mathematically using the concept of modified equation from numerical analysis. Last but not least, we also establish a connection between stochastic control and noise injection in the training process which helps to improve generalization of the networks. Furthermore, by relating stochastic training strategy with stochastic dynamic system, we can easily apply stochastic training to the networks with the LM-architecture. As an example, we introduced stochastic depth to LM-ResNet and achieve significant improvement over the original LM-ResNet on CIFAR10. |
Tasks | |
Published | 2017-10-27 |
URL | https://arxiv.org/abs/1710.10121v3 |
https://arxiv.org/pdf/1710.10121v3.pdf | |
PWC | https://paperswithcode.com/paper/beyond-finite-layer-neural-networks-bridging |
Repo | |
Framework | |
SolarisNet: A Deep Regression Network for Solar Radiation Prediction
Title | SolarisNet: A Deep Regression Network for Solar Radiation Prediction |
Authors | Subhadip Dey, Sawon Pratiher, Saon Banerjee, Chanchal Kumar Mukherjee |
Abstract | Effective utilization of photovoltaic (PV) plants requires weather variability robust global solar radiation (GSR) forecasting models. Random weather turbulence phenomena coupled with assumptions of clear sky model as suggested by Hottel pose significant challenges to parametric & non-parametric models in GSR conversion rate estimation. Also, a decent GSR estimate requires costly high-tech radiometer and expert dependent instrument handling and measurements, which are subjective. As such, a computer aided monitoring (CAM) system to evaluate PV plant operation feasibility by employing smart grid past data analytics and deep learning is developed. Our algorithm, SolarisNet is a 6-layer deep neural network trained on data collected at two weather stations located near Kalyani metrological site, West Bengal, India. The daily GSR prediction performance using SolarisNet outperforms the existing state of art and its efficacy in inferring past GSR data insights to comprehend daily and seasonal GSR variability along with its competence for short term forecasting is discussed. |
Tasks | |
Published | 2017-11-22 |
URL | http://arxiv.org/abs/1711.08413v2 |
http://arxiv.org/pdf/1711.08413v2.pdf | |
PWC | https://paperswithcode.com/paper/solarisnet-a-deep-regression-network-for |
Repo | |
Framework | |
Mastering Sketching: Adversarial Augmentation for Structured Prediction
Title | Mastering Sketching: Adversarial Augmentation for Structured Prediction |
Authors | Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa |
Abstract | We present an integral framework for training sketch simplification networks that convert challenging rough sketches into clean line drawings. Our approach augments a simplification network with a discriminator network, training both networks jointly so that the discriminator network discerns whether a line drawing is a real training data or the output of the simplification network, which in turn tries to fool it. This approach has two major advantages. First, because the discriminator network learns the structure in line drawings, it encourages the output sketches of the simplification network to be more similar in appearance to the training sketches. Second, we can also train the simplification network with additional unsupervised data, using the discriminator network as a substitute teacher. Thus, by adding only rough sketches without simplified line drawings, or only line drawings without the original rough sketches, we can improve the quality of the sketch simplification. We show how our framework can be used to train models that significantly outperform the state of the art in the sketch simplification task, despite using the same architecture for inference. We additionally present an approach to optimize for a single image, which improves accuracy at the cost of additional computation time. Finally, we show that, using the same framework, it is possible to train the network to perform the inverse problem, i.e., convert simple line sketches into pencil drawings, which is not possible using the standard mean squared error loss. We validate our framework with two user tests, where our approach is preferred to the state of the art in sketch simplification 92.3% of the time and obtains 1.2 more points on a scale of 1 to 5. |
Tasks | Structured Prediction |
Published | 2017-03-27 |
URL | http://arxiv.org/abs/1703.08966v1 |
http://arxiv.org/pdf/1703.08966v1.pdf | |
PWC | https://paperswithcode.com/paper/mastering-sketching-adversarial-augmentation |
Repo | |
Framework | |
Broadcasting Convolutional Network for Visual Relational Reasoning
Title | Broadcasting Convolutional Network for Visual Relational Reasoning |
Authors | Simyung Chang, John Yang, Seonguk Park, Nojun Kwak |
Abstract | In this paper, we propose the Broadcasting Convolutional Network (BCN) that extracts key object features from the global field of an entire input image and recognizes their relationship with local features. BCN is a simple network module that collects effective spatial features, embeds location information and broadcasts them to the entire feature maps. We further introduce the Multi-Relational Network (multiRN) that improves the existing Relation Network (RN) by utilizing the BCN module. In pixel-based relation reasoning problems, with the help of BCN, multiRN extends the concept of pairwise relations' in conventional RNs to multiwise relations’ by relating each object with multiple objects at once. This yields in O(n) complexity for n objects, which is a vast computational gain from RNs that take O(n^2). Through experiments, multiRN has achieved a state-of-the-art performance on CLEVR dataset, which proves the usability of BCN on relation reasoning problems. |
Tasks | Relational Reasoning |
Published | 2017-12-07 |
URL | http://arxiv.org/abs/1712.02517v3 |
http://arxiv.org/pdf/1712.02517v3.pdf | |
PWC | https://paperswithcode.com/paper/broadcasting-convolutional-network-for-visual |
Repo | |
Framework | |
Large Margin Learning in Set to Set Similarity Comparison for Person Re-identification
Title | Large Margin Learning in Set to Set Similarity Comparison for Person Re-identification |
Authors | Sanping Zhou, Jinjun Wang, Rui Shi, Qiqi Hou, Yihong Gong, Nanning Zheng |
Abstract | Person re-identification (Re-ID) aims at matching images of the same person across disjoint camera views, which is a challenging problem in multimedia analysis, multimedia editing and content-based media retrieval communities. The major challenge lies in how to preserve similarity of the same person across video footages with large appearance variations, while discriminating different individuals. To address this problem, conventional methods usually consider the pairwise similarity between persons by only measuring the point to point (P2P) distance. In this paper, we propose to use deep learning technique to model a novel set to set (S2S) distance, in which the underline objective focuses on preserving the compactness of intra-class samples for each camera view, while maximizing the margin between the intra-class set and inter-class set. The S2S distance metric is consisted of three terms, namely the class-identity term, the relative distance term and the regularization term. The class-identity term keeps the intra-class samples within each camera view gathering together, the relative distance term maximizes the distance between the intra-class class set and inter-class set across different camera views, and the regularization term smoothness the parameters of deep convolutional neural network (CNN). As a result, the final learned deep model can effectively find out the matched target to the probe object among various candidates in the video gallery by learning discriminative and stable feature representations. Using the CUHK01, CUHK03, PRID2011 and Market1501 benchmark datasets, we extensively conducted comparative evaluations to demonstrate the advantages of our method over the state-of-the-art approaches. |
Tasks | Person Re-Identification |
Published | 2017-08-18 |
URL | http://arxiv.org/abs/1708.05512v1 |
http://arxiv.org/pdf/1708.05512v1.pdf | |
PWC | https://paperswithcode.com/paper/large-margin-learning-in-set-to-set |
Repo | |
Framework | |
Deep driven fMRI decoding of visual categories
Title | Deep driven fMRI decoding of visual categories |
Authors | Michele Svanera, Sergio Benini, Gal Raz, Talma Hendler, Rainer Goebel, Giancarlo Valente |
Abstract | Deep neural networks have been developed drawing inspiration from the brain visual pathway, implementing an end-to-end approach: from image data to video object classes. However building an fMRI decoder with the typical structure of Convolutional Neural Network (CNN), i.e. learning multiple level of representations, seems impractical due to lack of brain data. As a possible solution, this work presents the first hybrid fMRI and deep features decoding approach: collected fMRI and deep learnt representations of video object classes are linked together by means of Kernel Canonical Correlation Analysis. In decoding, this allows exploiting the discriminatory power of CNN by relating the fMRI representation to the last layer of CNN (fc7). We show the effectiveness of embedding fMRI data onto a subspace related to deep features in distinguishing semantic visual categories based solely on brain imaging data. |
Tasks | |
Published | 2017-01-09 |
URL | http://arxiv.org/abs/1701.02133v1 |
http://arxiv.org/pdf/1701.02133v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-driven-fmri-decoding-of-visual |
Repo | |
Framework | |
Rationally Biased Learning
Title | Rationally Biased Learning |
Authors | Michel De Lara |
Abstract | Are human perception and decision biases grounded in a form of rationality? You return to your camp after hunting or gathering. You see the grass moving. You do not know the probability that a snake is in the grass. Should you cross the grass - at the risk of being bitten by a snake - or make a long, hence costly, detour? Based on this storyline, we consider a rational decision maker maximizing expected discounted utility with learning. We show that his optimal behavior displays three biases: status quo, salience, overestimation of small probabilities. Biases can be the product of rational behavior. |
Tasks | |
Published | 2017-09-05 |
URL | http://arxiv.org/abs/1709.02256v1 |
http://arxiv.org/pdf/1709.02256v1.pdf | |
PWC | https://paperswithcode.com/paper/rationally-biased-learning |
Repo | |
Framework | |
A Review of Evaluation Techniques for Social Dialogue Systems
Title | A Review of Evaluation Techniques for Social Dialogue Systems |
Authors | Amanda Cercas Curry, Helen Hastie, Verena Rieser |
Abstract | In contrast with goal-oriented dialogue, social dialogue has no clear measure of task success. Consequently, evaluation of these systems is notoriously hard. In this paper, we review current evaluation methods, focusing on automatic metrics. We conclude that turn-based metrics often ignore the context and do not account for the fact that several replies are valid, while end-of-dialogue rewards are mainly hand-crafted. Both lack grounding in human perceptions. |
Tasks | |
Published | 2017-09-13 |
URL | http://arxiv.org/abs/1709.04409v1 |
http://arxiv.org/pdf/1709.04409v1.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-evaluation-techniques-for-social |
Repo | |
Framework | |
MR to X-Ray Projection Image Synthesis
Title | MR to X-Ray Projection Image Synthesis |
Authors | Bernhard Stimpel, Christopher Syben, Tobias Würfl, Katrin Mentl, Arnd Dörfler, Andreas Maier |
Abstract | Hybrid imaging promises large potential in medical imaging applications. To fully utilize the possibilities of corresponding information from different modalities, the information must be transferable between the domains. In radiation therapy planning, existing methods make use of reconstructed 3D magnetic resonance imaging data to synthesize corresponding X-ray attenuation maps. In contrast, for fluoroscopic procedures only line integral data, i.e., 2D projection images, are present. The question arises which approaches could potentially be used for this MR to X-ray projection image-to-image translation. We examine three network architectures and two loss-functions regarding their suitability as generator networks for this task. All generators proved to yield suitable results for this task. A cascaded refinement network paired with a perceptual-loss function achieved the best qualitative results in our evaluation. The perceptual-loss showed to be able to preserve most of the high-frequency details in the projection images and, thus, is recommended for the underlying task and similar problems. |
Tasks | Image Generation, Image-to-Image Translation |
Published | 2017-10-20 |
URL | http://arxiv.org/abs/1710.07498v2 |
http://arxiv.org/pdf/1710.07498v2.pdf | |
PWC | https://paperswithcode.com/paper/mr-to-x-ray-projection-image-synthesis |
Repo | |
Framework | |
Minimal Exploration in Structured Stochastic Bandits
Title | Minimal Exploration in Structured Stochastic Bandits |
Authors | Richard Combes, Stefan Magureanu, Alexandre Proutiere |
Abstract | This paper introduces and addresses a wide class of stochastic bandit problems where the function mapping the arm to the corresponding reward exhibits some known structural properties. Most existing structures (e.g. linear, Lipschitz, unimodal, combinatorial, dueling, …) are covered by our framework. We derive an asymptotic instance-specific regret lower bound for these problems, and develop OSSB, an algorithm whose regret matches this fundamental limit. OSSB is not based on the classical principle of “optimism in the face of uncertainty” or on Thompson sampling, and rather aims at matching the minimal exploration rates of sub-optimal arms as characterized in the derivation of the regret lower bound. We illustrate the efficiency of OSSB using numerical experiments in the case of the linear bandit problem and show that OSSB outperforms existing algorithms, including Thompson sampling. |
Tasks | |
Published | 2017-11-01 |
URL | http://arxiv.org/abs/1711.00400v1 |
http://arxiv.org/pdf/1711.00400v1.pdf | |
PWC | https://paperswithcode.com/paper/minimal-exploration-in-structured-stochastic |
Repo | |
Framework | |
IVE-GAN: Invariant Encoding Generative Adversarial Networks
Title | IVE-GAN: Invariant Encoding Generative Adversarial Networks |
Authors | Robin Winter, Djork-Arné Clevert |
Abstract | Generative adversarial networks (GANs) are a powerful framework for generative tasks. However, they are difficult to train and tend to miss modes of the true data generation process. Although GANs can learn a rich representation of the covered modes of the data in their latent space, the framework misses an inverse mapping from data to this latent space. We propose Invariant Encoding Generative Adversarial Networks (IVE-GANs), a novel GAN framework that introduces such a mapping for individual samples from the data by utilizing features in the data which are invariant to certain transformations. Since the model maps individual samples to the latent space, it naturally encourages the generator to cover all modes. We demonstrate the effectiveness of our approach in terms of generative performance and learning rich representations on several datasets including common benchmark image generation tasks. |
Tasks | Image Generation |
Published | 2017-11-23 |
URL | http://arxiv.org/abs/1711.08646v1 |
http://arxiv.org/pdf/1711.08646v1.pdf | |
PWC | https://paperswithcode.com/paper/ive-gan-invariant-encoding-generative |
Repo | |
Framework | |
Embedded Binarized Neural Networks
Title | Embedded Binarized Neural Networks |
Authors | Bradley McDanel, Surat Teerapittayanon, H. T. Kung |
Abstract | We study embedded Binarized Neural Networks (eBNNs) with the aim of allowing current binarized neural networks (BNNs) in the literature to perform feedforward inference efficiently on small embedded devices. We focus on minimizing the required memory footprint, given that these devices often have memory as small as tens of kilobytes (KB). Beyond minimizing the memory required to store weights, as in a BNN, we show that it is essential to minimize the memory used for temporaries which hold intermediate results between layers in feedforward inference. To accomplish this, eBNN reorders the computation of inference while preserving the original BNN structure, and uses just a single floating-point temporary for the entire neural network. All intermediate results from a layer are stored as binary values, as opposed to floating-points used in current BNN implementations, leading to a 32x reduction in required temporary space. We provide empirical evidence that our proposed eBNN approach allows efficient inference (10s of ms) on devices with severely limited memory (10s of KB). For example, eBNN achieves 95% accuracy on the MNIST dataset running on an Intel Curie with only 15 KB of usable memory with an inference runtime of under 50 ms per sample. To ease the development of applications in embedded contexts, we make our source code available that allows users to train and discover eBNN models for a learning task at hand, which fit within the memory constraint of the target device. |
Tasks | |
Published | 2017-09-06 |
URL | http://arxiv.org/abs/1709.02260v1 |
http://arxiv.org/pdf/1709.02260v1.pdf | |
PWC | https://paperswithcode.com/paper/embedded-binarized-neural-networks |
Repo | |
Framework | |