May 7, 2019

3003 words 15 mins read

Paper Group AWR 80

Paper Group AWR 80

Deep Networks with Stochastic Depth. Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding. CSVideoNet: A Real-time End-to-end Learning Framework for High-frame-rate Video Compressive Sensing. Convolutional Radio Modulation Recognition Networks. Confidence driven TGV fusion. Deep Learning without Poor Local Minima. A Power …

Deep Networks with Stochastic Depth

Title Deep Networks with Stochastic Depth
Authors Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, Kilian Weinberger
Abstract Very deep convolutional networks with hundreds of layers have led to significant reductions in error on competitive benchmarks. Although the unmatched expressiveness of the many layers can be highly desirable at test time, training very deep networks comes with its own set of challenges. The gradients can vanish, the forward flow often diminishes, and the training time can be painfully slow. To address these problems, we propose stochastic depth, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time. We start with very deep networks but during training, for each mini-batch, randomly drop a subset of layers and bypass them with the identity function. This simple approach complements the recent success of residual networks. It reduces training time substantially and improves the test error significantly on almost all data sets that we used for evaluation. With stochastic depth we can increase the depth of residual networks even beyond 1200 layers and still yield meaningful improvements in test error (4.91% on CIFAR-10).
Tasks Image Classification
Published 2016-03-30
URL http://arxiv.org/abs/1603.09382v3
PDF http://arxiv.org/pdf/1603.09382v3.pdf
PWC https://paperswithcode.com/paper/deep-networks-with-stochastic-depth
Repo https://github.com/dblN/stochastic_depth_keras
Framework none

Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding

Title Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding
Authors Xiang Ren, Wenqi He, Meng Qu, Clare R. Voss, Heng Ji, Jiawei Han
Abstract Current systems of fine-grained entity typing use distant supervision in conjunction with existing knowledge bases to assign categories (type labels) to entity mentions. However, the type labels so obtained from knowledge bases are often noisy (i.e., incorrect for the entity mention’s local context). We define a new task, Label Noise Reduction in Entity Typing (LNR), to be the automatic identification of correct type labels (type-paths) for training examples, given the set of candidate type labels obtained by distant supervision with a given type hierarchy. The unknown type labels for individual entity mentions and the semantic similarity between entity types pose unique challenges for solving the LNR task. We propose a general framework, called PLE, to jointly embed entity mentions, text features and entity types into the same low-dimensional space where, in that space, objects whose types are semantically close have similar representations. Then we estimate the type-path for each training example in a top-down manner using the learned embeddings. We formulate a global objective for learning the embeddings from text corpora and knowledge bases, which adopts a novel margin-based loss that is robust to noisy labels and faithfully models type correlation derived from knowledge bases. Our experiments on three public typing datasets demonstrate the effectiveness and robustness of PLE, with an average of 25% improvement in accuracy compared to next best method.
Tasks Entity Typing, Semantic Similarity, Semantic Textual Similarity
Published 2016-02-17
URL http://arxiv.org/abs/1602.05307v1
PDF http://arxiv.org/pdf/1602.05307v1.pdf
PWC https://paperswithcode.com/paper/label-noise-reduction-in-entity-typing-by
Repo https://github.com/shanzhenren/AFET
Framework none

CSVideoNet: A Real-time End-to-end Learning Framework for High-frame-rate Video Compressive Sensing

Title CSVideoNet: A Real-time End-to-end Learning Framework for High-frame-rate Video Compressive Sensing
Authors Kai Xu, Fengbo Ren
Abstract This paper addresses the real-time encoding-decoding problem for high-frame-rate video compressive sensing (CS). Unlike prior works that perform reconstruction using iterative optimization-based approaches, we propose a non-iterative model, named “CSVideoNet”. CSVideoNet directly learns the inverse mapping of CS and reconstructs the original input in a single forward propagation. To overcome the limitations of existing CS cameras, we propose a multi-rate CNN and a synthesizing RNN to improve the trade-off between compression ratio (CR) and spatial-temporal resolution of the reconstructed videos. The experiment results demonstrate that CSVideoNet significantly outperforms the state-of-the-art approaches. With no pre/post-processing, we achieve 25dB PSNR recovery quality at 100x CR, with a frame rate of 125 fps on a Titan X GPU. Due to the feedforward and high-data-concurrency natures of CSVideoNet, it can take advantage of GPU acceleration to achieve three orders of magnitude speed-up over conventional iterative-based approaches. We share the source code at https://github.com/PSCLab-ASU/CSVideoNet.
Tasks Compressive Sensing, Video Compressive Sensing
Published 2016-12-15
URL http://arxiv.org/abs/1612.05203v5
PDF http://arxiv.org/pdf/1612.05203v5.pdf
PWC https://paperswithcode.com/paper/csvideonet-a-real-time-end-to-end-learning
Repo https://github.com/calmevtime/CSImageNet
Framework none

Convolutional Radio Modulation Recognition Networks

Title Convolutional Radio Modulation Recognition Networks
Authors Timothy J O’Shea, Johnathan Corgan, T. Charles Clancy
Abstract We study the adaptation of convolutional neural networks to the complex temporal radio signal domain. We compare the efficacy of radio modulation classification using naively learned features against using expert features which are widely used in the field today and we show significant performance improvements. We show that blind temporal learning on large and densely encoded time series using deep convolutional neural networks is viable and a strong candidate approach for this task especially at low signal to noise ratio.
Tasks Time Series
Published 2016-02-12
URL http://arxiv.org/abs/1602.04105v3
PDF http://arxiv.org/pdf/1602.04105v3.pdf
PWC https://paperswithcode.com/paper/convolutional-radio-modulation-recognition
Repo https://github.com/randaller/cnn-rtlsdr
Framework tf

Confidence driven TGV fusion

Title Confidence driven TGV fusion
Authors Valsamis Ntouskos, Fiora Pirri
Abstract We introduce a novel model for spatially varying variational data fusion, driven by point-wise confidence values. The proposed model allows for the joint estimation of the data and the confidence values based on the spatial coherence of the data. We discuss the main properties of the introduced model as well as suitable algorithms for estimating the solution of the corresponding biconvex minimization problem and their convergence. The performance of the proposed model is evaluated considering the problem of depth image fusion by using both synthetic and real data from publicly available datasets.
Tasks
Published 2016-03-30
URL http://arxiv.org/abs/1603.09302v2
PDF http://arxiv.org/pdf/1603.09302v2.pdf
PWC https://paperswithcode.com/paper/confidence-driven-tgv-fusion
Repo https://github.com/alcor-vision/confidence-fusion
Framework none

Deep Learning without Poor Local Minima

Title Deep Learning without Poor Local Minima
Authors Kenji Kawaguchi
Abstract In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015. With no unrealistic assumption, we first prove the following statements for the squared loss function of deep linear neural networks with any depth and any widths: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle point, and 4) there exist “bad” saddle points (where the Hessian has no negative eigenvalue) for the deeper networks (with more than three layers), whereas there is no bad saddle point for the shallow networks (with three layers). Moreover, for deep nonlinear neural networks, we prove the same four statements via a reduction to a deep linear model under the independence assumption adopted from recent work. As a result, we present an instance, for which we can answer the following question: how difficult is it to directly train a deep model in theory? It is more difficult than the classical machine learning models (because of the non-convexity), but not too difficult (because of the nonexistence of poor local minima). Furthermore, the mathematically proven existence of bad saddle points for deeper models would suggest a possible open problem. We note that even though we have advanced the theoretical foundations of deep learning and non-convex optimization, there is still a gap between theory and practice.
Tasks
Published 2016-05-23
URL http://arxiv.org/abs/1605.07110v3
PDF http://arxiv.org/pdf/1605.07110v3.pdf
PWC https://paperswithcode.com/paper/deep-learning-without-poor-local-minima
Repo https://github.com/yijiazh/DFER_Summer2019
Framework tf

A Powerful Generative Model Using Random Weights for the Deep Image Representation

Title A Powerful Generative Model Using Random Weights for the Deep Image Representation
Authors Kun He, Yan Wang, John Hopcroft
Abstract To what extent is the success of deep visualization due to the training? Could we do deep visualization using untrained, random weight networks? To address this issue, we explore new and powerful generative models for three popular deep visualization tasks using untrained, random weight convolutional neural networks. First we invert representations in feature spaces and reconstruct images from white noise inputs. The reconstruction quality is statistically higher than that of the same method applied on well trained networks with the same architecture. Next we synthesize textures using scaled correlations of representations in multiple layers and our results are almost indistinguishable with the original natural texture and the synthesized textures based on the trained network. Third, by recasting the content of an image in the style of various artworks, we create artistic images with high perceptual quality, highly competitive to the prior work of Gatys et al. on pretrained networks. To our knowledge this is the first demonstration of image representations using untrained deep neural networks. Our work provides a new and fascinating tool to study the representation of deep network architecture and sheds light on new understandings on deep visualization.
Tasks
Published 2016-06-15
URL http://arxiv.org/abs/1606.04801v2
PDF http://arxiv.org/pdf/1606.04801v2.pdf
PWC https://paperswithcode.com/paper/a-powerful-generative-model-using-random
Repo https://github.com/inzouzouwetrust/IMA_RWCNN_project
Framework none

Partial Membership Latent Dirichlet Allocation

Title Partial Membership Latent Dirichlet Allocation
Authors Chao Chen, Alina Zare, Huy Trinh, Gbeng Omotara, J. Tory Cobb, Timotius Lagaunne
Abstract Topic models (e.g., pLSA, LDA, sLDA) have been widely used for segmenting imagery. However, these models are confined to crisp segmentation, forcing a visual word (i.e., an image patch) to belong to one and only one topic. Yet, there are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership latent Dirichlet allocation (PM-LDA) model and an associated parameter estimation algorithm. This model can be useful for imagery where a visual word may be a mixture of multiple topics. Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.
Tasks Topic Models
Published 2016-12-28
URL http://arxiv.org/abs/1612.08936v1
PDF http://arxiv.org/pdf/1612.08936v1.pdf
PWC https://paperswithcode.com/paper/partial-membership-latent-dirichlet-1
Repo https://github.com/TigerSense/PMLDA
Framework none

TristouNet: Triplet Loss for Speaker Turn Embedding

Title TristouNet: Triplet Loss for Speaker Turn Embedding
Authors Hervé Bredin
Abstract TristouNet is a neural network architecture based on Long Short-Term Memory recurrent networks, meant to project speech sequences into a fixed-dimensional euclidean space. Thanks to the triplet loss paradigm used for training, the resulting sequence embeddings can be compared directly with the euclidean distance, for speaker comparison purposes. Experiments on short (between 500ms and 5s) speech turn comparison and speaker change detection show that TristouNet brings significant improvements over the current state-of-the-art techniques for both tasks.
Tasks
Published 2016-09-14
URL http://arxiv.org/abs/1609.04301v3
PDF http://arxiv.org/pdf/1609.04301v3.pdf
PWC https://paperswithcode.com/paper/tristounet-triplet-loss-for-speaker-turn
Repo https://github.com/jsalt-coml/babytrain_multilabel
Framework tf

Comparative evaluation of state-of-the-art algorithms for SSVEP-based BCIs

Title Comparative evaluation of state-of-the-art algorithms for SSVEP-based BCIs
Authors Vangelis P. Oikonomou, Georgios Liaros, Kostantinos Georgiadis, Elisavet Chatzilari, Katerina Adam, Spiros Nikolopoulos, Ioannis Kompatsiaris
Abstract Brain-computer interfaces (BCIs) have been gaining momentum in making human-computer interaction more natural, especially for people with neuro-muscular disabilities. Among the existing solutions the systems relying on electroencephalograms (EEG) occupy the most prominent place due to their non-invasiveness. However, the process of translating EEG signals into computer commands is far from trivial, since it requires the optimization of many different parameters that need to be tuned jointly. In this report, we focus on the category of EEG-based BCIs that rely on Steady-State-Visual-Evoked Potentials (SSVEPs) and perform a comparative evaluation of the most promising algorithms existing in the literature. More specifically, we define a set of algorithms for each of the various different parameters composing a BCI system (i.e. filtering, artifact removal, feature extraction, feature selection and classification) and study each parameter independently by keeping all other parameters fixed. The results obtained from this evaluation process are provided together with a dataset consisting of the 256-channel, EEG signals of 11 subjects, as well as a processing toolbox for reproducing the results and supporting further experimentation. In this way, we manage to make available for the community a state-of-the-art baseline for SSVEP-based BCIs that can be used as a basis for introducing novel methods and approaches.
Tasks EEG, Feature Selection
Published 2016-02-02
URL http://arxiv.org/abs/1602.00904v2
PDF http://arxiv.org/pdf/1602.00904v2.pdf
PWC https://paperswithcode.com/paper/comparative-evaluation-of-state-of-the-art
Repo https://github.com/akhilmurali013/project
Framework none

A Comprehensive Performance Evaluation of Deformable Face Tracking “In-the-Wild”

Title A Comprehensive Performance Evaluation of Deformable Face Tracking “In-the-Wild”
Authors Grigorios G. Chrysos, Epameinondas Antonakos, Patrick Snape, Akshay Asthana, Stefanos Zafeiriou
Abstract Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as “in-the-wild”). This is partially attributed to the fact that comprehensive “in-the-wild” benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking “in-the-wild”. Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.
Tasks Face Alignment, Face Detection, Face Recognition
Published 2016-03-18
URL http://arxiv.org/abs/1603.06015v2
PDF http://arxiv.org/pdf/1603.06015v2.pdf
PWC https://paperswithcode.com/paper/a-comprehensive-performance-evaluation-of
Repo https://github.com/zhusz/CVPR15-CFSS
Framework none

A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation

Title A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation
Authors Lei Tai, Jingwei Zhang, Ming Liu, Joschka Boedecker, Wolfram Burgard
Abstract Deep learning techniques have been widely applied, achieving state-of-the-art results in various fields of study. This survey focuses on deep learning solutions that target learning control policies for robotics applications. We carry out our discussions on the two main paradigms for learning control with deep networks: deep reinforcement learning and imitation learning. For deep reinforcement learning (DRL), we begin from traditional reinforcement learning algorithms, showing how they are extended to the deep context and effective mechanisms that could be added on top of the DRL algorithms. We then introduce representative works that utilize DRL to solve navigation and manipulation tasks in robotics. We continue our discussion on methods addressing the challenge of the reality gap for transferring DRL policies trained in simulation to real-world scenarios, and summarize robotics simulation platforms for conducting DRL research. For imitation leaning, we go through its three main categories, behavior cloning, inverse reinforcement learning and generative adversarial imitation learning, by introducing their formulations and their corresponding robotics applications. Finally, we discuss the open challenges and research frontiers.
Tasks Imitation Learning
Published 2016-12-21
URL http://arxiv.org/abs/1612.07139v4
PDF http://arxiv.org/pdf/1612.07139v4.pdf
PWC https://paperswithcode.com/paper/a-survey-of-deep-network-solutions-for
Repo https://github.com/tccnchsu/study
Framework none

Pixel-Level Domain Transfer

Title Pixel-Level Domain Transfer
Authors Donggeun Yoo, Namil Kim, Sunggyun Park, Anthony S. Paek, In So Kweon
Abstract We present an image-conditional image generation model. The model transfers an input domain to a target domain in semantic level, and generates the target image in pixel level. To generate realistic target images, we employ the real/fake-discriminator as in Generative Adversarial Nets, but also introduce a novel domain-discriminator to make the generated image relevant to the input image. We verify our model through a challenging task of generating a piece of clothing from an input image of a dressed person. We present a high quality clothing dataset containing the two domains, and succeed in demonstrating decent results.
Tasks Conditional Image Generation, Image Generation
Published 2016-03-24
URL http://arxiv.org/abs/1603.07442v3
PDF http://arxiv.org/pdf/1603.07442v3.pdf
PWC https://paperswithcode.com/paper/pixel-level-domain-transfer
Repo https://github.com/eliceio/FashionRetrieval
Framework tf

On the (im)possibility of fairness

Title On the (im)possibility of fairness
Authors Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian
Abstract What does it mean for an algorithm to be fair? Different papers use different notions of algorithmic fairness, and although these appear internally consistent, they also seem mutually incompatible. We present a mathematical setting in which the distinctions in previous papers can be made formal. In addition to characterizing the spaces of inputs (the “observed” space) and outputs (the “decision” space), we introduce the notion of a construct space: a space that captures unobservable, but meaningful variables for the prediction. We show that in order to prove desirable properties of the entire decision-making process, different mechanisms for fairness require different assumptions about the nature of the mapping from construct space to decision space. The results in this paper imply that future treatments of algorithmic fairness should more explicitly state assumptions about the relationship between constructs and observations.
Tasks Decision Making
Published 2016-09-23
URL http://arxiv.org/abs/1609.07236v1
PDF http://arxiv.org/pdf/1609.07236v1.pdf
PWC https://paperswithcode.com/paper/on-the-impossibility-of-fairness
Repo https://github.com/cteicher-m/loanBiases
Framework none

Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms

Title Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms
Authors Christian A. Naesseth, Francisco J. R. Ruiz, Scott W. Linderman, David M. Blei
Abstract Variational inference using the reparameterization trick has enabled large-scale approximate Bayesian inference in complex probabilistic models, leveraging stochastic optimization to sidestep intractable expectations. The reparameterization trick is applicable when we can simulate a random variable by applying a differentiable deterministic function on an auxiliary random variable whose distribution is fixed. For many distributions of interest (such as the gamma or Dirichlet), simulation of random variables relies on acceptance-rejection sampling. The discontinuity introduced by the accept-reject step means that standard reparameterization tricks are not applicable. We propose a new method that lets us leverage reparameterization gradients even when variables are outputs of a acceptance-rejection sampling algorithm. Our approach enables reparameterization on a larger class of variational distributions. In several studies of real and synthetic data, we show that the variance of the estimator of the gradient is significantly lower than other state-of-the-art methods. This leads to faster convergence of stochastic gradient variational inference.
Tasks Bayesian Inference, Stochastic Optimization
Published 2016-10-18
URL https://arxiv.org/abs/1610.05683v3
PDF https://arxiv.org/pdf/1610.05683v3.pdf
PWC https://paperswithcode.com/paper/reparameterization-gradients-through
Repo https://github.com/blei-lab/ars-reparameterization
Framework none
comments powered by Disqus