Paper Group ANR 71
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?. Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples. Advocacy Learning: Learning through Competition and Class-Conditional Representations. Deep Non-Rigid Structure from Motion with Missing Data. RadioUNet: Fast Radio Map Estimation with Convolution …
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
Title | How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? |
Authors | Zixiang Chen, Yuan Cao, Difan Zou, Quanquan Gu |
Abstract | A recent line of research on deep learning focuses on the extremely over-parameterized setting, and shows that when the network width is larger than a high degree polynomial of the training sample size $n$ and the inverse of the target accuracy $\epsilon^{-1}$, deep neural networks learned by (stochastic) gradient descent enjoy nice optimization and generalization guarantees. Very recently, it is shown that under certain margin assumption on the training data, a polylogarithmic width condition suffices for two-layer ReLU networks to converge and generalize (Ji and Telgarsky, 2019). However, how much over-parameterization is sufficient to guarantee optimization and generalization for deep neural networks still remains an open question. In this work, we establish sharp optimization and generalization guarantees for deep ReLU networks. Under various assumptions made in previous work, our optimization and generalization guarantees hold with network width polylogarithmic in $n$ and $\epsilon^{-1}$. Our results push the study of over-parameterized deep neural networks towards more practical settings. |
Tasks | |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.12360v2 |
https://arxiv.org/pdf/1911.12360v2.pdf | |
PWC | https://paperswithcode.com/paper/how-much-over-parameterization-is-sufficient |
Repo | |
Framework | |
Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples
Title | Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples |
Authors | Yinpeng Dong, Fan Bao, Hang Su, Jun Zhu |
Abstract | Sometimes it is not enough for a DNN to produce an outcome. For example, in applications such as healthcare, users need to understand the rationale of the decisions. Therefore, it is imperative to develop algorithms to learn models with good interpretability (Doshi-Velez 2017). An important factor that leads to the lack of interpretability of DNNs is the ambiguity of neurons, where a neuron may fire for various unrelated concepts. This work aims to increase the interpretability of DNNs on the whole image space by reducing the ambiguity of neurons. In this paper, we make the following contributions: 1) We propose a metric to evaluate the consistency level of neurons in a network quantitatively. 2) We find that the learned features of neurons are ambiguous by leveraging adversarial examples. 3) We propose to improve the consistency of neurons on adversarial example subset by an adversarial training algorithm with a consistent loss. |
Tasks | |
Published | 2019-01-25 |
URL | http://arxiv.org/abs/1901.09035v1 |
http://arxiv.org/pdf/1901.09035v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-interpretable-deep-neural-networks-by-1 |
Repo | |
Framework | |
Advocacy Learning: Learning through Competition and Class-Conditional Representations
Title | Advocacy Learning: Learning through Competition and Class-Conditional Representations |
Authors | Ian Fox, Jenna Wiens |
Abstract | We introduce advocacy learning, a novel supervised training scheme for attention-based classification problems. Advocacy learning relies on a framework consisting of two connected networks: 1) $N$ Advocates (one for each class), each of which outputs an argument in the form of an attention map over the input, and 2) a Judge, which predicts the class label based on these arguments. Each Advocate produces a class-conditional representation with the goal of convincing the Judge that the input example belongs to their class, even when the input belongs to a different class. Applied to several different classification tasks, we show that advocacy learning can lead to small improvements in classification accuracy over an identical supervised baseline. Though a series of follow-up experiments, we analyze when and how such class-conditional representations improve discriminative performance. Though somewhat counter-intuitive, a framework in which subnetworks are trained to competitively provide evidence in support of their class shows promise, in many cases performing on par with standard learning approaches. This provides a foundation for further exploration into competition and class-conditional representations in supervised learning. |
Tasks | |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02723v1 |
https://arxiv.org/pdf/1908.02723v1.pdf | |
PWC | https://paperswithcode.com/paper/advocacy-learning-learning-through |
Repo | |
Framework | |
Deep Non-Rigid Structure from Motion with Missing Data
Title | Deep Non-Rigid Structure from Motion with Missing Data |
Authors | Chen Kong, Simon Lucey |
Abstract | Non-Rigid Structure from Motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from an ensemble of images with 2D correspondences. Current NRSfM algorithms are limited from two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties stem from the inherent conflict between the condition of the system and the degrees of freedom needing to be modeled – which has hampered its practical utility for many applications within vision. In this paper we propose a novel hierarchical sparse coding model for NRSFM which can overcome (i) and (ii) to such an extent, that NRSFM can be applied to problems in vision previously thought too ill posed. Our approach is realized in practice as the training of an unsupervised deep neural network (DNN) auto-encoder with a unique architecture that is able to disentangle pose from 3D structure. Using modern deep learning computational platforms allows us to solve NRSfM problems at an unprecedented scale and shape complexity. Our approach has no 3D supervision, relying solely on 2D point correspondences. Further, our approach is also able to handle missing/occluded 2D points without the need for matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability. We believe our work to be a significant advance over state-of-the-art in NRSFM. |
Tasks | Matrix Completion |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.13123v2 |
https://arxiv.org/pdf/1907.13123v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-non-rigid-structure-from-motion |
Repo | |
Framework | |
RadioUNet: Fast Radio Map Estimation with Convolutional Neural Networks
Title | RadioUNet: Fast Radio Map Estimation with Convolutional Neural Networks |
Authors | Ron Levie, Çağkan Yapar, Gitta Kutyniok, Giuseppe Caire |
Abstract | In this paper we propose a highly efficient and very accurate method for estimating the propagation pathloss from a point x to all points y on the 2D plane. Our method, termed RadioUNet, is a deep neural network. For applications such as user-cell site association and device-to-device (D2D) link scheduling, an accurate knowledge of the pathloss function for all pairs of locations is very important. Commonly used statistical models approximate the pathloss as a decaying function of the distance between the points. However, in realistic propagation environments characterized by the presence of buildings, street canyons, and objects at different heights, such radial-symmetric functions yield very misleading results. In this paper we show that properly designed and trained deep neural networks are able to learn how to estimate the pathloss function, given an urban environment, very accurately and extremely quickly. Our proposed method generates pathloss estimations that are very close to estimations given by physical simulation, but much faster. Moreover, experimental results show that our method significantly outperforms previously proposed methods based on radial basis function interpolation and tensor completion. |
Tasks | |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.09002v1 |
https://arxiv.org/pdf/1911.09002v1.pdf | |
PWC | https://paperswithcode.com/paper/radiounet-fast-radio-map-estimation-with |
Repo | |
Framework | |
Controlled Forgetting: Targeted Stimulation and Dopaminergic Plasticity Modulation for Unsupervised Lifelong Learning in Spiking Neural Networks
Title | Controlled Forgetting: Targeted Stimulation and Dopaminergic Plasticity Modulation for Unsupervised Lifelong Learning in Spiking Neural Networks |
Authors | Jason M. Allred, Kaushik Roy |
Abstract | Stochastic gradient descent requires that training samples be drawn from a uniformly random distribution of the data. For a deployed system that must learn online from an uncontrolled and unknown environment, the ordering of input samples often fails to meet this criterion, making lifelong learning a difficult challenge. We exploit the locality of the unsupervised Spike Timing Dependent Plasticity (STDP) learning rule to target local representations in a Spiking Neural Network (SNN) to adapt to novel information while protecting essential information in the remainder of the SNN from catastrophic forgetting. In our Controlled Forgetting Networks (CFNs), novel information triggers stimulated firing and heterogeneously modulated plasticity, inspired by biological dopamine signals, to cause rapid and isolated adaptation in the synapses of neurons associated with outlier information. This targeting controls the forgetting process in a way that reduces the degradation of accuracy for older tasks while learning new tasks. Our experimental results on the MNIST dataset validate the capability of CFNs to learn successfully over time from an unknown, changing environment, achieving 95.36% accuracy, which we believe is the best unsupervised accuracy ever achieved by a fixed-size, single-layer SNN on a completely disjoint MNIST dataset. |
Tasks | |
Published | 2019-02-08 |
URL | https://arxiv.org/abs/1902.03187v2 |
https://arxiv.org/pdf/1902.03187v2.pdf | |
PWC | https://paperswithcode.com/paper/stimulating-stdp-to-exploit-locality-for |
Repo | |
Framework | |
Quantum Compressed Sensing with Unsupervised Tensor-Network Machine Learning
Title | Quantum Compressed Sensing with Unsupervised Tensor-Network Machine Learning |
Authors | Shi-Ju Ran, Zheng-Zhi Sun, Shao-Ming Fei, Gang Su, Maciej Lewenstein |
Abstract | We propose tensor-network compressed sensing (TNCS) by combining the ideas of compressed sensing, tensor network (TN), and machine learning, which permits novel and efficient quantum communications of realistic data. The strategy is to use the unsupervised TN machine learning algorithm to obtain the entangled state $\Psi \rangle$ that describes the probability distribution of a huge amount of classical information considered to be communicated. To transfer a specific piece of information with $\Psi \rangle$, our proposal is to encode such information in the separable state with the minimal distance to the measured state $\Phi \rangle$ that is obtained by partially measuring on $\Psi \rangle$ in a designed way. To this end, a measuring protocol analogous to the compressed sensing with neural-network machine learning is suggested, where the measurements are designed to minimize uncertainty of information from the probability distribution given by $\Phi \rangle$. In this way, those who have $\Phi \rangle$ can reliably access the information by simply measuring on $\Phi \rangle$. We propose q-sparsity to characterize the sparsity of quantum states and the efficiency of the quantum communications by TNCS. The high q-sparsity is essentially due to the fact that the TN states describing nicely the probability distribution obey the area law of entanglement entropy. Testing on realistic datasets (hand-written digits and fashion images), TNCS is shown to possess high efficiency and accuracy, where the security of communications is guaranteed by the fundamental quantum principles. |
Tasks | |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10290v2 |
https://arxiv.org/pdf/1907.10290v2.pdf | |
PWC | https://paperswithcode.com/paper/quantum-compressed-sensing-with-unsupervised |
Repo | |
Framework | |
Solving zero-sum extensive-form games with arbitrary payoff uncertainty models
Title | Solving zero-sum extensive-form games with arbitrary payoff uncertainty models |
Authors | Juan Leni, John Levine, John Quigley |
Abstract | Modeling strategic conflict from a game theoretical perspective involves dealing with epistemic uncertainty. Payoff uncertainty models are typically restricted to simple probability models due to computational restrictions. Recent breakthroughs Artificial Intelligence (AI) research applied to Poker have resulted in novel approximation approaches such as counterfactual regret minimization, that can successfully deal with large-scale imperfect games. By drawing from these ideas, this work addresses the problem of arbitrary continuous payoff distributions. We propose a method, Harsanyi-Counterfactual Regret Minimization, to solve two-player zero-sum extensive-form games with arbitrary payoff distribution models. Given a game $\Gamma$, using a Harsanyi transformation we generate a new game $\Gamma^#$ to which we later apply Counterfactual Regret Minimization to obtain $\varepsilon$-Nash equilibria. We include numerical experiments showing how the method can be applied to a previously published problem. |
Tasks | |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1905.03850v1 |
http://arxiv.org/pdf/1905.03850v1.pdf | |
PWC | https://paperswithcode.com/paper/190503850 |
Repo | |
Framework | |
Recurrent Feedback Improves Feedforward Representations in Deep Neural Networks
Title | Recurrent Feedback Improves Feedforward Representations in Deep Neural Networks |
Authors | Siming Yan, Xuyang Fang, Bowen Xiao, Harold Rockwell, Yimeng Zhang, Tai Sing Lee |
Abstract | The abundant recurrent horizontal and feedback connections in the primate visual cortex are thought to play an important role in bringing global and semantic contextual information to early visual areas during perceptual inference, helping to resolve local ambiguity and fill in missing details. In this study, we find that introducing feedback loops and horizontal recurrent connections to a deep convolution neural network (VGG16) allows the network to become more robust against noise and occlusion during inference, even in the initial feedforward pass. This suggests that recurrent feedback and contextual modulation transform the feedforward representations of the network in a meaningful and interesting way. We study the population codes of neurons in the network, before and after learning with feedback, and find that learning with feedback yielded an increase in discriminability (measured by d-prime) between the different object classes in the population codes of the neurons in the feedforward path, even at the earliest layer that receives feedback. We find that recurrent feedback, by injecting top-down semantic meaning to the population activities, helps the network learn better feedforward paths to robustly map noisy image patches to the latent representations corresponding to important visual concepts of each object class, resulting in greater robustness of the network against noises and occlusion as well as better fine-grained recognition. |
Tasks | |
Published | 2019-12-22 |
URL | https://arxiv.org/abs/1912.10489v1 |
https://arxiv.org/pdf/1912.10489v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-feedback-improves-feedforward |
Repo | |
Framework | |
Extracting UMLS Concepts from Medical Text Using General and Domain-Specific Deep Learning Models
Title | Extracting UMLS Concepts from Medical Text Using General and Domain-Specific Deep Learning Models |
Authors | Kathleen C. Fraser, Isar Nejadgholi, Berry De Bruijn, Muqun Li, Astha LaPlante, Khaldoun Zine El Abidine |
Abstract | Entity recognition is a critical first step to a number of clinical NLP applications, such as entity linking and relation extraction. We present the first attempt to apply state-of-the-art entity recognition approaches on a newly released dataset, MedMentions. This dataset contains over 4000 biomedical abstracts, annotated for UMLS semantic types. In comparison to existing datasets, MedMentions contains a far greater number of entity types, and thus represents a more challenging but realistic scenario in a real-world setting. We explore a number of relevant dimensions, including the use of contextual versus non-contextual word embeddings, general versus domain-specific unsupervised pre-training, and different deep learning architectures. We contrast our results against the well-known i2b2 2010 entity recognition dataset, and propose a new method to combine general and domain-specific information. While producing a state-of-the-art result for the i2b2 2010 task (F1 = 0.90), our results on MedMentions are significantly lower (F1 = 0.63), suggesting there is still plenty of opportunity for improvement on this new data. |
Tasks | Entity Linking, Relation Extraction, Word Embeddings |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01274v1 |
https://arxiv.org/pdf/1910.01274v1.pdf | |
PWC | https://paperswithcode.com/paper/extracting-umls-concepts-from-medical-text |
Repo | |
Framework | |
Negative Sampling in Variational Autoencoders
Title | Negative Sampling in Variational Autoencoders |
Authors | Adrián Csiszárik, Beatrix Benkő, Dániel Varga |
Abstract | We propose negative sampling as an approach to improve the notoriously bad out-of-distribution likelihood estimates of Variational Autoencoder models. Our model pushes latent images of negative samples away from the prior. When the source of negative samples is an auxiliary dataset, such a model can vastly improve on baselines when evaluated on OOD detection tasks. Perhaps more surprisingly, we present a fully unsupervised version of employing negative sampling in VAEs: when the generator is trained in an adversarial manner, using the generator’s own outputs as negative samples can also significantly improve the robustness of OOD likelihood estimates. |
Tasks | |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02760v2 |
https://arxiv.org/pdf/1910.02760v2.pdf | |
PWC | https://paperswithcode.com/paper/negative-sampling-in-variational-autoencoders-1 |
Repo | |
Framework | |
Progressive Explanation Generation for Human-robot Teaming
Title | Progressive Explanation Generation for Human-robot Teaming |
Authors | Yu Zhang, Mehrdad Zakershahrak |
Abstract | Generating explanation to explain its behavior is an essential capability for a robotic teammate. Explanations help human partners better understand the situation and maintain trust of their teammates. Prior work on robot generating explanations focuses on providing the reasoning behind its decision making. These approaches, however, fail to heed the cognitive requirement of understanding an explanation. In other words, while they provide the right explanations from the explainer’s perspective, the explainee part of the equation is ignored. In this work, we address an important aspect along this direction that contributes to a better understanding of a given explanation, which we refer to as the progressiveness of explanations. A progressive explanation improves understanding by limiting the cognitive effort required at each step of making the explanation. As a result, such explanations are expected to be smoother and hence easier to understand. A general formulation of progressive explanation is presented. Algorithms are provided based on several alternative quantifications of cognitive effort as an explanation is being made, which are evaluated in a standard planning competition domain. |
Tasks | Decision Making |
Published | 2019-02-02 |
URL | http://arxiv.org/abs/1902.00604v1 |
http://arxiv.org/pdf/1902.00604v1.pdf | |
PWC | https://paperswithcode.com/paper/progressive-explanation-generation-for-human |
Repo | |
Framework | |
Hierarchical method for cataract grading based on retinal images using improved Haar wavelet
Title | Hierarchical method for cataract grading based on retinal images using improved Haar wavelet |
Authors | Lvchen Cao, Huiqi Li, Yanjun Zhang, Liang Xu, Li Zhang |
Abstract | Cataracts, which are lenticular opacities that may occur at different lens locations, are the leading cause of visual impairment worldwide. Accurate and timely diagnosis can improve the quality of life of cataract patients. In this paper, a feature extraction-based method for grading cataract severity using retinal images is proposed. To obtain more appropriate features for the automatic grading, the Haar wavelet is improved according to the characteristics of retinal images. Retinal images of non-cataract, as well as mild, moderate, and severe cataracts, are automatically recognized using the improved Haar wavelet. A hierarchical strategy is used to transform the four-class classification problem into three adjacent two-class classification problems. Three sets of two-class classifiers based on a neural network are trained individually and integrated together to establish a complete classification system. The accuracies of the two-class classification (cataract and non-cataract) and four-class classification are 94.83% and 85.98%, respectively. The performance analysis demonstrates that the improved Haar wavelet feature achieves higher accuracy than the original Haar wavelet feature, and the fusion of three sets of two-class classifiers is superior to a simple four-class classifier. The discussion indicates that the retinal image-based method offers significant potential for cataract detection. |
Tasks | |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01261v1 |
http://arxiv.org/pdf/1904.01261v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-method-for-cataract-grading |
Repo | |
Framework | |
vireoJD-MM at Activity Detection in Extended Videos
Title | vireoJD-MM at Activity Detection in Extended Videos |
Authors | Fuchen Long, Qi Cai, Zhaofan Qiu, Zhijian Hou, Yingwei Pan, Ting Yao, Chong-Wah Ngo |
Abstract | This notebook paper presents an overview and comparative analysis of our system designed for activity detection in extended videos (ActEV-PC) in ActivityNet Challenge 2019. Specifically, we exploit person/vehicle detections in spatial level and action localization in temporal level for action detection in surveillance videos. The mechanism of different tubelet generation and model decomposition methods are studied as well. The detection results are finally predicted by late fusing the results from each component. |
Tasks | Action Detection, Action Localization, Activity Detection |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08547v1 |
https://arxiv.org/pdf/1906.08547v1.pdf | |
PWC | https://paperswithcode.com/paper/vireojd-mm-at-activity-detection-in-extended |
Repo | |
Framework | |
Enhancing Sound Texture in CNN-Based Acoustic Scene Classification
Title | Enhancing Sound Texture in CNN-Based Acoustic Scene Classification |
Authors | Yuzhong Wu, Tan Lee |
Abstract | Acoustic scene classification is the task of identifying the scene from which the audio signal is recorded. Convolutional neural network (CNN) models are widely adopted with proven successes in acoustic scene classification. However, there is little insight on how an audio scene is perceived in CNN, as what have been demonstrated in image recognition research. In the present study, the Class Activation Mapping (CAM) is utilized to analyze how the log-magnitude Mel-scale filter-bank (log-Mel) features of different acoustic scenes are learned in a CNN classifier. It is noted that distinct high-energy time-frequency components of audio signals generally do not correspond to strong activation on CAM, while the background sound texture are well learned in CNN. In order to make the sound texture more salient, we propose to apply the Difference of Gaussian (DoG) and Sobel operator to process the log-Mel features and enhance edge information of the time-frequency image. Experimental results on the DCASE 2017 ASC challenge show that using edge enhanced log-Mel images as input feature of CNN significantly improves the performance of audio scene classification. |
Tasks | Acoustic Scene Classification, Scene Classification |
Published | 2019-01-06 |
URL | http://arxiv.org/abs/1901.01502v1 |
http://arxiv.org/pdf/1901.01502v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-sound-texture-in-cnn-based-acoustic |
Repo | |
Framework | |