Paper Group ANR 389
Language Modeling with Generative AdversarialNetworks. Fast convergence rates of deep neural networks for classification. Context-Aware Policy Reuse. Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection. Discovering state-parameter mappings in subsurface models using generative adversarial networks. State of the Ar …
Language Modeling with Generative AdversarialNetworks
Title | Language Modeling with Generative AdversarialNetworks |
Authors | Mehrad Moradshahi, Utkarsh Contractor |
Abstract | Generative Adversarial Networks (GANs) have been promising in the field of image generation, however, they have been hard to train for language generation. GANs were originally designed to output differentiable values, so discrete language generation is challenging for them which causes high levels of instability in training GANs. Consequently, past work has resorted to pre-training with maximum-likelihood or training GANs without pre-training with a WGAN objective with a gradient penalty. In this study, we present a comparison of those approaches. Furthermore, we present the results of some experiments that indicate better training and convergence of Wasserstein GANs (WGANs) when a weaker regularization term is enforcing the Lipschitz constraint. |
Tasks | Image Generation, Language Modelling, Text Generation |
Published | 2018-04-08 |
URL | http://arxiv.org/abs/1804.02617v1 |
http://arxiv.org/pdf/1804.02617v1.pdf | |
PWC | https://paperswithcode.com/paper/language-modeling-with-generative |
Repo | |
Framework | |
Fast convergence rates of deep neural networks for classification
Title | Fast convergence rates of deep neural networks for classification |
Authors | Yongdai Kim, Ilsang Ohn, Dongha Kim |
Abstract | We derive the fast convergence rates of a deep neural network (DNN) classifier with the rectified linear unit (ReLU) activation function learned using the hinge loss. We consider three cases for a true model: (1) a smooth decision boundary, (2) smooth conditional class probability, and (3) the margin condition (i.e., the probability of inputs near the decision boundary is small). We show that the DNN classifier learned using the hinge loss achieves fast rate convergences for all three cases provided that the architecture (i.e., the number of layers, number of nodes and sparsity). is carefully selected. An important implication is that DNN architectures are very flexible for use in various cases without much modification. In addition, we consider a DNN classifier learned by minimizing the cross-entropy, and show that the DNN classifier achieves a fast convergence rate under the condition that the conditional class probabilities of most data are sufficiently close to either 1 or zero. This assumption is not unusual for image recognition because human beings are extremely good at recognizing most images. To confirm our theoretical explanation, we present the results of a small numerical study conducted to compare the hinge loss and cross-entropy. |
Tasks | |
Published | 2018-12-10 |
URL | https://arxiv.org/abs/1812.03599v2 |
https://arxiv.org/pdf/1812.03599v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-convergence-rates-of-deep-neural |
Repo | |
Framework | |
Context-Aware Policy Reuse
Title | Context-Aware Policy Reuse |
Authors | Siyuan Li, Fangda Gu, Guangxiang Zhu, Chongjie Zhang |
Abstract | Transfer learning can greatly speed up reinforcement learning for a new task by leveraging policies of relevant tasks. Existing works of policy reuse either focus on only selecting a single best source policy for transfer without considering contexts, or cannot guarantee to learn an optimal policy for a target task. To improve transfer efficiency and guarantee optimality, we develop a novel policy reuse method, called Context-Aware Policy reuSe (CAPS), that enables multi-policy transfer. Our method learns when and which source policy is best for reuse, as well as when to terminate its reuse. CAPS provides theoretical guarantees in convergence and optimality for both source policy selection and target task learning. Empirical results on a grid-based navigation domain and the Pygame Learning Environment demonstrate that CAPS significantly outperforms other state-of-the-art policy reuse methods. |
Tasks | Transfer Learning |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.03793v4 |
http://arxiv.org/pdf/1806.03793v4.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-policy-reuse |
Repo | |
Framework | |
Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection
Title | Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection |
Authors | Hongyu Xu, Xutao Lv, Xiaoyu Wang, Zhou Ren, Navaneeth Bodla, Rama Chellappa |
Abstract | In this paper, we propose a novel object detection algorithm named “Deep Regionlets” by integrating deep neural networks and a conventional detection schema for accurate generic object detection. Motivated by the effectiveness of regionlets for modeling object deformations and multiple aspect ratios, we incorporate regionlets into an end-to-end trainable deep learning framework. The deep regionlets framework consists of a region selection network and a deep regionlet learning module. Specifically, given a detection bounding box proposal, the region selection network provides guidance on where to select sub-regions from which features can be learned from. An object proposal typically contains 3-16 sub-regions. The regionlet learning module focuses on local feature selection and transformations to alleviate the effects of appearance variations. To this end, we first realize non-rectangular region selection within the detection framework to accommodate variations in object appearance. Moreover, we design a “gating network” within the regionlet leaning module to enable instance dependent soft feature selection and pooling. The Deep Regionlets framework is trained end-to-end without additional efforts. We present ablation studies and extensive experiments on the PASCAL VOC dataset and the Microsoft COCO dataset. The proposed method yields competitive performance over state-of-the-art algorithms, such as RetinaNet and Mask R-CNN, even without additional segmentation labels. |
Tasks | Feature Selection, Object Detection |
Published | 2018-11-28 |
URL | https://arxiv.org/abs/1811.11318v2 |
https://arxiv.org/pdf/1811.11318v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-regionlets-blended-representation-and |
Repo | |
Framework | |
Discovering state-parameter mappings in subsurface models using generative adversarial networks
Title | Discovering state-parameter mappings in subsurface models using generative adversarial networks |
Authors | Alexander Y. Sun |
Abstract | A fundamental problem in geophysical modeling is related to the identification and approximation of causal structures among physical processes. However, resolving the bidirectional mappings between physical parameters and model state variables (i.e., solving the forward and inverse problems) is challenging, especially when parameter dimensionality is high. Deep learning has opened a new door toward knowledge representation and complex pattern identification. In particular, the recently introduced generative adversarial networks (GANs) hold strong promises in learning cross-domain mappings for image translation. This study presents a state-parameter identification GAN (SPID-GAN) for simultaneously learning bidirectional mappings between a high-dimensional parameter space and the corresponding model state space. SPID-GAN is demonstrated using a series of representative problems from subsurface flow modeling. Results show that SPID-GAN achieves satisfactory performance in identifying the bidirectional state-parameter mappings, providing a new deep-learning-based, knowledge representation paradigm for a wide array of complex geophysical problems. |
Tasks | |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12856v1 |
http://arxiv.org/pdf/1810.12856v1.pdf | |
PWC | https://paperswithcode.com/paper/discovering-state-parameter-mappings-in |
Repo | |
Framework | |
State of the Art in Fair ML: From Moral Philosophy and Legislation to Fair Classifiers
Title | State of the Art in Fair ML: From Moral Philosophy and Legislation to Fair Classifiers |
Authors | Elias Baumann, Josef Lorenz Rumberger |
Abstract | Machine learning is becoming an ever present part in our lives as many decisions, e.g. to lend a credit, are no longer made by humans but by machine learning algorithms. However those decisions are often unfair and discriminating individuals belonging to protected groups based on race or gender. With the recent General Data Protection Regulation (GDPR) coming into effect, new awareness has been raised for such issues and with computer scientists having such a large impact on peoples lives it is necessary that actions are taken to discover and prevent discrimination. This work aims to give an introduction into discrimination, legislative foundations to counter it and strategies to detect and prevent machine learning algorithms from showing such behavior. |
Tasks | |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.09539v1 |
http://arxiv.org/pdf/1811.09539v1.pdf | |
PWC | https://paperswithcode.com/paper/state-of-the-art-in-fair-ml-from-moral |
Repo | |
Framework | |
The emergent algebraic structure of RNNs and embeddings in NLP
Title | The emergent algebraic structure of RNNs and embeddings in NLP |
Authors | Sean A. Cantrell |
Abstract | We examine the algebraic and geometric properties of a uni-directional GRU and word embeddings trained end-to-end on a text classification task. A hyperparameter search over word embedding dimension, GRU hidden dimension, and a linear combination of the GRU outputs is performed. We conclude that words naturally embed themselves in a Lie group and that RNNs form a nonlinear representation of the group. Appealing to these results, we propose a novel class of recurrent-like neural networks and a word embedding scheme. |
Tasks | Text Classification, Word Embeddings |
Published | 2018-03-07 |
URL | http://arxiv.org/abs/1803.02839v1 |
http://arxiv.org/pdf/1803.02839v1.pdf | |
PWC | https://paperswithcode.com/paper/the-emergent-algebraic-structure-of-rnns-and |
Repo | |
Framework | |
A Primer on Causal Analysis
Title | A Primer on Causal Analysis |
Authors | Finnian Lattimore, Cheng Soon Ong |
Abstract | We provide a conceptual map to navigate causal analysis problems. Focusing on the case of discrete random variables, we consider the case of causal effect estimation from observational data. The presented approaches apply also to continuous variables, but the issue of estimation becomes more complex. We then introduce the four schools of thought for causal analysis |
Tasks | |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01488v1 |
http://arxiv.org/pdf/1806.01488v1.pdf | |
PWC | https://paperswithcode.com/paper/a-primer-on-causal-analysis |
Repo | |
Framework | |
Machine Learning in Cyber-Security - Problems, Challenges and Data Sets
Title | Machine Learning in Cyber-Security - Problems, Challenges and Data Sets |
Authors | Idan Amit, John Matherly, William Hewlett, Zhi Xu, Yinnon Meshi, Yigal Weinberger |
Abstract | We present cyber-security problems of high importance. We show that in order to solve these cyber-security problems, one must cope with certain machine learning challenges. We provide novel data sets representing the problems in order to enable the academic community to investigate the problems and suggest methods to cope with the challenges. We also present a method to generate labels via pivoting, providing a solution to common problems of lack of labels in cyber-security. |
Tasks | |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.07858v3 |
http://arxiv.org/pdf/1812.07858v3.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-in-cyber-security-problems |
Repo | |
Framework | |
Kernel Flows: from learning kernels from data into the abyss
Title | Kernel Flows: from learning kernels from data into the abyss |
Authors | Houman Owhadi, Gene Ryan Yoo |
Abstract | Learning can be seen as approximating an unknown function by interpolating the training data. Kriging offers a solution to this problem based on the prior specification of a kernel. We explore a numerical approximation approach to kernel selection/construction based on the simple premise that a kernel must be good if the number of interpolation points can be halved without significant loss in accuracy (measured using the intrinsic RKHS norm $\cdot$ associated with the kernel). We first test and motivate this idea on a simple problem of recovering the Green’s function of an elliptic PDE (with inhomogeneous coefficients) from the sparse observation of one of its solutions. Next we consider the problem of learning non-parametric families of deep kernels of the form $K_1(F_n(x),F_n(x’))$ with $F_{n+1}=(I_d+\epsilon G_{n+1})\circ F_n$ and $G_{n+1} \in \operatorname{Span}{K_1(F_n(x_i),\cdot)}$. With the proposed approach constructing the kernel becomes equivalent to integrating a stochastic data driven dynamical system, which allows for the training of very deep (bottomless) networks and the exploration of their properties. These networks learn by constructing flow maps in the kernel and input spaces via incremental data-dependent deformations/perturbations (appearing as the cooperative counterpart of adversarial examples) and, at profound depths, they (1) can achieve accurate classification from only one data point per class (2) appear to learn archetypes of each class (3) expand distances between points that are in different classes and contract distances between points in the same class. For kernels parameterized by the weights of Convolutional Neural Networks, minimizing approximation errors incurred by halving random subsets of interpolation points, appears to outperform training (the same CNN architecture) with relative entropy and dropout. |
Tasks | |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04475v2 |
http://arxiv.org/pdf/1808.04475v2.pdf | |
PWC | https://paperswithcode.com/paper/kernel-flows-from-learning-kernels-from-data |
Repo | |
Framework | |
Principal Component Analysis with Tensor Train Subspace
Title | Principal Component Analysis with Tensor Train Subspace |
Authors | Wenqi Wang, Vaneet Aggarwal, Shuchin Aeron |
Abstract | Tensor train is a hierarchical tensor network structure that helps alleviate the curse of dimensionality by parameterizing large-scale multidimensional data via a set of network of low-rank tensors. Associated with such a construction is a notion of Tensor Train subspace and in this paper we propose a TT-PCA algorithm for estimating this structured subspace from the given data. By maintaining low rank tensor structure, TT-PCA is more robust to noise comparing with PCA or Tucker-PCA. This is borne out numerically by testing the proposed approach on the Extended YaleFace Dataset B. |
Tasks | |
Published | 2018-03-13 |
URL | http://arxiv.org/abs/1803.05026v1 |
http://arxiv.org/pdf/1803.05026v1.pdf | |
PWC | https://paperswithcode.com/paper/principal-component-analysis-with-tensor |
Repo | |
Framework | |
Robust Cross-View Gait Identification with Evidence: A Discriminant Gait GAN (DiGGAN) Approach on 10000 People
Title | Robust Cross-View Gait Identification with Evidence: A Discriminant Gait GAN (DiGGAN) Approach on 10000 People |
Authors | BingZhang Hu, Yan Gao, Yu Guan, Yang Long, Nicholas Lane, Thomas Ploetz |
Abstract | Gait is an important biometric trait for surveillance and forensic applications, which can be used to identify individuals at a large distance through CCTV cameras. However, it is very difficult to develop robust automated gait recognition systems, since gait may be affected by many covariate factors such as clothing, walking surface, walking speed, camera view angle, etc. Out of them, large view angle was deemed as the most challenging factor since it may alter the overall gait appearance substantially. Recently, some deep learning approaches (such as CNNs) have been employed to extract view-invariant features, and achieved encouraging results on small datasets. However, they do not scale well to large dataset, and the performance decreases significantly w.r.t. number of subjects, which is impractical to large-scale surveillance applications. To address this issue, in this work we propose a Discriminant Gait Generative Adversarial Network (DiGGAN) framework, which not only can learn view-invariant gait features for cross-view gait recognition tasks, but also can be used to reconstruct the gait templates in all views — serving as important evidences for forensic applications. We evaluated our DiGGAN framework on the world’s largest multi-view OU-MVLP dataset (which includes more than 10000 subjects), and our method outperforms state-of-the-art algorithms significantly on various cross-view gait identification scenarios (e.g., cooperative/uncooperative mode). Our DiGGAN framework also has the best results on the popular CASIA-B dataset, and it shows great generalisation capability across different datasets. |
Tasks | Gait Identification, Gait Recognition |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10493v1 |
http://arxiv.org/pdf/1811.10493v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-cross-view-gait-identification-with |
Repo | |
Framework | |
CADDY Underwater Stereo-Vision Dataset for Human-Robot Interaction (HRI) in the Context of Diver Activities
Title | CADDY Underwater Stereo-Vision Dataset for Human-Robot Interaction (HRI) in the Context of Diver Activities |
Authors | Arturo Gomez Chavez, Andrea Ranieri, Davide Chiarella, Enrica Zereik, Anja Babić, Andreas Birk |
Abstract | In this article we present a novel underwater dataset collected from several field trials within the EU FP7 project “Cognitive autonomous diving buddy (CADDY)", where an Autonomous Underwater Vehicle (AUV) was used to interact with divers and monitor their activities. To our knowledge, this is one of the first efforts to collect a large dataset in underwater environments targeting object classification, segmentation and human pose estimation tasks. The first part of the dataset contains stereo camera recordings (~10K) of divers performing hand gestures to communicate and interact with an AUV in different environmental conditions. These gestures samples serve to test the robustness of object detection and classification algorithms against underwater image distortions i.e., color attenuation and light backscatter. The second part includes stereo footage (~12.7K) of divers free-swimming in front of the AUV, along with synchronized IMUs measurements located throughout the diver’s suit (DiverNet) which serve as ground-truth for human pose and tracking methods. In both cases, these rectified images allow investigation of 3D representation and reasoning pipelines from low-texture targets commonly present in underwater scenarios. In this paper we describe our recording platform, sensor calibration procedure plus the data format and the utilities provided to use the dataset. |
Tasks | Calibration, Object Classification, Object Detection, Pose Estimation |
Published | 2018-07-12 |
URL | http://arxiv.org/abs/1807.04856v1 |
http://arxiv.org/pdf/1807.04856v1.pdf | |
PWC | https://paperswithcode.com/paper/caddy-underwater-stereo-vision-dataset-for |
Repo | |
Framework | |
Intrinsic Isometric Manifold Learning with Application to Localization
Title | Intrinsic Isometric Manifold Learning with Application to Localization |
Authors | Ariel Schwartz, Ronen Talmon |
Abstract | Data living on manifolds commonly appear in many applications. Often this results from an inherently latent low-dimensional system being observed through higher dimensional measurements. We show that under certain conditions, it is possible to construct an intrinsic and isometric data representation, which respects an underlying latent intrinsic geometry. Namely, we view the observed data only as a proxy and learn the structure of a latent unobserved intrinsic manifold, whereas common practice is to learn the manifold of the observed data. For this purpose, we build a new metric and propose a method for its robust estimation by assuming mild statistical priors and by using artificial neural networks as a mechanism for metric regularization and parametrization. We show successful application to unsupervised indoor localization in ad-hoc sensor networks. Specifically, we show that our proposed method facilitates accurate localization of a moving agent from imaging data it collects. Importantly, our method is applied in the same way to two different imaging modalities, thereby demonstrating its intrinsic and modality-invariant capabilities. |
Tasks | |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.00556v2 |
http://arxiv.org/pdf/1806.00556v2.pdf | |
PWC | https://paperswithcode.com/paper/intrinsic-isometric-manifold-learning-with |
Repo | |
Framework | |
An Automatic Method for Complete Brain Matter Segmentation from Multislice CT scan
Title | An Automatic Method for Complete Brain Matter Segmentation from Multislice CT scan |
Authors | Soumi Ray, Vinod Kumar, Chirag Ahuja, Niranjan Khandelwal |
Abstract | Computed tomography imaging is well accepted for its imaging speed, image contrast & resolution and cost. Thus it has wide use in detection and diagnosis of brain diseases. But unfortunately reported works on CT segmentation is not very significant. In this paper, a robust automatic segmentation system is presented which is capable of segment complete brain matter from CT slices, without any lose in information. The proposed method is simple, fast, accurate and completely automatic. It can handle multislice CT scan in single run. From a given multislice CT dataset, one slice is selected automatically to form masks for segmentation. Two types of masks are created to handle nasal slices in a better way. Masks are created from selected reference slice using automatic seed point selection and region growing technique. One mask is designed for brain matter and another includes the skull of the reference slice. This second mask is used as global reference mask for all slices whereas the brain matter mask is implemented on only adjacent slices and continuously modified for better segmentation. Slices in given dataset are divided into two batches, before reference slice and after reference slice. Each batch segmented separately. Successive propagation of brain matter mask has demonstrated very high potential in reported segmentation. Presented result shows highest sensitivity and more than 96% accuracy in all cases. Resulted segmented images can be used for any brain disease diagnosis or further image analysis. |
Tasks | |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.06215v2 |
http://arxiv.org/pdf/1809.06215v2.pdf | |
PWC | https://paperswithcode.com/paper/an-automatic-method-for-complete-brain-matter |
Repo | |
Framework | |