Paper Group ANR 320
Privacy-Preserving Adversarial Networks. Testing for Feature Relevance: The HARVEST Algorithm. Lipschitz Properties for Deep Convolutional Networks. HyperENTM: Evolving Scalable Neural Turing Machines through HyperNEAT. Quantifying Translation-Invariance in Convolutional Neural Networks. Learning to Schedule Deadline- and Operator-Sensitive Tasks. …
Privacy-Preserving Adversarial Networks
Title | Privacy-Preserving Adversarial Networks |
Authors | Ardhendu Tripathy, Ye Wang, Prakash Ishwar |
Abstract | We propose a data-driven framework for optimizing privacy-preserving data release mechanisms to attain the information-theoretically optimal tradeoff between minimizing distortion of useful data and concealing specific sensitive information. Our approach employs adversarially-trained neural networks to implement randomized mechanisms and to perform a variational approximation of mutual information privacy. We validate our Privacy-Preserving Adversarial Networks (PPAN) framework via proof-of-concept experiments on discrete and continuous synthetic data, as well as the MNIST handwritten digits dataset. For synthetic data, our model-agnostic PPAN approach achieves tradeoff points very close to the optimal tradeoffs that are analytically-derived from model knowledge. In experiments with the MNIST data, we visually demonstrate a learned tradeoff between minimizing the pixel-level distortion versus concealing the written digit. |
Tasks | |
Published | 2017-12-19 |
URL | https://arxiv.org/abs/1712.07008v3 |
https://arxiv.org/pdf/1712.07008v3.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-adversarial-networks |
Repo | |
Framework | |
Testing for Feature Relevance: The HARVEST Algorithm
Title | Testing for Feature Relevance: The HARVEST Algorithm |
Authors | Herbert Weisberg, Victor Pontes, Mathis Thoma |
Abstract | Feature selection with high-dimensional data and a very small proportion of relevant features poses a severe challenge to standard statistical methods. We have developed a new approach (HARVEST) that is straightforward to apply, albeit somewhat computer-intensive. This algorithm can be used to pre-screen a large number of features to identify those that are potentially useful. The basic idea is to evaluate each feature in the context of many random subsets of other features. HARVEST is predicated on the assumption that an irrelevant feature can add no real predictive value, regardless of which other features are included in the subset. Motivated by this idea, we have derived a simple statistical test for feature relevance. Empirical analyses and simulations produced so far indicate that the HARVEST algorithm is highly effective in predictive analytics, both in science and business. |
Tasks | Feature Selection |
Published | 2017-09-30 |
URL | http://arxiv.org/abs/1710.00210v2 |
http://arxiv.org/pdf/1710.00210v2.pdf | |
PWC | https://paperswithcode.com/paper/testing-for-feature-relevance-the-harvest |
Repo | |
Framework | |
Lipschitz Properties for Deep Convolutional Networks
Title | Lipschitz Properties for Deep Convolutional Networks |
Authors | Radu Balan, Maneesh Singh, Dongmian Zou |
Abstract | In this paper we discuss the stability properties of convolutional neural networks. Convolutional neural networks are widely used in machine learning. In classification they are mainly used as feature extractors. Ideally, we expect similar features when the inputs are from the same class. That is, we hope to see a small change in the feature vector with respect to a deformation on the input signal. This can be established mathematically, and the key step is to derive the Lipschitz properties. Further, we establish that the stability results can be extended for more general networks. We give a formula for computing the Lipschitz bound, and compare it with other methods to show it is closer to the optimal value. |
Tasks | |
Published | 2017-01-18 |
URL | http://arxiv.org/abs/1701.05217v1 |
http://arxiv.org/pdf/1701.05217v1.pdf | |
PWC | https://paperswithcode.com/paper/lipschitz-properties-for-deep-convolutional |
Repo | |
Framework | |
HyperENTM: Evolving Scalable Neural Turing Machines through HyperNEAT
Title | HyperENTM: Evolving Scalable Neural Turing Machines through HyperNEAT |
Authors | Jakob Merrild, Mikkel Angaju Rasmussen, Sebastian Risi |
Abstract | Recent developments within memory-augmented neural networks have solved sequential problems requiring long-term memory, which are intractable for traditional neural networks. However, current approaches still struggle to scale to large memory sizes and sequence lengths. In this paper we show how access to memory can be encoded geometrically through a HyperNEAT-based Neural Turing Machine (HyperENTM). We demonstrate that using the indirect HyperNEAT encoding allows for training on small memory vectors in a bit-vector copy task and then applying the knowledge gained from such training to speed up training on larger size memory vectors. Additionally, we demonstrate that in some instances, networks trained to copy bit-vectors of size 9 can be scaled to sizes of 1,000 without further training. While the task in this paper is simple, these results could open up the problems amendable to networks with external memories to problems with larger memory vectors and theoretically unbounded memory sizes. |
Tasks | |
Published | 2017-10-12 |
URL | http://arxiv.org/abs/1710.04748v1 |
http://arxiv.org/pdf/1710.04748v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperentm-evolving-scalable-neural-turing |
Repo | |
Framework | |
Quantifying Translation-Invariance in Convolutional Neural Networks
Title | Quantifying Translation-Invariance in Convolutional Neural Networks |
Authors | Eric Kauderer-Abrams |
Abstract | A fundamental problem in object recognition is the development of image representations that are invariant to common transformations such as translation, rotation, and small deformations. There are multiple hypotheses regarding the source of translation invariance in CNNs. One idea is that translation invariance is due to the increasing receptive field size of neurons in successive convolution layers. Another possibility is that invariance is due to the pooling operation. We develop a simple a tool, the translation-sensitivity map, which we use to visualize and quantify the translation-invariance of various architectures. We obtain the surprising result that architectural choices such as the number of pooling layers and the convolution filter size have only a secondary effect on the translation-invariance of a network. Our analysis identifies training data augmentation as the most important factor in obtaining translation-invariant representations of images using convolutional neural networks. |
Tasks | Data Augmentation, Object Recognition |
Published | 2017-12-10 |
URL | http://arxiv.org/abs/1801.01450v1 |
http://arxiv.org/pdf/1801.01450v1.pdf | |
PWC | https://paperswithcode.com/paper/quantifying-translation-invariance-in |
Repo | |
Framework | |
Learning to Schedule Deadline- and Operator-Sensitive Tasks
Title | Learning to Schedule Deadline- and Operator-Sensitive Tasks |
Authors | Hanan Rosemarin, John P. Dickerson, Sarit Kraus |
Abstract | The use of semi-autonomous and autonomous robotic assistants to aid in care of the elderly is expected to ease the burden on human caretakers, with small-stage testing already occurring in a variety of countries. Yet, it is likely that these robots will need to request human assistance via teleoperation when domain expertise is needed for a specific task. As deployment of robotic assistants moves to scale, mapping these requests for human aid to the teleoperators themselves will be a difficult online optimization problem. In this paper, we design a system that allocates requests to a limited number of teleoperators, each with different specialities, in an online fashion. We generalize a recent model of online job scheduling with a worst-case competitive-ratio bound to our setting. Next, we design a scalable machine-learning-based teleoperator-aware task scheduling algorithm and show, experimentally, that it performs well when compared to an omniscient optimal scheduling algorithm. |
Tasks | |
Published | 2017-06-19 |
URL | http://arxiv.org/abs/1706.06051v1 |
http://arxiv.org/pdf/1706.06051v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-schedule-deadline-and-operator |
Repo | |
Framework | |
Shift Aggregate Extract Networks
Title | Shift Aggregate Extract Networks |
Authors | Francesco Orsini, Daniele Baracchi, Paolo Frasconi |
Abstract | We introduce an architecture based on deep hierarchical decompositions to learn effective representations of large graphs. Our framework extends classic R-decompositions used in kernel methods, enabling nested “part-of-part” relations. Unlike recursive neural networks, which unroll a template on input graphs directly, we unroll a neural network template over the decomposition hierarchy, allowing us to deal with the high degree variability that typically characterize social network graphs. Deep hierarchical decompositions are also amenable to domain compression, a technique that reduces both space and time complexity by exploiting symmetries. We show empirically that our approach is competitive with current state-of-the-art graph classification methods, particularly when dealing with social network datasets. |
Tasks | Graph Classification |
Published | 2017-03-16 |
URL | http://arxiv.org/abs/1703.05537v1 |
http://arxiv.org/pdf/1703.05537v1.pdf | |
PWC | https://paperswithcode.com/paper/shift-aggregate-extract-networks |
Repo | |
Framework | |
An EM Based Probabilistic Two-Dimensional CCA with Application to Face Recognition
Title | An EM Based Probabilistic Two-Dimensional CCA with Application to Face Recognition |
Authors | Mehran Safayani, Seyed Hashem Ahmadi, Homayun Afrabandpey, Abdolreza Mirzaei |
Abstract | Recently, two-dimensional canonical correlation analysis (2DCCA) has been successfully applied for image feature extraction. The method instead of concatenating the columns of the images to the one-dimensional vectors, directly works with two-dimensional image matrices. Although 2DCCA works well in different recognition tasks, it lacks a probabilistic interpretation. In this paper, we present a probabilistic framework for 2DCCA called probabilistic 2DCCA (P2DCCA) and an iterative EM based algorithm for optimizing the parameters. Experimental results on synthetic and real data demonstrate superior performance in loading factor estimation for P2DCCA compared to 2DCCA. For real data, three subsets of AR face database and also the UMIST face database confirm the robustness of the proposed algorithm in face recognition tasks with different illumination conditions, facial expressions, poses and occlusions. |
Tasks | Face Recognition |
Published | 2017-02-25 |
URL | http://arxiv.org/abs/1702.07884v1 |
http://arxiv.org/pdf/1702.07884v1.pdf | |
PWC | https://paperswithcode.com/paper/an-em-based-probabilistic-two-dimensional-cca |
Repo | |
Framework | |
Interpretability via Model Extraction
Title | Interpretability via Model Extraction |
Authors | Osbert Bastani, Carolyn Kim, Hamsa Bastani |
Abstract | The ability to interpret machine learning models has become increasingly important now that machine learning is used to inform consequential decisions. We propose an approach called model extraction for interpreting complex, blackbox models. Our approach approximates the complex model using a much more interpretable model; as long as the approximation quality is good, then statistical properties of the complex model are reflected in the interpretable model. We show how model extraction can be used to understand and debug random forests and neural nets trained on several datasets from the UCI Machine Learning Repository, as well as control policies learned for several classical reinforcement learning problems. |
Tasks | |
Published | 2017-06-29 |
URL | http://arxiv.org/abs/1706.09773v4 |
http://arxiv.org/pdf/1706.09773v4.pdf | |
PWC | https://paperswithcode.com/paper/interpretability-via-model-extraction |
Repo | |
Framework | |
Hypothesis Testing For Densities and High-Dimensional Multinomials: Sharp Local Minimax Rates
Title | Hypothesis Testing For Densities and High-Dimensional Multinomials: Sharp Local Minimax Rates |
Authors | Sivaraman Balakrishnan, Larry Wasserman |
Abstract | We consider the goodness-of-fit testing problem of distinguishing whether the data are drawn from a specified distribution, versus a composite alternative separated from the null in the total variation metric. In the discrete case, we consider goodness-of-fit testing when the null distribution has a possibly growing or unbounded number of categories. In the continuous case, we consider testing a Lipschitz density, with possibly unbounded support, in the low-smoothness regime where the Lipschitz parameter is not assumed to be constant. In contrast to existing results, we show that the minimax rate and critical testing radius in these settings depend strongly, and in a precise way, on the null distribution being tested and this motivates the study of the (local) minimax rate as a function of the null distribution. For multinomials the local minimax rate was recently studied in the work of Valiant and Valiant. We re-visit and extend their results and develop two modifications to the chi-squared test whose performance we characterize. For testing Lipschitz densities, we show that the usual binning tests are inadequate in the low-smoothness regime and we design a spatially adaptive partitioning scheme that forms the basis for our locally minimax optimal tests. Furthermore, we provide the first local minimax lower bounds for this problem which yield a sharp characterization of the dependence of the critical radius on the null hypothesis being tested. In the low-smoothness regime we also provide adaptive tests, that adapt to the unknown smoothness parameter. We illustrate our results with a variety of simulations that demonstrate the practical utility of our proposed tests. |
Tasks | |
Published | 2017-06-30 |
URL | http://arxiv.org/abs/1706.10003v1 |
http://arxiv.org/pdf/1706.10003v1.pdf | |
PWC | https://paperswithcode.com/paper/hypothesis-testing-for-densities-and-high |
Repo | |
Framework | |
Hierarchical Recurrent Attention Network for Response Generation
Title | Hierarchical Recurrent Attention Network for Response Generation |
Authors | Chen Xing, Wei Wu, Yu Wu, Ming Zhou, Yalou Huang, Wei-Ying Ma |
Abstract | We study multi-turn response generation in chatbots where a response is generated according to a conversation context. Existing work has modeled the hierarchy of the context, but does not pay enough attention to the fact that words and utterances in the context are differentially important. As a result, they may lose important information in context and generate irrelevant responses. We propose a hierarchical recurrent attention network (HRAN) to model both aspects in a unified framework. In HRAN, a hierarchical attention mechanism attends to important parts within and among utterances with word level attention and utterance level attention respectively. With the word level attention, hidden vectors of a word level encoder are synthesized as utterance vectors and fed to an utterance level encoder to construct hidden representations of the context. The hidden vectors of the context are then processed by the utterance level attention and formed as context vectors for decoding the response. Empirical studies on both automatic evaluation and human judgment show that HRAN can significantly outperform state-of-the-art models for multi-turn response generation. |
Tasks | |
Published | 2017-01-25 |
URL | http://arxiv.org/abs/1701.07149v1 |
http://arxiv.org/pdf/1701.07149v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-recurrent-attention-network-for |
Repo | |
Framework | |
From Photo Streams to Evolving Situations
Title | From Photo Streams to Evolving Situations |
Authors | Mengfan Tang, Feiping Nie, Siripen Pongpaichet, Ramesh Jain |
Abstract | Photos are becoming spontaneous, objective, and universal sources of information. This paper develops evolving situation recognition using photo streams coming from disparate sources combined with the advances of deep learning. Using visual concepts in photos together with space and time information, we formulate the situation detection into a semi-supervised learning framework and propose new graph-based models to solve the problem. To extend the method for unknown situations, we introduce a soft label method which enables the traditional semi-supervised learning framework to accurately predict predefined labels as well as effectively form new clusters. To overcome the noisy data which degrades graph quality, leading to poor recognition results, we take advantage of two kinds of noise-robust norms which can eliminate the adverse effects of outliers in visual concepts and improve the accuracy of situation recognition. Finally, we demonstrate the idea and the effectiveness of the proposed model on Yahoo Flickr Creative Commons 100 Million. |
Tasks | |
Published | 2017-02-20 |
URL | http://arxiv.org/abs/1702.05878v1 |
http://arxiv.org/pdf/1702.05878v1.pdf | |
PWC | https://paperswithcode.com/paper/from-photo-streams-to-evolving-situations |
Repo | |
Framework | |
Efficient Decentralized Visual Place Recognition From Full-Image Descriptors
Title | Efficient Decentralized Visual Place Recognition From Full-Image Descriptors |
Authors | Titus Cieslewski, Davide Scaramuzza |
Abstract | In this paper, we discuss the adaptation of our decentralized place recognition method described in [1] to full image descriptors. As we had shown, the key to making a scalable decentralized visual place recognition lies in exploting deterministic key assignment in a distributed key-value map. Through this, it is possible to reduce bandwidth by up to a factor of n, the robot count, by casting visual place recognition to a key-value lookup problem. In [1], we exploited this for the bag-of-words method [3], [4]. Our method of casting bag-of-words, however, results in a complex decentralized system, which has inherently worse recall than its centralized counterpart. In this paper, we instead start from the recent full-image description method NetVLAD [5]. As we show, casting this to a key-value lookup problem can be achieved with k-means clustering, and results in a much simpler system than [1]. The resulting system still has some flaws, albeit of a completely different nature: it suffers when the environment seen during deployment lies in a different distribution in feature space than the environment seen during training. |
Tasks | Visual Place Recognition |
Published | 2017-05-30 |
URL | http://arxiv.org/abs/1705.10739v1 |
http://arxiv.org/pdf/1705.10739v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-decentralized-visual-place |
Repo | |
Framework | |
Co-Morbidity Exploration on Wearables Activity Data Using Unsupervised Pre-training and Multi-Task Learning
Title | Co-Morbidity Exploration on Wearables Activity Data Using Unsupervised Pre-training and Multi-Task Learning |
Authors | Karan Aggarwal, Shafiq Joty, Luis F. Luque, Jaideep Srivastava |
Abstract | Physical activity and sleep play a major role in the prevention and management of many chronic conditions. It is not a trivial task to understand their impact on chronic conditions. Currently, data from electronic health records (EHRs), sleep lab studies, and activity/sleep logs are used. The rapid increase in the popularity of wearable health devices provides a significant new data source, making it possible to track the user’s lifestyle real-time through web interfaces, both to consumer as well as their healthcare provider, potentially. However, at present there is a gap between lifestyle data (e.g., sleep, physical activity) and clinical outcomes normally captured in EHRs. This is a critical barrier for the use of this new source of signal for healthcare decision making. Applying deep learning to wearables data provides a new opportunity to overcome this barrier. To address the problem of the unavailability of clinical data from a major fraction of subjects and unrepresentative subject populations, we propose a novel unsupervised (task-agnostic) time-series representation learning technique called act2vec. act2vec learns useful features by taking into account the co-occurrence of activity levels along with periodicity of human activity patterns. The learned representations are then exploited to boost the performance of disorder-specific supervised learning models. Furthermore, since many disorders are often related to each other, a phenomenon referred to as co-morbidity, we use a multi-task learning framework for exploiting the shared structure of disorder inducing life-style choices partially captured in the wearables data. Empirical evaluation using actigraphy data from 4,124 subjects shows that our proposed method performs and generalizes substantially better than the conventional time-series symbolic representational methods and task-specific deep learning models. |
Tasks | Decision Making, Multi-Task Learning, Representation Learning, Time Series |
Published | 2017-12-27 |
URL | http://arxiv.org/abs/1712.09527v1 |
http://arxiv.org/pdf/1712.09527v1.pdf | |
PWC | https://paperswithcode.com/paper/co-morbidity-exploration-on-wearables |
Repo | |
Framework | |
Formalising Type-Logical Grammars in Agda
Title | Formalising Type-Logical Grammars in Agda |
Authors | Wen Kokke |
Abstract | In recent years, the interest in using proof assistants to formalise and reason about mathematics and programming languages has grown. Type-logical grammars, being closely related to type theories and systems used in functional programming, are a perfect candidate to next apply this curiosity to. The advantages of using proof assistants is that they allow one to write formally verified proofs about one’s type-logical systems, and that any theory, once implemented, can immediately be computed with. The downside is that in many cases the formal proofs are written as an afterthought, are incomplete, or use obtuse syntax. This makes it that the verified proofs are often much more difficult to read than the pen-and-paper proofs, and almost never directly published. In this paper, we will try to remedy that by example. Concretely, we use Agda to model the Lambek-Grishin calculus, a grammar logic with a rich vocabulary of type-forming operations. We then present a verified procedure for cut elimination in this system. Then we briefly outline a CPS translation from proofs in the Lambek-Grishin calculus to programs in Agda. And finally, we will put our system to use in the analysis of a simple example sentence. |
Tasks | |
Published | 2017-09-03 |
URL | http://arxiv.org/abs/1709.00728v1 |
http://arxiv.org/pdf/1709.00728v1.pdf | |
PWC | https://paperswithcode.com/paper/formalising-type-logical-grammars-in-agda |
Repo | |
Framework | |