Paper Group ANR 1352
An Information-Theoretic Explanation for the Adversarial Fragility of AI Classifiers. Actively Seeking and Learning from Live Data. Graph Domain Adaptation with Localized Graph Signal Representations. Correlations between Word Vector Sets. Exact asymptotics for phase retrieval and compressed sensing with random generative priors. A large-scale crow …
An Information-Theoretic Explanation for the Adversarial Fragility of AI Classifiers
Title | An Information-Theoretic Explanation for the Adversarial Fragility of AI Classifiers |
Authors | Hui Xie, Jirong Yi, Weiyu Xu, Raghu Mudumbai |
Abstract | We present a simple hypothesis about a compression property of artificial intelligence (AI) classifiers and present theoretical arguments to show that this hypothesis successfully accounts for the observed fragility of AI classifiers to small adversarial perturbations. We also propose a new method for detecting when small input perturbations cause classifier errors, and show theoretical guarantees for the performance of this detection method. We present experimental results with a voice recognition system to demonstrate this method. The ideas in this paper are motivated by a simple analogy between AI classifiers and the standard Shannon model of a communication system. |
Tasks | |
Published | 2019-01-27 |
URL | http://arxiv.org/abs/1901.09413v1 |
http://arxiv.org/pdf/1901.09413v1.pdf | |
PWC | https://paperswithcode.com/paper/an-information-theoretic-explanation-for-the |
Repo | |
Framework | |
Actively Seeking and Learning from Live Data
Title | Actively Seeking and Learning from Live Data |
Authors | Damien Teney, Anton van den Hengel |
Abstract | One of the key limitations of traditional machine learning methods is their requirement for training data that exemplifies all the information to be learned. This is a particular problem for visual question answering methods, which may be asked questions about virtually anything. The approach we propose is a step toward overcoming this limitation by searching for the information required at test time. The resulting method dynamically utilizes data from an external source, such as a large set of questions/answers or images/captions. Concretely, we learn a set of base weights for a simple VQA model, that are specifically adapted to a given question with the information specifically retrieved for this question. The adaptation process leverages recent advances in gradient-based meta learning and contributions for efficient retrieval and cross-domain adaptation. We surpass the state-of-the-art on the VQA-CP v2 benchmark and demonstrate our approach to be intrinsically more robust to out-of-distribution test data. We demonstrate the use of external non-VQA data using the MS COCO captioning dataset to support the answering process. This approach opens a new avenue for open-domain VQA systems that interface with diverse sources of data. |
Tasks | Domain Adaptation, Meta-Learning, Question Answering, Visual Question Answering |
Published | 2019-04-05 |
URL | http://arxiv.org/abs/1904.02865v1 |
http://arxiv.org/pdf/1904.02865v1.pdf | |
PWC | https://paperswithcode.com/paper/actively-seeking-and-learning-from-live-data |
Repo | |
Framework | |
Graph Domain Adaptation with Localized Graph Signal Representations
Title | Graph Domain Adaptation with Localized Graph Signal Representations |
Authors | Yusuf Yigit Pilavci, Eylem Tugce Guneyi, Cemil Cengiz, Elif Vural |
Abstract | In this paper we propose a domain adaptation algorithm designed for graph domains. Given a source graph with many labeled nodes and a target graph with few or no labeled nodes, we aim to estimate the target labels by making use of the similarity between the characteristics of the variation of the label functions on the two graphs. Our assumption about the source and the target domains is that the local behaviour of the label function, such as its spread and speed of variation on the graph, bears resemblance between the two graphs. We estimate the unknown target labels by solving an optimization problem where the label information is transferred from the source graph to the target graph based on the prior that the projections of the label functions onto localized graph bases be similar between the source and the target graphs. In order to efficiently capture the local variation of the label functions on the graphs, spectral graph wavelets are used as the graph bases. Experimentation on various data sets shows that the proposed method yields quite satisfactory classification accuracy compared to reference domain adaptation methods. |
Tasks | Domain Adaptation |
Published | 2019-11-07 |
URL | https://arxiv.org/abs/1911.02883v1 |
https://arxiv.org/pdf/1911.02883v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-domain-adaptation-with-localized-graph |
Repo | |
Framework | |
Correlations between Word Vector Sets
Title | Correlations between Word Vector Sets |
Authors | Vitalii Zhelezniak, April Shen, Daniel Busbridge, Aleksandar Savkov, Nils Hammerla |
Abstract | Similarity measures based purely on word embeddings are comfortably competing with much more sophisticated deep learning and expert-engineered systems on unsupervised semantic textual similarity (STS) tasks. In contrast to commonly used geometric approaches, we treat a single word embedding as e.g. 300 observations from a scalar random variable. Using this paradigm, we first illustrate that similarities derived from elementary pooling operations and classic correlation coefficients yield excellent results on standard STS benchmarks, outperforming many recently proposed methods while being much faster and trivial to implement. Next, we demonstrate how to avoid pooling operations altogether and compare sets of word embeddings directly via correlation operators between reproducing kernel Hilbert spaces. Just like cosine similarity is used to compare individual word vectors, we introduce a novel application of the centered kernel alignment (CKA) as a natural generalisation of squared cosine similarity for sets of word vectors. Likewise, CKA is very easy to implement and enjoys very strong empirical results. |
Tasks | Semantic Textual Similarity, Word Embeddings |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02902v1 |
https://arxiv.org/pdf/1910.02902v1.pdf | |
PWC | https://paperswithcode.com/paper/correlations-between-word-vector-sets |
Repo | |
Framework | |
Exact asymptotics for phase retrieval and compressed sensing with random generative priors
Title | Exact asymptotics for phase retrieval and compressed sensing with random generative priors |
Authors | Benjamin Aubin, Bruno Loureiro, Antoine Baker, Florent Krzakala, Lenka Zdeborová |
Abstract | We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix. We derive sharp asymptotics for the information-theoretically optimal performance and for the best known polynomial algorithm for an ensemble of generative priors consisting of fully connected deep neural networks with random weight matrices and arbitrary activations. We compare the performance to sparse separable priors and conclude that generative priors might be advantageous in terms of algorithmic performance. In particular, while sparsity does not allow to perform compressive phase retrieval efficiently close to its information-theoretic limit, it is found that under the random generative prior compressed phase retrieval becomes tractable. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02008v1 |
https://arxiv.org/pdf/1912.02008v1.pdf | |
PWC | https://paperswithcode.com/paper/exact-asymptotics-for-phase-retrieval-and |
Repo | |
Framework | |
A large-scale crowdsourced analysis of abuse against women journalists and politicians on Twitter
Title | A large-scale crowdsourced analysis of abuse against women journalists and politicians on Twitter |
Authors | Laure Delisle, Alfredo Kalaitzis, Krzysztof Majewski, Archy de Berker, Milena Marin, Julien Cornebise |
Abstract | We report the first, to the best of our knowledge, hand-in-hand collaboration between human rights activists and machine learners, leveraging crowd-sourcing to study online abuse against women on Twitter. On a technical front, we carefully curate an unbiased yet low-variance dataset of labeled tweets, analyze it to account for the variability of abuse perception, and establish baselines, preparing it for release to community research efforts. On a social impact front, this study provides the technical backbone for a media campaign aimed at raising public and deciders’ awareness and elevating the standards expected from social media companies. |
Tasks | |
Published | 2019-01-31 |
URL | http://arxiv.org/abs/1902.03093v1 |
http://arxiv.org/pdf/1902.03093v1.pdf | |
PWC | https://paperswithcode.com/paper/a-large-scale-crowdsourced-analysis-of-abuse |
Repo | |
Framework | |
Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks
Title | Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks |
Authors | Karan Sikka, Lucas Van Bramer, Ajay Divakaran |
Abstract | There has been an explosion of multimodal content generated on social media networks in the last few years, which has necessitated a deeper understanding of social media content and user behavior. We present a novel content-independent content-user-reaction model for social multimedia content analysis. Compared to prior works that generally tackle semantic content understanding and user behavior modeling in isolation, we propose a generalized solution to these problems within a unified framework. We embed users, images and text drawn from open social media in a common multimodal geometric space, using a novel loss function designed to cope with distant and disparate modalities, and thereby enable seamless three-way retrieval. Our model not only outperforms unimodal embedding based methods on cross-modal retrieval tasks but also shows improvements stemming from jointly solving the two tasks on Twitter data. We also show that the user embeddings learned within our joint multimodal embedding model are better at predicting user interests compared to those learned with unimodal content on Instagram data. Our framework thus goes beyond the prior practice of using explicit leader-follower link information to establish affiliations by extracting implicit content-centric affiliations from isolated users. We provide qualitative results to show that the user clusters emerging from learned embeddings have consistent semantics and the ability of our model to discover fine-grained semantics from noisy and unstructured data. Our work reveals that social multimodal content is inherently multimodal and possesses a consistent structure because in social networks meaning is created through interactions between users and content. |
Tasks | Cross-Modal Retrieval |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07075v3 |
https://arxiv.org/pdf/1905.07075v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-unified-multimodal-embeddings-for |
Repo | |
Framework | |
The Landscape of Non-convex Empirical Risk with Degenerate Population Risk
Title | The Landscape of Non-convex Empirical Risk with Degenerate Population Risk |
Authors | Shuang Li, Gongguo Tang, Michael B. Wakin |
Abstract | The landscape of empirical risk has been widely studied in a series of machine learning problems, including low-rank matrix factorization, matrix sensing, matrix completion, and phase retrieval. In this work, we focus on the situation where the corresponding population risk is a degenerate non-convex loss function, namely, the Hessian of the population risk can have zero eigenvalues. Instead of analyzing the non-convex empirical risk directly, we first study the landscape of the corresponding population risk, which is usually easier to characterize, and then build a connection between the landscape of the empirical risk and its population risk. In particular, we establish a correspondence between the critical points of the empirical risk and its population risk without the strongly Morse assumption, which is required in existing literature but not satisfied in degenerate scenarios. We also apply the theory to matrix sensing and phase retrieval to demonstrate how to infer the landscape of empirical risk from that of the corresponding population risk. |
Tasks | Matrix Completion |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05520v4 |
https://arxiv.org/pdf/1907.05520v4.pdf | |
PWC | https://paperswithcode.com/paper/the-landscape-of-non-convex-empirical-risk |
Repo | |
Framework | |
Learning Good Representation via Continuous Attention
Title | Learning Good Representation via Continuous Attention |
Authors | Liang Zhao, Wei Xu |
Abstract | In this paper we present our scientific discovery that good representation can be learned via continuous attention during the interaction between Unsupervised Learning(UL) and Reinforcement Learning(RL) modules driven by intrinsic motivation. Specifically, we designed intrinsic rewards generated from UL modules for driving the RL agent to focus on objects for a period of time and to learn good representations of objects for later object recognition task. We evaluate our proposed algorithm in both with and without extrinsic reward settings. Experiments with end-to-end training in simulated environments with applications to few-shot object recognition demonstrated the effectiveness of the proposed algorithm. |
Tasks | Object Recognition |
Published | 2019-03-29 |
URL | http://arxiv.org/abs/1903.12344v2 |
http://arxiv.org/pdf/1903.12344v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-good-representation-via-continuous |
Repo | |
Framework | |
Counting the learnable functions of structured data
Title | Counting the learnable functions of structured data |
Authors | Pietro Rotondo, Marco Cosentino Lagomarsino, Marco Gherardi |
Abstract | Cover’s function counting theorem is a milestone in the theory of artificial neural networks. It provides an answer to the fundamental question of determining how many binary assignments (dichotomies) of $p$ points in $n$ dimensions can be linearly realized. Regrettably, it has proved hard to extend the same approach to more advanced problems than the classification of points. In particular, an emerging necessity is to find methods to deal with structured data, and specifically with non-pointlike patterns. A prominent case is that of invariant recognition, whereby identification of a stimulus is insensitive to irrelevant transformations on the inputs (such as rotations or changes in perspective in an image). An object is therefore represented by an extended perceptual manifold, consisting of inputs that are classified similarly. Here, we develop a function counting theory for structured data of this kind, by extending Cover’s combinatorial technique, and we derive analytical expressions for the average number of dichotomies of generically correlated sets of patterns. As an application, we obtain a closed formula for the capacity of a binary classifier trained to distinguish general polytopes of any dimension. These results may help extend our theoretical understanding of generalization, feature extraction, and invariant object recognition by neural networks. |
Tasks | Object Recognition |
Published | 2019-03-28 |
URL | http://arxiv.org/abs/1903.12021v1 |
http://arxiv.org/pdf/1903.12021v1.pdf | |
PWC | https://paperswithcode.com/paper/counting-the-learnable-functions-of |
Repo | |
Framework | |
Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation
Title | Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation |
Authors | Baigong Zheng, Renjie Zheng, Mingbo Ma, Liang Huang |
Abstract | Simultaneous translation is widely useful but remains challenging. Previous work falls into two main categories: (a) fixed-latency policies such as Ma et al. (2019) and (b) adaptive policies such as Gu et al. (2017). The former are simple and effective, but have to aggressively predict future content due to diverging source-target word order; the latter do not anticipate, but suffer from unstable and inefficient training. To combine the merits of both approaches, we propose a simple supervised-learning framework to learn an adaptive policy from oracle READ/WRITE sequences generated from parallel text. At each step, such an oracle sequence chooses to WRITE the next target word if the available source sentence context provides enough information to do so, otherwise READ the next source word. Experiments on German<->English show that our method, without retraining the underlying NMT model, can learn flexible policies with better BLEU scores and similar latencies compared to previous work. |
Tasks | |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.01559v2 |
https://arxiv.org/pdf/1909.01559v2.pdf | |
PWC | https://paperswithcode.com/paper/simpler-and-faster-learning-of-adaptive |
Repo | |
Framework | |
Multi-Attribute Selectivity Estimation Using Deep Learning
Title | Multi-Attribute Selectivity Estimation Using Deep Learning |
Authors | Shohedul Hasan, Saravanan Thirumuruganathan, Jees Augustine, Nick Koudas, Gautam Das |
Abstract | Selectivity estimation - the problem of estimating the result size of queries - is a fundamental problem in databases. Accurate estimation of query selectivity involving multiple correlated attributes is especially challenging. Poor cardinality estimates could result in the selection of bad plans by the query optimizer. We investigate the feasibility of using deep learning based approaches for both point and range queries and propose two complementary approaches. Our first approach considers selectivity as an unsupervised deep density estimation problem. We successfully introduce techniques from neural density estimation for this purpose. The key idea is to decompose the joint distribution into a set of tractable conditional probability distributions such that they satisfy the autoregressive property. Our second approach formulates selectivity estimation as a supervised deep learning problem that predicts the selectivity of a given query. We also introduce and address a number of practical challenges arising when adapting deep learning for relational data. These include query/data featurization, incorporating query workload information in a deep learning framework and the dynamic scenario where both data and workload queries could be updated. Our extensive experiments with a special emphasis on queries with a large number of predicates and/or small result sizes demonstrates that our proposed techniques provide fast and accurate selective estimates with minimal space overhead. |
Tasks | Density Estimation |
Published | 2019-03-24 |
URL | https://arxiv.org/abs/1903.09999v2 |
https://arxiv.org/pdf/1903.09999v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-attribute-selectivity-estimation-using |
Repo | |
Framework | |
The functional role of cue-driven feature-based feedback in object recognition
Title | The functional role of cue-driven feature-based feedback in object recognition |
Authors | Sushrut Thorat, Marcel van Gerven, Marius Peelen |
Abstract | Visual object recognition is not a trivial task, especially when the objects are degraded or surrounded by clutter or presented briefly. External cues (such as verbal cues or visual context) can boost recognition performance in such conditions. In this work, we build an artificial neural network to model the interaction between the object processing stream (OPS) and the cue. We study the effects of varying neural and representational capacities of the OPS on the performance boost provided by cue-driven feature-based feedback in the OPS. We observe that the feedback provides performance boosts only if the category-specific features about the objects cannot be fully represented in the OPS. This representational limit is more dependent on task demands than neural capacity. We also observe that the feedback scheme trained to maximise recognition performance boost is not the same as tuning-based feedback, and actually performs better than tuning-based feedback. |
Tasks | Object Recognition |
Published | 2019-03-25 |
URL | http://arxiv.org/abs/1903.10446v1 |
http://arxiv.org/pdf/1903.10446v1.pdf | |
PWC | https://paperswithcode.com/paper/the-functional-role-of-cue-driven-feature |
Repo | |
Framework | |
Semantic Adversarial Network for Zero-Shot Sketch-Based Image Retrieval
Title | Semantic Adversarial Network for Zero-Shot Sketch-Based Image Retrieval |
Authors | Xinxun Xu, Hao Wang, Leida Li, Cheng Deng |
Abstract | Zero-shot sketch-based image retrieval (ZS-SBIR) is a specific cross-modal retrieval task for retrieving natural images with free-hand sketches under zero-shot scenario. Previous works mostly focus on modeling the correspondence between images and sketches or synthesizing image features with sketch features. However, both of them ignore the large intra-class variance of sketches, thus resulting in unsatisfactory retrieval performance. In this paper, we propose a novel end-to-end semantic adversarial approach for ZS-SBIR. Specifically, we devise a semantic adversarial module to maximize the consistency between learned semantic features and category-level word vectors. Moreover, to preserve the discriminability of synthesized features within each training category, a triplet loss is employed for the generative module. Additionally, the proposed model is trained in an end-to-end strategy to exploit better semantic features suitable for ZS-SBIR. Extensive experiments conducted on two large-scale popular datasets demonstrate that our proposed approach remarkably outperforms state-of-the-art approaches by more than 12% on Sketchy dataset and about 3% on TU-Berlin dataset in the retrieval. |
Tasks | Cross-Modal Retrieval, Image Retrieval, Sketch-Based Image Retrieval |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02327v2 |
https://arxiv.org/pdf/1905.02327v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-adversarial-network-for-zero-shot |
Repo | |
Framework | |
Deep Multicameral Decoding for Localizing Unoccluded Object Instances from a Single RGB Image
Title | Deep Multicameral Decoding for Localizing Unoccluded Object Instances from a Single RGB Image |
Authors | Matthieu Grard, Emmanuel Dellandréa, Liming Chen |
Abstract | Occlusion-aware instance-sensitive segmentation is a complex task generally split into region-based segmentations, by approximating instances as their bounding box. We address the showcase scenario of dense homogeneous layouts in which this approximation does not hold. In this scenario, outlining unoccluded instances by decoding a deep encoder becomes difficult, due to the translation invariance of convolutional layers and the lack of complexity in the decoder. We therefore propose a multicameral design composed of subtask-specific lightweight decoder and encoder-decoder units, coupled in cascade to encourage subtask-specific feature reuse and enforce a learning path within the decoding process. Furthermore, the state-of-the-art datasets for occlusion-aware instance segmentation contain real images with few instances and occlusions mostly due to objects occluding the background, unlike dense object layouts. We thus also introduce a synthetic dataset of dense homogeneous object layouts, namely Mikado, which extensibly contains more instances and inter-instance occlusions per image than these public datasets. Our extensive experiments on Mikado and public datasets show that ordinal multiscale units within the decoding process prove more effective than state-of-the-art design patterns for capturing position-sensitive representations. We also show that Mikado is plausible with respect to real-world problems, in the sense that it enables the learning of performance-enhancing representations transferable to real images, while drastically reducing the need of hand-made annotations for finetuning. The proposed dataset will be made publicly available. |
Tasks | Boundary Detection, Instance Segmentation, Semantic Segmentation |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07480v2 |
https://arxiv.org/pdf/1906.07480v2.pdf | |
PWC | https://paperswithcode.com/paper/bicameral-structuring-and-synthetic-imagery |
Repo | |
Framework | |