January 30, 2020

2942 words 14 mins read

Paper Group ANR 463

Paper Group ANR 463

Adversarial Generation of Handwritten Text Images Conditioned on Sequences. Why Blocking Targeted Adversarial Perturbations Impairs the Ability to Learn. Stochastic Prediction of Multi-Agent Interactions from Partial Observations. Workflow-Driven Distributed Machine Learning in CHASE-CI: A Cognitive Hardware and Software Ecosystem Community Infrast …

Adversarial Generation of Handwritten Text Images Conditioned on Sequences

Title Adversarial Generation of Handwritten Text Images Conditioned on Sequences
Authors Eloi Alonso, Bastien Moysset, Ronaldo Messina
Abstract State-of-the-art offline handwriting text recognition systems tend to use neural networks and therefore require a large amount of annotated data to be trained. In order to partially satisfy this requirement, we propose a system based on Generative Adversarial Networks (GAN) to produce synthetic images of handwritten words. We use bidirectional LSTM recurrent layers to get an embedding of the word to be rendered, and we feed it to the generator network. We also modify the standard GAN by adding an auxiliary network for text recognition. The system is then trained with a balanced combination of an adversarial loss and a CTC loss. Together, these extensions to GAN enable to control the textual content of the generated word images. We obtain realistic images on both French and Arabic datasets, and we show that integrating these synthetic images into the existing training data of a text recognition system can slightly enhance its performance.
Tasks
Published 2019-03-01
URL http://arxiv.org/abs/1903.00277v1
PDF http://arxiv.org/pdf/1903.00277v1.pdf
PWC https://paperswithcode.com/paper/adversarial-generation-of-handwritten-text
Repo
Framework

Why Blocking Targeted Adversarial Perturbations Impairs the Ability to Learn

Title Why Blocking Targeted Adversarial Perturbations Impairs the Ability to Learn
Authors Ziv Katzir, Yuval Elovici
Abstract Despite their accuracy, neural network-based classifiers are still prone to manipulation through adversarial perturbations. Those perturbations are designed to be misclassified by the neural network, while being perceptually identical to some valid input. The vast majority of attack methods rely on white-box conditions, where the attacker has full knowledge of the attacked network’s parameters. This allows the attacker to calculate the network’s loss gradient with respect to some valid input and use this gradient in order to create an adversarial example. The task of blocking white-box attacks has proven difficult to solve. While a large number of defense methods have been suggested, they have had limited success. In this work we examine this difficulty and try to understand it. We systematically explore the abilities and limitations of defensive distillation, one of the most promising defense mechanisms against adversarial perturbations suggested so far in order to understand the defense challenge. We show that contrary to commonly held belief, the ability to bypass defensive distillation is not dependent on an attack’s level of sophistication. In fact, simple approaches, such as the Targeted Gradient Sign Method, are capable of effectively bypassing defensive distillation. We prove that defensive distillation is highly effective against non-targeted attacks but is unsuitable for targeted attacks. This discovery leads us to realize that targeted attacks leverage the same input gradient that allows a network to be trained. This implies that blocking them will require losing the network’s ability to learn, presenting an impossible tradeoff to the research community.
Tasks
Published 2019-07-11
URL https://arxiv.org/abs/1907.05718v1
PDF https://arxiv.org/pdf/1907.05718v1.pdf
PWC https://paperswithcode.com/paper/why-blocking-targeted-adversarial
Repo
Framework

Stochastic Prediction of Multi-Agent Interactions from Partial Observations

Title Stochastic Prediction of Multi-Agent Interactions from Partial Observations
Authors Chen Sun, Per Karlsson, Jiajun Wu, Joshua B Tenenbaum, Kevin Murphy
Abstract We present a method that learns to integrate temporal information, from a learned dynamics model, with ambiguous visual information, from a learned vision model, in the context of interacting agents. Our method is based on a graph-structured variational recurrent neural network (Graph-VRNN), which is trained end-to-end to infer the current state of the (partially observed) world, as well as to forecast future states. We show that our method outperforms various baselines on two sports datasets, one based on real basketball trajectories, and one generated by a soccer game engine.
Tasks
Published 2019-02-25
URL http://arxiv.org/abs/1902.09641v1
PDF http://arxiv.org/pdf/1902.09641v1.pdf
PWC https://paperswithcode.com/paper/stochastic-prediction-of-multi-agent
Repo
Framework

Workflow-Driven Distributed Machine Learning in CHASE-CI: A Cognitive Hardware and Software Ecosystem Community Infrastructure

Title Workflow-Driven Distributed Machine Learning in CHASE-CI: A Cognitive Hardware and Software Ecosystem Community Infrastructure
Authors Ilkay Altintas, Kyle Marcus, Isaac Nealey, Scott L. Sellars, John Graham, Dima Mishin, Joel Polizzi, Daniel Crawl, Thomas DeFanti, Larry Smarr
Abstract The advances in data, computing and networking over the last two decades led to a shift in many application domains that includes machine learning on big data as a part of the scientific process, requiring new capabilities for integrated and distributed hardware and software infrastructure. This paper contributes a workflow-driven approach for dynamic data-driven application development on top of a new kind of networked Cyberinfrastructure called CHASE-CI. In particular, we present: 1) The architecture for CHASE-CI, a network of distributed fast GPU appliances for machine learning and storage managed through Kubernetes on the high-speed (10-100Gbps) Pacific Research Platform (PRP); 2) A machine learning software containerization approach and libraries required for turning such a network into a distributed computer for big data analysis; 3) An atmospheric science case study that can only be made scalable with an infrastructure like CHASE-CI; 4) Capabilities for virtual cluster management for data communication and analysis in a dynamically scalable fashion, and visualization across the network in specialized visualization facilities in near real-time; and, 5) A step-by-step workflow and performance measurement approach that enables taking advantage of the dynamic architecture of the CHASE-CI network and container management infrastructure.
Tasks
Published 2019-02-26
URL http://arxiv.org/abs/1903.06802v1
PDF http://arxiv.org/pdf/1903.06802v1.pdf
PWC https://paperswithcode.com/paper/workflow-driven-distributed-machine-learning
Repo
Framework

Self-attention based BiLSTM-CNN classifier for the prediction of ischemic and non-ischemic cardiomyopathy

Title Self-attention based BiLSTM-CNN classifier for the prediction of ischemic and non-ischemic cardiomyopathy
Authors Kavita Dubey, Anant Agarwal, Astitwa Sarthak Lathe, Ranjeet Kumar, Vishal Srivastava
Abstract Heart Failure is a major component of healthcare expenditure and a leading cause of mortality worldwide. Despite higher inter-rater variability, endomyocardial biopsy (EMB) is still regarded as the standard technique, used to identify the cause (e.g. ischemic or non-ischemic cardiomyopathy, coronary artery disease, myocardial infarction etc.) of unexplained heart failure. In this paper, we focus on identifying cardiomyopathy as ischemic or non-ischemic. For this, we propose and implement a new unified architecture comprising CNN (inception-V3 model) and bidirectional LSTM (BiLSTM) with self-attention mechanism to predict the ischemic or non-ischemic to classify cardiomyopathy using histopathological images. The proposed model is based on self-attention that implicitly focuses on the information outputted from the hidden layers of BiLSTM. Through our results we demonstrate that this framework carries a high learning capacity and is able to improve the classification performance.
Tasks
Published 2019-07-24
URL https://arxiv.org/abs/1907.10370v3
PDF https://arxiv.org/pdf/1907.10370v3.pdf
PWC https://paperswithcode.com/paper/self-attention-based-bilstm-cnn-classifier
Repo
Framework

Fine-Tuned Neural Models for Propaganda Detection at the Sentence and Fragment levels

Title Fine-Tuned Neural Models for Propaganda Detection at the Sentence and Fragment levels
Authors Tariq Alhindi, Jonas Pfeiffer, Smaranda Muresan
Abstract This paper presents the CUNLP submission for the NLP4IF 2019 shared-task on FineGrained Propaganda Detection. Our system finished 5th out of 26 teams on the sentence-level classification task and 5th out of 11 teams on the fragment-level classification task based on our scores on the blind test set. We present our models, a discussion of our ablation studies and experiments, and an analysis of our performance on all eighteen propaganda techniques present in the corpus of the shared task.
Tasks
Published 2019-10-22
URL https://arxiv.org/abs/1910.09702v1
PDF https://arxiv.org/pdf/1910.09702v1.pdf
PWC https://paperswithcode.com/paper/fine-tuned-neural-models-for-propaganda
Repo
Framework

Compressibility Loss for Neural Network Weights

Title Compressibility Loss for Neural Network Weights
Authors Caglar Aytekin, Francesco Cricri, Emre Aksu
Abstract In this paper we apply a compressibility loss that enables learning highly compressible neural network weights. The loss was previously proposed as a measure of negated sparsity of a signal, yet in this paper we show that minimizing this loss also enforces the non-zero parts of the signal to have very low entropy, thus making the entire signal more compressible. For an optimization problem where the goal is to minimize the compressibility loss (the objective), we prove that at any critical point of the objective, the weight vector is a ternary signal and the corresponding value of the objective is the squared root of the number of non-zero elements in the signal, thus directly related to sparsity. In the experiments, we train neural networks with the compressibility loss and we show that the proposed method achieves weight sparsity and compression ratios comparable with the state-of-the-art.
Tasks
Published 2019-05-03
URL https://arxiv.org/abs/1905.01044v1
PDF https://arxiv.org/pdf/1905.01044v1.pdf
PWC https://paperswithcode.com/paper/compressibility-loss-for-neural-network
Repo
Framework

Bias of Homotopic Gradient Descent for the Hinge Loss

Title Bias of Homotopic Gradient Descent for the Hinge Loss
Authors Denali Molitor, Deanna Needell, Rachel Ward
Abstract Gradient descent is a simple and widely used optimization method for machine learning. For homogeneous linear classifiers applied to separable data, gradient descent has been shown to converge to the maximal margin (or equivalently, the minimal norm) solution for various smooth loss functions. The previous theory does not, however, apply to non-smooth functions such as the hinge loss which is widely used in practice. Here, we study the convergence of a homotopic variant of gradient descent applied to the hinge loss and provide explicit convergence rates to the max-margin solution for linearly separable data.
Tasks
Published 2019-07-26
URL https://arxiv.org/abs/1907.11746v1
PDF https://arxiv.org/pdf/1907.11746v1.pdf
PWC https://paperswithcode.com/paper/bias-of-homotopic-gradient-descent-for-the
Repo
Framework

Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments

Title Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments
Authors Kai Niu, Yan Huang, Wanli Ouyang, Liang Wang
Abstract Description-based person re-identification (Re-id) is an important task in video surveillance that requires discriminative cross-modal representations to distinguish different people. It is difficult to directly measure the similarity between images and descriptions due to the modality heterogeneity (the cross-modal problem). And all samples belonging to a single category (the fine-grained problem) makes this task even harder than the conventional image-description matching task. In this paper, we propose a Multi-granularity Image-text Alignments (MIA) model to alleviate the cross-modal fine-grained problem for better similarity evaluation in description-based person Re-id. Specifically, three different granularities, i.e., global-global, global-local and local-local alignments are carried out hierarchically. Firstly, the global-global alignment in the Global Contrast (GC) module is for matching the global contexts of images and descriptions. Secondly, the global-local alignment employs the potential relations between local components and global contexts to highlight the distinguishable components while eliminating the uninvolved ones adaptively in the Relation-guided Global-local Alignment (RGA) module. Thirdly, as for the local-local alignment, we match visual human parts with noun phrases in the Bi-directional Fine-grained Matching (BFM) module. The whole network combining multiple granularities can be end-to-end trained without complex pre-processing. To address the difficulties in training the combination of multiple granularities, an effective step training strategy is proposed to train these granularities step-by-step. Extensive experiments and analysis have shown that our method obtains the state-of-the-art performance on the CUHK-PEDES dataset and outperforms the previous methods by a significant margin.
Tasks Person Re-Identification
Published 2019-06-23
URL https://arxiv.org/abs/1906.09610v1
PDF https://arxiv.org/pdf/1906.09610v1.pdf
PWC https://paperswithcode.com/paper/improving-description-based-person-re
Repo
Framework

Language learning using Speech to Image retrieval

Title Language learning using Speech to Image retrieval
Authors Danny Merkx, Stefan L. Frank, Mirjam Ernestus
Abstract Humans learn language by interaction with their environment and listening to other humans. It should also be possible for computational models to learn language directly from speech but so far most approaches require text. We improve on existing neural network approaches to create visually grounded embeddings for spoken utterances. Using a combination of a multi-layer GRU, importance sampling, cyclic learning rates, ensembling and vectorial self-attention our results show a remarkable increase in image-caption retrieval performance over previous work. Furthermore, we investigate which layers in the model learn to recognise words in the input. We find that deeper network layers are better at encoding word presence, although the final layer has slightly lower performance. This shows that our visually grounded sentence encoder learns to recognise words from the input even though it is not explicitly trained for word recognition.
Tasks Image Retrieval
Published 2019-09-09
URL https://arxiv.org/abs/1909.03795v1
PDF https://arxiv.org/pdf/1909.03795v1.pdf
PWC https://paperswithcode.com/paper/language-learning-using-speech-to-image
Repo
Framework

Propagation complete encodings of smooth DNNF theories

Title Propagation complete encodings of smooth DNNF theories
Authors Petr Kučera, Petr Savický
Abstract We investigate conjunctive normal form (CNF) encodings of a function represented with a smooth decomposable negation normal form (DNNF). Several encodings of DNNFs and decision diagrams were considered by (Abio et al. 2016). The authors differentiate between encodings which implement consistency or domain consistency from encodings which implement unit refutation completeness or propagation completeness (in both cases implements means by unit propagation). The difference is that in the former case we do not care about properties of the encoding with respect to the auxiliary variables while in the latter case we treat all variables (the input ones and the auxiliary ones) in the same way. The latter case is useful if a DNNF is a part of a problem containing also other constraints and a SAT solver is used to test satisfiability. The currently known encodings of smooth DNNF theories implement domain consistency. Building on this and the result of (Abio et al. 2016) on an encoding of decision diagrams which implements propagation completeness, we present a new encoding of a smooth DNNF which implements propagation completeness. This closes the gap left open in the literature on encodings of DNNFs.
Tasks
Published 2019-09-14
URL https://arxiv.org/abs/1909.06673v2
PDF https://arxiv.org/pdf/1909.06673v2.pdf
PWC https://paperswithcode.com/paper/propagation-complete-encodings-of-smooth-dnnf
Repo
Framework

An Exponential Lower Bound for the Runtime of the cGA on Jump Functions

Title An Exponential Lower Bound for the Runtime of the cGA on Jump Functions
Authors Benjamin Doerr
Abstract In the first runtime analysis of an estimation-of-distribution algorithm (EDA) on the multi-modal jump function class, Hasen"ohrl and Sutton (GECCO 2018) proved that the runtime of the compact genetic algorithm with suitable parameter choice on jump functions with high probability is at most polynomial (in the dimension) if the jump size is at most logarithmic (in the dimension), and is at most exponential in the jump size if the jump size is super-logarithmic. The exponential runtime guarantee was achieved with a hypothetical population size that is also exponential in the jump size. Consequently, this setting cannot lead to a better runtime. In this work, we show that any choice of the hypothetical population size leads to a runtime that, with high probability, is at least exponential in the jump size. This result might be the first non-trivial exponential lower bound for EDAs that holds for arbitrary parameter settings.
Tasks
Published 2019-04-17
URL https://arxiv.org/abs/1904.08415v2
PDF https://arxiv.org/pdf/1904.08415v2.pdf
PWC https://paperswithcode.com/paper/an-exponential-lower-bound-for-the-runtime-of
Repo
Framework

Conditional Learning of Fair Representations

Title Conditional Learning of Fair Representations
Authors Han Zhao, Amanda Coston, Tameem Adel, Geoffrey J. Gordon
Abstract We propose a novel algorithm for learning fair representations that can simultaneously mitigate two notions of disparity among different demographic subgroups in the classification setting. Two key components underpinning the design of our algorithm are balanced error rate and conditional alignment of representations. We show how these two components contribute to ensuring accuracy parity and equalized false-positive and false-negative rates across groups without impacting demographic parity. Furthermore, we also demonstrate both in theory and on two real-world experiments that the proposed algorithm leads to a better utility-fairness trade-off on balanced datasets compared with existing algorithms on learning fair representations for classification.
Tasks
Published 2019-10-16
URL https://arxiv.org/abs/1910.07162v3
PDF https://arxiv.org/pdf/1910.07162v3.pdf
PWC https://paperswithcode.com/paper/conditional-learning-of-fair-representations
Repo
Framework

RecSys-DAN: Discriminative Adversarial Networks for Cross-Domain Recommender Systems

Title RecSys-DAN: Discriminative Adversarial Networks for Cross-Domain Recommender Systems
Authors Cheng Wang, Mathias Niepert, Hui Li
Abstract Data sparsity and data imbalance are practical and challenging issues in cross-domain recommender systems. This paper addresses those problems by leveraging the concepts which derive from representation learning, adversarial learning and transfer learning (particularly, domain adaptation). Although various transfer learning methods have shown promising performance in this context, our proposed novel method RecSys-DAN focuses on alleviating the cross-domain and within-domain data sparsity and data imbalance and learns transferable latent representations for users, items and their interactions. Different from existing approaches, the proposed method transfers the latent representations from a source domain to a target domain in an adversarial way. The mapping functions in the target domain are learned by playing a min-max game with an adversarial loss, aiming to generate domain indistinguishable representations for a discriminator. Four neural architectural instances of ResSys-DAN are proposed and explored. Empirical results on real-world Amazon data show that, even without using labeled data (i.e., ratings) in the target domain, RecSys-DAN achieves competitive performance as compared to the state-of-the-art supervised methods. More importantly, RecSys-DAN is highly flexible to both unimodal and multimodal scenarios, and thus it is more robust to the cold-start recommendation which is difficult for previous methods.
Tasks Domain Adaptation, Recommendation Systems, Representation Learning, Transfer Learning
Published 2019-03-26
URL http://arxiv.org/abs/1903.10794v2
PDF http://arxiv.org/pdf/1903.10794v2.pdf
PWC https://paperswithcode.com/paper/recsys-dan-discriminative-adversarial
Repo
Framework

Cascade Decoder: A Universal Decoding Method for Biomedical Image Segmentation

Title Cascade Decoder: A Universal Decoding Method for Biomedical Image Segmentation
Authors Peixian Liang, Jianxu Chen, Hao Zheng, Lin Yang, Yizhe Zhang, Danny Z. Chen
Abstract The Encoder-Decoder architecture is a main stream deep learning model for biomedical image segmentation. The encoder fully compresses the input and generates encoded features, and the decoder then produces dense predictions using encoded features. However, decoders are still under-explored in such architectures. In this paper, we comprehensively study the state-of-the-art Encoder-Decoder architectures, and propose a new universal decoder, called cascade decoder, to improve semantic segmentation accuracy. Our cascade decoder can be embedded into existing networks and trained altogether in an end-to-end fashion. The cascade decoder structure aims to conduct more effective decoding of hierarchically encoded features and is more compatible with common encoders than the known decoders. We replace the decoders of state-of-the-art models with our cascade decoder for several challenging biomedical image segmentation tasks, and the considerable improvements achieved demonstrate the efficacy of our new decoding method.
Tasks Semantic Segmentation
Published 2019-01-15
URL http://arxiv.org/abs/1901.04949v1
PDF http://arxiv.org/pdf/1901.04949v1.pdf
PWC https://paperswithcode.com/paper/cascade-decoder-a-universal-decoding-method
Repo
Framework
comments powered by Disqus