April 2, 2020

3441 words 17 mins read

Paper Group ANR 210

Paper Group ANR 210

Fair Principal Component Analysis and Filter Design. Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology. ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning. Search Space of Adversarial Perturbations against Image Filters. Estimating Training Data Influence by Tracking Gradient Descent. …

Fair Principal Component Analysis and Filter Design

Title Fair Principal Component Analysis and Filter Design
Authors Gad Zalcberg, Ami Wiesel
Abstract We consider Fair Principal Component Analysis (FPCA) and search for a low dimensional subspace that spans multiple target vectors in a fair manner. FPCA is defined as a non-concave maximization of the worst projected target norm within a given set. The problem arises in filter design in signal processing, and when incorporating fairness into dimensionality reduction schemes. The state of the art approach to FPCA is via semidefinite relaxation and involves a polynomial yet computationally expensive optimization. To allow scalability, we propose to address FPCA using naive sub-gradient descent. We analyze the landscape of the underlying optimization in the case of orthogonal targets. We prove that the landscape is benign and that all local minima are globally optimal. Interestingly, the SDR approach leads to sub-optimal solutions in this simple case. Finally, we discuss the equivalence between orthogonal FPCA and the design of normalized tight frames.
Tasks Dimensionality Reduction
Published 2020-02-16
URL https://arxiv.org/abs/2002.06557v1
PDF https://arxiv.org/pdf/2002.06557v1.pdf
PWC https://paperswithcode.com/paper/fair-principal-component-analysis-and-filter

Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology

Title Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology
Authors Quynh Nguyen, Marco Mondelli
Abstract A recent line of research has provided convergence guarantees for gradient descent algorithms in the excessive over-parameterization regime where the widths of all the hidden layers are required to be polynomially large in the number of training samples. However, the widths of practical deep networks are often only large in the first layer(s) and then start to decrease towards the output layer. This raises an interesting open question whether similar results also hold under this empirically relevant setting. Existing theoretical insights suggest that the loss surface of this class of networks is well-behaved, but these results usually do not provide direct algorithmic guarantees for optimization. In this paper, we close the gap by showing that one wide layer followed by pyramidal deep network topology suffices for gradient descent to find a global minimum with a geometric rate. Our proof is based on a weak form of Polyak-Lojasiewicz inequality which holds for deep pyramidal networks in the manifold of full-rank weight matrices.
Published 2020-02-18
URL https://arxiv.org/abs/2002.07867v1
PDF https://arxiv.org/pdf/2002.07867v1.pdf
PWC https://paperswithcode.com/paper/global-convergence-of-deep-networks-with-one

ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning

Title ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning
Authors Martijn Oldenhof, Adam Arany, Yves Moreau, Jaak Simm
Abstract In drug discovery, knowledge of the graph structure of chemical compounds is essential. Many thousands of scientific articles in chemistry and pharmaceutical sciences have investigated chemical compounds, but in cases the details of the structure of these chemical compounds is published only as an images. A tool to analyze these images automatically and convert them into a chemical graph structure would be useful for many applications, such drug discovery. A few such tools are available and they are mostly derived from optical character recognition. However, our evaluation of the performance of those tools reveals that they make often mistakes in detecting the correct bond multiplicity and stereochemical information. In addition, errors sometimes even lead to missing atoms in the resulting graph. In our work, we address these issues by developing a compound recognition method based on machine learning. More specifically, we develop a deep neural network model for optical compound recognition. The deep learning solution presented here consists of a segmentation model, followed by three classification models that predict atom locations, bonds and charges. Furthermore, this model not only predicts the graph structure of the molecule but also produces all information necessary to relate each component of the resulting graph to the source image. This solution is scalable and could rapidly process thousands of images. Finally, we compare empirically the proposed method to a well-established tool and observe significant error reductions.
Tasks Drug Discovery, Optical Character Recognition
Published 2020-02-23
URL https://arxiv.org/abs/2002.09914v1
PDF https://arxiv.org/pdf/2002.09914v1.pdf
PWC https://paperswithcode.com/paper/chemgrapher-optical-graph-recognition-of

Search Space of Adversarial Perturbations against Image Filters

Title Search Space of Adversarial Perturbations against Image Filters
Authors Dang Duy Thang, Toshihiro Matsui
Abstract The superiority of deep learning performance is threatened by safety issues for itself. Recent findings have shown that deep learning systems are very weak to adversarial examples, an attack form that was altered by the attacker’s intent to deceive the deep learning system. There are many proposed defensive methods to protect deep learning systems against adversarial examples. However, there is still a lack of principal strategies to deceive those defensive methods. Any time a particular countermeasure is proposed, a new powerful adversarial attack will be invented to deceive that countermeasure. In this study, we focus on investigating the ability to create adversarial patterns in search space against defensive methods that use image filters. Experimental results conducted on the ImageNet dataset with image classification tasks showed the correlation between the search space of adversarial perturbation and filters. These findings open a new direction for building stronger offensive methods towards deep learning systems.
Tasks Adversarial Attack, Image Classification
Published 2020-03-05
URL https://arxiv.org/abs/2003.02750v1
PDF https://arxiv.org/pdf/2003.02750v1.pdf
PWC https://paperswithcode.com/paper/search-space-of-adversarial-perturbations

Estimating Training Data Influence by Tracking Gradient Descent

Title Estimating Training Data Influence by Tracking Gradient Descent
Authors Garima Pruthi, Frederick Liu, Mukund Sundararajan, Satyen Kale
Abstract We introduce a method called TrackIn that computes the influence of a training example on a prediction made by the model, by tracking how the loss on the test point changes during the training process whenever the training example of interest was utilized. We provide a scalable implementation of TrackIn via a combination of a few key ideas: (a) a first-order approximation to the exact computation, (b) using random projections to speed up the computation of the first-order approximation for large models, (c) using saved checkpoints of standard training procedures, and (d) cherry-picking layers of a deep neural network. An experimental evaluation shows that TrackIn is more effective in identifying mislabelled training examples than other related methods such as influence functions and representer points. We also discuss insights from applying the method on vision, regression and natural language tasks.
Published 2020-02-19
URL https://arxiv.org/abs/2002.08484v1
PDF https://arxiv.org/pdf/2002.08484v1.pdf
PWC https://paperswithcode.com/paper/estimating-training-data-influence-by

Partially-Shared Variational Auto-encoders for Unsupervised Domain Adaptation with Target Shift

Title Partially-Shared Variational Auto-encoders for Unsupervised Domain Adaptation with Target Shift
Authors Ryuhei Takahashi, Atsushi Hashimoto, Motoharu Sonogashira, Masaaki Iiyama
Abstract This paper proposes a novel approach for unsupervised domain adaptation (UDA) with target shift. Target shift is a problem of mismatch in label distribution between source and target domains. Typically it appears as class-imbalance in target domain. In practice, this is an important problem in UDA; as we do not know labels in target domain datasets, we do not know whether or not its distribution is identical to that in the source domain dataset. Many traditional approaches achieve UDA with distribution matching by minimizing mean maximum discrepancy or adversarial training; however these approaches implicitly assume a coincidence in the distributions and do not work under situations with target shift. Some recent UDA approaches focus on class boundary and some of them are robust to target shift, but they are only applicable to classification and not to regression. To overcome the target shift problem in UDA, the proposed method, partially shared variational autoencoders (PS-VAEs), uses pair-wise feature alignment instead of feature distribution matching. PS-VAEs inter-convert domain of each sample by a CycleGAN-based architecture while preserving its label-related content. To evaluate the performance of PS-VAEs, we carried out two experiments: UDA with class-unbalanced digits datasets (classification), and UDA from synthesized data to real observation in human-pose-estimation (regression). The proposed method presented its robustness against the class-imbalance in the classification task, and outperformed the other methods in the regression task with a large margin.
Tasks Domain Adaptation, Pose Estimation, Unsupervised Domain Adaptation
Published 2020-01-22
URL https://arxiv.org/abs/2001.07895v3
PDF https://arxiv.org/pdf/2001.07895v3.pdf
PWC https://paperswithcode.com/paper/partially-shared-variational-auto-encoders

Robotic Grasp Manipulation Using Evolutionary Computing and Deep Reinforcement Learning

Title Robotic Grasp Manipulation Using Evolutionary Computing and Deep Reinforcement Learning
Authors Priya Shukla, Hitesh Kumar, G. C. Nandi
Abstract Intelligent Object manipulation for grasping is a challenging problem for robots. Unlike robots, humans almost immediately know how to manipulate objects for grasping due to learning over the years. A grown woman can grasp objects more skilfully than a child because of learning skills developed over years, the absence of which in the present day robotic grasping compels it to perform well below the human object grasping benchmarks. In this paper we have taken up the challenge of developing learning based pose estimation by decomposing the problem into both position and orientation learning. More specifically, for grasp position estimation, we explore three different methods - a Genetic Algorithm (GA) based optimization method to minimize error between calculated image points and predicted end-effector (EE) position, a regression based method (RM) where collected data points of robot EE and image points have been regressed with a linear model, a PseudoInverse (PI) model which has been formulated in the form of a mapping matrix with robot EE position and image points for several observations. Further for grasp orientation learning, we develop a deep reinforcement learning (DRL) model which we name as Grasp Deep Q-Network (GDQN) and benchmarked our results with Modified VGG16 (MVGG16). Rigorous experimentations show that due to inherent capability of producing very high-quality solutions for optimization problems and search problems, GA based predictor performs much better than the other two models for position estimation. For orientation learning results indicate that off policy learning through GDQN outperforms MVGG16, since GDQN architecture is specially made suitable for the reinforcement learning. Based on our proposed architectures and algorithms, the robot is capable of grasping all rigid body objects having regular shapes.
Tasks Pose Estimation, Robotic Grasping
Published 2020-01-15
URL https://arxiv.org/abs/2001.05443v1
PDF https://arxiv.org/pdf/2001.05443v1.pdf
PWC https://paperswithcode.com/paper/robotic-grasp-manipulation-using-evolutionary

Semiparametric Bayesian Forecasting of Spatial Earthquake Occurrences

Title Semiparametric Bayesian Forecasting of Spatial Earthquake Occurrences
Authors Aleksandar A. Kolev, Gordon J. Ross
Abstract Self-exciting Hawkes processes are used to model events which cluster in time and space, and have been widely studied in seismology under the name of the Epidemic Type Aftershock Sequence (ETAS) model. In the ETAS framework, the occurrence of the mainshock earthquakes in a geographical region is assumed to follow an inhomogeneous spatial point process, and aftershock events are then modelled via a separate triggering kernel. Most previous studies of the ETAS model have relied on point estimates of the model parameters due to the complexity of the likelihood function, and the difficulty in estimating an appropriate mainshock distribution. In order to take estimation uncertainty into account, we instead propose a fully Bayesian formulation of the ETAS model which uses a nonparametric Dirichlet process mixture prior to capture the spatial mainshock process. Direct inference for the resulting model is problematic due to the strong correlation of the parameters for the mainshock and triggering processes, so we instead use an auxiliary latent variable routine to perform efficient inference.
Published 2020-02-05
URL https://arxiv.org/abs/2002.01706v1
PDF https://arxiv.org/pdf/2002.01706v1.pdf
PWC https://paperswithcode.com/paper/semiparametric-bayesian-forecasting-of

DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection

Title DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection
Authors Ruben Tolosana, Ruben Vera-Rodriguez, Julian Fierrez, Aythami Morales, Javier Ortega-Garcia
Abstract The free access to large-scale public databases, together with the fast progress of deep learning techniques, in particular Generative Adversarial Networks, have led to the generation of very realistic fake contents with its corresponding implications towards society in this era of fake news. This survey provides a thorough review of techniques for manipulating face images including DeepFake methods, and methods to detect such manipulations. In particular, four types of facial manipulation are reviewed: i) entire face synthesis, ii) face identity swap (DeepFakes), iii) facial attributes manipulation, and iv) facial expression manipulation. For each manipulation type, we provide details regarding manipulation techniques, existing public databases, and key benchmarks for technology evaluation of fake detection methods, including a summary of results from those evaluations. Among the different databases available and discussed in the survey, FaceForensics++ is for example one of the most widely used for detecting both face identity swap and facial expression manipulations, with results in the literature in the range of 90-100% of manipulation detection accuracy. In addition to the survey information, we also discuss trends and provide an outlook of the ongoing work in this field, e.g., the recently announced DeepFake Detection Challenge (DFDC).
Tasks DeepFake Detection, Face Generation, Face Swapping
Published 2020-01-01
URL https://arxiv.org/abs/2001.00179v1
PDF https://arxiv.org/pdf/2001.00179v1.pdf
PWC https://paperswithcode.com/paper/deepfakes-and-beyond-a-survey-of-face

BaitWatcher: A lightweight web interface for the detection of incongruent news headlines

Title BaitWatcher: A lightweight web interface for the detection of incongruent news headlines
Authors Kunwoo Park, Taegyun Kim, Seunghyun Yoon, Meeyoung Cha, Kyomin Jung
Abstract In digital environments where substantial amounts of information are shared online, news headlines play essential roles in the selection and diffusion of news articles. Some news articles attract audience attention by showing exaggerated or misleading headlines. This study addresses the \textit{headline incongruity} problem, in which a news headline makes claims that are either unrelated or opposite to the contents of the corresponding article. We present \textit{BaitWatcher}, which is a lightweight web interface that guides readers in estimating the likelihood of incongruence in news articles before clicking on the headlines. BaitWatcher utilizes a hierarchical recurrent encoder that efficiently learns complex textual representations of a news headline and its associated body text. For training the model, we construct a million scale dataset of news articles, which we also release for broader research use. Based on the results of a focus group interview, we discuss the importance of developing an interpretable AI agent for the design of a better interface for mitigating the effects of online misinformation.
Published 2020-03-23
URL https://arxiv.org/abs/2003.11459v1
PDF https://arxiv.org/pdf/2003.11459v1.pdf
PWC https://paperswithcode.com/paper/baitwatcher-a-lightweight-web-interface-for

Revisiting Fixed Support Wasserstein Barycenter: Computational Hardness and Efficient Algorithms

Title Revisiting Fixed Support Wasserstein Barycenter: Computational Hardness and Efficient Algorithms
Authors Tianyi Lin, Nhat Ho, Xi Chen, Marco Cuturi, Michael I. Jordan
Abstract We study the fixed-support Wasserstein barycenter problem (FS-WBP), which consists in computing the Wasserstein barycenter of $m$ discrete probability measures supported on a finite metric space of size $n$. We show first that the constraint matrix arising from the linear programming (LP) representation of the FS-WBP is totally unimodular when $m \geq 3$ and $n = 2$, but not totally unimodular when $m \geq 3$ and $n \geq 3$. This result answers an open problem, since it shows that the FS-WBP is not a minimum-cost flow problem and therefore cannot be solved efficiently using linear programming. Building on this negative result, we propose and analyze a simple and efficient variant of the iterative Bregman projection (IBP) algorithm, currently the most widely adopted algorithm to solve the FS-WBP. The algorithm is an accelerated IBP algorithm which achieves the complexity bound of $\widetilde{\mathcal{O}}(mn^{7/3}/\varepsilon)$. This bound is better than that obtained for the standard IBP algorithm—$\widetilde{\mathcal{O}}(mn^{2}/\varepsilon^2)$—in terms of $\varepsilon$, and that of accelerated primal-dual gradient algorithm—$\widetilde{\mathcal{O}}(mn^{5/2}/\varepsilon)$—in terms of $n$. Empirical study demonstrates that the acceleration promised by the theory is real in practice.
Published 2020-02-12
URL https://arxiv.org/abs/2002.04783v2
PDF https://arxiv.org/pdf/2002.04783v2.pdf
PWC https://paperswithcode.com/paper/revisiting-fixed-support-wasserstein

Exponential discretization of weights of neural network connections in pre-trained neural networks

Title Exponential discretization of weights of neural network connections in pre-trained neural networks
Authors Magomed Yu. Malsagov, Emil M. Khayrov, Maria M. Pushkareva, Iakov M. Karandashev
Abstract To reduce random access memory (RAM) requirements and to increase speed of recognition algorithms we consider a weight discretization problem for trained neural networks. We show that an exponential discretization is preferable to a linear discretization since it allows one to achieve the same accuracy when the number of bits is 1 or 2 less. The quality of the neural network VGG-16 is already satisfactory (top5 accuracy 69%) in the case of 3 bit exponential discretization. The ResNet50 neural network shows top5 accuracy 84% at 4 bits. Other neural networks perform fairly well at 5 bits (top5 accuracies of Xception, Inception-v3, and MobileNet-v2 top5 were 87%, 90%, and 77%, respectively). At less number of bits, the accuracy decreases rapidly.
Published 2020-02-03
URL https://arxiv.org/abs/2002.00623v1
PDF https://arxiv.org/pdf/2002.00623v1.pdf
PWC https://paperswithcode.com/paper/exponential-discretization-of-weights-of
Title Accelerating Cooperative Planning for Automated Vehicles with Learned Heuristics and Monte Carlo Tree Search
Authors Karl Kurzer, Marcus Fechner, J. Marius Zöllner
Abstract Efficient driving in urban traffic scenarios requires foresight. The observation of other traffic participants, and the inference of their possible next actions depending on the own action is considered cooperative prediction and planning. Humans are well equipped with the capability to predict the actions of multiple interacting traffic participants and plan accordingly, without the need to directly communicate with others. Prior work has shown that it is possible to achieve effective cooperative planning without the need for explicit communication. However, the search space for cooperative plans is so large that the vast amount of the computational budget is spent on exploring the search space in unpromising regions that are far away from the solution. To accelerate the planning process, we combined learned heuristics with a cooperative planning method in order to guide the search towards regions with promising actions, yielding better results at lower computational costs.
Published 2020-02-02
URL https://arxiv.org/abs/2002.00497v1
PDF https://arxiv.org/pdf/2002.00497v1.pdf
PWC https://paperswithcode.com/paper/accelerating-cooperative-planning-for

Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space

Title Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space
Authors Mohammad Saeed Abrishami, Amir Erfan Eshratifar, David Eigen, Yanzhi Wang, Shahin Nazarian, Massoud Pedram
Abstract Recent advances in the field of artificial intelligence have been made possible by deep neural networks. In applications where data are scarce, transfer learning and data augmentation techniques are commonly used to improve the generalization of deep learning models. However, fine-tuning a transfer model with data augmentation in the raw input space has a high computational cost to run the full network for every augmented input. This is particularly critical when large models are implemented on embedded devices with limited computational and energy resources. In this work, we propose a method that replaces the augmentation in the raw input space with an approximate one that acts purely in the embedding space. Our experimental results show that the proposed method drastically reduces the computation, while the accuracy of models is negligibly compromised.
Tasks Data Augmentation, Transfer Learning
Published 2020-02-12
URL https://arxiv.org/abs/2002.04776v1
PDF https://arxiv.org/pdf/2002.04776v1.pdf
PWC https://paperswithcode.com/paper/efficient-training-of-deep-convolutional

A Machine Learning Application for Raising WASH Awareness in the Times of Covid-19 Pandemic

Title A Machine Learning Application for Raising WASH Awareness in the Times of Covid-19 Pandemic
Authors Rohan Pandey, Vaibhav Gautam, Kanav Bhagat, Tavpritesh Sethi
Abstract A proactive approach to raise awareness while preventing misinformation is a modern-day challenge in all domains including healthcare. Such awareness and sensitization approaches to prevention and containment are important components of a strong healthcare system, especially in the times of outbreaks such as the ongoing Covid-19 pandemic. However, there is a fine balance between continuous awareness-raising by providing new information and the risk of misinformation. In this work, we address this gap by creating a life-long learning application that delivers authentic information to users in Hindi, the most widely used local language in India. It does this by matching sources of verified and authentic information such as the WHO reports against daily news by using machine learning and natural language processing. It delivers the narrated content in Hindi by using state-of-the-art text to speech engines. Finally, the approach allows user input for continuous improvement of news feed relevance on a daily basis. We demonstrate a focused application of this approach for Water, Sanitation, Hygiene as it is critical in the containment of the currently raging Covid-19 pandemic through the WashKaro android application. Thirteen combinations of pre-processing strategies, word-embeddings, and similarity metrics were evaluated by eight human users via calculation of agreement statistics. The best performing combination achieved a Cohen’s Kappa of 0.54 and was deployed in the WashKaro application back-end. Interventional studies for evaluating the effectiveness of the WashKaro application for preventing WASH-related diseases are planned to be carried out in the Mohalla clinics that provided 3.5 Million consults in 2019 in Delhi, India. Additionally, the application also features human-curated and vetted information to reach out to the community as audio-visual content in local languages.
Tasks Word Embeddings
Published 2020-03-16
URL https://arxiv.org/abs/2003.07074v1
PDF https://arxiv.org/pdf/2003.07074v1.pdf
PWC https://paperswithcode.com/paper/a-machine-learning-application-for-raising
comments powered by Disqus