April 2, 2020

3530 words 17 mins read

Paper Group ANR 92

Paper Group ANR 92

Targeted free energy estimation via learned mappings. Global Convergence of Frank Wolfe on One Hidden Layer Networks. Monocular Depth Estimation Based On Deep Learning: An Overview. A Survey on Dragonfly Algorithm and its Applications in Engineering. Machine Learning for Intelligent Optical Networks: A Comprehensive Survey. Teaching Temporal Logics …

Targeted free energy estimation via learned mappings

Title Targeted free energy estimation via learned mappings
Authors Peter Wirnsberger, Andrew J. Ballard, George Papamakarios, Stuart Abercrombie, Sébastien Racanière, Alexander Pritzel, Danilo Jimenez Rezende, Charles Blundell
Abstract Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences, and has since inspired a huge body of related methods that use it as an integral building block. Being an importance sampling based estimator, however, FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions. One strategy to mitigate this problem, called Targeted Free Energy Perturbation, uses a high-dimensional mapping in configuration space to increase overlap of the underlying distributions. Despite its potential, this method has attracted only limited attention due to the formidable challenge of formulating a tractable mapping. Here, we cast Targeted FEP as a machine learning (ML) problem in which the mapping is parameterized as a neural network that is optimized so as to increase overlap. We test our method on a fully-periodic solvation system, with a model that respects the inherent permutational and periodic symmetries of the problem. We demonstrate that our method leads to a substantial variance reduction in free energy estimates when compared against baselines.
Published 2020-02-12
URL https://arxiv.org/abs/2002.04913v1
PDF https://arxiv.org/pdf/2002.04913v1.pdf
PWC https://paperswithcode.com/paper/targeted-free-energy-estimation-via-learned

Global Convergence of Frank Wolfe on One Hidden Layer Networks

Title Global Convergence of Frank Wolfe on One Hidden Layer Networks
Authors Alexandre d’Aspremont, Mert Pilanci
Abstract We derive global convergence bounds for the Frank Wolfe algorithm when training one hidden layer neural networks. When using the ReLU activation function, and under tractable preconditioning assumptions on the sample data set, the linear minimization oracle used to incrementally form the solution can be solved explicitly as a second order cone program. The classical Frank Wolfe algorithm then converges with rate $O(1/T)$ where $T$ is both the number of neurons and the number of calls to the oracle.
Published 2020-02-06
URL https://arxiv.org/abs/2002.02208v1
PDF https://arxiv.org/pdf/2002.02208v1.pdf
PWC https://paperswithcode.com/paper/global-convergence-of-frank-wolfe-on-one

Monocular Depth Estimation Based On Deep Learning: An Overview

Title Monocular Depth Estimation Based On Deep Learning: An Overview
Authors Chaoqiang Zhao, Qiyu Sun, Chongzhen Zhang, Yang Tang, Feng Qian
Abstract Depth information is important for autonomous systems to perceive environments and estimate their own state. Traditional depth estimation methods, like structure from motion and stereo vision matching, are built on feature correspondences of multiple viewpoints. Meanwhile, the predicted depth maps are sparse. Inferring depth information from a single image (monocular depth estimation) is an ill-posed problem. With the rapid development of deep neural networks, monocular depth estimation based on deep learning has been widely studied recently and achieved promising performance in accuracy. Meanwhile, dense depth maps are estimated from single images by deep neural networks in an end-to-end manner. In order to improve the accuracy of depth estimation, different kinds of network frameworks, loss functions and training strategies are proposed subsequently. Therefore, we survey the current monocular depth estimation methods based on deep learning in this review. Initially, we conclude several widely used datasets and evaluation indicators in deep learning-based depth estimation. Furthermore, we review some representative existing methods according to different training manners: supervised, unsupervised and semi-supervised. Finally, we discuss the challenges and provide some ideas for future researches in monocular depth estimation.
Tasks Depth Estimation, Monocular Depth Estimation
Published 2020-03-14
URL https://arxiv.org/abs/2003.06620v1
PDF https://arxiv.org/pdf/2003.06620v1.pdf
PWC https://paperswithcode.com/paper/monocular-depth-estimation-based-on-deep

A Survey on Dragonfly Algorithm and its Applications in Engineering

Title A Survey on Dragonfly Algorithm and its Applications in Engineering
Authors Chnoor M. Rahman, Tarik A. Rashid
Abstract Dragonfly algorithm (DA) is one of the most recently developed heuristic optimization algorithms by Mirjalili in 2016. It is now one of the most widely used algorithms. In some cases, it outperforms the most popular algorithms. However, this algorithm is not far from obstacles when it comes to complex optimization problems. In this work, along with the strengths of the algorithm in solving real-world optimization problems, the weakness of the algorithm to optimize complex optimization problems is addressed. This survey presents a comprehensive investigation of DA in the engineering area. First, an overview of the algorithm is discussed. Additionally, the different variants of the algorithm are addressed too. The combined versions of the DA with other techniques and the modifications that have been done to make the algorithm work better are shown. Besides, a survey on applications in engineering area that used DA is offered. The algorithm is compared with some other metaheuristic algorithms to demonstrate its ability to optimize problems comparing to the others. The results of the algorithm from the works that utilized the DA in the literature and the results of the benchmark functions showed that in comparison with some other algorithms DA has an excellent performance, especially for small to medium problems. Moreover, the bottlenecks of the algorithm and some future trends are discussed. Authors conduct this research with the hope of offering beneficial information about the DA to the researchers who want to study the algorithm and utilize it to optimize engineering problems.
Published 2020-02-19
URL https://arxiv.org/abs/2002.12126v1
PDF https://arxiv.org/pdf/2002.12126v1.pdf
PWC https://paperswithcode.com/paper/a-survey-on-dragonfly-algorithm-and-its

Machine Learning for Intelligent Optical Networks: A Comprehensive Survey

Title Machine Learning for Intelligent Optical Networks: A Comprehensive Survey
Authors Rentao Gu, Zeyuan Yang, Yuefeng Ji
Abstract With the rapid development of Internet and communication systems, both in services and technologies, communication networks have been suffering increasing complexity. It is imperative to improve intelligence in communication network, and several aspects have been incorporating with Artificial Intelligence (AI) and Machine Learning (ML). Optical network, which plays an important role both in core and access network in communication networks, also faces great challenges of system complexity and the requirement of manual operations. To overcome the current limitations and address the issues of future optical networks, it is essential to deploy more intelligence capability to enable autonomous and exible network operations. ML techniques are proved to have superiority on solving complex problems; and thus recently, ML techniques have been used for many optical network applications. In this paper, a detailed survey of existing applications of ML for intelligent optical networks is presented. The applications of ML are classified in terms of their use cases, which are categorized into optical network control and resource management, and optical networks monitoring and survivability. The use cases are analyzed and compared according to the used ML techniques. Besides, a tutorial for ML applications is provided from the aspects of the introduction of common ML algorithms, paradigms of ML, and motivations of applying ML. Lastly, challenges and possible solutions of ML application in optical networks are also discussed, which intends to inspire future innovations in leveraging ML to build intelligent optical networks.
Published 2020-03-11
URL https://arxiv.org/abs/2003.05290v1
PDF https://arxiv.org/pdf/2003.05290v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-for-intelligent-optical

Teaching Temporal Logics to Neural Networks

Title Teaching Temporal Logics to Neural Networks
Authors Bernd Finkbeiner, Christopher Hahn, Markus N. Rabe, Frederik Schmitt
Abstract We show that a deep neural network can learn the semantics of linear-time temporal logic (LTL). As a challenging task that requires deep understanding of the LTL semantics, we show that our network can solve the trace generation problem for LTL: given a satisfiable LTL formula, find a trace that satisfies the formula. We frame the trace generation problem for LTL as a translation task, i.e., to translate from formulas to satisfying traces, and train an off-the-shelf implementation of the Transformer, a recently introduced deep learning architecture proposed for solving natural language processing tasks. We provide a detailed analysis of our experimental results, comparing multiple hyperparameter settings and formula representations. After training for several hours on a single GPU the results were surprising: the Transformer returns the syntactically equivalent trace in 89% of the cases on a held-out test set. Most of the “mispredictions”, however, (and overall more than 99% of the predicted traces) still satisfy the given LTL formula. In other words, the Transformer generalized from imperfect training data to the semantics of LTL.
Published 2020-03-06
URL https://arxiv.org/abs/2003.04218v1
PDF https://arxiv.org/pdf/2003.04218v1.pdf
PWC https://paperswithcode.com/paper/teaching-temporal-logics-to-neural-networks

NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit Pay

Title NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit Pay
Authors Jianbin Lin, Zhiqiang Zhang, Jun Zhou, Xiaolong Li, Jingli Fang, Yanming Fang, Quan Yu, Yuan Qi
Abstract Ant Credit Pay is a consumer credit service in Ant Financial Service Group. Similar to credit card, loan default is one of the major risks of this credit product. Hence, effective algorithm for default prediction is the key to losses reduction and profits increment for the company. However, the challenges facing in our scenario are different from those in conventional credit card service. The first one is scalability. The huge volume of users and their behaviors in Ant Financial requires the ability to process industrial-scale data and perform model training efficiently. The second challenges is the cold-start problem. Different from the manual review for credit card application in conventional banks, the credit limit of Ant Credit Pay is automatically offered to users based on the knowledge learned from big data. However, default prediction for new users is suffered from lack of enough credit behaviors. It requires that the proposal should leverage other new data source to alleviate the cold-start problem. Considering the above challenges and the special scenario in Ant Financial, we try to incorporate default prediction with network information to alleviate the cold-start problem. In this paper, we propose an industrial-scale distributed network representation framework, termed NetDP, for default prediction in Ant Credit Pay. The proposal explores network information generated by various interaction between users, and blends unsupervised and supervised network representation in a unified framework for default prediction problem. Moreover, we present a parameter-server-based distributed implement of our proposal to handle the scalability challenge. Experimental results demonstrate the effectiveness of our proposal, especially in cold-start problem, as well as the efficiency for industrial-scale dataset.
Published 2020-04-01
URL https://arxiv.org/abs/2004.00201v1
PDF https://arxiv.org/pdf/2004.00201v1.pdf
PWC https://paperswithcode.com/paper/netdp-an-industrial-scale-distributed-network

Deep Multi-View Enhancement Hashing for Image Retrieval

Title Deep Multi-View Enhancement Hashing for Image Retrieval
Authors Chenggang Yan, Biao Gong, Yuxuan Wei, Yue Gao
Abstract Hashing is an efficient method for nearest neighbor search in large-scale data space by embedding high-dimensional feature descriptors into a similarity preserving Hamming space with a low dimension. However, large-scale high-speed retrieval through binary code has a certain degree of reduction in retrieval accuracy compared to traditional retrieval methods. We have noticed that multi-view methods can well preserve the diverse characteristics of data. Therefore, we try to introduce the multi-view deep neural network into the hash learning field, and design an efficient and innovative retrieval model, which has achieved a significant improvement in retrieval performance. In this paper, we propose a supervised multi-view hash model which can enhance the multi-view information through neural networks. This is a completely new hash learning method that combines multi-view and deep learning methods. The proposed method utilizes an effective view stability evaluation method to actively explore the relationship among views, which will affect the optimization direction of the entire network. We have also designed a variety of multi-data fusion methods in the Hamming space to preserve the advantages of both convolution and multi-view. In order to avoid excessive computing resources on the enhancement procedure during retrieval, we set up a separate structure called memory network which participates in training together. The proposed method is systematically evaluated on the CIFAR-10, NUS-WIDE and MS-COCO datasets, and the results show that our method significantly outperforms the state-of-the-art single-view and multi-view hashing methods.
Tasks Image Retrieval
Published 2020-02-01
URL https://arxiv.org/abs/2002.00169v1
PDF https://arxiv.org/pdf/2002.00169v1.pdf
PWC https://paperswithcode.com/paper/deep-multi-view-enhancement-hashing-for-image

Robot Navigation in Unseen Spaces using an Abstract Map

Title Robot Navigation in Unseen Spaces using an Abstract Map
Authors Ben Talbot, Feras Dayoub, Peter Corke, Gordon Wyeth
Abstract Human navigation in built environments depends on symbolic spatial information which has unrealised potential to enhance robot navigation capabilities. Information sources such as labels, signs, maps, planners, spoken directions, and navigational gestures communicate a wealth of spatial information to the navigators of built environments; a wealth of information that robots typically ignore. We present a robot navigation system that uses the same symbolic spatial information employed by humans to purposefully navigate in unseen built environments with a level of performance comparable to humans. The navigation system uses a novel data structure called the abstract map to imagine malleable spatial models for unseen spaces from spatial symbols. Sensorimotor perceptions from a robot are then employed to provide purposeful navigation to symbolic goal locations in the unseen environment. We show how a dynamic system can be used to create malleable spatial models for the abstract map, and provide an open source implementation to encourage future work in the area of symbolic navigation. Symbolic navigation performance of humans and a robot is evaluated in a real-world built environment. The paper concludes with a qualitative analysis of human navigation strategies, providing further insights into how the symbolic navigation capabilities of robots in unseen built environments can be improved in the future.
Tasks Robot Navigation
Published 2020-01-31
URL https://arxiv.org/abs/2001.11684v1
PDF https://arxiv.org/pdf/2001.11684v1.pdf
PWC https://paperswithcode.com/paper/robot-navigation-in-unseen-spaces-using-an
Title The Second Worldwide Wave of Interest in Coronavirus since the COVID-19 Outbreaks in South Korea, Italy and Iran: A Google Trends Study
Authors Artur Strzelecki
Abstract The recent emergence of a new coronavirus, COVID-19, has gained extensive coverage in public media and global news. As of 24 March 2020, the virus has caused viral pneumonia in tens of thousands of people in Wuhan, China, and thousands of cases in 184 other countries and territories. This study explores the potential use of Google Trends (GT) to monitor worldwide interest in this COVID-19 epidemic. GT was chosen as a source of reverse engineering data, given the interest in the topic. Current data on COVID-19 is retrieved from (GT) using one main search topic: Coronavirus. Geographical settings for GT are worldwide, China, South Korea, Italy and Iran. The reported period is 15 January 2020 to 24 March 2020. The results show that the highest worldwide peak in the first wave of demand for information was on 31 January 2020. After the first peak, the number of new cases reported daily rose for 6 days. A second wave started on 21 February 2020 after the outbreaks were reported in Italy, with the highest peak on 16 March 2020. The second wave is six times as big as the first wave. The number of new cases reported daily is rising day by day. This short communication gives a brief introduction to how the demand for information on coronavirus epidemic is reported through GT.
Published 2020-03-24
URL https://arxiv.org/abs/2003.10998v1
PDF https://arxiv.org/pdf/2003.10998v1.pdf
PWC https://paperswithcode.com/paper/the-second-worldwide-wave-of-interest-in

Dual Convolutional LSTM Network for Referring Image Segmentation

Title Dual Convolutional LSTM Network for Referring Image Segmentation
Authors Linwei Ye, Zhi Liu, Yang Wang
Abstract We consider referring image segmentation. It is a problem at the intersection of computer vision and natural language understanding. Given an input image and a referring expression in the form of a natural language sentence, the goal is to segment the object of interest in the image referred by the linguistic query. To this end, we propose a dual convolutional LSTM (ConvLSTM) network to tackle this problem. Our model consists of an encoder network and a decoder network, where ConvLSTM is used in both encoder and decoder networks to capture spatial and sequential information. The encoder network extracts visual and linguistic features for each word in the expression sentence, and adopts an attention mechanism to focus on words that are more informative in the multimodal interaction. The decoder network integrates the features generated by the encoder network at multiple levels as its input and produces the final precise segmentation mask. Experimental results on four challenging datasets demonstrate that the proposed network achieves superior segmentation performance compared with other state-of-the-art methods.
Tasks Semantic Segmentation
Published 2020-01-30
URL https://arxiv.org/abs/2001.11561v1
PDF https://arxiv.org/pdf/2001.11561v1.pdf
PWC https://paperswithcode.com/paper/dual-convolutional-lstm-network-for-referring

Adaptive Prediction Timing for Electronic Health Records

Title Adaptive Prediction Timing for Electronic Health Records
Authors Jacob Deasy, Ari Ercole, Pietro Liò
Abstract In realistic scenarios, multivariate timeseries evolve over case-by-case time-scales. This is particularly clear in medicine, where the rate of clinical events varies by ward, patient, and application. Increasingly complex models have been shown to effectively predict patient outcomes, but have failed to adapt granularity to these inherent temporal resolutions. As such, we introduce a novel, more realistic, approach to generating patient outcome predictions at an adaptive rate based on uncertainty accumulation in Bayesian recurrent models. We use a Recurrent Neural Network (RNN) and a Bayesian embedding layer with a new aggregation method to demonstrate adaptive prediction timing. Our model predicts more frequently when events are dense or the model is certain of event latent representations, and less frequently when readings are sparse or the model is uncertain. At 48 hours after patient admission, our model achieves equal performance compared to its static-windowed counterparts, while generating patient- and event-specific prediction timings that lead to improved predictive performance over the crucial first 12 hours of the patient stay.
Published 2020-03-05
URL https://arxiv.org/abs/2003.02554v1
PDF https://arxiv.org/pdf/2003.02554v1.pdf
PWC https://paperswithcode.com/paper/adaptive-prediction-timing-for-electronic

Targeted Forgetting and False Memory Formation in Continual Learners through Adversarial Backdoor Attacks

Title Targeted Forgetting and False Memory Formation in Continual Learners through Adversarial Backdoor Attacks
Authors Muhammad Umer, Glenn Dawson, Robi Polikar
Abstract Artificial neural networks are well-known to be susceptible to catastrophic forgetting when continually learning from sequences of tasks. Various continual (or “incremental”) learning approaches have been proposed to avoid catastrophic forgetting, but they are typically adversary agnostic, i.e., they do not consider the possibility of a malicious attack. In this effort, we explore the vulnerability of Elastic Weight Consolidation (EWC), a popular continual learning algorithm for avoiding catastrophic forgetting. We show that an intelligent adversary can bypass the EWC’s defenses, and instead cause gradual and deliberate forgetting by introducing small amounts of misinformation to the model during training. We demonstrate such an adversary’s ability to assume control of the model via injection of “backdoor” attack samples on both permuted and split benchmark variants of the MNIST dataset. Importantly, once the model has learned the adversarial misinformation, the adversary can then control the amount of forgetting of any task. Equivalently, the malicious actor can create a “false memory” about any task by inserting carefully-designed backdoor samples to any fraction of the test instances of that task. Perhaps most damaging, we show this vulnerability to be very acute; neural network memory can be easily compromised with the addition of backdoor samples into as little as 1% of the training data of even a single task.
Tasks Continual Learning
Published 2020-02-17
URL https://arxiv.org/abs/2002.07111v1
PDF https://arxiv.org/pdf/2002.07111v1.pdf
PWC https://paperswithcode.com/paper/targeted-forgetting-and-false-memory

A Novel Kuhnian Ontology for Epistemic Classification of STM Scholarly Articles

Title A Novel Kuhnian Ontology for Epistemic Classification of STM Scholarly Articles
Authors Khalid M. Saqr, Abdelrahman Elsharawy
Abstract Thomas Kuhn proposed his paradigmatic view of scientific discovery five decades ago. The concept of paradigm has not only explained the progress of science, but has also become the central epistemic concept among STM scientists. Here, we adopt the principles of Kuhnian philosophy to construct a novel ontology aims at classifying and evaluating the impact of STM scholarly articles. First, we explain how the Kuhnian cycle of science describes research at different epistemic stages. Second, we show how the Kuhnian cycle could be reconstructed into modular ontologies which classify scholarly articles according to their contribution to paradigm-centred knowledge. The proposed ontology and its scenarios are discussed. To the best of the authors knowledge, this is the first attempt for creating an ontology for describing scholarly articles based on the Kuhnian paradigmatic view of science.
Published 2020-02-10
URL https://arxiv.org/abs/2002.03531v1
PDF https://arxiv.org/pdf/2002.03531v1.pdf
PWC https://paperswithcode.com/paper/a-novel-kuhnian-ontology-for-epistemic

Balanced Alignment for Face Recognition: A Joint Learning Approach

Title Balanced Alignment for Face Recognition: A Joint Learning Approach
Authors Huawei Wei, Peng Lu, Yichen Wei
Abstract Face alignment is crucial for face recognition and has been widely adopted. However, current practice is too simple and under-explored. There lacks an understanding of how important face alignment is and how it should be performed, for recognition. This work studies these problems and makes two contributions. First, it provides an in-depth and quantitative study of how alignment strength affects recognition accuracy. Our results show that excessive alignment is harmful and an optimal balanced point of alignment is in need. To strike the balance, our second contribution is a novel joint learning approach where alignment learning is controllable with respect to its strength and driven by recognition. Our proposed method is validated by comprehensive experiments on several benchmarks, especially the challenging ones with large pose.
Tasks Face Alignment, Face Recognition
Published 2020-03-23
URL https://arxiv.org/abs/2003.10168v1
PDF https://arxiv.org/pdf/2003.10168v1.pdf
PWC https://paperswithcode.com/paper/balanced-alignment-for-face-recognition-a
comments powered by Disqus