January 29, 2020

3057 words 15 mins read

Paper Group ANR 689

Paper Group ANR 689

Differential equations as models of deep neural networks. Variational Autoencoder Trajectory Primitives with Continuous and Discrete Latent Codes. Physics-Informed Echo State Networks for Chaotic Systems Forecasting. Knowledge extraction from the learning of sequences in a long short term memory (LSTM) architecture. New Approach for Solving The Clu …

Differential equations as models of deep neural networks

Title Differential equations as models of deep neural networks
Authors Julius Ruseckas
Abstract In this work we systematically analyze general properties of differential equations used as machine learning models. We demonstrate that the gradient of the loss function with respect to to the hidden state can be considered as a generalized momentum conjugate to the hidden state, allowing application of the tools of classical mechanics. In addition, we show that not only residual networks, but also feedforward neural networks with small nonlinearities and the weights matrices deviating only slightly from identity matrices can be related to the differential equations. We propose a differential equation describing such networks and investigate its properties.
Tasks
Published 2019-09-09
URL https://arxiv.org/abs/1909.03767v2
PDF https://arxiv.org/pdf/1909.03767v2.pdf
PWC https://paperswithcode.com/paper/differential-equations-as-models-of-deep
Repo
Framework

Variational Autoencoder Trajectory Primitives with Continuous and Discrete Latent Codes

Title Variational Autoencoder Trajectory Primitives with Continuous and Discrete Latent Codes
Authors Takayuki Osa, Shuhei Ikemoto
Abstract Imitation learning is an intuitive approach for teaching motion to robotic systems. Although previous studies have proposed various methods to model demonstrated movement primitives, one of the limitations of existing methods is that it is not trivial to modify their planned trajectory once the model is learned. The trajectory of a robotic manipulator is often high-dimensional, and it is not easy to tune the shape of the planned trajectory in an intuitive manner. We address this problem by learning the latent space of the robot trajectory. If the latent variable of the trajectories can be learned, it can be used to tune the trajectory in an intuitive manner even when the user is an expert. We propose a framework for modeling demonstrated trajectories with a neural network that learns the low-dimensional latent space. Our neural network structure is built on the variational autoencoder (VAE) with discrete and continuous latent variables. We extend the structure of the existing VAE to obtain the decoder that is conditioned on the goal position of the trajectory for generalization to different goal positions. To cope with requirement of the massive training data, we use a trajectory augmentation technique inspired by the data augmentation commonly used in the computer vision community. In the proposed framework, the latent variables that encodes the multiple types of trajectories are learned in an unsupervised manner. The learned decoder can be used as a motion planner in which the user can specify the goal position and the trajectory types by setting the latent variables. The experimental results show that our neural network can be trained using a limited number of demonstrated trajectories and that the interpretable latent representations can be learned.
Tasks Data Augmentation, Imitation Learning
Published 2019-12-09
URL https://arxiv.org/abs/1912.04063v1
PDF https://arxiv.org/pdf/1912.04063v1.pdf
PWC https://paperswithcode.com/paper/variational-autoencoder-trajectory-primitives
Repo
Framework

Physics-Informed Echo State Networks for Chaotic Systems Forecasting

Title Physics-Informed Echo State Networks for Chaotic Systems Forecasting
Authors Nguyen Anh Khoa Doan, Wolfgang Polifke, Luca Magri
Abstract We propose a physics-informed Echo State Network (ESN) to predict the evolution of chaotic systems. Compared to conventional ESNs, the physics-informed ESNs are trained to solve supervised learning tasks while ensuring that their predictions do not violate physical laws. This is achieved by introducing an additional loss function during the training of the ESNs, which penalizes non-physical predictions without the need of any additional training data. This approach is demonstrated on a chaotic Lorenz system, where the physics-informed ESNs improve the predictability horizon by about two Lyapunov times as compared to conventional ESNs. The proposed framework shows the potential of using machine learning combined with prior physical knowledge to improve the time-accurate prediction of chaotic dynamical systems.
Tasks
Published 2019-04-09
URL https://arxiv.org/abs/1906.11122v1
PDF https://arxiv.org/pdf/1906.11122v1.pdf
PWC https://paperswithcode.com/paper/physics-informed-echo-state-networks-for
Repo
Framework

Knowledge extraction from the learning of sequences in a long short term memory (LSTM) architecture

Title Knowledge extraction from the learning of sequences in a long short term memory (LSTM) architecture
Authors Ikram Chraibi Kaadoud, Nicolas P. Rougier, Frédéric Alexandre
Abstract We introduce a general method to extract knowledge from a recurrent neural network (Long Short Term Memory) that has learnt to detect if a given input sequence is valid or not, according to an unknown generative automaton. Based on the clustering of the hidden states, we explain how to build and validate an automaton that corresponds to the underlying (unknown) automaton, and allows to predict if a given sequence is valid or not. The method is illustrated on artificial grammars (Reber’s grammar variations) as well as on a real use-case whose underlying grammar is unknown.
Tasks
Published 2019-12-06
URL https://arxiv.org/abs/1912.03126v1
PDF https://arxiv.org/pdf/1912.03126v1.pdf
PWC https://paperswithcode.com/paper/knowledge-extraction-from-the-learning-of
Repo
Framework

New Approach for Solving The Clustered Shortest-Path Tree Problem Based on Reducing The Search Space of Evolutionary Algorithm

Title New Approach for Solving The Clustered Shortest-Path Tree Problem Based on Reducing The Search Space of Evolutionary Algorithm
Authors Huynh Thi Thanh Binh, Pham Dinh Thanh, Ta Bao Thang
Abstract Along with the development of manufacture and services, the problem of distribution network optimization has been growing in importance, thus receiving much attention from the research community. One of the most recently introduced network optimization problems is the Clustered Shortest-Path Tree Problem (CluSTP). Since the problem is NP-Hard, recent approaches often prefer to use approximation algorithms to solve it, several of which used Evolutionary Algorithms (EAs) and have been proven to be effective. However, most of the prior studies directly applied EAs to the whole CluSTP problem, which leads to a great amount of resource consumption, especially when the problem size is large. To overcome these limitations, this paper suggests a method for reducing the search space of the EAs applied to CluSTP by decomposing the original problem into two sub-problems, the solution to one of which is found by an EAs and that to the other is found by another method. The goal of the first sub-problem is to determine a spanning tree which connects among the clusters, while the goal of the second sub-problem is to determine the best spanning tree for each cluster. In addition, this paper proposes a new EAs, which can be applied to solve the first sub-problem and suggests using the Dijkstra’s algorithm to solve the second sub-problem. The proposed approach is comprehensively experimented and compared with existing methods. Experimental results prove that our method is more efficient and more importantly, it can obtain results which are close to the optimal results.
Tasks
Published 2019-06-10
URL https://arxiv.org/abs/1908.07060v1
PDF https://arxiv.org/pdf/1908.07060v1.pdf
PWC https://paperswithcode.com/paper/new-approach-for-solving-the-clustered
Repo
Framework

Spectral inference for large Stochastic Blockmodels with nodal covariates

Title Spectral inference for large Stochastic Blockmodels with nodal covariates
Authors Angelo Mele, Lingxin Hao, Joshua Cape, Carey E. Priebe
Abstract In many applications of network analysis, it is important to distinguish between observed and unobserved factors affecting network structure. To this end, we develop spectral estimators for both unobserved blocks and the effect of covariates in stochastic blockmodels. Our main strategy is to reformulate the stochastic blockmodel estimation problem as recovery of latent positions in a generalized random dot product graph. On the theoretical side, we establish asymptotic normality of our estimators for the subsequent purpose of performing inference. On the applied side, we show that computing our estimator is much faster than standard variational expectation–maximization algorithms and scales well for large networks. The results in this paper provide a foundation to estimate the effect of observed covariates as well as unobserved latent community structure on the probability of link formation in networks.
Tasks
Published 2019-08-18
URL https://arxiv.org/abs/1908.06438v1
PDF https://arxiv.org/pdf/1908.06438v1.pdf
PWC https://paperswithcode.com/paper/spectral-inference-for-large-stochastic
Repo
Framework

On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval

Title On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval
Authors Ankita Pasad, Bowen Shi, Herman Kamper, Karen Livescu
Abstract Recent work has shown that speech paired with images can be used to learn semantically meaningful speech representations even without any textual supervision. In real-world low-resource settings, however, we often have access to some transcribed speech. We study whether and how visual grounding is useful in the presence of varying amounts of textual supervision. In particular, we consider the task of semantic speech retrieval in a low-resource setting. We use a previously studied data set and task, where models are trained on images with spoken captions and evaluated on human judgments of semantic relevance. We propose a multitask learning approach to leverage both visual and textual modalities, with visual supervision in the form of keyword probabilities from an external tagger. We find that visual grounding is helpful even in the presence of textual supervision, and we analyze this effect over a range of sizes of transcribed data sets. With ~5 hours of transcribed speech, we obtain 23% higher average precision when also using visual supervision.
Tasks
Published 2019-04-24
URL https://arxiv.org/abs/1904.10947v2
PDF https://arxiv.org/pdf/1904.10947v2.pdf
PWC https://paperswithcode.com/paper/on-the-contributions-of-visual-and-textual
Repo
Framework

Geometry-constrained Car Recognition Using a 3D Perspective Network

Title Geometry-constrained Car Recognition Using a 3D Perspective Network
Authors Rui Zeng, Zongyuan Ge, Simon Denman, Sridha Sridharan, Clinton Fookes
Abstract We present a novel learning framework for vehicle recognition from a single RGB image. Unlike existing methods which only use attention mechanisms to locate 2D discriminative information, our work learns a novel 3D perspective feature representation of a vehicle, which is then fused with 2D appearance feature to predict the category. The framework is composed of a global network (GN), a 3D perspective network (3DPN), and a fusion network. The GN is used to locate the region of interest (RoI) and generate the 2D global feature. With the assistance of the RoI, the 3DPN estimates the 3D bounding box under the guidance of the proposed vanishing point loss, which provides a perspective geometry constraint. Then the proposed 3D representation is generated by eliminating the viewpoint variance of the 3D bounding box using perspective transformation. Finally, the 3D and 2D feature are fused to predict the category of the vehicle. We present qualitative and quantitative results on the vehicle classification and verification tasks in the BoxCars dataset. The results demonstrate that, by learning such a concise 3D representation, we can achieve superior performance to methods that only use 2D information while retain 3D meaningful information without the challenge of requiring a 3D CAD model.
Tasks
Published 2019-03-19
URL https://arxiv.org/abs/1903.07916v3
PDF https://arxiv.org/pdf/1903.07916v3.pdf
PWC https://paperswithcode.com/paper/3dcarrecog-car-recognition-using-3d-bounding
Repo
Framework

Poincaré Wasserstein Autoencoder

Title Poincaré Wasserstein Autoencoder
Authors Ivan Ovinnikov
Abstract This work presents a reformulation of the recently proposed Wasserstein autoencoder framework on a non-Euclidean manifold, the Poincar'e ball model of the hyperbolic space. By assuming the latent space to be hyperbolic, we can use its intrinsic hierarchy to impose structure on the learned latent space representations. We demonstrate the model in the visual domain to analyze some of its properties and show competitive results on a graph link prediction task.
Tasks Link Prediction
Published 2019-01-05
URL https://arxiv.org/abs/1901.01427v2
PDF https://arxiv.org/pdf/1901.01427v2.pdf
PWC https://paperswithcode.com/paper/poincare-wasserstein-autoencoder
Repo
Framework

Detecting Deception in Political Debates Using Acoustic and Textual Features

Title Detecting Deception in Political Debates Using Acoustic and Textual Features
Authors Daniel Kopev, Ahmed Ali, Ivan Koychev, Preslav Nakov
Abstract We present work on deception detection, where, given a spoken claim, we aim to predict its factuality. While previous work in the speech community has relied on recordings from staged setups where people were asked to tell the truth or to lie and their statements were recorded, here we use real-world political debates. Thanks to the efforts of fact-checking organizations, it is possible to obtain annotations for statements in the context of a political discourse as true, half-true, or false. Starting with such data from the CLEF-2018 CheckThat! Lab, which was limited to text, we performed alignment to the corresponding videos, thus producing a multimodal dataset. We further developed a multimodal deep-learning architecture for the task of deception detection, which yielded sizable improvements over the state of the art for the CLEF-2018 Lab task 2. Our experiments show that the use of the acoustic signal consistently helped to improve the performance compared to using textual and metadata features only, based on several different evaluation measures. We release the new dataset to the research community, hoping to help advance the overall field of multimodal deception detection.
Tasks Deception Detection
Published 2019-10-04
URL https://arxiv.org/abs/1910.01990v1
PDF https://arxiv.org/pdf/1910.01990v1.pdf
PWC https://paperswithcode.com/paper/detecting-deception-in-political-debates
Repo
Framework

Mesh-based Camera Pairs Selection and Occlusion-Aware Masking for Mesh Refinement

Title Mesh-based Camera Pairs Selection and Occlusion-Aware Masking for Mesh Refinement
Authors Andrea Romanoni, Matteo Matteucci
Abstract Many Multi-View-Stereo algorithms extract a 3D mesh model of a scene, after fusing depth maps into a volumetric representation of the space. Due to the limited scalability of such representations, the estimated model does not capture fine details of the scene. Therefore a mesh refinement algorithm is usually applied; it improves the mesh resolution and accuracy by minimizing the photometric error induced by the 3D model into pairs of cameras. The choice of these pairs significantly affects the quality of the refinement and usually relies on sparse 3D points belonging to the surface. Instead, in this paper, to increase the quality of pairs selection, we exploit the 3D model (before the refinement) to compute five metrics: scene coverage, mutual image overlap, image resolution, camera parallax, and a new symmetry term. To improve the refinement robustness, we also propose an explicit method to manage occlusions, which may negatively affect the computation of the photometric error. The proposed method takes into account the depth of the model while computing the similarity measure and its gradient. We quantitatively and qualitatively validated our approach on publicly available datasets against state of the art reconstruction methods.
Tasks
Published 2019-05-21
URL https://arxiv.org/abs/1905.08502v1
PDF https://arxiv.org/pdf/1905.08502v1.pdf
PWC https://paperswithcode.com/paper/mesh-based-camera-pairs-selection-and
Repo
Framework

BERT-Based Arabic Social Media Author Profiling

Title BERT-Based Arabic Social Media Author Profiling
Authors Chiyu Zhang, Muhammad Abdul-Mageed
Abstract We report our models for detecting age, language variety, and gender from social media data in the context of the Arabic author profiling and deception detection shared task (APDA). We build simple models based on pre-trained bidirectional encoders from transformers (BERT). We first fine-tune the pre-trained BERT model on each of the three datasets with shared task released data. Then we augment shared task data with in-house data for gender and dialect, showing the utility of augmenting training data. Our best models on the shared task test data are acquired with a majority voting of various BERT models trained under different data conditions. We acquire 54.72% accuracy for age, 93.75% for dialect, 81.67% for gender, and 40.97% joint accuracy across the three tasks.
Tasks Deception Detection
Published 2019-09-09
URL https://arxiv.org/abs/1909.04181v3
PDF https://arxiv.org/pdf/1909.04181v3.pdf
PWC https://paperswithcode.com/paper/bert-based-arabic-social-media
Repo
Framework

Cross-Lingual Contextual Word Embeddings Mapping With Multi-Sense Words In Mind

Title Cross-Lingual Contextual Word Embeddings Mapping With Multi-Sense Words In Mind
Authors Zheng Zhang, Ruiqing Yin, Jun Zhu, Pierre Zweigenbaum
Abstract Recent work in cross-lingual contextual word embedding learning cannot handle multi-sense words well. In this work, we explore the characteristics of contextual word embeddings and show the link between contextual word embeddings and word senses. We propose two improving solutions by considering contextual multi-sense word embeddings as noise (removal) and by generating cluster level average anchor embeddings for contextual multi-sense word embeddings (replacement). Experiments show that our solutions can improve the supervised contextual word embeddings alignment for multi-sense words in a microscopic perspective without hurting the macroscopic performance on the bilingual lexicon induction task. For unsupervised alignment, our methods significantly improve the performance on the bilingual lexicon induction task for more than 10 points.
Tasks Word Embeddings
Published 2019-09-18
URL https://arxiv.org/abs/1909.08681v1
PDF https://arxiv.org/pdf/1909.08681v1.pdf
PWC https://paperswithcode.com/paper/cross-lingual-contextual-word-embeddings
Repo
Framework

Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls

Title Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls
Authors Jiacheng Zhuo, Qi Lei, Alexandros G. Dimakis, Constantine Caramanis
Abstract Large-scale machine learning training suffers from two prior challenges, specifically for nuclear-norm constrained problems with distributed systems: the synchronization slowdown due to the straggling workers, and high communication costs. In this work, we propose an asynchronous Stochastic Frank Wolfe (SFW-asyn) method, which, for the first time, solves the two problems simultaneously, while successfully maintaining the same convergence rate as the vanilla SFW. We implement our algorithm in python (with MPI) to run on Amazon EC2, and demonstrate that SFW-asyn yields speed-ups almost linear to the number of machines compared to the vanilla SFW.
Tasks
Published 2019-10-17
URL https://arxiv.org/abs/1910.07703v1
PDF https://arxiv.org/pdf/1910.07703v1.pdf
PWC https://paperswithcode.com/paper/communication-efficient-asynchronous
Repo
Framework

Automatic Hip Fracture Identification and Functional Subclassification with Deep Learning

Title Automatic Hip Fracture Identification and Functional Subclassification with Deep Learning
Authors Justin D Krogue, Kaiyang V Cheng, Kevin M Hwang, Paul Toogood, Eric G Meinberg, Erik J Geiger, Musa Zaid, Kevin C McGill, Rina Patel, Jae Ho Sohn, Alexandra Wright, Bryan F Darger, Kevin A Padrez, Eugene Ozhinsky, Sharmila Majumdar, Valentina Pedoia
Abstract Purpose: Hip fractures are a common cause of morbidity and mortality. Automatic identification and classification of hip fractures using deep learning may improve outcomes by reducing diagnostic errors and decreasing time to operation. Methods: Hip and pelvic radiographs from 1118 studies were reviewed and 3034 hips were labeled via bounding boxes and classified as normal, displaced femoral neck fracture, nondisplaced femoral neck fracture, intertrochanteric fracture, previous ORIF, or previous arthroplasty. A deep learning-based object detection model was trained to automate the placement of the bounding boxes. A Densely Connected Convolutional Neural Network (DenseNet) was trained on a subset of the bounding box images, and its performance evaluated on a held out test set and by comparison on a 100-image subset to two groups of human observers: fellowship-trained radiologists and orthopaedists, and senior residents in emergency medicine, radiology, and orthopaedics. Results: The binary accuracy for fracture of our model was 93.8% (95% CI, 91.3-95.8%), with sensitivity of 92.7% (95% CI, 88.7-95.6%), and specificity 95.0% (95% CI, 91.5-97.3%). Multiclass classification accuracy was 90.4% (95% CI, 87.4-92.9%). When compared to human observers, our model achieved at least expert-level classification under all conditions. Additionally, when the model was used as an aid, human performance improved, with aided resident performance approximating unaided fellowship-trained expert performance. Conclusions: Our deep learning model identified and classified hip fractures with at least expert-level accuracy, and when used as an aid improved human performance, with aided resident performance approximating that of unaided fellowship-trained attendings.
Tasks Object Detection
Published 2019-09-10
URL https://arxiv.org/abs/1909.06326v1
PDF https://arxiv.org/pdf/1909.06326v1.pdf
PWC https://paperswithcode.com/paper/automatic-hip-fracture-identification-and
Repo
Framework
comments powered by Disqus