October 19, 2019

2929 words 14 mins read

Paper Group ANR 116

Chittron: An Automatic Bangla Image Captioning System. Rank Minimization on Tensor Ring: A New Paradigm in Scalable Tensor Decomposition and Completion. RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks. On the Latent Space of Wasserstein Auto-Encoders. Efficient collective sw …

Chittron: An Automatic Bangla Image Captioning System


Title	Chittron: An Automatic Bangla Image Captioning System
Authors	Motiur Rahman, Nabeel Mohammed, Nafees Mansoor, Sifat Momen
Abstract	Automatic image caption generation aims to produce an accurate description of an image in natural language automatically. However, Bangla, the fifth most widely spoken language in the world, is lagging considerably in the research and development of such domain. Besides, while there are many established data sets to related to image annotation in English, no such resource exists for Bangla yet. Hence, this paper outlines the development of “Chittron”, an automatic image captioning system in Bangla. Moreover, to address the data set availability issue, a collection of 16,000 Bangladeshi contextual images has been accumulated and manually annotated in Bangla. This data set is then used to train a model which integrates a pre-trained VGG16 image embedding model with stacked LSTM layers. The model is trained to predict the caption when the input is an image, one word at a time. The results show that the model has successfully been able to learn a working language model and to generate captions of images quite accurately in many cases. The results are evaluated mainly qualitatively. However, BLEU scores are also reported. It is expected that a better result can be obtained with a bigger and more varied data set.
Tasks	Image Captioning, Language Modelling
Published	2018-09-02
URL	http://arxiv.org/abs/1809.00339v1
PDF	http://arxiv.org/pdf/1809.00339v1.pdf
PWC	https://paperswithcode.com/paper/chittron-an-automatic-bangla-image-captioning
Repo
Framework

Rank Minimization on Tensor Ring: A New Paradigm in Scalable Tensor Decomposition and Completion


Title	Rank Minimization on Tensor Ring: A New Paradigm in Scalable Tensor Decomposition and Completion
Authors	Longhao Yuan, Chao Li, Danilo Mandic, Jianting Cao, Qibin Zhao
Abstract	In low-rank tensor completion tasks, due to the underlying multiple large-scale singular value decomposition (SVD) operations and rank selection problem of the traditional methods, they suffer from high computational cost and high sensitivity of model complexity. In this paper, taking advantages of high compressibility of the recently proposed tensor ring (TR) decomposition, we propose a new model for tensor completion problem. This is achieved through introducing convex surrogates of tensor low-rank assumption on latent tensor ring factors, which makes it possible for the Schatten norm regularization based models to be solved at much smaller scale. We propose two algorithms which apply different structured Schatten norms on tensor ring factors respectively. By the alternating direction method of multipliers (ADMM) scheme, the tensor ring factors and the predicted tensor can be optimized simultaneously. The experiments on synthetic data and real-world data show the high performance and efficiency of the proposed approach.
Tasks
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08468v1
PDF	http://arxiv.org/pdf/1805.08468v1.pdf
PWC	https://paperswithcode.com/paper/rank-minimization-on-tensor-ring-a-new
Repo
Framework

RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks


Title	RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks
Authors	Jinsung Yoon, James Jordon, Mihaela van der Schaar
Abstract	Training complex machine learning models for prediction often requires a large amount of data that is not always readily available. Leveraging these external datasets from related but different sources is therefore an important task if good predictive models are to be built for deployment in settings where data can be rare. In this paper we propose a novel approach to the problem in which we use multiple GAN architectures to learn to translate from one dataset to another, thereby allowing us to effectively enlarge the target dataset, and therefore learn better predictive models than if we simply used the target dataset. We show the utility of such an approach, demonstrating that our method improves the prediction performance on the target domain over using just the target dataset and also show that our framework outperforms several other benchmarks on a collection of real-world medical datasets.
Tasks
Published	2018-02-18
URL	http://arxiv.org/abs/1802.06403v2
PDF	http://arxiv.org/pdf/1802.06403v2.pdf
PWC	https://paperswithcode.com/paper/radialgan-leveraging-multiple-datasets-to
Repo
Framework

On the Latent Space of Wasserstein Auto-Encoders


Title	On the Latent Space of Wasserstein Auto-Encoders
Authors	Paul K. Rubenstein, Bernhard Schoelkopf, Ilya Tolstikhin
Abstract	We study the role of latent space dimensionality in Wasserstein auto-encoders (WAEs). Through experimentation on synthetic and real datasets, we argue that random encoders should be preferred over deterministic encoders. We highlight the potential of WAEs for representation learning with promising results on a benchmark disentanglement task.
Tasks	Representation Learning
Published	2018-02-11
URL	http://arxiv.org/abs/1802.03761v1
PDF	http://arxiv.org/pdf/1802.03761v1.pdf
PWC	https://paperswithcode.com/paper/on-the-latent-space-of-wasserstein-auto
Repo
Framework

Efficient collective swimming by harnessing vortices through deep reinforcement learning


Title	Efficient collective swimming by harnessing vortices through deep reinforcement learning
Authors	Siddhartha Verma, Guido Novati, Petros Koumoutsakos
Abstract	Fish in schooling formations navigate complex flow-fields replete with mechanical energy in the vortex wakes of their companions. Their schooling behaviour has been associated with evolutionary advantages including collective energy savings. How fish harvest energy from their complex fluid environment and the underlying physical mechanisms governing energy-extraction during collective swimming, is still unknown. Here we show that fish can improve their sustained propulsive efficiency by actively following, and judiciously intercepting, vortices in the wake of other swimmers. This swimming strategy leads to collective energy-savings and is revealed through the first ever combination of deep reinforcement learning with high-fidelity flow simulations. We find that a `smart-swimmer’ can adapt its position and body deformation to synchronise with the momentum of the oncoming vortices, improving its average swimming-efficiency at no cost to the leader. The results show that fish may harvest energy deposited in vortices produced by their peers, and support the conjecture that swimming in formation is energetically advantageous. Moreover, this study demonstrates that deep reinforcement learning can produce navigation algorithms for complex flow-fields, with promising implications for energy savings in autonomous robotic swarms. \|
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02674v1
PDF	http://arxiv.org/pdf/1802.02674v1.pdf
PWC	https://paperswithcode.com/paper/efficient-collective-swimming-by-harnessing
Repo
Framework

Deep Learning for Energy Markets


Title	Deep Learning for Energy Markets
Authors	Michael Polson, Vadim Sokolov
Abstract	Deep Learning is applied to energy markets to predict extreme loads observed in energy grids. Forecasting energy loads and prices is challenging due to sharp peaks and troughs that arise due to supply and demand fluctuations from intraday system constraints. We propose deep spatio-temporal models and extreme value theory (EVT) to capture theses effects and in particular the tail behavior of load spikes. Deep LSTM architectures with ReLU and $\tanh$ activation functions can model trends and temporal dependencies while EVT captures highly volatile load spikes above a pre-specified threshold. To illustrate our methodology, we use hourly price and demand data from 4719 nodes of the PJM interconnection, and we construct a deep predictor. We show that DL-EVT outperforms traditional Fourier time series methods, both in-and out-of-sample, by capturing the observed nonlinearities in prices. Finally, we conclude with directions for future research.
Tasks	Time Series
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05527v3
PDF	http://arxiv.org/pdf/1808.05527v3.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-energy-markets
Repo
Framework

Digital holographic particle volume reconstruction using a deep neural network


Title	Digital holographic particle volume reconstruction using a deep neural network
Authors	Tomoyoshi Shimobaba, Takayuki Takahashi, Yota Yamamoto, Yutaka Endo, Atsushi Shiraki, Takashi Nishitsuji, Naoto Hoshikawa, Takashi Kakue, Tomoyosh Ito
Abstract	This paper proposes a particle volume reconstruction directly from an in-line hologram using a deep neural network. Digital holographic volume reconstruction conventionally uses multiple diffraction calculations to obtain sectional reconstructed images from an in-line hologram, followed by detection of the lateral and axial positions, and the sizes of particles by using focus metrics. However, the axial resolution is limited by the numerical aperture of the optical system, and the processes are time-consuming. The method proposed here can simultaneously detect the lateral and axial positions, and the particle sizes via a deep neural network (DNN). We numerically investigated the performance of the DNN in terms of the errors in the detected positions and sizes. The calculation time is faster than conventional diffracted-based approaches.
Tasks
Published	2018-10-21
URL	http://arxiv.org/abs/1810.09444v1
PDF	http://arxiv.org/pdf/1810.09444v1.pdf
PWC	https://paperswithcode.com/paper/digital-holographic-particle-volume
Repo
Framework

Kernel Regression for Graph Signal Prediction in Presence of Sparse Noise


Title	Kernel Regression for Graph Signal Prediction in Presence of Sparse Noise
Authors	Arun Venkitaraman, Pascal Frossard, Saikat Chatterjee
Abstract	In presence of sparse noise we propose kernel regression for predicting output vectors which are smooth over a given graph. Sparse noise models the training outputs being corrupted either with missing samples or large perturbations. The presence of sparse noise is handled using appropriate use of $\ell_1$-norm along-with use of $\ell_2$-norm in a convex cost function. For optimization of the cost function, we propose an iteratively reweighted least-squares (IRLS) approach that is suitable for kernel substitution or kernel trick due to availability of a closed form solution. Simulations using real-world temperature data show efficacy of our proposed method, mainly for limited-size training datasets.
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02314v1
PDF	http://arxiv.org/pdf/1811.02314v1.pdf
PWC	https://paperswithcode.com/paper/kernel-regression-for-graph-signal-prediction
Repo
Framework

Topological Constraints on Homeomorphic Auto-Encoding


Title	Topological Constraints on Homeomorphic Auto-Encoding
Authors	Pim de Haan, Luca Falorsi
Abstract	When doing representation learning on data that lives on a known non-trivial manifold embedded in high dimensional space, it is natural to desire the encoder to be homeomorphic when restricted to the manifold, so that it is bijective and continuous with a continuous inverse. Using topological arguments, we show that when the manifold is non-trivial, the encoder must be globally discontinuous and propose a universal, albeit impractical, construction. In addition, we derive necessary constraints which need to be satisfied when designing manifold-specific practical encoders. These are used to analyse candidates for a homeomorphic encoder for the manifold of 3D rotations $SO(3)$.
Tasks	Representation Learning
Published	2018-12-27
URL	http://arxiv.org/abs/1812.10783v1
PDF	http://arxiv.org/pdf/1812.10783v1.pdf
PWC	https://paperswithcode.com/paper/topological-constraints-on-homeomorphic-auto
Repo
Framework

Learning Parametric Closed-Loop Policies for Markov Potential Games


Title	Learning Parametric Closed-Loop Policies for Markov Potential Games
Authors	Sergio Valcarcel Macua, Javier Zazo, Santiago Zazo
Abstract	Multiagent systems where agents interact among themselves and with a stochastic environment can be formalized as stochastic games. We study a subclass named Markov potential games (MPGs) that appear often in economic and engineering applications when the agents share a common resource. We consider MPGs with continuous state-action variables, coupled constraints and nonconvex rewards. Previous analysis followed a variational approach that is only valid for very simple cases (convex rewards, invertible dynamics, and no coupled constraints); or considered deterministic dynamics and provided open-loop (OL) analysis, studying strategies that consist in predefined action sequences, which are not optimal for stochastic environments. We present a closed-loop (CL) analysis for MPGs and consider parametric policies that depend on the current state. We provide easily verifiable, sufficient and necessary conditions for a stochastic game to be an MPG, even for complex parametric functions (e.g., deep neural networks); and show that a closed-loop Nash equilibrium (NE) can be found (or at least approximated) by solving a related optimal control problem (OCP). This is useful since solving an OCP–which is a single-objective problem–is usually much simpler than solving the original set of coupled OCPs that form the game–which is a multiobjective control problem. This is a considerable improvement over the previously standard approach for the CL analysis of MPGs, which gives no approximate solution if no NE belongs to the chosen parametric family, and which is practical only for simple parametric forms. We illustrate the theoretical contributions with an example by applying our approach to a noncooperative communications engineering game. We then solve the game with a deep reinforcement learning algorithm that learns policies that closely approximates an exact variational NE of the game.
Tasks
Published	2018-02-03
URL	http://arxiv.org/abs/1802.00899v2
PDF	http://arxiv.org/pdf/1802.00899v2.pdf
PWC	https://paperswithcode.com/paper/learning-parametric-closed-loop-policies-for
Repo
Framework

Distill-Net: Application-Specific Distillation of Deep Convolutional Neural Networks for Resource-Constrained IoT Platforms


Title	Distill-Net: Application-Specific Distillation of Deep Convolutional Neural Networks for Resource-Constrained IoT Platforms
Authors	Mohammad Motamedi, Felix Portillo, Daniel Fong, Soheil Ghiasi
Abstract	Many Internet-of-Things (IoT) applications demand fast and accurate understanding of a few key events in their surrounding environment. Deep Convolutional Neural Networks (CNNs) have emerged as an effective approach to understand speech, images, and similar high dimensional data types. Algorithmic performance of modern CNNs, however, fundamentally relies on learning class-agnostic hierarchical features that only exist in comprehensive training datasets with many classes. As a result, fast inference using CNNs trained on such datasets is prohibitive for most resource-constrained IoT platforms. To bridge this gap, we present a principled and practical methodology for distilling a complex modern CNN that is trained to effectively recognize many different classes of input data into an application-dependent essential core that not only recognizes the few classes of interest to the application accurately, but also runs efficiently on platforms with limited resources. Experimental results confirm that our approach strikes a favorable balance between classification accuracy (application constraint), inference efficiency (platform constraint), and productive development of new applications (business constraint).
Tasks
Published	2018-12-16
URL	http://arxiv.org/abs/1812.07390v1
PDF	http://arxiv.org/pdf/1812.07390v1.pdf
PWC	https://paperswithcode.com/paper/distill-net-application-specific-distillation
Repo
Framework

Object Tracking in Satellite Videos Based on a Multi-Frame Optical Flow Tracker


Title	Object Tracking in Satellite Videos Based on a Multi-Frame Optical Flow Tracker
Authors	Bo Du, Shihan Cai, Chen Wu, Liangpei Zhang, Dacheng Tao
Abstract	Object tracking is a hot topic in computer vision. Thanks to the booming of the very high resolution (VHR) remote sensing techniques, it is now possible to track targets of interests in satellite videos. However, since the targets in the satellite videos are usually too small compared with the entire image, and too similar with the background, most state-of-the-art algorithms failed to track the target in satellite videos with a satisfactory accuracy. Due to the fact that optical flow shows the great potential to detect even the slight movement of the targets, we proposed a multi-frame optical flow tracker (MOFT) for object tracking in satellite videos. The Lucas-Kanade optical flow method was fused with the HSV color system and integral image to track the targets in the satellite videos, while multi-frame difference method was utilized in the optical flow tracker for a better interpretation. The experiments with three VHR remote sensing satellite video datasets indicate that compared with state-of-the-art object tracking algorithms, the proposed method can track the target more accurately.
Tasks	Object Tracking, Optical Flow Estimation
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09323v1
PDF	http://arxiv.org/pdf/1804.09323v1.pdf
PWC	https://paperswithcode.com/paper/object-tracking-in-satellite-videos-based-on
Repo
Framework

Towards security defect prediction with AI


Title	Towards security defect prediction with AI
Authors	Carson D. Sestili, William S. Snavely, Nathan M. VanHoudnos
Abstract	In this study, we investigate the limits of the current state of the art AI system for detecting buffer overflows and compare it with current static analysis tools. To do so, we developed a code generator, s-bAbI, capable of producing an arbitrarily large number of code samples of controlled complexity. We found that the static analysis engines we examined have good precision, but poor recall on this dataset, except for a sound static analyzer that has good precision and recall. We found that the state of the art AI system, a memory network modeled after Choi et al. [1], can achieve similar performance to the static analysis engines, but requires an exhaustive amount of training data in order to do so. Our work points towards future approaches that may solve these problems; namely, using representations of code that can capture appropriate scope information and using deep learning methods that are able to perform arithmetic operations.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09897v2
PDF	http://arxiv.org/pdf/1808.09897v2.pdf
PWC	https://paperswithcode.com/paper/towards-security-defect-prediction-with-ai
Repo
Framework

Leveraging Knowledge Graph Embedding Techniques for Industry 4.0 Use Cases


Title	Leveraging Knowledge Graph Embedding Techniques for Industry 4.0 Use Cases
Authors	Martina Garofalo, Maria Angela Pellegrino, Abdulrahman Altabba, Michael Cochez
Abstract	Industry is evolving towards Industry 4.0, which holds the promise of increased flexibility in manufacturing, better quality and improved productivity. A core actor of this growth is using sensors, which must capture data that can used in unforeseen ways to achieve a performance not achievable without them. However, the complexity of this improved setting is much greater than what is currently used in practice. Hence, it is imperative that the management cannot only be performed by human labor force, but part of that will be done by automated algorithms instead. A natural way to represent the data generated by this large amount of sensors, which are not acting measuring independent variables, and the interaction of the different devices is by using a graph data model. Then, machine learning could be used to aid the Industry 4.0 system to, for example, perform predictive maintenance. However, machine learning directly on graphs, needs feature engineering and has scalability issues. In this paper we discuss methods to convert (embed) the graph in a vector space, such that it becomes feasible to use traditional machine learning methods for Industry 4.0 settings.
Tasks	Feature Engineering, Graph Embedding, Knowledge Graph Embedding
Published	2018-07-31
URL	http://arxiv.org/abs/1808.00434v1
PDF	http://arxiv.org/pdf/1808.00434v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-knowledge-graph-embedding
Repo
Framework

Divergence Prior and Vessel-tree Reconstruction


Title	Divergence Prior and Vessel-tree Reconstruction
Authors	Zhongwen Zhang, Egor Chesakov, Dmitrii Marin, Yuri Boykov
Abstract	We propose a new geometric regularization principle for reconstructing vector fields based on prior knowledge about their divergence. As one important example of this general idea, we focus on vector fields modelling blood flow pattern that should be divergent in arteries and convergent in veins. We show that this previously ignored regularization constraint can significantly improve the quality of vessel tree reconstruction particularly around bifurcations where non-zero divergence is concentrated. Our divergence prior is critical for resolving (binary) sign ambiguity in flow orientations produced by standard vessel filters, e.g. Frangi. Our vessel tree centerline reconstruction combines divergence constraints with robust curvature regularization. Our unsupervised method can reconstruct complete vessel trees with near-capillary details on synthetic and real 3D volumes.
Tasks
Published	2018-11-24
URL	http://arxiv.org/abs/1811.09745v1
PDF	http://arxiv.org/pdf/1811.09745v1.pdf
PWC	https://paperswithcode.com/paper/divergence-prior-and-vessel-tree
Repo
Framework