January 26, 2020

3102 words 15 mins read

Paper Group ANR 1497

Performing Arithmetic Using a Neural Network Trained on Digit Permutation Pairs. A Closer Look at Disentangling in $β$-VAE. Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity. SelectNet: Learning to Sample from the Wild for Imbalanced Data Training. Tracking the Consumption Junction: Temporal Dependencies bet …

Performing Arithmetic Using a Neural Network Trained on Digit Permutation Pairs


Title	Performing Arithmetic Using a Neural Network Trained on Digit Permutation Pairs
Authors	Marcus D. Bloice, Peter M. Roth, Andreas Holzinger
Abstract	In this paper a neural network is trained to perform simple arithmetic using images of concatenated handwritten digit pairs. A convolutional neural network was trained with images consisting of two side-by-side handwritten digits, where the image’s label is the summation of the two digits contained in the combined image. Crucially, the network was tested on permutation pairs that were not present during training in an effort to see if the network could learn the task of addition, as opposed to simply mapping images to labels. A dataset was generated for all possible permutation pairs of length 2 for the digits 0-9 using MNIST as a basis for the images, with one thousand samples generated for each permutation pair. For testing the network, samples generated from previously unseen permutation pairs were fed into the trained network, and its predictions measured. Results were encouraging, with the network achieving an accuracy of over 90% on some permutation train/test splits. This suggests that the network learned at first digit recognition, and subsequently the further task of addition based on the two recognised digits. As far as the authors are aware, no previous work has concentrated on learning a mathematical operation in this way.
Tasks
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03035v1
PDF	https://arxiv.org/pdf/1912.03035v1.pdf
PWC	https://paperswithcode.com/paper/performing-arithmetic-using-a-neural-network
Repo
Framework

A Closer Look at Disentangling in $β$-VAE


Title	A Closer Look at Disentangling in $β$-VAE
Authors	Harshvardhan Sikka, Weishun Zhong, Jun Yin, Cengiz Pehlevan
Abstract	In many data analysis tasks, it is beneficial to learn representations where each dimension is statistically independent and thus disentangled from the others. If data generating factors are also statistically independent, disentangled representations can be formed by Bayesian inference of latent variables. We examine a generalization of the Variational Autoencoder (VAE), $\beta$-VAE, for learning such representations using variational inference. $\beta$-VAE enforces conditional independence of its bottleneck neurons controlled by its hyperparameter $\beta$. This condition is in general not compatible with the statistical independence of latents. By providing analytical and numerical arguments, we show that this incompatibility leads to a non-monotonic inference performance in $\beta$-VAE with a finite optimal $\beta$.
Tasks	Bayesian Inference
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05127v1
PDF	https://arxiv.org/pdf/1912.05127v1.pdf
PWC	https://paperswithcode.com/paper/a-closer-look-at-disentangling-in-vae
Repo
Framework

Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity


Title	Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity
Authors	Deepak Pathak, Chris Lu, Trevor Darrell, Phillip Isola, Alexei A. Efros
Abstract	Contemporary sensorimotor learning approaches typically start with an existing complex agent (e.g., a robotic arm), which they learn to control. In contrast, this paper investigates a modular co-evolution strategy: a collection of primitive agents learns to dynamically self-assemble into composite bodies while also learning to coordinate their behavior to control these bodies. Each primitive agent consists of a limb with a motor attached at one end. Limbs may choose to link up to form collectives. When a limb initiates a link-up action, and there is another limb nearby, the latter is magnetically connected to the ‘parent’ limb’s motor. This forms a new single agent, which may further link with other agents. In this way, complex morphologies can emerge, controlled by a policy whose architecture is in explicit correspondence with the morphology. We evaluate the performance of these dynamic and modular agents in simulated environments. We demonstrate better generalization to test-time changes both in the environment, as well as in the structure of the agent, compared to static and monolithic baselines. Project video and code are available at https://pathak22.github.io/modular-assemblies/
Tasks
Published	2019-02-14
URL	https://arxiv.org/abs/1902.05546v2
PDF	https://arxiv.org/pdf/1902.05546v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-control-self-assembling
Repo
Framework

SelectNet: Learning to Sample from the Wild for Imbalanced Data Training


Title	SelectNet: Learning to Sample from the Wild for Imbalanced Data Training
Authors	Yunru Liu, Tingran Gao, Haizhao Yang
Abstract	Supervised learning from training data with imbalanced class sizes, a commonly encountered scenario in real applications such as anomaly/fraud detection, has long been considered a significant challenge in machine learning. Motivated by recent progress in curriculum and self-paced learning, we propose to adopt a semi-supervised learning paradigm by training a deep neural network, referred to as SelectNet, to selectively add unlabelled data together with their predicted labels to the training dataset. Unlike existing techniques designed to tackle the difficulty in dealing with class imbalanced training data such as resampling, cost-sensitive learning, and margin-based learning, SelectNet provides an end-to-end approach for learning from important unlabelled data “in the wild” that most likely belong to the under-sampled classes in the training data, thus gradually mitigates the imbalance in the data used for training the classifier. We demonstrate the efficacy of SelectNet through extensive numerical experiments on standard datasets in computer vision.
Tasks	Fraud Detection
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09872v1
PDF	https://arxiv.org/pdf/1905.09872v1.pdf
PWC	https://paperswithcode.com/paper/selectnet-learning-to-sample-from-the-wild
Repo
Framework

Tracking the Consumption Junction: Temporal Dependencies between Articles and Advertisements in Dutch Newspapers


Title	Tracking the Consumption Junction: Temporal Dependencies between Articles and Advertisements in Dutch Newspapers
Authors	Melvin Wevers, Jianbo Gao, Kristoffer L. Nielbo
Abstract	Historians have regularly debated whether advertisements can be used as a viable source to study the past. Their main concern centered on the question of agency. Were advertisements a reflection of historical events and societal debates, or were ad makers instrumental in shaping society and the ways people interacted with consumer goods? Using techniques from econometrics (Granger causality test) and complexity science (Adaptive Fractal Analysis), this paper analyzes to what extent advertisements shaped or reflected society. We found evidence that indicate a fundamental difference between the dynamic behavior of word use in articles and advertisements published in a century of Dutch newspapers. Articles exhibit persistent trends that are likely to be reflective of communicative memory. Contrary to this, advertisements have a more irregular behavior characterized by short bursts and fast decay, which, in part, mirrors the dynamic through which advertisers introduced terms into public discourse. On the issue of whether advertisements shaped or reflected society, we found particular product types that seemed to be collectively driven by a causality going from advertisements to articles. Generally, we found support for a complex interaction pattern dubbed the consumption junction. Finally, we discovered noteworthy patterns in terms of causality and long-range dependencies for specific product groups. All in, this study shows how methods from econometrics and complexity science can be applied to humanities data to improve our understanding of complex cultural-historical phenomena such as the role of advertising in society.
Tasks
Published	2019-03-27
URL	http://arxiv.org/abs/1903.11461v1
PDF	http://arxiv.org/pdf/1903.11461v1.pdf
PWC	https://paperswithcode.com/paper/tracking-the-consumption-junction-temporal
Repo
Framework

Exploring Properties of Icosoku by Constraint Satisfaction Approach


Title	Exploring Properties of Icosoku by Constraint Satisfaction Approach
Authors	Ke Liu, Sven Löffler, Petra Hofstedt
Abstract	Icosoku is a challenging and interesting puzzle that exhibits highly symmetrical and combinatorial nature. In this paper, we pose the questions derived from the puzzle, but with more difficulty and generality. In addition, we also present a constraint programming model for the proposed questions, which can provide the answers to our first two questions. The purpose of this paper is to share our preliminary result and problems to encourage researchers in both group theory and constraint communities to consider this topic further.
Tasks
Published	2019-08-16
URL	https://arxiv.org/abs/1908.06003v1
PDF	https://arxiv.org/pdf/1908.06003v1.pdf
PWC	https://paperswithcode.com/paper/exploring-properties-of-icosoku-by-constraint
Repo
Framework

Multi-objects Generation with Amortized Structural Regularization


Title	Multi-objects Generation with Amortized Structural Regularization
Authors	Kun Xu, Chongxuan Li, Jun Zhu, Bo Zhang
Abstract	Deep generative models (DGMs) have shown promise in image generation. However, most of the existing work learn the model by simply optimizing a divergence between the marginal distributions of the model and the data, and often fail to capture the rich structures and relations in multi-object images. Human knowledge is a critical element to the success of DGMs to infer these structures. In this paper, we propose the amortized structural regularization (ASR) framework, which adopts the posterior regularization (PR) to embed human knowledge into DGMs via a set of structural constraints. We derive a lower bound of the regularized log-likelihood, which can be jointly optimized with respect to the generative model and recognition model efficiently. Empirical results show that ASR significantly outperforms the DGM baselines in terms of inference accuracy and sample quality.
Tasks	Image Generation
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03923v1
PDF	https://arxiv.org/pdf/1906.03923v1.pdf
PWC	https://paperswithcode.com/paper/multi-objects-generation-with-amortized
Repo
Framework

Pre-training of Context-aware Item Representation for Next Basket Recommendation


Title	Pre-training of Context-aware Item Representation for Next Basket Recommendation
Authors	Jingxuan Yang, Jun Xu, Jianzhuo Tong, Sheng Gao, Jun Guo, Jirong Wen
Abstract	Next basket recommendation, which aims to predict the next a few items that a user most probably purchases given his historical transactions, plays a vital role in market basket analysis. From the viewpoint of item, an item could be purchased by different users together with different items, for different reasons. Therefore, an ideal recommender system should represent an item considering its transaction contexts. Existing state-of-the-art deep learning methods usually adopt the static item representations, which are invariant among all of the transactions and thus cannot achieve the full potentials of deep learning. Inspired by the pre-trained representations of BERT in natural language processing, we propose to conduct context-aware item representation for next basket recommendation, called Item Encoder Representations from Transformers (IERT). In the offline phase, IERT pre-trains deep item representations conditioning on their transaction contexts. In the online recommendation phase, the pre-trained model is further fine-tuned with an additional output layer. The output contextualized item embeddings are used to capture users’ sequential behaviors and general tastes to conduct recommendation. Experimental results on the Ta-Feng data set show that IERT outperforms the state-of-the-art baseline methods, which demonstrated the effectiveness of IERT in next basket representation.
Tasks	Recommendation Systems
Published	2019-04-14
URL	http://arxiv.org/abs/1904.12604v1
PDF	http://arxiv.org/pdf/1904.12604v1.pdf
PWC	https://paperswithcode.com/paper/190412604
Repo
Framework

From Knowledge Map to Mind Map: Artificial Imagination


Title	From Knowledge Map to Mind Map: Artificial Imagination
Authors	Ruixue Liu, Baoyang Chen, Xiaoyu Guo, Yan Dai, Meng Chen, Zhijie Qiu, Xiaodong He
Abstract	Imagination is one of the most important factors which makes an artistic painting unique and impressive. With the rapid development of Artificial Intelligence, more and more researchers try to create painting with AI technology automatically. However, lacking of imagination is still a main problem for AI painting. In this paper, we propose a novel approach to inject rich imagination into a special painting art Mind Map creation. We firstly consider lexical and phonological similarities of seed word, then learn and inherit original painting style of the author, and finally apply Dadaism and impossibility of improvisation principles into painting process. We also design several metrics for imagination evaluation. Experimental results show that our proposed method can increase imagination of painting and also improve its overall quality.
Tasks
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01080v2
PDF	http://arxiv.org/pdf/1903.01080v2.pdf
PWC	https://paperswithcode.com/paper/from-knowledge-map-to-mind-map-artificial
Repo
Framework

Sparse Regularization for Mixture Problems


Title	Sparse Regularization for Mixture Problems
Authors	Yohann de Castro, Sébastien Gadat, Clément Marteau, Cathy Maugis-Rabusseau
Abstract	This paper investigates the statistical estimation of a discrete mixing measure $\mu^0$ involved in a kernel mixture model. Using some recent advances in $\ell_1$-regularization over the space of measures, we introduce a “data fitting + regularization” convex program for estimating $\mu^0$ in a grid-less manner, this method is referred to as Beurling-LASSO. Our contribution is two-fold: we derive a lower bound on the bandwidth of our data fitting term depending only on the support of $\mu^0$ and its so-called “minimum separation” to ensure quantitative support localization error bounds; and under a so-called “non-degenerate source condition” we derive a non-asymptotic support stability property. This latter shows that for sufficiently large sample size $n$, our estimator has exactly as many weighted Dirac masses as the target $\mu^0$, converging in amplitude and localization towards the true ones. The statistical performances of this estimator are investigated designing a so-called “dual certificate”, which will be appropriate to our setting. Some classical situations, as e.g., Gaussian or ordinary smooth mixtures (e.g., Laplace distributions), are discussed at the end of the paper. We stress in particular that our method is completely adaptive w.r.t. the number of components involved in the mixture.
Tasks
Published	2019-07-23
URL	https://arxiv.org/abs/1907.10592v1
PDF	https://arxiv.org/pdf/1907.10592v1.pdf
PWC	https://paperswithcode.com/paper/sparse-regularization-for-mixture-problems
Repo
Framework

The inD Dataset: A Drone Dataset of Naturalistic Road User Trajectories at German Intersections


Title	The inD Dataset: A Drone Dataset of Naturalistic Road User Trajectories at German Intersections
Authors	Julian Bock, Robert Krajewski, Tobias Moers, Steffen Runde, Lennart Vater, Lutz Eckstein
Abstract	Automated vehicles rely heavily on data-driven methods, especially for complex urban environments. Large datasets of real world measurement data in the form of road user trajectories are crucial for several tasks like road user prediction models or scenario-based safety validation. So far, though, this demand is unmet as no public dataset of urban road user trajectories is available in an appropriate size, quality and variety. By contrast, the highway drone dataset (highD) has recently shown that drones are an efficient method for acquiring naturalistic road user trajectories. Compared to driving studies or ground-level infrastructure sensors, one major advantage of using a drone is the possibility to record naturalistic behavior, as road users do not notice measurements taking place. Due to the ideal viewing angle, an entire intersection scenario can be measured with significantly less occlusion than with sensors at ground level. Both the class and the trajectory of each road user can be extracted from the video recordings with high precision using state-of-the-art deep neural networks. Therefore, we propose the creation of a comprehensive, large-scale urban intersection dataset with naturalistic road user behavior using camera-equipped drones as successor of the highD dataset. The resulting dataset contains more than 11500 road users including vehicles, bicyclists and pedestrians at intersections in Germany and is called inD. The dataset consists of 10 hours of measurement data from four intersections and is available online for non-commercial research at: http://www.inD-dataset.com
Tasks
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07602v1
PDF	https://arxiv.org/pdf/1911.07602v1.pdf
PWC	https://paperswithcode.com/paper/the-ind-dataset-a-drone-dataset-of
Repo
Framework

A Theory of Regularized Markov Decision Processes


Title	A Theory of Regularized Markov Decision Processes
Authors	Matthieu Geist, Bruno Scherrer, Olivier Pietquin
Abstract	Many recent successful (deep) reinforcement learning algorithms make use of regularization, generally based on entropy or Kullback-Leibler divergence. We propose a general theory of regularized Markov Decision Processes that generalizes these approaches in two directions: we consider a larger class of regularizers, and we consider the general modified policy iteration approach, encompassing both policy iteration and value iteration. The core building blocks of this theory are a notion of regularized Bellman operator and the Legendre-Fenchel transform, a classical tool of convex optimization. This approach allows for error propagation analyses of general algorithmic schemes of which (possibly variants of) classical algorithms such as Trust Region Policy Optimization, Soft Q-learning, Stochastic Actor Critic or Dynamic Policy Programming are special cases. This also draws connections to proximal convex optimization, especially to Mirror Descent.
Tasks	Q-Learning
Published	2019-01-31
URL	https://arxiv.org/abs/1901.11275v2
PDF	https://arxiv.org/pdf/1901.11275v2.pdf
PWC	https://paperswithcode.com/paper/a-theory-of-regularized-markov-decision
Repo
Framework


Title	Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task
Authors	Alireza Mohammadshahi, Remi Lebret, Karl Aberer
Abstract	In this paper, we propose a new approach to learn multimodal multilingual embeddings for matching images and their relevant captions in two languages. We combine two existing objective functions to make images and captions close in a joint embedding space while adapting the alignment of word embeddings between existing languages in our model. We show that our approach enables better generalization, achieving state-of-the-art performance in text-to-image and image-to-text retrieval task, and caption-caption similarity task. Two multimodal multilingual datasets are used for evaluation: Multi30k with German and English captions and Microsoft-COCO with English and Japanese captions.
Tasks	Cross-Modal Retrieval, Multilingual Word Embeddings, Word Embeddings
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03291v1
PDF	https://arxiv.org/pdf/1910.03291v1.pdf
PWC	https://paperswithcode.com/paper/aligning-multilingual-word-embeddings-for
Repo
Framework

Efficient, Lexicon-Free OCR using Deep Learning


Title	Efficient, Lexicon-Free OCR using Deep Learning
Authors	Marcin Namysl, Iuliu Konya
Abstract	Contrary to popular belief, Optical Character Recognition (OCR) remains a challenging problem when text occurs in unconstrained environments, like natural scenes, due to geometrical distortions, complex backgrounds, and diverse fonts. In this paper, we present a segmentation-free OCR system that combines deep learning methods, synthetic training data generation, and data augmentation techniques. We render synthetic training data using large text corpora and over 2000 fonts. To simulate text occurring in complex natural scenes, we augment extracted samples with geometric distortions and with a proposed data augmentation technique - alpha-compositing with background textures. Our models employ a convolutional neural network encoder to extract features from text images. Inspired by the recent progress in neural machine translation and language modeling, we examine the capabilities of both recurrent and convolutional neural networks in modeling the interactions between input elements.
Tasks	Data Augmentation, Language Modelling, Machine Translation, Optical Character Recognition
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01969v1
PDF	https://arxiv.org/pdf/1906.01969v1.pdf
PWC	https://paperswithcode.com/paper/efficient-lexicon-free-ocr-using-deep
Repo
Framework

Annals of Library and Information Studies. A bibliometric analysis of the journal and a comparison with the top library and information studies journals in Asia and worldwide (2011_2017)


Title	Annals of Library and Information Studies. A bibliometric analysis of the journal and a comparison with the top library and information studies journals in Asia and worldwide (2011_2017)
Authors	Juan Jose Prieto-Gutierrez, Francisco Segado-Boj
Abstract	This paper presents a thorough bibliometric analysis of research published in Annals of Library and Information Studies (ALIS), an India-based journal, for the period 2011_2017. Specifically, it compares this journal’s trends with those of other library and information science (LIS) journals from the same geographical area (India, and Asia as a whole) and with the 10 highest-rated LIS journals worldwide. The source of the data used was the multidisciplinary database Scopus. To perform this comparison, ALIS’ production was analyzed in order to identify authorship patterns; for example, authors’ countries of residence, co-authorship trends, and collaboration networks. Research topics were identified through keyword analysis, while performance was measured by examining the number of citations articles received. This study provides substantial information. The research lines detected through examining the keywords in ALIS articles were determined to be similar to those for the top LIS journals in both Asia and worldwide. Specifically, ALIS authors are focusing on metrics, bibliometrics, and social networking, which follows global trends. Notably, however, collaboration among Asia-based journals was found to be lower than that in the top-indexed journals in the LIS field. The results obtained present a roadmap for expanding the research in this field.
Tasks
Published	2019-08-26
URL	https://arxiv.org/abs/1908.09541v1
PDF	https://arxiv.org/pdf/1908.09541v1.pdf
PWC	https://paperswithcode.com/paper/annals-of-library-and-information-studies-a
Repo
Framework