January 30, 2020

3069 words 15 mins read

Paper Group ANR 207

Semi-Heterogeneous Three-Way Joint Embedding Network for Sketch-Based Image Retrieval. A Training-free, One-shot Detection Framework For Geospatial Objects In Remote Sensing Images. Manufacturing Dispatching using Reinforcement and Transfer Learning. Causally Denoise Word Embeddings Using Half-Sibling Regression. Error bound of local minima and KL …

Semi-Heterogeneous Three-Way Joint Embedding Network for Sketch-Based Image Retrieval


Title	Semi-Heterogeneous Three-Way Joint Embedding Network for Sketch-Based Image Retrieval
Authors	Jianjun Lei, Yuxin Song, Bo Peng, Zhanyu Ma, Ling Shao, Yi-Zhe Song
Abstract	Sketch-based image retrieval (SBIR) is a challenging task due to the large cross-domain gap between sketches and natural images. How to align abstract sketches and natural images into a common high-level semantic space remains a key problem in SBIR. In this paper, we propose a novel semi-heterogeneous three-way joint embedding network (Semi3-Net), which integrates three branches (a sketch branch, a natural image branch, and an edgemap branch) to learn more discriminative cross-domain feature representations for the SBIR task. The key insight lies with how we cultivate the mutual and subtle relationships amongst the sketches, natural images, and edgemaps. A semi-heterogeneous feature mapping is designed to extract bottom features from each domain, where the sketch and edgemap branches are shared while the natural image branch is heterogeneous to the other branches. In addition, a joint semantic embedding is introduced to embed the features from different domains into a common high-level semantic space, where all of the three branches are shared. To further capture informative features common to both natural images and the corresponding edgemaps, a co-attention model is introduced to conduct common channel-wise feature recalibration between different domains. A hybrid-loss mechanism is designed to align the three branches, where an alignment loss and a sketch-edgemap contrastive loss are presented to encourage the network to learn invariant cross-domain representations. Experimental results on two widely used category-level datasets (Sketchy and TU-Berlin Extension) demonstrate that the proposed method outperforms state-of-the-art methods.
Tasks	Image Retrieval, Sketch-Based Image Retrieval
Published	2019-11-10
URL	https://arxiv.org/abs/1911.04470v1
PDF	https://arxiv.org/pdf/1911.04470v1.pdf
PWC	https://paperswithcode.com/paper/191104470
Repo
Framework

A Training-free, One-shot Detection Framework For Geospatial Objects In Remote Sensing Images


Title	A Training-free, One-shot Detection Framework For Geospatial Objects In Remote Sensing Images
Authors	Tengfei Zhang, Yue Zhang, Xian Sun, Menglong Yan, Yaoling Wang, Kun Fu
Abstract	Deep learning based object detection has achieved great success. However, these supervised learning methods are data-hungry and time-consuming. This restriction makes them unsuitable for limited data and urgent tasks, especially in the applications of remote sensing. Inspired by the ability of humans to quickly learn new visual concepts from very few examples, we propose a training-free, one-shot geospatial object detection framework for remote sensing images. It consists of (1) a feature extractor with remote sensing domain knowledge, (2) a multi-level feature fusion method, (3) a novel similarity metric method, and (4) a 2-stage object detection pipeline. Experiments on sewage treatment plant and airport detections show that proposed method has achieved a certain effect. Our method can serve as a baseline for training-free, one-shot geospatial object detection.
Tasks	Object Detection
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02302v1
PDF	http://arxiv.org/pdf/1904.02302v1.pdf
PWC	https://paperswithcode.com/paper/a-training-free-one-shot-detection-framework
Repo
Framework

Manufacturing Dispatching using Reinforcement and Transfer Learning


Title	Manufacturing Dispatching using Reinforcement and Transfer Learning
Authors	Shuai Zheng, Chetan Gupta, Susumu Serita
Abstract	Efficient dispatching rule in manufacturing industry is key to ensure product on-time delivery and minimum past-due and inventory cost. Manufacturing, especially in the developed world, is moving towards on-demand manufacturing meaning a high mix, low volume product mix. This requires efficient dispatching that can work in dynamic and stochastic environments, meaning it allows for quick response to new orders received and can work over a disparate set of shop floor settings. In this paper we address this problem of dispatching in manufacturing. Using reinforcement learning (RL), we propose a new design to formulate the shop floor state as a 2-D matrix, incorporate job slack time into state representation, and design lateness and tardiness rewards function for dispatching purpose. However, maintaining a separate RL model for each production line on a manufacturing shop floor is costly and often infeasible. To address this, we enhance our deep RL model with an approach for dispatching policy transfer. This increases policy generalization and saves time and cost for model training and data collection. Experiments show that: (1) our approach performs the best in terms of total discounted reward and average lateness, tardiness, (2) the proposed policy transfer approach reduces training time and increases policy generalization.
Tasks	Transfer Learning
Published	2019-10-04
URL	https://arxiv.org/abs/1910.02035v1
PDF	https://arxiv.org/pdf/1910.02035v1.pdf
PWC	https://paperswithcode.com/paper/manufacturing-dispatching-using-reinforcement
Repo
Framework

Causally Denoise Word Embeddings Using Half-Sibling Regression


Title	Causally Denoise Word Embeddings Using Half-Sibling Regression
Authors	Zekun Yang, Tianlin Liu
Abstract	Distributional representations of words, also known as word vectors, have become crucial for modern natural language processing tasks due to their wide applications. Recently, a growing body of word vector postprocessing algorithm has emerged, aiming to render off-the-shelf word vectors even stronger. In line with these investigations, we introduce a novel word vector postprocessing scheme under a causal inference framework. Concretely, the postprocessing pipeline is realized by Half-Sibling Regression (HSR), which allows us to identify and remove confounding noise contained in word vectors. Compared to previous work, our proposed method has the advantages of interpretability and transparency due to its causal inference grounding. Evaluated on a battery of standard lexical-level evaluation tasks and downstream sentiment analysis tasks, our method reaches state-of-the-art performance.
Tasks	Causal Inference, Sentiment Analysis, Word Embeddings
Published	2019-11-24
URL	https://arxiv.org/abs/1911.10524v1
PDF	https://arxiv.org/pdf/1911.10524v1.pdf
PWC	https://paperswithcode.com/paper/causally-denoise-word-embeddings-using-half
Repo
Framework

Error bound of local minima and KL property of exponent 1/2 for squared F-norm regularized factorization


Title	Error bound of local minima and KL property of exponent 1/2 for squared F-norm regularized factorization
Authors	Ting Tao, Shaohua Pan, Shujun Bi
Abstract	This paper is concerned with the squared F(robenius)-norm regularized factorization form for noisy low-rank matrix recovery problems. Under a suitable assumption on the restricted condition number of the Hessian matrix of the loss function, we establish an error bound to the true matrix for those local minima whose ranks are not more than the rank of the true matrix. Then, for the least squares loss function, we achieve the KL property of exponent 1/2 for the F-norm regularized factorization function over its global minimum set under a restricted strong convexity assumption. These theoretical findings are also confirmed by applying an accelerated alternating minimization method to the F-norm regularized factorization problem.
Tasks
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04293v1
PDF	https://arxiv.org/pdf/1911.04293v1.pdf
PWC	https://paperswithcode.com/paper/error-bound-of-local-minima-and-kl-property
Repo
Framework

AITuning: Machine Learning-based Tuning Tool for Run-Time Communication Libraries


Title	AITuning: Machine Learning-based Tuning Tool for Run-Time Communication Libraries
Authors	Alessandro Fanfarillo, Davide Del Vento
Abstract	In this work, we address the problem of tuning communication libraries by using a deep reinforcement learning approach. Reinforcement learning is a machine learning technique incredibly effective in solving game-like situations. In fact, tuning a set of parameters in a communication library in order to get better performance in a parallel application can be expressed as a game: Find the right combination/path that provides the best reward. Even though AITuning has been designed to be utilized with different run-time libraries, we focused this work on applying it to the OpenCoarrays run-time communication library, built on top of MPI-3. This work not only shows the potential of using a reinforcement learning algorithm for tuning communication libraries, but also demonstrates how the MPI Tool Information Interface, introduced by the MPI-3 standard, can be used effectively by run-time libraries to improve the performance without human intervention.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06301v1
PDF	https://arxiv.org/pdf/1909.06301v1.pdf
PWC	https://paperswithcode.com/paper/aituning-machine-learning-based-tuning-tool
Repo
Framework

Scaling up Psychology via Scientific Regret Minimization: A Case Study in Moral Decisions


Title	Scaling up Psychology via Scientific Regret Minimization: A Case Study in Moral Decisions
Authors	Mayank Agrawal, Joshua C. Peterson, Thomas L. Griffiths
Abstract	Do large datasets provide value to psychologists? Without a systematic methodology for working with such datasets, there is a valid concern that analyses will produce noise artifacts rather than true effects. In this paper, we offer a way to enable researchers to systematically build models and identify novel phenomena in large datasets. One traditional approach is to analyze the residuals of models—the biggest errors they make in predicting the data—to discover what might be missing from those models. However, once a dataset is sufficiently large, machine learning algorithms approximate the true underlying function better than the data, suggesting instead that the predictions of these data-driven models should be used to guide model-building. We call this approach “Scientific Regret Minimization” (SRM) as it focuses on minimizing errors for cases that we know should have been predictable. We demonstrate this methodology on a subset of the Moral Machine dataset, a public collection of roughly forty million moral decisions. Using SRM, we found that incorporating a set of deontological principles that capture dimensions along which groups of agents can vary (e.g. sex and age) improves a computational model of human moral judgment. Furthermore, we were able to identify and independently validate three interesting moral phenomena: criminal dehumanization, age of responsibility, and asymmetric notions of responsibility.
Tasks	Decision Making
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07581v2
PDF	https://arxiv.org/pdf/1910.07581v2.pdf
PWC	https://paperswithcode.com/paper/scaling-up-psychology-via-scientific-regret
Repo
Framework

Text Guided Person Image Synthesis


Title	Text Guided Person Image Synthesis
Authors	Xingran Zhou, Siyu Huang, Bin Li, Yingming Li, Jiachen Li, Zhongfei Zhang
Abstract	This paper presents a novel method to manipulate the visual appearance (pose and attribute) of a person image according to natural language descriptions. Our method can be boiled down to two stages: 1) text guided pose generation and 2) visual appearance transferred image synthesis. In the first stage, our method infers a reasonable target human pose based on the text. In the second stage, our method synthesizes a realistic and appearance transferred person image according to the text in conjunction with the target pose. Our method extracts sufficient information from the text and establishes a mapping between the image space and the language space, making generating and editing images corresponding to the description possible. We conduct extensive experiments to reveal the effectiveness of our method, as well as using the VQA Perceptual Score as a metric for evaluating the method. It shows for the first time that we can automatically edit the person image from the natural language descriptions.
Tasks	Image Generation, Visual Question Answering
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05118v1
PDF	http://arxiv.org/pdf/1904.05118v1.pdf
PWC	https://paperswithcode.com/paper/text-guided-person-image-synthesis
Repo
Framework

The future of urban models in the Big Data and AI era: a bibliometric analysis (2000-2019)


Title	The future of urban models in the Big Data and AI era: a bibliometric analysis (2000-2019)
Authors	Marion Maisonobe
Abstract	This article questions the effects on urban research dynamics of the Big Data and AI turn in urban management. To identify these effects, we use two complementary materials: bibliometric data and interviews. We consider two areas in urban research: one, covering the academic research dealing with transportation systems and the other, with water systems. First, we measure the evolution of AI and Big Data keywords in these two areas. Second, we measure the evolution of the share of publications published in computer science journals about urban traffic and water quality. To guide these bibliometric analyses, we rely on the content of interviews conducted with academics and higher education officials in Paris and Edinburgh at the beginning of 2018.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1912.00532v1
PDF	https://arxiv.org/pdf/1912.00532v1.pdf
PWC	https://paperswithcode.com/paper/the-future-of-urban-models-in-the-big-data
Repo
Framework

Adversarial Robustness Curves


Title	Adversarial Robustness Curves
Authors	Christina Göpfert, Jan Philip Göpfert, Barbara Hammer
Abstract	The existence of adversarial examples has led to considerable uncertainty regarding the trust one can justifiably put in predictions produced by automated systems. This uncertainty has, in turn, lead to considerable research effort in understanding adversarial robustness. In this work, we take first steps towards separating robustness analysis from the choice of robustness threshold and norm. We propose robustness curves as a more general view of the robustness behavior of a model and investigate under which circumstances they can qualitatively depend on the chosen norm.
Tasks
Published	2019-07-31
URL	https://arxiv.org/abs/1908.00096v1
PDF	https://arxiv.org/pdf/1908.00096v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-robustness-curves
Repo
Framework

Modulated Self-attention Convolutional Network for VQA


Title	Modulated Self-attention Convolutional Network for VQA
Authors	Jean-Benoit Delbrouck, Antoine Maiorca, Nathan Hubens, Stéphane Dupont
Abstract	As new data-sets for real-world visual reasoning and compositional question answering are emerging, it might be needed to use the visual feature extraction as a end-to-end process during training. This small contribution aims to suggest new ideas to improve the visual processing of traditional convolutional network for visual question answering (VQA). In this paper, we propose to modulate by a linguistic input a CNN augmented with self-attention. We show encouraging relative improvements for future research in this direction.
Tasks	Question Answering, Visual Question Answering, Visual Reasoning
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03343v2
PDF	https://arxiv.org/pdf/1910.03343v2.pdf
PWC	https://paperswithcode.com/paper/modulated-self-attention-convolutional
Repo
Framework

Combining 3D Morphable Models: A Large scale Face-and-Head Model


Title	Combining 3D Morphable Models: A Large scale Face-and-Head Model
Authors	Stylianos Ploumpis, Haoyang Wang, Nick Pears, William A. P. Smith, Stefanos Zafeiriou
Abstract	Three-dimensional Morphable Models (3DMMs) are powerful statistical tools for representing the 3D surfaces of an object class. In this context, we identify an interesting question that has previously not received research attention: is it possible to combine two or more 3DMMs that (a) are built using different templates that perhaps only partly overlap, (b) have different representation capabilities and (c) are built from different datasets that may not be publicly-available? In answering this question, we make two contributions. First, we propose two methods for solving this problem: i. use a regressor to complete missing parts of one model using the other, ii. use the Gaussian Process framework to blend covariance matrices from multiple models. Second, as an example application of our approach, we build a new face-and-head shape model that combines the variability and facial detail of the LSFM with the full head modelling of the LYHM. The resulting combined shape model achieves state-of-the-art performance and outperforms existing head models by a large margin. Finally, as an application experiment, we reconstruct full head representations from single, unconstrained images by utilizing our proposed large-scale model in conjunction with the FaceWarehouse blendshapes for handling expressions.
Tasks
Published	2019-03-09
URL	http://arxiv.org/abs/1903.03785v1
PDF	http://arxiv.org/pdf/1903.03785v1.pdf
PWC	https://paperswithcode.com/paper/combining-3d-morphable-models-a-large-scale
Repo
Framework

The relational processing limits of classic and contemporary neural network models of language processing


Title	The relational processing limits of classic and contemporary neural network models of language processing
Authors	Guillermo Puebla, Andrea E. Martin, Leonidas A. A. Doumas
Abstract	The ability of neural networks to capture relational knowledge is a matter of long-standing controversy. Recently, some researchers in the PDP side of the debate have argued that (1) classic PDP models can handle relational structure (Rogers & McClelland, 2008, 2014) and (2) the success of deep learning approaches to text processing suggests that structured representations are unnecessary to capture the gist of human language (Rabovsky et al., 2018). In the present study we tested the Story Gestalt model (St. John, 1992), a classic PDP model of text comprehension, and a Sequence-to-Sequence with Attention model (Bahdanau et al., 2015), a contemporary deep learning architecture for text processing. Both models were trained to answer questions about stories based on the thematic roles that several concepts played on the stories. In three critical test we varied the statistical structure of new stories while keeping their relational structure constant with respect to the training data. Each model was susceptible to each statistical structure manipulation to a different degree, with their performance failing below chance at least under one manipulation. We argue that the failures of both models are due to the fact that they cannotperform dynamic binding of independent roles and fillers. Ultimately, these results cast doubts onthe suitability of traditional neural networks models for explaining phenomena based on relational reasoning, including language processing.
Tasks	Reading Comprehension, Relational Reasoning
Published	2019-05-12
URL	https://arxiv.org/abs/1905.05708v1
PDF	https://arxiv.org/pdf/1905.05708v1.pdf
PWC	https://paperswithcode.com/paper/the-relational-processing-limits-of-classic
Repo
Framework

Efficient estimation of AUC in a sliding window


Title	Efficient estimation of AUC in a sliding window
Authors	Nikolaj Tatti
Abstract	In many applications, monitoring area under the ROC curve (AUC) in a sliding window over a data stream is a natural way of detecting changes in the system. The drawback is that computing AUC in a sliding window is expensive, especially if the window size is large and the data flow is significant. In this paper we propose a scheme for maintaining an approximate AUC in a sliding window of length $k$. More specifically, we propose an algorithm that, given $\epsilon$, estimates AUC within $\epsilon / 2$, and can maintain this estimate in $O((\log k) / \epsilon)$ time, per update, as the window slides. This provides a speed-up over the exact computation of AUC, which requires $O(k)$ time, per update. The speed-up becomes more significant as the size of the window increases. Our estimate is based on grouping the data points together, and using these groups to calculate AUC. The grouping is designed carefully such that ($i$) the groups are small enough, so that the error stays small, ($ii$) the number of groups is small, so that enumerating them is not expensive, and ($iii$) the definition is flexible enough so that we can maintain the groups efficiently. Our experimental evaluation demonstrates that the average approximation error in practice is much smaller than the approximation guarantee $\epsilon / 2$, and that we can achieve significant speed-ups with only a modest sacrifice in accuracy.
Tasks
Published	2019-02-02
URL	http://arxiv.org/abs/1902.00632v1
PDF	http://arxiv.org/pdf/1902.00632v1.pdf
PWC	https://paperswithcode.com/paper/efficient-estimation-of-auc-in-a-sliding
Repo
Framework

SNN under Attack: are Spiking Deep Belief Networks vulnerable to Adversarial Examples?


Title	SNN under Attack: are Spiking Deep Belief Networks vulnerable to Adversarial Examples?
Authors	Alberto Marchisio, Giorgio Nanfa, Faiq Khalid, Muhammad Abdullah Hanif, Maurizio Martina, Muhammad Shafique
Abstract	Recently, many adversarial examples have emerged for Deep Neural Networks (DNNs) causing misclassifications. However, in-depth work still needs to be performed to demonstrate such attacks and security vulnerabilities for spiking neural networks (SNNs), i.e. the 3rd generation NNs. This paper aims at addressing the fundamental questions:“Are SNNs vulnerable to the adversarial attacks as well?” and “if yes, to what extent?” Using a Spiking Deep Belief Network (SDBN) for the MNIST database classification, we show that the SNN accuracy decreases accordingly to the noise magnitude in data poisoning random attacks applied to the test images. Moreover, SDBNs generalization capabilities increase by applying noise to the training images. We develop a novel black box attack methodology to automatically generate imperceptible and robust adversarial examples through a greedy algorithm, which is first of its kind for SNNs.
Tasks	data poisoning
Published	2019-02-04
URL	http://arxiv.org/abs/1902.01147v1
PDF	http://arxiv.org/pdf/1902.01147v1.pdf
PWC	https://paperswithcode.com/paper/snn-under-attack-are-spiking-deep-belief
Repo
Framework