October 19, 2019

2968 words 14 mins read

Paper Group ANR 178

Sum-Product Networks for Sequence Labeling. Lifted Marginal MAP Inference. Attention, Please! Adversarial Defense via Attention Rectification and Preservation. Multimodal Machine Translation with Reinforcement Learning. Density estimation for shift-invariant multidimensional distributions. Joint Person Segmentation and Identification in Synchronize …

Sum-Product Networks for Sequence Labeling


Title	Sum-Product Networks for Sequence Labeling
Authors	Martin Ratajczak, Sebastian Tschiatschek, Franz Pernkopf
Abstract	We consider higher-order linear-chain conditional random fields (HO-LC-CRFs) for sequence modelling, and use sum-product networks (SPNs) for representing higher-order input- and output-dependent factors. SPNs are a recently introduced class of deep models for which exact and efficient inference can be performed. By combining HO-LC-CRFs with SPNs, expressive models over both the output labels and the hidden variables are instantiated while still enabling efficient exact inference. Furthermore, the use of higher-order factors allows us to capture relations of multiple input segments and multiple output labels as often present in real-world data. These relations can not be modelled by the commonly used first-order models and higher-order models with local factors including only a single output label. We demonstrate the effectiveness of our proposed models for sequence labeling. In extensive experiments, we outperform other state-of-the-art methods in optical character recognition and achieve competitive results in phone classification.
Tasks	Optical Character Recognition
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02324v1
PDF	http://arxiv.org/pdf/1807.02324v1.pdf
PWC	https://paperswithcode.com/paper/sum-product-networks-for-sequence-labeling
Repo
Framework

Lifted Marginal MAP Inference


Title	Lifted Marginal MAP Inference
Authors	Vishal Sharma, Noman Ahmed Sheikh, Happy Mittal, Vibhav Gogate, Parag Singla
Abstract	Lifted inference reduces the complexity of inference in relational probabilistic models by identifying groups of constants (or atoms) which behave symmetric to each other. A number of techniques have been proposed in the literature for lifting marginal as well MAP inference. We present the first application of lifting rules for marginal-MAP (MMAP), an important inference problem in models having latent (random) variables. Our main contribution is two fold: (1) we define a new equivalence class of (logical) variables, called Single Occurrence for MAX (SOM), and show that solution lies at extreme with respect to the SOM variables, i.e., predicate groundings differing only in the instantiation of the SOM variables take the same truth value (2) we define a sub-class {\em SOM-R} (SOM Reduce) and exploit properties of extreme assignments to show that MMAP inference can be performed by reducing the domain of SOM-R variables to a single constant.We refer to our lifting technique as the {\em SOM-R} rule for lifted MMAP. Combined with existing rules such as decomposer and binomial, this results in a powerful framework for lifted MMAP. Experiments on three benchmark domains show significant gains in both time and memory compared to ground inference as well as lifted approaches not using SOM-R.
Tasks
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00589v2
PDF	http://arxiv.org/pdf/1807.00589v2.pdf
PWC	https://paperswithcode.com/paper/lifted-marginal-map-inference
Repo
Framework

Attention, Please! Adversarial Defense via Attention Rectification and Preservation


Title	Attention, Please! Adversarial Defense via Attention Rectification and Preservation
Authors	Shangxi Wu, Jitao Sang, Kaiyuan Xu, Jiaming Zhang, Yanfeng Sun, Liping Jing, Jian Yu
Abstract	This study provides a new understanding of the adversarial attack problem by examining the correlation between adversarial attack and visual attention change. In particular, we observed that: (1) images with incomplete attention regions are more vulnerable to adversarial attacks; and (2) successful adversarial attacks lead to deviated and scattered attention map. Accordingly, an attention-based adversarial defense framework is designed to simultaneously rectify the attention map for prediction and preserve the attention area between adversarial and original images. The problem of adding iteratively attacked samples is also discussed in the context of visual attention change. We hope the attention-related data analysis and defense solution in this study will shed some light on the mechanism behind the adversarial attack and also facilitate future adversarial defense/attack model design.
Tasks	Adversarial Attack, Adversarial Defense
Published	2018-11-24
URL	https://arxiv.org/abs/1811.09831v2
PDF	https://arxiv.org/pdf/1811.09831v2.pdf
PWC	https://paperswithcode.com/paper/attention-please-adversarial-defense-via
Repo
Framework

Multimodal Machine Translation with Reinforcement Learning


Title	Multimodal Machine Translation with Reinforcement Learning
Authors	Xin Qian, Ziyi Zhong, Jieli Zhou
Abstract	Multimodal machine translation is one of the applications that integrates computer vision and language processing. It is a unique task given that in the field of machine translation, many state-of-the-arts algorithms still only employ textual information. In this work, we explore the effectiveness of reinforcement learning in multimodal machine translation. We present a novel algorithm based on the Advantage Actor-Critic (A2C) algorithm that specifically cater to the multimodal machine translation task of the EMNLP 2018 Third Conference on Machine Translation (WMT18). We experiment our proposed algorithm on the Multi30K multilingual English-German image description dataset and the Flickr30K image entity dataset. Our model takes two channels of inputs, image and text, uses translation evaluation metrics as training rewards, and achieves better results than supervised learning MLE baseline models. Furthermore, we discuss the prospects and limitations of using reinforcement learning for machine translation. Our experiment results suggest a promising reinforcement learning solution to the general task of multimodal sequence to sequence learning.
Tasks	Machine Translation, Multimodal Machine Translation
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02356v1
PDF	http://arxiv.org/pdf/1805.02356v1.pdf
PWC	https://paperswithcode.com/paper/multimodal-machine-translation-with
Repo
Framework

Density estimation for shift-invariant multidimensional distributions


Title	Density estimation for shift-invariant multidimensional distributions
Authors	Anindya De, Philip M. Long, Rocco A. Servedio
Abstract	We study density estimation for classes of shift-invariant distributions over $\mathbb{R}^d$. A multidimensional distribution is “shift-invariant” if, roughly speaking, it is close in total variation distance to a small shift of it in any direction. Shift-invariance relaxes smoothness assumptions commonly used in non-parametric density estimation to allow jump discontinuities. The different classes of distributions that we consider correspond to different rates of tail decay. For each such class we give an efficient algorithm that learns any distribution in the class from independent samples with respect to total variation distance. As a special case of our general result, we show that $d$-dimensional shift-invariant distributions which satisfy an exponential tail bound can be learned to total variation distance error $\epsilon$ using $\tilde{O}_d(1/ \epsilon^{d+2})$ examples and $\tilde{O}_d(1/ \epsilon^{2d+2})$ time. This implies that, for constant $d$, multivariate log-concave distributions can be learned in $\tilde{O}_d(1/\epsilon^{2d+2})$ time using $\tilde{O}_d(1/\epsilon^{d+2})$ samples, answering a question of [Diakonikolas, Kane and Stewart, 2016] All of our results extend to a model of noise-tolerant density estimation using Huber’s contamination model, in which the target distribution to be learned is a $(1-\epsilon,\epsilon)$ mixture of some unknown distribution in the class with some other arbitrary and unknown distribution, and the learning algorithm must output a hypothesis distribution with total variation distance error $O(\epsilon)$ from the target distribution. We show that our general results are close to best possible by proving a simple $\Omega\left(1/\epsilon^d\right)$ information-theoretic lower bound on sample complexity even for learning bounded distributions that are shift-invariant.
Tasks	Density Estimation
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03744v1
PDF	http://arxiv.org/pdf/1811.03744v1.pdf
PWC	https://paperswithcode.com/paper/density-estimation-for-shift-invariant
Repo
Framework

Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos


Title	Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos
Authors	Mingze Xu, Chenyou Fan, Yuchen Wang, Michael S Ryoo, David J Crandall
Abstract	In a world of pervasive cameras, public spaces are often captured from multiple perspectives by cameras of different types, both fixed and mobile. An important problem is to organize these heterogeneous collections of videos by finding connections between them, such as identifying correspondences between the people appearing in the videos and the people holding or wearing the cameras. In this paper, we wish to solve two specific problems: (1) given two or more synchronized third-person videos of a scene, produce a pixel-level segmentation of each visible person and identify corresponding people across different views (i.e., determine who in camera A corresponds with whom in camera B), and (2) given one or more synchronized third-person videos as well as a first-person video taken by a mobile or wearable camera, segment and identify the camera wearer in the third-person videos. Unlike previous work which requires ground truth bounding boxes to estimate the correspondences, we perform person segmentation and identification jointly. We find that solving these two problems simultaneously is mutually beneficial, because better fine-grained segmentation allows us to better perform matching across views, and information from multiple views helps us perform more accurate segmentation. We evaluate our approach on two challenging datasets of interacting people captured from multiple wearable cameras, and show that our proposed method performs significantly better than the state-of-the-art on both person segmentation and identification.
Tasks
Published	2018-03-29
URL	http://arxiv.org/abs/1803.11217v2
PDF	http://arxiv.org/pdf/1803.11217v2.pdf
PWC	https://paperswithcode.com/paper/joint-person-segmentation-and-identification
Repo
Framework

A Rational Distributed Process-level Account of Independence Judgment


Title	A Rational Distributed Process-level Account of Independence Judgment
Authors	Ardavan S. Nobandegani, Ioannis N. Psaromiligkos
Abstract	It is inconceivable how chaotic the world would look to humans, faced with innumerable decisions a day to be made under uncertainty, had they been lacking the capacity to distinguish the relevant from the irrelevant—a capacity which computationally amounts to handling probabilistic independence relations. The highly parallel and distributed computational machinery of the brain suggests that a satisfying process-level account of human independence judgment should also mimic these features. In this work, we present the first rational, distributed, message-passing, process-level account of independence judgment, called $\mathcal{D}^\ast$. Interestingly, $\mathcal{D}^\ast$ shows a curious, but normatively-justified tendency for quick detection of dependencies, whenever they hold. Furthermore, $\mathcal{D}^\ast$ outperforms all the previously proposed algorithms in the AI literature in terms of worst-case running time, and a salient aspect of it is supported by recent work in neuroscience investigating possible implementations of Bayes nets at the neural level. $\mathcal{D}^\ast$ nicely exemplifies how the pursuit of cognitive plausibility can lead to the discovery of state-of-the-art algorithms with appealing properties, and its simplicity makes $\mathcal{D}^\ast$ potentially a good candidate for pedagogical purposes.
Tasks
Published	2018-01-30
URL	http://arxiv.org/abs/1801.10186v1
PDF	http://arxiv.org/pdf/1801.10186v1.pdf
PWC	https://paperswithcode.com/paper/a-rational-distributed-process-level-account
Repo
Framework

SIMCom: Statistical Sniffing of Inter-Module Communications for Run-time Hardware Trojan Detection


Title	SIMCom: Statistical Sniffing of Inter-Module Communications for Run-time Hardware Trojan Detection
Authors	Faiq Khalid, Syed Rafay Hasan, Osman Hasan, Falah Awwad, Muhammad Shafique
Abstract	Timely detection of Hardware Trojans (HT) has become a major challenge for secure integrated circuits. We present a run-time methodology for HT detection that employs a multi-parameter statistical traffic modeling of the communication channel in a given System-on-Chip (SoC). Towards this, it leverages the Hurst exponent, the standard deviation of the injection distribution and hop distribution jointly to accurately identify HT-based online anomalies. At design time, our methodology employs a property specification language to define and embed assertions in the RTL, specifying the correct communication behavior of a given SoC. At runtime, it monitors the anomalies in the communication behavior by checking the execution patterns against these assertions. We evaluate our methodology for detecting HTs in MC8051 microcontrollers. The experimental results show that with the combined analysis of multiple statistical parameters, our methodology is able to detect all the benchmark Trojans (available on trust-hub) inserted in MC8051, which directly or indirectly affect the communication-channels in SoC.
Tasks
Published	2018-11-04
URL	http://arxiv.org/abs/1901.07299v1
PDF	http://arxiv.org/pdf/1901.07299v1.pdf
PWC	https://paperswithcode.com/paper/simcom-statistical-sniffing-of-inter-module
Repo
Framework

A Framework for Complementary Companion Character Behavior in Video Games


Title	A Framework for Complementary Companion Character Behavior in Video Games
Authors	Gavin Scott, Foaad Khosmood
Abstract	We propose a game development framework capable of governing the behavior of complementary companions in a video game. A “complementary” action is contrasted with a mimicking action and is defined as any action by a friendly non-player character that furthers the player’s strategy. This is determined through a combination of both player action and game state prediction processes while allowing the AI companion to experiment. We determine the location of interest for companion actions based on a dynamic set of regions customized to the individual player. A user study shows promising results; a majority of participants familiar with game design react positively to the companion behavior, stating that they would consider using the frame-work in future games themselves.
Tasks
Published	2018-08-28
URL	http://arxiv.org/abs/1808.09079v1
PDF	http://arxiv.org/pdf/1808.09079v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-complementary-companion
Repo
Framework

On-Demand Video Dispatch Networks: A Scalable End-to-End Learning Approach


Title	On-Demand Video Dispatch Networks: A Scalable End-to-End Learning Approach
Authors	Damao Yang, Sihan Peng, He Huang, Hongliang Xue
Abstract	We design a dispatch system to improve the peak service quality of video on demand (VOD). Our system predicts the hot videos during the peak hours of the next day based on the historical requests, and dispatches to the content delivery networks (CDNs) at the previous off-peak time. In order to scale to billions of videos, we build the system with two neural networks, one for video clustering and the other for dispatch policy developing. The clustering network employs autoencoder layers and reduces the video number to a fixed value. The policy network employs fully connected layers and ranks the clustered videos with dispatch probabilities. The two networks are coupled with weight-sharing temporal layers, which analyze the video request sequences with convolutional and recurrent modules. Therefore, the clustering and dispatch tasks are trained in an end-to-end mechanism. The real-world results show that our approach achieves an average prediction accuracy of 17%, compared with 3% from the present baseline method, for the same amount of dispatches.
Tasks
Published	2018-12-25
URL	http://arxiv.org/abs/1901.04295v1
PDF	http://arxiv.org/pdf/1901.04295v1.pdf
PWC	https://paperswithcode.com/paper/on-demand-video-dispatch-networks-a-scalable
Repo
Framework

Maximum Causal Tsallis Entropy Imitation Learning


Title	Maximum Causal Tsallis Entropy Imitation Learning
Authors	Kyungjae Lee, Sungjoon Choi, Songhwai Oh
Abstract	In this paper, we propose a novel maximum causal Tsallis entropy (MCTE) framework for imitation learning which can efficiently learn a sparse multi-modal policy distribution from demonstrations. We provide the full mathematical analysis of the proposed framework. First, the optimal solution of an MCTE problem is shown to be a sparsemax distribution, whose supporting set can be adjusted. The proposed method has advantages over a softmax distribution in that it can exclude unnecessary actions by assigning zero probability. Second, we prove that an MCTE problem is equivalent to robust Bayes estimation in the sense of the Brier score. Third, we propose a maximum causal Tsallis entropy imitation learning (MCTEIL) algorithm with a sparse mixture density network (sparse MDN) by modeling mixture weights using a sparsemax distribution. In particular, we show that the causal Tsallis entropy of an MDN encourages exploration and efficient mixture utilization while Boltzmann Gibbs entropy is less effective. We validate the proposed method in two simulation studies and MCTEIL outperforms existing imitation learning methods in terms of average returns and learning multi-modal policies.
Tasks	Imitation Learning
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08336v2
PDF	http://arxiv.org/pdf/1805.08336v2.pdf
PWC	https://paperswithcode.com/paper/maximum-causal-tsallis-entropy-imitation
Repo
Framework

Web-Scale Responsive Visual Search at Bing


Title	Web-Scale Responsive Visual Search at Bing
Authors	Houdong Hu, Yan Wang, Linjun Yang, Pavel Komlev, Li Huang, Xi Chen, Jiapei Huang, Ye Wu, Meenaz Merchant, Arun Sacheti
Abstract	In this paper, we introduce a web-scale general visual search system deployed in Microsoft Bing. The system accommodates tens of billions of images in the index, with thousands of features for each image, and can respond in less than 200 ms. In order to overcome the challenges in relevance, latency, and scalability in such large scale of data, we employ a cascaded learning-to-rank framework based on various latest deep learning visual features, and deploy in a distributed heterogeneous computing platform. Quantitative and qualitative experiments show that our system is able to support various applications on Bing website and apps.
Tasks	Learning-To-Rank
Published	2018-02-14
URL	http://arxiv.org/abs/1802.04914v2
PDF	http://arxiv.org/pdf/1802.04914v2.pdf
PWC	https://paperswithcode.com/paper/web-scale-responsive-visual-search-at-bing
Repo
Framework

A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification


Title	A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification
Authors	Zeyang Lei, Yujiu Yang, Min Yang, Yi Liu
Abstract	Deep learning approaches for sentiment classification do not fully exploit sentiment linguistic knowledge. In this paper, we propose a Multi-sentiment-resource Enhanced Attention Network (MEAN) to alleviate the problem by integrating three kinds of sentiment linguistic knowledge (e.g., sentiment lexicon, negation words, intensity words) into the deep neural network via attention mechanisms. By using various types of sentiment resources, MEAN utilizes sentiment-relevant information from different representation subspaces, which makes it more effective to capture the overall semantics of the sentiment, negation and intensity words for sentiment prediction. The experimental results demonstrate that MEAN has robust superiority over strong competitors.
Tasks	Sentiment Analysis
Published	2018-07-13
URL	http://arxiv.org/abs/1807.04990v1
PDF	http://arxiv.org/pdf/1807.04990v1.pdf
PWC	https://paperswithcode.com/paper/a-multi-sentiment-resource-enhanced-attention
Repo
Framework

Molecular Dynamics with Neural-Network Potentials


Title	Molecular Dynamics with Neural-Network Potentials
Authors	Michael Gastegger, Philipp Marquetand
Abstract	Molecular dynamics simulations are an important tool for describing the evolution of a chemical system with time. However, these simulations are inherently held back either by the prohibitive cost of accurate electronic structure theory computations or the limited accuracy of classical empirical force fields. Machine learning techniques can help to overcome these limitations by providing access to potential energies, forces and other molecular properties modeled directly after an electronic structure reference at only a fraction of the original computational cost. The present text discusses several practical aspects of conducting machine learning driven molecular dynamics simulations. First, we study the efficient selection of reference data points on the basis of an active learning inspired adaptive sampling scheme. This is followed by the analysis of a machine-learning based model for simulating molecular dipole moments in the framework of predicting infrared spectra via molecular dynamics simulations. Finally, we show that machine learning models can offer valuable aid in understanding chemical systems beyond a simple prediction of quantities.
Tasks	Active Learning
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07676v1
PDF	http://arxiv.org/pdf/1812.07676v1.pdf
PWC	https://paperswithcode.com/paper/molecular-dynamics-with-neural-network
Repo
Framework

Leveraging Implicit Spatial Information in Global Features for Image Retrieval


Title	Leveraging Implicit Spatial Information in Global Features for Image Retrieval
Authors	Pierre Jacob, David Picard, Aymeric Histace, Edouard Klein
Abstract	Most image retrieval methods use global features that aggregate local distinctive patterns into a single representation. However, the aggregation process destroys the relative spatial information by considering orderless sets of local descriptors. We propose to integrate relative spatial information into the aggregation process by taking into account co-occurrences of local patterns in a tensor framework. The resulting signature called Improved Spatial Tensor Aggregation (ISTA) is able to reach state of the art performances on well known datasets such as Holidays, Oxford5k and Paris6k.
Tasks	Image Retrieval
Published	2018-06-23
URL	http://arxiv.org/abs/1806.08991v1
PDF	http://arxiv.org/pdf/1806.08991v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-implicit-spatial-information-in
Repo
Framework