Paper Group ANR 178
Sum-Product Networks for Sequence Labeling. Lifted Marginal MAP Inference. Attention, Please! Adversarial Defense via Attention Rectification and Preservation. Multimodal Machine Translation with Reinforcement Learning. Density estimation for shift-invariant multidimensional distributions. Joint Person Segmentation and Identification in Synchronize …
Sum-Product Networks for Sequence Labeling
Title | Sum-Product Networks for Sequence Labeling |
Authors | Martin Ratajczak, Sebastian Tschiatschek, Franz Pernkopf |
Abstract | We consider higher-order linear-chain conditional random fields (HO-LC-CRFs) for sequence modelling, and use sum-product networks (SPNs) for representing higher-order input- and output-dependent factors. SPNs are a recently introduced class of deep models for which exact and efficient inference can be performed. By combining HO-LC-CRFs with SPNs, expressive models over both the output labels and the hidden variables are instantiated while still enabling efficient exact inference. Furthermore, the use of higher-order factors allows us to capture relations of multiple input segments and multiple output labels as often present in real-world data. These relations can not be modelled by the commonly used first-order models and higher-order models with local factors including only a single output label. We demonstrate the effectiveness of our proposed models for sequence labeling. In extensive experiments, we outperform other state-of-the-art methods in optical character recognition and achieve competitive results in phone classification. |
Tasks | Optical Character Recognition |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02324v1 |
http://arxiv.org/pdf/1807.02324v1.pdf | |
PWC | https://paperswithcode.com/paper/sum-product-networks-for-sequence-labeling |
Repo | |
Framework | |
Lifted Marginal MAP Inference
Title | Lifted Marginal MAP Inference |
Authors | Vishal Sharma, Noman Ahmed Sheikh, Happy Mittal, Vibhav Gogate, Parag Singla |
Abstract | Lifted inference reduces the complexity of inference in relational probabilistic models by identifying groups of constants (or atoms) which behave symmetric to each other. A number of techniques have been proposed in the literature for lifting marginal as well MAP inference. We present the first application of lifting rules for marginal-MAP (MMAP), an important inference problem in models having latent (random) variables. Our main contribution is two fold: (1) we define a new equivalence class of (logical) variables, called Single Occurrence for MAX (SOM), and show that solution lies at extreme with respect to the SOM variables, i.e., predicate groundings differing only in the instantiation of the SOM variables take the same truth value (2) we define a sub-class {\em SOM-R} (SOM Reduce) and exploit properties of extreme assignments to show that MMAP inference can be performed by reducing the domain of SOM-R variables to a single constant.We refer to our lifting technique as the {\em SOM-R} rule for lifted MMAP. Combined with existing rules such as decomposer and binomial, this results in a powerful framework for lifted MMAP. Experiments on three benchmark domains show significant gains in both time and memory compared to ground inference as well as lifted approaches not using SOM-R. |
Tasks | |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00589v2 |
http://arxiv.org/pdf/1807.00589v2.pdf | |
PWC | https://paperswithcode.com/paper/lifted-marginal-map-inference |
Repo | |
Framework | |
Attention, Please! Adversarial Defense via Attention Rectification and Preservation
Title | Attention, Please! Adversarial Defense via Attention Rectification and Preservation |
Authors | Shangxi Wu, Jitao Sang, Kaiyuan Xu, Jiaming Zhang, Yanfeng Sun, Liping Jing, Jian Yu |
Abstract | This study provides a new understanding of the adversarial attack problem by examining the correlation between adversarial attack and visual attention change. In particular, we observed that: (1) images with incomplete attention regions are more vulnerable to adversarial attacks; and (2) successful adversarial attacks lead to deviated and scattered attention map. Accordingly, an attention-based adversarial defense framework is designed to simultaneously rectify the attention map for prediction and preserve the attention area between adversarial and original images. The problem of adding iteratively attacked samples is also discussed in the context of visual attention change. We hope the attention-related data analysis and defense solution in this study will shed some light on the mechanism behind the adversarial attack and also facilitate future adversarial defense/attack model design. |
Tasks | Adversarial Attack, Adversarial Defense |
Published | 2018-11-24 |
URL | https://arxiv.org/abs/1811.09831v2 |
https://arxiv.org/pdf/1811.09831v2.pdf | |
PWC | https://paperswithcode.com/paper/attention-please-adversarial-defense-via |
Repo | |
Framework | |
Multimodal Machine Translation with Reinforcement Learning
Title | Multimodal Machine Translation with Reinforcement Learning |
Authors | Xin Qian, Ziyi Zhong, Jieli Zhou |
Abstract | Multimodal machine translation is one of the applications that integrates computer vision and language processing. It is a unique task given that in the field of machine translation, many state-of-the-arts algorithms still only employ textual information. In this work, we explore the effectiveness of reinforcement learning in multimodal machine translation. We present a novel algorithm based on the Advantage Actor-Critic (A2C) algorithm that specifically cater to the multimodal machine translation task of the EMNLP 2018 Third Conference on Machine Translation (WMT18). We experiment our proposed algorithm on the Multi30K multilingual English-German image description dataset and the Flickr30K image entity dataset. Our model takes two channels of inputs, image and text, uses translation evaluation metrics as training rewards, and achieves better results than supervised learning MLE baseline models. Furthermore, we discuss the prospects and limitations of using reinforcement learning for machine translation. Our experiment results suggest a promising reinforcement learning solution to the general task of multimodal sequence to sequence learning. |
Tasks | Machine Translation, Multimodal Machine Translation |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02356v1 |
http://arxiv.org/pdf/1805.02356v1.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-machine-translation-with |
Repo | |
Framework | |
Density estimation for shift-invariant multidimensional distributions
Title | Density estimation for shift-invariant multidimensional distributions |
Authors | Anindya De, Philip M. Long, Rocco A. Servedio |
Abstract | We study density estimation for classes of shift-invariant distributions over $\mathbb{R}^d$. A multidimensional distribution is “shift-invariant” if, roughly speaking, it is close in total variation distance to a small shift of it in any direction. Shift-invariance relaxes smoothness assumptions commonly used in non-parametric density estimation to allow jump discontinuities. The different classes of distributions that we consider correspond to different rates of tail decay. For each such class we give an efficient algorithm that learns any distribution in the class from independent samples with respect to total variation distance. As a special case of our general result, we show that $d$-dimensional shift-invariant distributions which satisfy an exponential tail bound can be learned to total variation distance error $\epsilon$ using $\tilde{O}_d(1/ \epsilon^{d+2})$ examples and $\tilde{O}_d(1/ \epsilon^{2d+2})$ time. This implies that, for constant $d$, multivariate log-concave distributions can be learned in $\tilde{O}_d(1/\epsilon^{2d+2})$ time using $\tilde{O}_d(1/\epsilon^{d+2})$ samples, answering a question of [Diakonikolas, Kane and Stewart, 2016] All of our results extend to a model of noise-tolerant density estimation using Huber’s contamination model, in which the target distribution to be learned is a $(1-\epsilon,\epsilon)$ mixture of some unknown distribution in the class with some other arbitrary and unknown distribution, and the learning algorithm must output a hypothesis distribution with total variation distance error $O(\epsilon)$ from the target distribution. We show that our general results are close to best possible by proving a simple $\Omega\left(1/\epsilon^d\right)$ information-theoretic lower bound on sample complexity even for learning bounded distributions that are shift-invariant. |
Tasks | Density Estimation |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03744v1 |
http://arxiv.org/pdf/1811.03744v1.pdf | |
PWC | https://paperswithcode.com/paper/density-estimation-for-shift-invariant |
Repo | |
Framework | |
Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos
Title | Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos |
Authors | Mingze Xu, Chenyou Fan, Yuchen Wang, Michael S Ryoo, David J Crandall |
Abstract | In a world of pervasive cameras, public spaces are often captured from multiple perspectives by cameras of different types, both fixed and mobile. An important problem is to organize these heterogeneous collections of videos by finding connections between them, such as identifying correspondences between the people appearing in the videos and the people holding or wearing the cameras. In this paper, we wish to solve two specific problems: (1) given two or more synchronized third-person videos of a scene, produce a pixel-level segmentation of each visible person and identify corresponding people across different views (i.e., determine who in camera A corresponds with whom in camera B), and (2) given one or more synchronized third-person videos as well as a first-person video taken by a mobile or wearable camera, segment and identify the camera wearer in the third-person videos. Unlike previous work which requires ground truth bounding boxes to estimate the correspondences, we perform person segmentation and identification jointly. We find that solving these two problems simultaneously is mutually beneficial, because better fine-grained segmentation allows us to better perform matching across views, and information from multiple views helps us perform more accurate segmentation. We evaluate our approach on two challenging datasets of interacting people captured from multiple wearable cameras, and show that our proposed method performs significantly better than the state-of-the-art on both person segmentation and identification. |
Tasks | |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1803.11217v2 |
http://arxiv.org/pdf/1803.11217v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-person-segmentation-and-identification |
Repo | |
Framework | |
A Rational Distributed Process-level Account of Independence Judgment
Title | A Rational Distributed Process-level Account of Independence Judgment |
Authors | Ardavan S. Nobandegani, Ioannis N. Psaromiligkos |
Abstract | It is inconceivable how chaotic the world would look to humans, faced with innumerable decisions a day to be made under uncertainty, had they been lacking the capacity to distinguish the relevant from the irrelevant—a capacity which computationally amounts to handling probabilistic independence relations. The highly parallel and distributed computational machinery of the brain suggests that a satisfying process-level account of human independence judgment should also mimic these features. In this work, we present the first rational, distributed, message-passing, process-level account of independence judgment, called $\mathcal{D}^\ast$. Interestingly, $\mathcal{D}^\ast$ shows a curious, but normatively-justified tendency for quick detection of dependencies, whenever they hold. Furthermore, $\mathcal{D}^\ast$ outperforms all the previously proposed algorithms in the AI literature in terms of worst-case running time, and a salient aspect of it is supported by recent work in neuroscience investigating possible implementations of Bayes nets at the neural level. $\mathcal{D}^\ast$ nicely exemplifies how the pursuit of cognitive plausibility can lead to the discovery of state-of-the-art algorithms with appealing properties, and its simplicity makes $\mathcal{D}^\ast$ potentially a good candidate for pedagogical purposes. |
Tasks | |
Published | 2018-01-30 |
URL | http://arxiv.org/abs/1801.10186v1 |
http://arxiv.org/pdf/1801.10186v1.pdf | |
PWC | https://paperswithcode.com/paper/a-rational-distributed-process-level-account |
Repo | |
Framework | |
SIMCom: Statistical Sniffing of Inter-Module Communications for Run-time Hardware Trojan Detection
Title | SIMCom: Statistical Sniffing of Inter-Module Communications for Run-time Hardware Trojan Detection |
Authors | Faiq Khalid, Syed Rafay Hasan, Osman Hasan, Falah Awwad, Muhammad Shafique |
Abstract | Timely detection of Hardware Trojans (HT) has become a major challenge for secure integrated circuits. We present a run-time methodology for HT detection that employs a multi-parameter statistical traffic modeling of the communication channel in a given System-on-Chip (SoC). Towards this, it leverages the Hurst exponent, the standard deviation of the injection distribution and hop distribution jointly to accurately identify HT-based online anomalies. At design time, our methodology employs a property specification language to define and embed assertions in the RTL, specifying the correct communication behavior of a given SoC. At runtime, it monitors the anomalies in the communication behavior by checking the execution patterns against these assertions. We evaluate our methodology for detecting HTs in MC8051 microcontrollers. The experimental results show that with the combined analysis of multiple statistical parameters, our methodology is able to detect all the benchmark Trojans (available on trust-hub) inserted in MC8051, which directly or indirectly affect the communication-channels in SoC. |
Tasks | |
Published | 2018-11-04 |
URL | http://arxiv.org/abs/1901.07299v1 |
http://arxiv.org/pdf/1901.07299v1.pdf | |
PWC | https://paperswithcode.com/paper/simcom-statistical-sniffing-of-inter-module |
Repo | |
Framework | |
A Framework for Complementary Companion Character Behavior in Video Games
Title | A Framework for Complementary Companion Character Behavior in Video Games |
Authors | Gavin Scott, Foaad Khosmood |
Abstract | We propose a game development framework capable of governing the behavior of complementary companions in a video game. A “complementary” action is contrasted with a mimicking action and is defined as any action by a friendly non-player character that furthers the player’s strategy. This is determined through a combination of both player action and game state prediction processes while allowing the AI companion to experiment. We determine the location of interest for companion actions based on a dynamic set of regions customized to the individual player. A user study shows promising results; a majority of participants familiar with game design react positively to the companion behavior, stating that they would consider using the frame-work in future games themselves. |
Tasks | |
Published | 2018-08-28 |
URL | http://arxiv.org/abs/1808.09079v1 |
http://arxiv.org/pdf/1808.09079v1.pdf | |
PWC | https://paperswithcode.com/paper/a-framework-for-complementary-companion |
Repo | |
Framework | |
On-Demand Video Dispatch Networks: A Scalable End-to-End Learning Approach
Title | On-Demand Video Dispatch Networks: A Scalable End-to-End Learning Approach |
Authors | Damao Yang, Sihan Peng, He Huang, Hongliang Xue |
Abstract | We design a dispatch system to improve the peak service quality of video on demand (VOD). Our system predicts the hot videos during the peak hours of the next day based on the historical requests, and dispatches to the content delivery networks (CDNs) at the previous off-peak time. In order to scale to billions of videos, we build the system with two neural networks, one for video clustering and the other for dispatch policy developing. The clustering network employs autoencoder layers and reduces the video number to a fixed value. The policy network employs fully connected layers and ranks the clustered videos with dispatch probabilities. The two networks are coupled with weight-sharing temporal layers, which analyze the video request sequences with convolutional and recurrent modules. Therefore, the clustering and dispatch tasks are trained in an end-to-end mechanism. The real-world results show that our approach achieves an average prediction accuracy of 17%, compared with 3% from the present baseline method, for the same amount of dispatches. |
Tasks | |
Published | 2018-12-25 |
URL | http://arxiv.org/abs/1901.04295v1 |
http://arxiv.org/pdf/1901.04295v1.pdf | |
PWC | https://paperswithcode.com/paper/on-demand-video-dispatch-networks-a-scalable |
Repo | |
Framework | |
Maximum Causal Tsallis Entropy Imitation Learning
Title | Maximum Causal Tsallis Entropy Imitation Learning |
Authors | Kyungjae Lee, Sungjoon Choi, Songhwai Oh |
Abstract | In this paper, we propose a novel maximum causal Tsallis entropy (MCTE) framework for imitation learning which can efficiently learn a sparse multi-modal policy distribution from demonstrations. We provide the full mathematical analysis of the proposed framework. First, the optimal solution of an MCTE problem is shown to be a sparsemax distribution, whose supporting set can be adjusted. The proposed method has advantages over a softmax distribution in that it can exclude unnecessary actions by assigning zero probability. Second, we prove that an MCTE problem is equivalent to robust Bayes estimation in the sense of the Brier score. Third, we propose a maximum causal Tsallis entropy imitation learning (MCTEIL) algorithm with a sparse mixture density network (sparse MDN) by modeling mixture weights using a sparsemax distribution. In particular, we show that the causal Tsallis entropy of an MDN encourages exploration and efficient mixture utilization while Boltzmann Gibbs entropy is less effective. We validate the proposed method in two simulation studies and MCTEIL outperforms existing imitation learning methods in terms of average returns and learning multi-modal policies. |
Tasks | Imitation Learning |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08336v2 |
http://arxiv.org/pdf/1805.08336v2.pdf | |
PWC | https://paperswithcode.com/paper/maximum-causal-tsallis-entropy-imitation |
Repo | |
Framework | |
Web-Scale Responsive Visual Search at Bing
Title | Web-Scale Responsive Visual Search at Bing |
Authors | Houdong Hu, Yan Wang, Linjun Yang, Pavel Komlev, Li Huang, Xi Chen, Jiapei Huang, Ye Wu, Meenaz Merchant, Arun Sacheti |
Abstract | In this paper, we introduce a web-scale general visual search system deployed in Microsoft Bing. The system accommodates tens of billions of images in the index, with thousands of features for each image, and can respond in less than 200 ms. In order to overcome the challenges in relevance, latency, and scalability in such large scale of data, we employ a cascaded learning-to-rank framework based on various latest deep learning visual features, and deploy in a distributed heterogeneous computing platform. Quantitative and qualitative experiments show that our system is able to support various applications on Bing website and apps. |
Tasks | Learning-To-Rank |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.04914v2 |
http://arxiv.org/pdf/1802.04914v2.pdf | |
PWC | https://paperswithcode.com/paper/web-scale-responsive-visual-search-at-bing |
Repo | |
Framework | |
A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification
Title | A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification |
Authors | Zeyang Lei, Yujiu Yang, Min Yang, Yi Liu |
Abstract | Deep learning approaches for sentiment classification do not fully exploit sentiment linguistic knowledge. In this paper, we propose a Multi-sentiment-resource Enhanced Attention Network (MEAN) to alleviate the problem by integrating three kinds of sentiment linguistic knowledge (e.g., sentiment lexicon, negation words, intensity words) into the deep neural network via attention mechanisms. By using various types of sentiment resources, MEAN utilizes sentiment-relevant information from different representation subspaces, which makes it more effective to capture the overall semantics of the sentiment, negation and intensity words for sentiment prediction. The experimental results demonstrate that MEAN has robust superiority over strong competitors. |
Tasks | Sentiment Analysis |
Published | 2018-07-13 |
URL | http://arxiv.org/abs/1807.04990v1 |
http://arxiv.org/pdf/1807.04990v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-sentiment-resource-enhanced-attention |
Repo | |
Framework | |
Molecular Dynamics with Neural-Network Potentials
Title | Molecular Dynamics with Neural-Network Potentials |
Authors | Michael Gastegger, Philipp Marquetand |
Abstract | Molecular dynamics simulations are an important tool for describing the evolution of a chemical system with time. However, these simulations are inherently held back either by the prohibitive cost of accurate electronic structure theory computations or the limited accuracy of classical empirical force fields. Machine learning techniques can help to overcome these limitations by providing access to potential energies, forces and other molecular properties modeled directly after an electronic structure reference at only a fraction of the original computational cost. The present text discusses several practical aspects of conducting machine learning driven molecular dynamics simulations. First, we study the efficient selection of reference data points on the basis of an active learning inspired adaptive sampling scheme. This is followed by the analysis of a machine-learning based model for simulating molecular dipole moments in the framework of predicting infrared spectra via molecular dynamics simulations. Finally, we show that machine learning models can offer valuable aid in understanding chemical systems beyond a simple prediction of quantities. |
Tasks | Active Learning |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07676v1 |
http://arxiv.org/pdf/1812.07676v1.pdf | |
PWC | https://paperswithcode.com/paper/molecular-dynamics-with-neural-network |
Repo | |
Framework | |
Leveraging Implicit Spatial Information in Global Features for Image Retrieval
Title | Leveraging Implicit Spatial Information in Global Features for Image Retrieval |
Authors | Pierre Jacob, David Picard, Aymeric Histace, Edouard Klein |
Abstract | Most image retrieval methods use global features that aggregate local distinctive patterns into a single representation. However, the aggregation process destroys the relative spatial information by considering orderless sets of local descriptors. We propose to integrate relative spatial information into the aggregation process by taking into account co-occurrences of local patterns in a tensor framework. The resulting signature called Improved Spatial Tensor Aggregation (ISTA) is able to reach state of the art performances on well known datasets such as Holidays, Oxford5k and Paris6k. |
Tasks | Image Retrieval |
Published | 2018-06-23 |
URL | http://arxiv.org/abs/1806.08991v1 |
http://arxiv.org/pdf/1806.08991v1.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-implicit-spatial-information-in |
Repo | |
Framework | |