July 28, 2019

2981 words 14 mins read

Paper Group ANR 209

Learning to Remember Translation History with a Continuous Cache. Bandit Models of Human Behavior: Reward Processing in Mental Disorders. Contrast Enhancement Estimation for Digital Image Forensics. Semi-Automatic Terminology Ontology Learning Based on Topic Modeling. MNL-Bandit: A Dynamic Learning Approach to Assortment Selection. The YouTube-8M K …

Learning to Remember Translation History with a Continuous Cache


Title	Learning to Remember Translation History with a Continuous Cache
Authors	Zhaopeng Tu, Yang Liu, Shuming Shi, Tong Zhang
Abstract	Existing neural machine translation (NMT) models generally translate sentences in isolation, missing the opportunity to take advantage of document-level information. In this work, we propose to augment NMT models with a very light-weight cache-like memory network, which stores recent hidden representations as translation history. The probability distribution over generated words is updated online depending on the translation history retrieved from the memory, endowing NMT models with the capability to dynamically adapt over time. Experiments on multiple domains with different topics and styles show the effectiveness of the proposed approach with negligible impact on the computational cost.
Tasks	Machine Translation
Published	2017-11-26
URL	http://arxiv.org/abs/1711.09367v1
PDF	http://arxiv.org/pdf/1711.09367v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-remember-translation-history-with
Repo
Framework

Bandit Models of Human Behavior: Reward Processing in Mental Disorders


Title	Bandit Models of Human Behavior: Reward Processing in Mental Disorders
Authors	Djallel Bouneffouf, Irina Rish, Guillermo A. Cecchi
Abstract	Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson’s and Alzheimer’s diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions.
Tasks	Decision Making
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02897v1
PDF	http://arxiv.org/pdf/1706.02897v1.pdf
PWC	https://paperswithcode.com/paper/bandit-models-of-human-behavior-reward
Repo
Framework

Contrast Enhancement Estimation for Digital Image Forensics


Title	Contrast Enhancement Estimation for Digital Image Forensics
Authors	Longyin Wen, Honggang Qi, Siwei Lyu
Abstract	Inconsistency in contrast enhancement can be used to expose image forgeries. In this work, we describe a new method to estimate contrast enhancement from a single image. Our method takes advantage of the nature of contrast enhancement as a mapping between pixel values, and the distinct characteristics it introduces to the image pixel histogram. Our method recovers the original pixel histogram and the contrast enhancement simultaneously from a single image with an iterative algorithm. Unlike previous methods, our method is robust in the presence of additive noise perturbations that are used to hide the traces of contrast enhancement. Furthermore, we also develop an e effective method to to detect image regions undergone contrast enhancement transformations that are different from the rest of the image, and use this method to detect composite images. We perform extensive experimental evaluations to demonstrate the efficacy and efficiency of our method method.
Tasks
Published	2017-06-13
URL	http://arxiv.org/abs/1706.03875v1
PDF	http://arxiv.org/pdf/1706.03875v1.pdf
PWC	https://paperswithcode.com/paper/contrast-enhancement-estimation-for-digital
Repo
Framework

Semi-Automatic Terminology Ontology Learning Based on Topic Modeling


Title	Semi-Automatic Terminology Ontology Learning Based on Topic Modeling
Authors	Monika Rani, Amit Kumar Dhar, O. P. Vyas
Abstract	Ontologies provide features like a common vocabulary, reusability, machine-readable content, and also allows for semantic search, facilitate agent interaction and ordering & structuring of knowledge for the Semantic Web (Web 3.0) application. However, the challenge in ontology engineering is automatic learning, i.e., the there is still a lack of fully automatic approach from a text corpus or dataset of various topics to form ontology using machine learning techniques. In this paper, two topic modeling algorithms are explored, namely LSI & SVD and Mr.LDA for learning topic ontology. The objective is to determine the statistical relationship between document and terms to build a topic ontology and ontology graph with minimum human intervention. Experimental analysis on building a topic ontology and semantic retrieving corresponding topic ontology for the user’s query demonstrating the effectiveness of the proposed approach.
Tasks
Published	2017-08-05
URL	http://arxiv.org/abs/1709.01991v1
PDF	http://arxiv.org/pdf/1709.01991v1.pdf
PWC	https://paperswithcode.com/paper/semi-automatic-terminology-ontology-learning
Repo
Framework

MNL-Bandit: A Dynamic Learning Approach to Assortment Selection


Title	MNL-Bandit: A Dynamic Learning Approach to Assortment Selection
Authors	Shipra Agrawal, Vashist Avadhanula, Vineet Goyal, Assaf Zeevi
Abstract	We consider a dynamic assortment selection problem, where in every round the retailer offers a subset (assortment) of $N$ substitutable products to a consumer, who selects one of these products according to a multinomial logit (MNL) choice model. The retailer observes this choice and the objective is to dynamically learn the model parameters, while optimizing cumulative revenues over a selling horizon of length $T$. We refer to this exploration-exploitation formulation as the MNL-Bandit problem. Existing methods for this problem follow an “explore-then-exploit” approach, which estimate parameters to a desired accuracy and then, treating these estimates as if they are the correct parameter values, offers the optimal assortment based on these estimates. These approaches require certain a priori knowledge of “separability”, determined by the true parameters of the underlying MNL model, and this in turn is critical in determining the length of the exploration period. (Separability refers to the distinguishability of the true optimal assortment from the other sub-optimal alternatives.) In this paper, we give an efficient algorithm that simultaneously explores and exploits, achieving performance independent of the underlying parameters. The algorithm can be implemented in a fully online manner, without knowledge of the horizon length $T$. Furthermore, the algorithm is adaptive in the sense that its performance is near-optimal in both the “well separated” case, as well as the general parameter setting where this separation need not hold.
Tasks
Published	2017-06-13
URL	http://arxiv.org/abs/1706.03880v2
PDF	http://arxiv.org/pdf/1706.03880v2.pdf
PWC	https://paperswithcode.com/paper/mnl-bandit-a-dynamic-learning-approach-to
Repo
Framework

The YouTube-8M Kaggle Competition: Challenges and Methods


Title	The YouTube-8M Kaggle Competition: Challenges and Methods
Authors	Haosheng Zou, Kun Xu, Jialian Li, Jun Zhu
Abstract	We took part in the YouTube-8M Video Understanding Challenge hosted on Kaggle, and achieved the 10th place within less than one month’s time. In this paper, we present an extensive analysis and solution to the underlying machine-learning problem based on frame-level data, where major challenges are identified and corresponding preliminary methods are proposed. It’s noteworthy that, with merely the proposed strategies and uniformly-averaging multi-crop ensemble was it sufficient for us to reach our ranking. We also report the methods we believe to be promising but didn’t have enough time to train to convergence. We hope this paper could serve, to some extent, as a review and guideline of the YouTube-8M multi-label video classification benchmark, inspiring future attempts and research.
Tasks	Video Classification, Video Understanding
Published	2017-06-28
URL	http://arxiv.org/abs/1706.09274v2
PDF	http://arxiv.org/pdf/1706.09274v2.pdf
PWC	https://paperswithcode.com/paper/the-youtube-8m-kaggle-competition-challenges
Repo
Framework

Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data


Title	Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data
Authors	Anurag Kumar, Bhiksha Raj
Abstract	The development of audio event recognition models requires labeled training data, which are generally hard to obtain. One promising source of recordings of audio events is the large amount of multimedia data on the web. In particular, if the audio content analysis must itself be performed on web audio, it is important to train the recognizers themselves from such data. Training from these web data, however, poses several challenges, the most important being the availability of labels : labels, if any, that may be obtained for the data are generally {\em weak}, and not of the kind conventionally required for training detectors or classifiers. We propose that learning algorithms that can exploit weak labels offer an effective method to learn from web data. We then propose a robust and efficient deep convolutional neural network (CNN) based framework to learn audio event recognizers from weakly labeled data. The proposed method can train from and analyze recordings of variable length in an efficient manner and outperforms a network trained with {\em strongly labeled} web data by a considerable margin.
Tasks
Published	2017-07-09
URL	http://arxiv.org/abs/1707.02530v2
PDF	http://arxiv.org/pdf/1707.02530v2.pdf
PWC	https://paperswithcode.com/paper/deep-cnn-framework-for-audio-event
Repo
Framework

An Effective Way to Improve YouTube-8M Classification Accuracy in Google Cloud Platform


Title	An Effective Way to Improve YouTube-8M Classification Accuracy in Google Cloud Platform
Authors	Zhenzhen Zhong, Shujiao Huang, Cheng Zhan, Licheng Zhang, Zhiwei Xiao, Chang-Chun Wang, Pei Yang
Abstract	Large-scale datasets have played a significant role in progress of neural network and deep learning areas. YouTube-8M is such a benchmark dataset for general multi-label video classification. It was created from over 7 million YouTube videos (450,000 hours of video) and includes video labels from a vocabulary of 4716 classes (3.4 labels/video on average). It also comes with pre-extracted audio & visual features from every second of video (3.2 billion feature vectors in total). Google cloud recently released the datasets and organized ‘Google Cloud & YouTube-8M Video Understanding Challenge’ on Kaggle. Competitors are challenged to develop classification algorithms that assign video-level labels using the new and improved Youtube-8M V2 dataset. Inspired by the competition, we started exploration of audio understanding and classification using deep learning algorithms and ensemble methods. We built several baseline predictions according to the benchmark paper and public github tensorflow code. Furthermore, we improved global prediction accuracy (GAP) from base level 77% to 80.7% through approaches of ensemble.
Tasks	Video Classification, Video Understanding
Published	2017-06-26
URL	http://arxiv.org/abs/1706.08217v1
PDF	http://arxiv.org/pdf/1706.08217v1.pdf
PWC	https://paperswithcode.com/paper/an-effective-way-to-improve-youtube-8m
Repo
Framework

Learning without Prejudice: Avoiding Bias in Webly-Supervised Action Recognition


Title	Learning without Prejudice: Avoiding Bias in Webly-Supervised Action Recognition
Authors	Christian Rupprecht, Ansh Kapil, Nan Liu, Lamberto Ballan, Federico Tombari
Abstract	Webly-supervised learning has recently emerged as an alternative paradigm to traditional supervised learning based on large-scale datasets with manual annotations. The key idea is that models such as CNNs can be learned from the noisy visual data available on the web. In this work we aim to exploit web data for video understanding tasks such as action recognition and detection. One of the main problems in webly-supervised learning is cleaning the noisy labeled data from the web. The state-of-the-art paradigm relies on training a first classifier on noisy data that is then used to clean the remaining dataset. Our key insight is that this procedure biases the second classifier towards samples that the first one understands. Here we train two independent CNNs, a RGB network on web images and video frames and a second network using temporal information from optical flow. We show that training the networks independently is vastly superior to selecting the frames for the flow classifier by using our RGB network. Moreover, we show benefits in enriching the training set with different data sources from heterogeneous public web databases. We demonstrate that our framework outperforms all other webly-supervised methods on two public benchmarks, UCF-101 and Thumos’14.
Tasks	Optical Flow Estimation, Temporal Action Localization, Video Understanding
Published	2017-06-14
URL	http://arxiv.org/abs/1706.04589v1
PDF	http://arxiv.org/pdf/1706.04589v1.pdf
PWC	https://paperswithcode.com/paper/learning-without-prejudice-avoiding-bias-in
Repo
Framework

Action Understanding with Multiple Classes of Actors


Title	Action Understanding with Multiple Classes of Actors
Authors	Chenliang Xu, Caiming Xiong, Jason J. Corso
Abstract	Despite the rapid progress, existing works on action understanding focus strictly on one type of action agent, which we call actor—a human adult, ignoring the diversity of actions performed by other actors. To overcome this narrow viewpoint, our paper marks the first effort in the computer vision community to jointly consider algorithmic understanding of various types of actors undergoing various actions. To begin with, we collect a large annotated Actor-Action Dataset (A2D) that consists of 3782 short videos and 31 temporally untrimmed long videos. We formulate the general actor-action understanding problem and instantiate it at various granularities: video-level single- and multiple-label actor-action recognition, and pixel-level actor-action segmentation. We propose and examine a comprehensive set of graphical models that consider the various types of interplay among actors and actions. Our findings have led us to conclusive evidence that the joint modeling of actor and action improves performance over modeling each of them independently, and further improvement can be obtained by considering the multi-scale natural in video understanding. Hence, our paper concludes the argument of the value of explicit consideration of various actors in comprehensive action understanding and provides a dataset and a benchmark for later works exploring this new problem.
Tasks	action segmentation, Temporal Action Localization, Video Understanding
Published	2017-04-27
URL	http://arxiv.org/abs/1704.08723v1
PDF	http://arxiv.org/pdf/1704.08723v1.pdf
PWC	https://paperswithcode.com/paper/action-understanding-with-multiple-classes-of
Repo
Framework

A Reproducible Study on Remote Heart Rate Measurement


Title	A Reproducible Study on Remote Heart Rate Measurement
Authors	Guillaume Heusch, André Anjos, Sébastien Marcel
Abstract	This paper studies the problem of reproducible research in remote photoplethysmography (rPPG). Most of the work published in this domain is assessed on privately-owned databases, making it difficult to evaluate proposed algorithms in a standard and principled manner. As a consequence, we present a new, publicly available database containing a relatively large number of subjects recorded under two different lighting conditions. Also, three state-of-the-art rPPG algorithms from the literature were selected, implemented and released as open source free software. After a thorough, unbiased experimental evaluation in various settings, it is shown that none of the selected algorithms is precise enough to be used in a real-world scenario.
Tasks
Published	2017-09-04
URL	http://arxiv.org/abs/1709.00962v1
PDF	http://arxiv.org/pdf/1709.00962v1.pdf
PWC	https://paperswithcode.com/paper/a-reproducible-study-on-remote-heart-rate
Repo
Framework

How Robust Are Character-Based Word Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse?


Title	How Robust Are Character-Based Word Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse?
Authors	Georg Heigold, Günter Neumann, Josef van Genabith
Abstract	This paper investigates the robustness of NLP against perturbed word forms. While neural approaches can achieve (almost) human-like accuracy for certain tasks and conditions, they often are sensitive to small changes in the input such as non-canonical input (e.g., typos). Yet both stability and robustness are desired properties in applications involving user-generated content, and the more as humans easily cope with such noisy or adversary conditions. In this paper, we study the impact of noisy input. We consider different noise distributions (one type of noise, combination of noise types) and mismatched noise distributions for training and testing. Moreover, we empirically evaluate the robustness of different models (convolutional neural networks, recurrent neural networks, non-neural models), different basic units (characters, byte pair encoding units), and different NLP tasks (morphological tagging, machine translation).
Tasks	Machine Translation, Morphological Tagging, Word Embeddings
Published	2017-04-14
URL	http://arxiv.org/abs/1704.04441v1
PDF	http://arxiv.org/pdf/1704.04441v1.pdf
PWC	https://paperswithcode.com/paper/how-robust-are-character-based-word
Repo
Framework

Well-Founded Operators for Normal Hybrid MKNF Knowledge Bases


Title	Well-Founded Operators for Normal Hybrid MKNF Knowledge Bases
Authors	Jianmin Ji, Fangfang Liu, Jia-Huai You
Abstract	Hybrid MKNF knowledge bases have been considered one of the dominant approaches to combining open world ontology languages with closed world rule-based languages. Currently, the only known inference methods are based on the approach of guess-and-verify, while most modern SAT/ASP solvers are built under the DPLL architecture. The central impediment here is that it is not clear what constitutes a constraint propagator, a key component employed in any DPLL-based solver. In this paper, we address this problem by formulating the notion of unfounded sets for nondisjunctive hybrid MKNF knowledge bases, based on which we propose and study two new well-founded operators. We show that by employing a well-founded operator as a constraint propagator, a sound and complete DPLL search engine can be readily defined. We compare our approach with the operator based on the alternating fixpoint construction by Knorr et al [2011] and show that, when applied to arbitrary partial partitions, the new well-founded operators not only propagate more truth values but also circumvent the non-converging behavior of the latter. In addition, we study the possibility of simplifying a given hybrid MKNF knowledge base by employing a well-founded operator, and show that, out of the two operators proposed in this paper, the weaker one can be applied for this purpose and the stronger one cannot. These observations are useful in implementing a grounder for hybrid MKNF knowledge bases, which can be applied before the computation of MKNF models. The paper is under consideration for acceptance in TPLP.
Tasks
Published	2017-07-06
URL	http://arxiv.org/abs/1707.01959v2
PDF	http://arxiv.org/pdf/1707.01959v2.pdf
PWC	https://paperswithcode.com/paper/well-founded-operators-for-normal-hybrid-mknf
Repo
Framework

Robust Covariate Shift Prediction with General Losses and Feature Views


Title	Robust Covariate Shift Prediction with General Losses and Feature Views
Authors	Anqi Liu, Brian D. Ziebart
Abstract	Covariate shift relaxes the widely-employed independent and identically distributed (IID) assumption by allowing different training and testing input distributions. Unfortunately, common methods for addressing covariate shift by trying to remove the bias between training and testing distributions using importance weighting often provide poor performance guarantees in theory and unreliable predictions with high variance in practice. Recently developed methods that construct a predictor that is inherently robust to the difficulties of learning under covariate shift are restricted to minimizing logloss and can be too conservative when faced with high-dimensional learning tasks. We address these limitations in two ways: by robustly minimizing various loss functions, including non-convex ones, under the testing distribution; and by separately shaping the influence of covariate shift according to different feature-based views of the relationship between input variables and example labels. These generalizations make robust covariate shift prediction applicable to more task scenarios. We demonstrate the benefits on classification under covariate shift tasks.
Tasks
Published	2017-12-28
URL	http://arxiv.org/abs/1712.10043v1
PDF	http://arxiv.org/pdf/1712.10043v1.pdf
PWC	https://paperswithcode.com/paper/robust-covariate-shift-prediction-with
Repo
Framework

Scene-centric Joint Parsing of Cross-view Videos


Title	Scene-centric Joint Parsing of Cross-view Videos
Authors	Hang Qi, Yuanlu Xu, Tao Yuan, Tianfu Wu, Song-Chun Zhu
Abstract	Cross-view video understanding is an important yet under-explored area in computer vision. In this paper, we introduce a joint parsing framework that integrates view-centric proposals into scene-centric parse graphs that represent a coherent scene-centric understanding of cross-view scenes. Our key observations are that overlapping fields of views embed rich appearance and geometry correlations and that knowledge fragments corresponding to individual vision tasks are governed by consistency constraints available in commonsense knowledge. The proposed joint parsing framework represents such correlations and constraints explicitly and generates semantic scene-centric parse graphs. Quantitative experiments show that scene-centric predictions in the parse graph outperform view-centric predictions.
Tasks	Video Understanding
Published	2017-09-16
URL	http://arxiv.org/abs/1709.05436v3
PDF	http://arxiv.org/pdf/1709.05436v3.pdf
PWC	https://paperswithcode.com/paper/scene-centric-joint-parsing-of-cross-view
Repo
Framework