Paper Group ANR 209
Learning to Remember Translation History with a Continuous Cache. Bandit Models of Human Behavior: Reward Processing in Mental Disorders. Contrast Enhancement Estimation for Digital Image Forensics. Semi-Automatic Terminology Ontology Learning Based on Topic Modeling. MNL-Bandit: A Dynamic Learning Approach to Assortment Selection. The YouTube-8M K …
Learning to Remember Translation History with a Continuous Cache
Title | Learning to Remember Translation History with a Continuous Cache |
Authors | Zhaopeng Tu, Yang Liu, Shuming Shi, Tong Zhang |
Abstract | Existing neural machine translation (NMT) models generally translate sentences in isolation, missing the opportunity to take advantage of document-level information. In this work, we propose to augment NMT models with a very light-weight cache-like memory network, which stores recent hidden representations as translation history. The probability distribution over generated words is updated online depending on the translation history retrieved from the memory, endowing NMT models with the capability to dynamically adapt over time. Experiments on multiple domains with different topics and styles show the effectiveness of the proposed approach with negligible impact on the computational cost. |
Tasks | Machine Translation |
Published | 2017-11-26 |
URL | http://arxiv.org/abs/1711.09367v1 |
http://arxiv.org/pdf/1711.09367v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-remember-translation-history-with |
Repo | |
Framework | |
Bandit Models of Human Behavior: Reward Processing in Mental Disorders
Title | Bandit Models of Human Behavior: Reward Processing in Mental Disorders |
Authors | Djallel Bouneffouf, Irina Rish, Guillermo A. Cecchi |
Abstract | Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson’s and Alzheimer’s diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions. |
Tasks | Decision Making |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.02897v1 |
http://arxiv.org/pdf/1706.02897v1.pdf | |
PWC | https://paperswithcode.com/paper/bandit-models-of-human-behavior-reward |
Repo | |
Framework | |
Contrast Enhancement Estimation for Digital Image Forensics
Title | Contrast Enhancement Estimation for Digital Image Forensics |
Authors | Longyin Wen, Honggang Qi, Siwei Lyu |
Abstract | Inconsistency in contrast enhancement can be used to expose image forgeries. In this work, we describe a new method to estimate contrast enhancement from a single image. Our method takes advantage of the nature of contrast enhancement as a mapping between pixel values, and the distinct characteristics it introduces to the image pixel histogram. Our method recovers the original pixel histogram and the contrast enhancement simultaneously from a single image with an iterative algorithm. Unlike previous methods, our method is robust in the presence of additive noise perturbations that are used to hide the traces of contrast enhancement. Furthermore, we also develop an e effective method to to detect image regions undergone contrast enhancement transformations that are different from the rest of the image, and use this method to detect composite images. We perform extensive experimental evaluations to demonstrate the efficacy and efficiency of our method method. |
Tasks | |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.03875v1 |
http://arxiv.org/pdf/1706.03875v1.pdf | |
PWC | https://paperswithcode.com/paper/contrast-enhancement-estimation-for-digital |
Repo | |
Framework | |
Semi-Automatic Terminology Ontology Learning Based on Topic Modeling
Title | Semi-Automatic Terminology Ontology Learning Based on Topic Modeling |
Authors | Monika Rani, Amit Kumar Dhar, O. P. Vyas |
Abstract | Ontologies provide features like a common vocabulary, reusability, machine-readable content, and also allows for semantic search, facilitate agent interaction and ordering & structuring of knowledge for the Semantic Web (Web 3.0) application. However, the challenge in ontology engineering is automatic learning, i.e., the there is still a lack of fully automatic approach from a text corpus or dataset of various topics to form ontology using machine learning techniques. In this paper, two topic modeling algorithms are explored, namely LSI & SVD and Mr.LDA for learning topic ontology. The objective is to determine the statistical relationship between document and terms to build a topic ontology and ontology graph with minimum human intervention. Experimental analysis on building a topic ontology and semantic retrieving corresponding topic ontology for the user’s query demonstrating the effectiveness of the proposed approach. |
Tasks | |
Published | 2017-08-05 |
URL | http://arxiv.org/abs/1709.01991v1 |
http://arxiv.org/pdf/1709.01991v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-automatic-terminology-ontology-learning |
Repo | |
Framework | |
MNL-Bandit: A Dynamic Learning Approach to Assortment Selection
Title | MNL-Bandit: A Dynamic Learning Approach to Assortment Selection |
Authors | Shipra Agrawal, Vashist Avadhanula, Vineet Goyal, Assaf Zeevi |
Abstract | We consider a dynamic assortment selection problem, where in every round the retailer offers a subset (assortment) of $N$ substitutable products to a consumer, who selects one of these products according to a multinomial logit (MNL) choice model. The retailer observes this choice and the objective is to dynamically learn the model parameters, while optimizing cumulative revenues over a selling horizon of length $T$. We refer to this exploration-exploitation formulation as the MNL-Bandit problem. Existing methods for this problem follow an “explore-then-exploit” approach, which estimate parameters to a desired accuracy and then, treating these estimates as if they are the correct parameter values, offers the optimal assortment based on these estimates. These approaches require certain a priori knowledge of “separability”, determined by the true parameters of the underlying MNL model, and this in turn is critical in determining the length of the exploration period. (Separability refers to the distinguishability of the true optimal assortment from the other sub-optimal alternatives.) In this paper, we give an efficient algorithm that simultaneously explores and exploits, achieving performance independent of the underlying parameters. The algorithm can be implemented in a fully online manner, without knowledge of the horizon length $T$. Furthermore, the algorithm is adaptive in the sense that its performance is near-optimal in both the “well separated” case, as well as the general parameter setting where this separation need not hold. |
Tasks | |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.03880v2 |
http://arxiv.org/pdf/1706.03880v2.pdf | |
PWC | https://paperswithcode.com/paper/mnl-bandit-a-dynamic-learning-approach-to |
Repo | |
Framework | |
The YouTube-8M Kaggle Competition: Challenges and Methods
Title | The YouTube-8M Kaggle Competition: Challenges and Methods |
Authors | Haosheng Zou, Kun Xu, Jialian Li, Jun Zhu |
Abstract | We took part in the YouTube-8M Video Understanding Challenge hosted on Kaggle, and achieved the 10th place within less than one month’s time. In this paper, we present an extensive analysis and solution to the underlying machine-learning problem based on frame-level data, where major challenges are identified and corresponding preliminary methods are proposed. It’s noteworthy that, with merely the proposed strategies and uniformly-averaging multi-crop ensemble was it sufficient for us to reach our ranking. We also report the methods we believe to be promising but didn’t have enough time to train to convergence. We hope this paper could serve, to some extent, as a review and guideline of the YouTube-8M multi-label video classification benchmark, inspiring future attempts and research. |
Tasks | Video Classification, Video Understanding |
Published | 2017-06-28 |
URL | http://arxiv.org/abs/1706.09274v2 |
http://arxiv.org/pdf/1706.09274v2.pdf | |
PWC | https://paperswithcode.com/paper/the-youtube-8m-kaggle-competition-challenges |
Repo | |
Framework | |
Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data
Title | Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data |
Authors | Anurag Kumar, Bhiksha Raj |
Abstract | The development of audio event recognition models requires labeled training data, which are generally hard to obtain. One promising source of recordings of audio events is the large amount of multimedia data on the web. In particular, if the audio content analysis must itself be performed on web audio, it is important to train the recognizers themselves from such data. Training from these web data, however, poses several challenges, the most important being the availability of labels : labels, if any, that may be obtained for the data are generally {\em weak}, and not of the kind conventionally required for training detectors or classifiers. We propose that learning algorithms that can exploit weak labels offer an effective method to learn from web data. We then propose a robust and efficient deep convolutional neural network (CNN) based framework to learn audio event recognizers from weakly labeled data. The proposed method can train from and analyze recordings of variable length in an efficient manner and outperforms a network trained with {\em strongly labeled} web data by a considerable margin. |
Tasks | |
Published | 2017-07-09 |
URL | http://arxiv.org/abs/1707.02530v2 |
http://arxiv.org/pdf/1707.02530v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-cnn-framework-for-audio-event |
Repo | |
Framework | |
An Effective Way to Improve YouTube-8M Classification Accuracy in Google Cloud Platform
Title | An Effective Way to Improve YouTube-8M Classification Accuracy in Google Cloud Platform |
Authors | Zhenzhen Zhong, Shujiao Huang, Cheng Zhan, Licheng Zhang, Zhiwei Xiao, Chang-Chun Wang, Pei Yang |
Abstract | Large-scale datasets have played a significant role in progress of neural network and deep learning areas. YouTube-8M is such a benchmark dataset for general multi-label video classification. It was created from over 7 million YouTube videos (450,000 hours of video) and includes video labels from a vocabulary of 4716 classes (3.4 labels/video on average). It also comes with pre-extracted audio & visual features from every second of video (3.2 billion feature vectors in total). Google cloud recently released the datasets and organized ‘Google Cloud & YouTube-8M Video Understanding Challenge’ on Kaggle. Competitors are challenged to develop classification algorithms that assign video-level labels using the new and improved Youtube-8M V2 dataset. Inspired by the competition, we started exploration of audio understanding and classification using deep learning algorithms and ensemble methods. We built several baseline predictions according to the benchmark paper and public github tensorflow code. Furthermore, we improved global prediction accuracy (GAP) from base level 77% to 80.7% through approaches of ensemble. |
Tasks | Video Classification, Video Understanding |
Published | 2017-06-26 |
URL | http://arxiv.org/abs/1706.08217v1 |
http://arxiv.org/pdf/1706.08217v1.pdf | |
PWC | https://paperswithcode.com/paper/an-effective-way-to-improve-youtube-8m |
Repo | |
Framework | |
Learning without Prejudice: Avoiding Bias in Webly-Supervised Action Recognition
Title | Learning without Prejudice: Avoiding Bias in Webly-Supervised Action Recognition |
Authors | Christian Rupprecht, Ansh Kapil, Nan Liu, Lamberto Ballan, Federico Tombari |
Abstract | Webly-supervised learning has recently emerged as an alternative paradigm to traditional supervised learning based on large-scale datasets with manual annotations. The key idea is that models such as CNNs can be learned from the noisy visual data available on the web. In this work we aim to exploit web data for video understanding tasks such as action recognition and detection. One of the main problems in webly-supervised learning is cleaning the noisy labeled data from the web. The state-of-the-art paradigm relies on training a first classifier on noisy data that is then used to clean the remaining dataset. Our key insight is that this procedure biases the second classifier towards samples that the first one understands. Here we train two independent CNNs, a RGB network on web images and video frames and a second network using temporal information from optical flow. We show that training the networks independently is vastly superior to selecting the frames for the flow classifier by using our RGB network. Moreover, we show benefits in enriching the training set with different data sources from heterogeneous public web databases. We demonstrate that our framework outperforms all other webly-supervised methods on two public benchmarks, UCF-101 and Thumos’14. |
Tasks | Optical Flow Estimation, Temporal Action Localization, Video Understanding |
Published | 2017-06-14 |
URL | http://arxiv.org/abs/1706.04589v1 |
http://arxiv.org/pdf/1706.04589v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-without-prejudice-avoiding-bias-in |
Repo | |
Framework | |
Action Understanding with Multiple Classes of Actors
Title | Action Understanding with Multiple Classes of Actors |
Authors | Chenliang Xu, Caiming Xiong, Jason J. Corso |
Abstract | Despite the rapid progress, existing works on action understanding focus strictly on one type of action agent, which we call actor—a human adult, ignoring the diversity of actions performed by other actors. To overcome this narrow viewpoint, our paper marks the first effort in the computer vision community to jointly consider algorithmic understanding of various types of actors undergoing various actions. To begin with, we collect a large annotated Actor-Action Dataset (A2D) that consists of 3782 short videos and 31 temporally untrimmed long videos. We formulate the general actor-action understanding problem and instantiate it at various granularities: video-level single- and multiple-label actor-action recognition, and pixel-level actor-action segmentation. We propose and examine a comprehensive set of graphical models that consider the various types of interplay among actors and actions. Our findings have led us to conclusive evidence that the joint modeling of actor and action improves performance over modeling each of them independently, and further improvement can be obtained by considering the multi-scale natural in video understanding. Hence, our paper concludes the argument of the value of explicit consideration of various actors in comprehensive action understanding and provides a dataset and a benchmark for later works exploring this new problem. |
Tasks | action segmentation, Temporal Action Localization, Video Understanding |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1704.08723v1 |
http://arxiv.org/pdf/1704.08723v1.pdf | |
PWC | https://paperswithcode.com/paper/action-understanding-with-multiple-classes-of |
Repo | |
Framework | |
A Reproducible Study on Remote Heart Rate Measurement
Title | A Reproducible Study on Remote Heart Rate Measurement |
Authors | Guillaume Heusch, André Anjos, Sébastien Marcel |
Abstract | This paper studies the problem of reproducible research in remote photoplethysmography (rPPG). Most of the work published in this domain is assessed on privately-owned databases, making it difficult to evaluate proposed algorithms in a standard and principled manner. As a consequence, we present a new, publicly available database containing a relatively large number of subjects recorded under two different lighting conditions. Also, three state-of-the-art rPPG algorithms from the literature were selected, implemented and released as open source free software. After a thorough, unbiased experimental evaluation in various settings, it is shown that none of the selected algorithms is precise enough to be used in a real-world scenario. |
Tasks | |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.00962v1 |
http://arxiv.org/pdf/1709.00962v1.pdf | |
PWC | https://paperswithcode.com/paper/a-reproducible-study-on-remote-heart-rate |
Repo | |
Framework | |
How Robust Are Character-Based Word Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse?
Title | How Robust Are Character-Based Word Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse? |
Authors | Georg Heigold, Günter Neumann, Josef van Genabith |
Abstract | This paper investigates the robustness of NLP against perturbed word forms. While neural approaches can achieve (almost) human-like accuracy for certain tasks and conditions, they often are sensitive to small changes in the input such as non-canonical input (e.g., typos). Yet both stability and robustness are desired properties in applications involving user-generated content, and the more as humans easily cope with such noisy or adversary conditions. In this paper, we study the impact of noisy input. We consider different noise distributions (one type of noise, combination of noise types) and mismatched noise distributions for training and testing. Moreover, we empirically evaluate the robustness of different models (convolutional neural networks, recurrent neural networks, non-neural models), different basic units (characters, byte pair encoding units), and different NLP tasks (morphological tagging, machine translation). |
Tasks | Machine Translation, Morphological Tagging, Word Embeddings |
Published | 2017-04-14 |
URL | http://arxiv.org/abs/1704.04441v1 |
http://arxiv.org/pdf/1704.04441v1.pdf | |
PWC | https://paperswithcode.com/paper/how-robust-are-character-based-word |
Repo | |
Framework | |
Well-Founded Operators for Normal Hybrid MKNF Knowledge Bases
Title | Well-Founded Operators for Normal Hybrid MKNF Knowledge Bases |
Authors | Jianmin Ji, Fangfang Liu, Jia-Huai You |
Abstract | Hybrid MKNF knowledge bases have been considered one of the dominant approaches to combining open world ontology languages with closed world rule-based languages. Currently, the only known inference methods are based on the approach of guess-and-verify, while most modern SAT/ASP solvers are built under the DPLL architecture. The central impediment here is that it is not clear what constitutes a constraint propagator, a key component employed in any DPLL-based solver. In this paper, we address this problem by formulating the notion of unfounded sets for nondisjunctive hybrid MKNF knowledge bases, based on which we propose and study two new well-founded operators. We show that by employing a well-founded operator as a constraint propagator, a sound and complete DPLL search engine can be readily defined. We compare our approach with the operator based on the alternating fixpoint construction by Knorr et al [2011] and show that, when applied to arbitrary partial partitions, the new well-founded operators not only propagate more truth values but also circumvent the non-converging behavior of the latter. In addition, we study the possibility of simplifying a given hybrid MKNF knowledge base by employing a well-founded operator, and show that, out of the two operators proposed in this paper, the weaker one can be applied for this purpose and the stronger one cannot. These observations are useful in implementing a grounder for hybrid MKNF knowledge bases, which can be applied before the computation of MKNF models. The paper is under consideration for acceptance in TPLP. |
Tasks | |
Published | 2017-07-06 |
URL | http://arxiv.org/abs/1707.01959v2 |
http://arxiv.org/pdf/1707.01959v2.pdf | |
PWC | https://paperswithcode.com/paper/well-founded-operators-for-normal-hybrid-mknf |
Repo | |
Framework | |
Robust Covariate Shift Prediction with General Losses and Feature Views
Title | Robust Covariate Shift Prediction with General Losses and Feature Views |
Authors | Anqi Liu, Brian D. Ziebart |
Abstract | Covariate shift relaxes the widely-employed independent and identically distributed (IID) assumption by allowing different training and testing input distributions. Unfortunately, common methods for addressing covariate shift by trying to remove the bias between training and testing distributions using importance weighting often provide poor performance guarantees in theory and unreliable predictions with high variance in practice. Recently developed methods that construct a predictor that is inherently robust to the difficulties of learning under covariate shift are restricted to minimizing logloss and can be too conservative when faced with high-dimensional learning tasks. We address these limitations in two ways: by robustly minimizing various loss functions, including non-convex ones, under the testing distribution; and by separately shaping the influence of covariate shift according to different feature-based views of the relationship between input variables and example labels. These generalizations make robust covariate shift prediction applicable to more task scenarios. We demonstrate the benefits on classification under covariate shift tasks. |
Tasks | |
Published | 2017-12-28 |
URL | http://arxiv.org/abs/1712.10043v1 |
http://arxiv.org/pdf/1712.10043v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-covariate-shift-prediction-with |
Repo | |
Framework | |
Scene-centric Joint Parsing of Cross-view Videos
Title | Scene-centric Joint Parsing of Cross-view Videos |
Authors | Hang Qi, Yuanlu Xu, Tao Yuan, Tianfu Wu, Song-Chun Zhu |
Abstract | Cross-view video understanding is an important yet under-explored area in computer vision. In this paper, we introduce a joint parsing framework that integrates view-centric proposals into scene-centric parse graphs that represent a coherent scene-centric understanding of cross-view scenes. Our key observations are that overlapping fields of views embed rich appearance and geometry correlations and that knowledge fragments corresponding to individual vision tasks are governed by consistency constraints available in commonsense knowledge. The proposed joint parsing framework represents such correlations and constraints explicitly and generates semantic scene-centric parse graphs. Quantitative experiments show that scene-centric predictions in the parse graph outperform view-centric predictions. |
Tasks | Video Understanding |
Published | 2017-09-16 |
URL | http://arxiv.org/abs/1709.05436v3 |
http://arxiv.org/pdf/1709.05436v3.pdf | |
PWC | https://paperswithcode.com/paper/scene-centric-joint-parsing-of-cross-view |
Repo | |
Framework | |