July 29, 2019

3137 words 15 mins read

Paper Group ANR 64

Robust Contextual Bandit via the Capped-$\ell_{2}$ norm. Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study. Discriminative Learning of Prediction Intervals. Neural Semantic Parsing over Multiple Knowledge-bases. Susceptibility Propagation by Using Diagonal Consistency. LED-based Photometric Stereo: Modeling, Calibration …

Robust Contextual Bandit via the Capped-$\ell_{2}$ norm


Title	Robust Contextual Bandit via the Capped-$\ell_{2}$ norm
Authors	Feiyun Zhu, Xinliang Zhu, Sheng Wang, Jiawen Yao, Junzhou Huang
Abstract	This paper considers the actor-critic contextual bandit for the mobile health (mHealth) intervention. The state-of-the-art decision-making methods in mHealth generally assume that the noise in the dynamic system follows the Gaussian distribution. Those methods use the least-square-based algorithm to estimate the expected reward, which is prone to the existence of outliers. To deal with the issue of outliers, we propose a novel robust actor-critic contextual bandit method for the mHealth intervention. In the critic updating, the capped-$\ell_{2}$ norm is used to measure the approximation error, which prevents outliers from dominating our objective. A set of weights could be achieved from the critic updating. Considering them gives a weighted objective for the actor updating. It provides the badly noised sample in the critic updating with zero weights for the actor updating. As a result, the robustness of both actor-critic updating is enhanced. There is a key parameter in the capped-$\ell_{2}$ norm. We provide a reliable method to properly set it by making use of one of the most fundamental definitions of outliers in statistics. Extensive experiment results demonstrate that our method can achieve almost identical results compared with the state-of-the-art methods on the dataset without outliers and dramatically outperform them on the datasets noised by outliers.
Tasks	Decision Making
Published	2017-08-17
URL	http://arxiv.org/abs/1708.05446v1
PDF	http://arxiv.org/pdf/1708.05446v1.pdf
PWC	https://paperswithcode.com/paper/robust-contextual-bandit-via-the-capped-ell_2
Repo
Framework

Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study


Title	Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study
Authors	Rasoul Kaljahi, Jennifer Foster
Abstract	Any-gram kernels are a flexible and efficient way to employ bag-of-n-gram features when learning from textual data. They are also compatible with the use of word embeddings so that word similarities can be accounted for. While the original any-gram kernels are implemented on top of tree kernels, we propose a new approach which is independent of tree kernels and is more efficient. We also propose a more effective way to make use of word embeddings than the original any-gram formulation. When applied to the task of sentiment classification, our new formulation achieves significantly better performance.
Tasks	Sentence Classification, Sentiment Analysis, Word Embeddings
Published	2017-12-19
URL	http://arxiv.org/abs/1712.07004v1
PDF	http://arxiv.org/pdf/1712.07004v1.pdf
PWC	https://paperswithcode.com/paper/any-gram-kernels-for-sentence-classification
Repo
Framework

Discriminative Learning of Prediction Intervals


Title	Discriminative Learning of Prediction Intervals
Authors	Nir Rosenfeld, Yishay Mansour, Elad Yom-Tov
Abstract	In this work we consider the task of constructing prediction intervals in an inductive batch setting. We present a discriminative learning framework which optimizes the expected error rate under a budget constraint on the interval sizes. Most current methods for constructing prediction intervals offer guarantees for a single new test point. Applying these methods to multiple test points can result in a high computational overhead and degraded statistical guarantees. By focusing on expected errors, our method allows for variability in the per-example conditional error rates. As we demonstrate both analytically and empirically, this flexibility can increase the overall accuracy, or alternatively, reduce the average interval size. While the problem we consider is of a regressive flavor, the loss we use is combinatorial. This allows us to provide PAC-style, finite-sample guarantees. Computationally, we show that our original objective is NP-hard, and suggest a tractable convex surrogate. We conclude with a series of experimental evaluations.
Tasks
Published	2017-10-16
URL	http://arxiv.org/abs/1710.05888v2
PDF	http://arxiv.org/pdf/1710.05888v2.pdf
PWC	https://paperswithcode.com/paper/discriminative-learning-of-prediction
Repo
Framework

Neural Semantic Parsing over Multiple Knowledge-bases


Title	Neural Semantic Parsing over Multiple Knowledge-bases
Authors	Jonathan Herzig, Jonathan Berant
Abstract	A fundamental challenge in developing semantic parsers is the paucity of strong supervision in the form of language utterances annotated with logical form. In this paper, we propose to exploit structural regularities in language in different domains, and train semantic parsers over multiple knowledge-bases (KBs), while sharing information across datasets. We find that we can substantially improve parsing accuracy by training a single sequence-to-sequence model over multiple KBs, when providing an encoding of the domain at decoding time. Our model achieves state-of-the-art performance on the Overnight dataset (containing eight domains), improves performance over a single KB baseline from 75.6% to 79.6%, while obtaining a 7x reduction in the number of model parameters.
Tasks	Semantic Parsing
Published	2017-02-06
URL	http://arxiv.org/abs/1702.01569v2
PDF	http://arxiv.org/pdf/1702.01569v2.pdf
PWC	https://paperswithcode.com/paper/neural-semantic-parsing-over-multiple
Repo
Framework

Susceptibility Propagation by Using Diagonal Consistency


Title	Susceptibility Propagation by Using Diagonal Consistency
Authors	Muneki Yasuda, Kazuyuki Tanaka
Abstract	A susceptibility propagation that is constructed by combining a belief propagation and a linear response method is used for approximate computation for Markov random fields. Herein, we formulate a new, improved susceptibility propagation by using the concept of a diagonal matching method that is based on mean-field approaches to inverse Ising problems. The proposed susceptibility propagation is robust for various network structures, and it is reduced to the ordinary susceptibility propagation and to the adaptive Thouless-Anderson-Palmer equation in special cases.
Tasks
Published	2017-12-01
URL	http://arxiv.org/abs/1712.00155v1
PDF	http://arxiv.org/pdf/1712.00155v1.pdf
PWC	https://paperswithcode.com/paper/susceptibility-propagation-by-using-diagonal
Repo
Framework

LED-based Photometric Stereo: Modeling, Calibration and Numerical Solution


Title	LED-based Photometric Stereo: Modeling, Calibration and Numerical Solution
Authors	Yvain Quéau, Bastien Durix, Tao Wu, Daniel Cremers, François Lauze, Jean-Denis Durou
Abstract	We conduct a thorough study of photometric stereo under nearby point light source illumination, from modeling to numerical solution, through calibration. In the classical formulation of photometric stereo, the luminous fluxes are assumed to be directional, which is very difficult to achieve in practice. Rather, we use light-emitting diodes (LEDs) to illuminate the scene to reconstruct. Such point light sources are very convenient to use, yet they yield a more complex photometric stereo model which is arduous to solve. We first derive in a physically sound manner this model, and show how to calibrate its parameters. Then, we discuss two state-of-the-art numerical solutions. The first one alternatingly estimates the albedo and the normals, and then integrates the normals into a depth map. It is shown empirically to be independent from the initialization, but convergence of this sequential approach is not established. The second one directly recovers the depth, by formulating photometric stereo as a system of PDEs which are partially linearized using image ratios. Although the sequential approach is avoided, initialization matters a lot and convergence is not established either. Therefore, we introduce a provably convergent alternating reweighted least-squares scheme for solving the original system of PDEs, without resorting to image ratios for linearization. Finally, we extend this study to the case of RGB images.
Tasks	Calibration
Published	2017-07-04
URL	http://arxiv.org/abs/1707.01018v2
PDF	http://arxiv.org/pdf/1707.01018v2.pdf
PWC	https://paperswithcode.com/paper/led-based-photometric-stereo-modeling
Repo
Framework

Development and validation of a novel dementia of Alzheimer’s type (DAT) score based on metabolism FDG-PET imaging


Title	Development and validation of a novel dementia of Alzheimer’s type (DAT) score based on metabolism FDG-PET imaging
Authors	Karteek Popuri, Rakesh Balachandar, Kathryn Alpert, Donghuan Lu, Mahadev Bhalla, Ian Mackenzie, Robin Ging-Yuek Hsiung, Lei Wang, Mirza Faisal Beg, the Alzhemier’s Disease Neuroimaging Initiative
Abstract	Fluorodeoxyglucose positron emission tomography (FDG-PET) imaging based 3D topographic brain glucose metabolism patterns from normal controls (NC) and individuals with dementia of Alzheimer’s type (DAT) are used to train a novel multi-scale ensemble classification model. This ensemble model outputs a FDG-PET DAT score (FPDS) between 0 and 1 denoting the probability of a subject to be clinically diagnosed with DAT based on their metabolism profile. A novel 7 group image stratification scheme is devised that groups images not only based on their associated clinical diagnosis but also on past and future trajectories of the clinical diagnoses, yielding a more continuous representation of the different stages of DAT spectrum that mimics a real-world clinical setting. The potential for using FPDS as a DAT biomarker was validated on a large number of FDG-PET images (N=2984) obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database taken across the proposed stratification, and a good classification AUC (area under the curve) of 0.78 was achieved in distinguishing between images belonging to subjects on a DAT trajectory and those images taken from subjects not progressing to a DAT diagnosis. Further, the FPDS biomarker achieved state-of-the-art performance on the mild cognitive impairment (MCI) to DAT conversion prediction task with an AUC of 0.81, 0.80, 0.77 for the 2, 3, 5 years to conversion windows respectively.
Tasks
Published	2017-11-02
URL	http://arxiv.org/abs/1711.00671v1
PDF	http://arxiv.org/pdf/1711.00671v1.pdf
PWC	https://paperswithcode.com/paper/development-and-validation-of-a-novel
Repo
Framework

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization


Title	GIANT: Globally Improved Approximate Newton Method for Distributed Optimization
Authors	Shusen Wang, Farbod Roosta-Khorasani, Peng Xu, Michael W. Mahoney
Abstract	For distributed computing environment, we consider the empirical risk minimization problem and propose a distributed and communication-efficient Newton-type optimization method. At every iteration, each worker locally finds an Approximate NewTon (ANT) direction, which is sent to the main driver. The main driver, then, averages all the ANT directions received from workers to form a {\it Globally Improved ANT} (GIANT) direction. GIANT is highly communication efficient and naturally exploits the trade-offs between local computations and global communications in that more local computations result in fewer overall rounds of communications. Theoretically, we show that GIANT enjoys an improved convergence rate as compared with first-order methods and existing distributed Newton-type methods. Further, and in sharp contrast with many existing distributed Newton-type methods, as well as popular first-order methods, a highly advantageous practical feature of GIANT is that it only involves one tuning parameter. We conduct large-scale experiments on a computer cluster and, empirically, demonstrate the superior performance of GIANT.
Tasks	Distributed Optimization
Published	2017-09-11
URL	http://arxiv.org/abs/1709.03528v5
PDF	http://arxiv.org/pdf/1709.03528v5.pdf
PWC	https://paperswithcode.com/paper/giant-globally-improved-approximate-newton
Repo
Framework

Transformation-Grounded Image Generation Network for Novel 3D View Synthesis


Title	Transformation-Grounded Image Generation Network for Novel 3D View Synthesis
Authors	Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, Alexander C. Berg
Abstract	We present a transformation-grounded image generation network for novel 3D view synthesis from a single image. Instead of taking a ‘blank slate’ approach, we first explicitly infer the parts of the geometry visible both in the input and novel views and then re-cast the remaining synthesis problem as image completion. Specifically, we both predict a flow to move the pixels from the input to the novel view along with a novel visibility map that helps deal with occulsion/disocculsion. Next, conditioned on those intermediate results, we hallucinate (infer) parts of the object invisible in the input image. In addition to the new network structure, training with a combination of adversarial and perceptual loss results in a reduction in common artifacts of novel view synthesis such as distortions and holes, while successfully generating high frequency details and preserving visual aspects of the input image. We evaluate our approach on a wide range of synthetic and real examples. Both qualitative and quantitative results show our method achieves significantly better results compared to existing methods.
Tasks	Image Generation, Novel View Synthesis
Published	2017-03-08
URL	http://arxiv.org/abs/1703.02921v1
PDF	http://arxiv.org/pdf/1703.02921v1.pdf
PWC	https://paperswithcode.com/paper/transformation-grounded-image-generation
Repo
Framework

End-To-End Visual Speech Recognition With LSTMs


Title	End-To-End Visual Speech Recognition With LSTMs
Authors	Stavros Petridis, Zuwei Li, Maja Pantic
Abstract	Traditional visual speech recognition systems consist of two stages, feature extraction and classification. Recently, several deep learning approaches have been presented which automatically extract features from the mouth images and aim to replace the feature extraction stage. However, research on joint learning of features and classification is very limited. In this work, we present an end-to-end visual speech recognition system based on Long-Short Memory (LSTM) networks. To the best of our knowledge, this is the first model which simultaneously learns to extract features directly from the pixels and perform classification and also achieves state-of-the-art performance in visual speech classification. The model consists of two streams which extract features directly from the mouth and difference images, respectively. The temporal dynamics in each stream are modelled by an LSTM and the fusion of the two streams takes place via a Bidirectional LSTM (BLSTM). An absolute improvement of 9.7% over the base line is reported on the OuluVS2 database, and 1.5% on the CUAVE database when compared with other methods which use a similar visual front-end.
Tasks	Speech Recognition, Visual Speech Recognition
Published	2017-01-20
URL	http://arxiv.org/abs/1701.05847v1
PDF	http://arxiv.org/pdf/1701.05847v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-visual-speech-recognition-with
Repo
Framework

Label-Dependencies Aware Recurrent Neural Networks


Title	Label-Dependencies Aware Recurrent Neural Networks
Authors	Yoann Dupont, Marco Dinarelli, Isabelle Tellier
Abstract	In the last few years, Recurrent Neural Networks (RNNs) have proved effective on several NLP tasks. Despite such great success, their ability to model \emph{sequence labeling} is still limited. This lead research toward solutions where RNNs are combined with models which already proved effective in this domain, such as CRFs. In this work we propose a solution far simpler but very effective: an evolution of the simple Jordan RNN, where labels are re-injected as input into the network, and converted into embeddings, in the same way as words. We compare this RNN variant to all the other RNN models, Elman and Jordan RNN, LSTM and GRU, on two well-known tasks of Spoken Language Understanding (SLU). Thanks to label embeddings and their combination at the hidden layer, the proposed variant, which uses more parameters than Elman and Jordan RNNs, but far fewer than LSTM and GRU, is more effective than other RNNs, but also outperforms sophisticated CRF models.
Tasks	Spoken Language Understanding
Published	2017-06-06
URL	http://arxiv.org/abs/1706.01740v1
PDF	http://arxiv.org/pdf/1706.01740v1.pdf
PWC	https://paperswithcode.com/paper/label-dependencies-aware-recurrent-neural
Repo
Framework

Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading


Title	Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading
Authors	Chunlin Tian, Weijun Ji
Abstract	The Aduio-visual Speech Recognition (AVSR) which employs both the video and audio information to do Automatic Speech Recognition (ASR) is one of the application of multimodal leaning making ASR system more robust and accuracy. The traditional models usually treated AVSR as inference or projection but strict prior limits its ability. As the revival of deep learning, Deep Neural Networks (DNN) becomes an important toolkit in many traditional classification tasks including ASR, image classification, natural language processing. Some DNN models were used in AVSR like Multimodal Deep Autoencoders (MDAEs), Multimodal Deep Belief Network (MDBN) and Multimodal Deep Boltzmann Machine (MDBM) that actually work better than traditional methods. However, such DNN models have several shortcomings: (1) They don’t balance the modal fusion and temporal fusion, or even haven’t temporal fusion; (2)The architecture of these models isn’t end-to-end, the training and testing getting cumbersome. We propose a DNN model, Auxiliary Multimodal LSTM (am-LSTM), to overcome such weakness. The am-LSTM could be trained and tested once, moreover easy to train and preventing overfitting automatically. The extensibility and flexibility are also take into consideration. The experiments show that am-LSTM is much better than traditional methods and other DNN models in three datasets.
Tasks	Audio-Visual Speech Recognition, Image Classification, Lipreading, Speech Recognition, Visual Speech Recognition
Published	2017-01-16
URL	http://arxiv.org/abs/1701.04224v2
PDF	http://arxiv.org/pdf/1701.04224v2.pdf
PWC	https://paperswithcode.com/paper/auxiliary-multimodal-lstm-for-audio-visual
Repo
Framework

Implicit Entity Linking in Tweets


Title	Implicit Entity Linking in Tweets
Authors	Sujan Perera, Pablo N. Mendes, Adarsh Alex, Amit Sheth, Krishnaprasad Thirunarayan
Abstract	Over the years, Twitter has become one of the largest communication platforms providing key data to various applications such as brand monitoring, trend detection, among others. Entity linking is one of the major tasks in natural language understanding from tweets and it associates entity mentions in text to corresponding entries in knowledge bases in order to provide unambiguous interpretation and additional con- text. State-of-the-art techniques have focused on linking explicitly mentioned entities in tweets with reasonable success. However, we argue that in addition to explicit mentions i.e. The movie Gravity was more ex- pensive than the mars orbiter mission entities (movie Gravity) can also be mentioned implicitly i.e. This new space movie is crazy. you must watch it!. This paper introduces the problem of implicit entity linking in tweets. We propose an approach that models the entities by exploiting their factual and contextual knowledge. We demonstrate how to use these models to perform implicit entity linking on a ground truth dataset with 397 tweets from two domains, namely, Movie and Book. Specifically, we show: 1) the importance of linking implicit entities and its value addition to the standard entity linking task, and 2) the importance of exploiting contextual knowledge associated with an entity for linking their implicit mentions. We also make the ground truth dataset publicly available to foster the research in this new research area.
Tasks	Entity Linking
Published	2017-07-26
URL	http://arxiv.org/abs/1707.08470v1
PDF	http://arxiv.org/pdf/1707.08470v1.pdf
PWC	https://paperswithcode.com/paper/implicit-entity-linking-in-tweets
Repo
Framework

A Laplacian Framework for Option Discovery in Reinforcement Learning


Title	A Laplacian Framework for Option Discovery in Reinforcement Learning
Authors	Marlos C. Machado, Marc G. Bellemare, Michael Bowling
Abstract	Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for representation learning in MDPs. In this paper we address the option discovery problem by showing how PVFs implicitly define options. We do it by introducing eigenpurposes, intrinsic reward functions derived from the learned representations. The options discovered from eigenpurposes traverse the principal directions of the state space. They are useful for multiple tasks because they are discovered without taking the environment’s rewards into consideration. Moreover, different options act at different time scales, making them helpful for exploration. We demonstrate features of eigenpurposes in traditional tabular domains as well as in Atari 2600 games.
Tasks	Atari Games, Representation Learning
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00956v2
PDF	http://arxiv.org/pdf/1703.00956v2.pdf
PWC	https://paperswithcode.com/paper/a-laplacian-framework-for-option-discovery-in
Repo
Framework

Learned Spectral Super-Resolution


Title	Learned Spectral Super-Resolution
Authors	Silvano Galliani, Charis Lanaras, Dimitrios Marmanis, Emmanuel Baltsavias, Konrad Schindler
Abstract	We describe a novel method for blind, single-image spectral super-resolution. While conventional super-resolution aims to increase the spatial resolution of an input image, our goal is to spectrally enhance the input, i.e., generate an image with the same spatial resolution, but a greatly increased number of narrow (hyper-spectral) wave-length bands. Just like the spatial statistics of natural images has rich structure, which one can exploit as prior to predict high-frequency content from a low resolution image, the same is also true in the spectral domain: the materials and lighting conditions of the observed world induce structure in the spectrum of wavelengths observed at a given pixel. Surprisingly, very little work exists that attempts to use this diagnosis and achieve blind spectral super-resolution from single images. We start from the conjecture that, just like in the spatial domain, we can learn the statistics of natural image spectra, and with its help generate finely resolved hyper-spectral images from RGB input. Technically, we follow the current best practice and implement a convolutional neural network (CNN), which is trained to carry out the end-to-end mapping from an entire RGB image to the corresponding hyperspectral image of equal size. We demonstrate spectral super-resolution both for conventional RGB images and for multi-spectral satellite data, outperforming the state-of-the-art.
Tasks	Super-Resolution
Published	2017-03-28
URL	http://arxiv.org/abs/1703.09470v1
PDF	http://arxiv.org/pdf/1703.09470v1.pdf
PWC	https://paperswithcode.com/paper/learned-spectral-super-resolution
Repo
Framework