May 5, 2019

2805 words 14 mins read

Paper Group ANR 461

A Characterization of Prediction Errors. Deep CNNs along the Time Axis with Intermap Pooling for Robustness to Spectral Variations. Pano2Vid: Automatic Cinematography for Watching 360$^{\circ}$ Videos. Neural Net Models for Open-Domain Discourse Coherence. Unsupervised Learning with Truncated Gaussian Graphical Models. Dimension Projection among La …

A Characterization of Prediction Errors


Title	A Characterization of Prediction Errors
Authors	Christopher Meek
Abstract	Understanding prediction errors and determining how to fix them is critical to building effective predictive systems. In this paper, we delineate four types of prediction errors and demonstrate that these four types characterize all prediction errors. In addition, we describe potential remedies and tools that can be used to reduce the uncertainty when trying to determine the source of a prediction error and when trying to take action to remove a prediction errors.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1611.05955v1
PDF	http://arxiv.org/pdf/1611.05955v1.pdf
PWC	https://paperswithcode.com/paper/a-characterization-of-prediction-errors
Repo
Framework

Deep CNNs along the Time Axis with Intermap Pooling for Robustness to Spectral Variations


Title	Deep CNNs along the Time Axis with Intermap Pooling for Robustness to Spectral Variations
Authors	Hwaran Lee, Geonmin Kim, Ho-Gyeong Kim, Sang-Hoon Oh, Soo-Young Lee
Abstract	Convolutional neural networks (CNNs) with convolutional and pooling operations along the frequency axis have been proposed to attain invariance to frequency shifts of features. However, this is inappropriate with regard to the fact that acoustic features vary in frequency. In this paper, we contend that convolution along the time axis is more effective. We also propose the addition of an intermap pooling (IMP) layer to deep CNNs. In this layer, filters in each group extract common but spectrally variant features, then the layer pools the feature maps of each group. As a result, the proposed IMP CNN can achieve insensitivity to spectral variations characteristic of different speakers and utterances. The effectiveness of the IMP CNN architecture is demonstrated on several LVCSR tasks. Even without speaker adaptation techniques, the architecture achieved a WER of 12.7% on the SWB part of the Hub5’2000 evaluation test set, which is competitive with other state-of-the-art methods.
Tasks	Large Vocabulary Continuous Speech Recognition
Published	2016-06-10
URL	http://arxiv.org/abs/1606.03207v2
PDF	http://arxiv.org/pdf/1606.03207v2.pdf
PWC	https://paperswithcode.com/paper/deep-cnns-along-the-time-axis-with-intermap
Repo
Framework

Pano2Vid: Automatic Cinematography for Watching 360$^{\circ}$ Videos


Title	Pano2Vid: Automatic Cinematography for Watching 360$^{\circ}$ Videos
Authors	Yu-Chuan Su, Dinesh Jayaraman, Kristen Grauman
Abstract	We introduce the novel task of Pano2Vid $-$ automatic cinematography in panoramic 360$^{\circ}$ videos. Given a 360$^{\circ}$ video, the goal is to direct an imaginary camera to virtually capture natural-looking normal field-of-view (NFOV) video. By selecting “where to look” within the panorama at each time step, Pano2Vid aims to free both the videographer and the end viewer from the task of determining what to watch. Towards this goal, we first compile a dataset of 360$^{\circ}$ videos downloaded from the web, together with human-edited NFOV camera trajectories to facilitate evaluation. Next, we propose AutoCam, a data-driven approach to solve the Pano2Vid task. AutoCam leverages NFOV web video to discriminatively identify space-time “glimpses” of interest at each time instant, and then uses dynamic programming to select optimal human-like camera trajectories. Through experimental evaluation on multiple newly defined Pano2Vid performance measures against several baselines, we show that our method successfully produces informative videos that could conceivably have been captured by human videographers.
Tasks
Published	2016-12-07
URL	http://arxiv.org/abs/1612.02335v1
PDF	http://arxiv.org/pdf/1612.02335v1.pdf
PWC	https://paperswithcode.com/paper/pano2vid-automatic-cinematography-for
Repo
Framework

Neural Net Models for Open-Domain Discourse Coherence


Title	Neural Net Models for Open-Domain Discourse Coherence
Authors	Jiwei Li, Dan Jurafsky
Abstract	Discourse coherence is strongly associated with text quality, making it important to natural language generation and understanding. Yet existing models of coherence focus on measuring individual aspects of coherence (lexical overlap, rhetorical structure, entity centering) in narrow domains. In this paper, we describe domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences. We study both discriminative models that learn to distinguish coherent from incoherent discourse, and generative models that produce coherent text, including a novel neural latent-variable Markovian generative model that captures the latent discourse dependencies between sentences in a text. Our work achieves state-of-the-art performance on multiple coherence evaluations, and marks an initial step in generating coherent texts given discourse contexts.
Tasks	Text Generation
Published	2016-06-05
URL	http://arxiv.org/abs/1606.01545v3
PDF	http://arxiv.org/pdf/1606.01545v3.pdf
PWC	https://paperswithcode.com/paper/neural-net-models-for-open-domain-discourse
Repo
Framework

Unsupervised Learning with Truncated Gaussian Graphical Models


Title	Unsupervised Learning with Truncated Gaussian Graphical Models
Authors	Qinliang Su, Xuejun Liao, Chunyuan Li, Zhe Gan, Lawrence Carin
Abstract	Gaussian graphical models (GGMs) are widely used for statistical modeling, because of ease of inference and the ubiquitous use of the normal distribution in practical approximations. However, they are also known for their limited modeling abilities, due to the Gaussian assumption. In this paper, we introduce a novel variant of GGMs, which relaxes the Gaussian restriction and yet admits efficient inference. Specifically, we impose a bipartite structure on the GGM and govern the hidden variables by truncated normal distributions. The nonlinearity of the model is revealed by its connection to rectified linear unit (ReLU) neural networks. Meanwhile, thanks to the bipartite structure and appealing properties of truncated normals, we are able to train the models efficiently using contrastive divergence. We consider three output constructs, accounting for real-valued, binary and count data. We further extend the model to deep constructions and show that deep models can be used for unsupervised pre-training of rectifier neural networks. Extensive experimental results are provided to validate the proposed models and demonstrate their superiority over competing models.
Tasks
Published	2016-11-15
URL	http://arxiv.org/abs/1611.04920v2
PDF	http://arxiv.org/pdf/1611.04920v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-with-truncated-gaussian
Repo
Framework

Dimension Projection among Languages based on Pseudo-relevant Documents for Query Translation


Title	Dimension Projection among Languages based on Pseudo-relevant Documents for Query Translation
Authors	Javid Dadashkarimi, Mahsa S. Shahshahani, Amirhossein Tebbifakhr, Heshaam Faili, Azadeh Shakery
Abstract	Using top-ranked documents in response to a query has been shown to be an effective approach to improve the quality of query translation in dictionary-based cross-language information retrieval. In this paper, we propose a new method for dictionary-based query translation based on dimension projection of embedded vectors from the pseudo-relevant documents in the source language to their equivalents in the target language. To this end, first we learn low-dimensional vectors of the words in the pseudo-relevant collections separately and then aim to find a query-dependent transformation matrix between the vectors of translation pairs appeared in the collections. At the next step, representation of each query term is projected to the target language and then, after using a softmax function, a query-dependent translation model is built. Finally, the model is used for query translation. Our experiments on four CLEF collections in French, Spanish, German, and Italian demonstrate that the proposed method outperforms a word embedding baseline based on bilingual shuffling and a further number of competitive baselines. The proposed method reaches up to 87% performance of machine translation (MT) in short queries and considerable improvements in verbose queries.
Tasks	Information Retrieval, Machine Translation
Published	2016-05-25
URL	http://arxiv.org/abs/1605.07844v2
PDF	http://arxiv.org/pdf/1605.07844v2.pdf
PWC	https://paperswithcode.com/paper/dimension-projection-among-languages-based-on
Repo
Framework

A DNN Framework For Text Image Rectification From Planar Transformations


Title	A DNN Framework For Text Image Rectification From Planar Transformations
Authors	Chengzhe Yan, Jie Hu, Changshui Zhang
Abstract	In this paper, a novel neural network architecture is proposed attempting to rectify text images with mild assumptions. A new dataset of text images is collected to verify our model and open to public. We explored the capability of deep neural network in learning geometric transformation and found the model could segment the text image without explicit supervised segmentation information. Experiments show the architecture proposed can restore planar transformations with wonderful robustness and effectiveness.
Tasks
Published	2016-11-14
URL	http://arxiv.org/abs/1611.04298v1
PDF	http://arxiv.org/pdf/1611.04298v1.pdf
PWC	https://paperswithcode.com/paper/a-dnn-framework-for-text-image-rectification
Repo
Framework

Equilibrium Graphs


Title	Equilibrium Graphs
Authors	Pedro Cabalar, Carlos Pérez, Gilberto Pérez
Abstract	In this paper we present an extension of Peirce’s existential graphs to provide a diagrammatic representation of expressions in Quantified Equilibrium Logic (QEL). Using this formalisation, logical connectives are replaced by encircled regions (circles and squares) and quantified variables are represented as “identity” lines. Although the expressive power is equivalent to that of QEL, the new representation can be useful for illustrative or educational purposes.
Tasks
Published	2016-09-07
URL	http://arxiv.org/abs/1609.02010v1
PDF	http://arxiv.org/pdf/1609.02010v1.pdf
PWC	https://paperswithcode.com/paper/equilibrium-graphs
Repo
Framework

Performance of a community detection algorithm based on semidefinite programming


Title	Performance of a community detection algorithm based on semidefinite programming
Authors	Adel Javanmard, Andrea Montanari, Federico Ricci-Tersenghi
Abstract	The problem of detecting communities in a graph is maybe one the most studied inference problems, given its simplicity and widespread diffusion among several disciplines. A very common benchmark for this problem is the stochastic block model or planted partition problem, where a phase transition takes place in the detection of the planted partition by changing the signal-to-noise ratio. Optimal algorithms for the detection exist which are based on spectral methods, but we show these are extremely sensible to slight modification in the generative model. Recently Javanmard, Montanari and Ricci-Tersenghi (arXiv:1511.08769) have used statistical physics arguments, and numerical simulations to show that finding communities in the stochastic block model via semidefinite programming is quasi optimal. Further, the resulting semidefinite relaxation can be solved efficiently, and is very robust with respect to changes in the generative model. In this paper we study in detail several practical aspects of this new algorithm based on semidefinite programming for the detection of the planted partition. The algorithm turns out to be very fast, allowing the solution of problems with $O(10^5)$ variables in few second on a laptop computer.
Tasks	Community Detection
Published	2016-03-30
URL	http://arxiv.org/abs/1603.09045v1
PDF	http://arxiv.org/pdf/1603.09045v1.pdf
PWC	https://paperswithcode.com/paper/performance-of-a-community-detection
Repo
Framework

Towards an “In-the-Wild” Emotion Dataset Using a Game-based Framework


Title	Towards an “In-the-Wild” Emotion Dataset Using a Game-based Framework
Authors	Wei Li, Farnaz Abtahi, Christina Tsangouri, Zhigang Zhu
Abstract	In order to create an “in-the-wild” dataset of facial emotions with large number of balanced samples, this paper proposes a game-based data collection framework. The framework mainly include three components: a game engine, a game interface, and a data collection and evaluation module. We use a deep learning approach to build an emotion classifier as the game engine. Then a emotion web game to allow gamers to enjoy the games, while the data collection module obtains automatically-labelled emotion images. Using our game, we have collected more than 15,000 images within a month of the test run and built an emotion dataset “GaMo”. To evaluate the dataset, we compared the performance of two deep learning models trained on both GaMo and CIFE. The results of our experiments show that because of being large and balanced, GaMo can be used to build a more robust emotion detector than the emotion detector trained on CIFE, which was used in the game engine to collect the face images.
Tasks
Published	2016-07-10
URL	http://arxiv.org/abs/1607.02678v1
PDF	http://arxiv.org/pdf/1607.02678v1.pdf
PWC	https://paperswithcode.com/paper/towards-an-in-the-wild-emotion-dataset-using
Repo
Framework


Title	Modification of Question Writing Style Influences Content Popularity in a Social Q&A System
Authors	Igor A. Podgorny
Abstract	TurboTax AnswerXchange is a social Q&A system supporting users working on federal and state tax returns. Using 2015 data, we demonstrate that content popularity (or number of views per AnswerXchange question) can be predicted with reasonable accuracy based on attributes of the question alone. We also employ probabilistic topic analysis and uplift modeling to identify question features with the highest impact on popularity. We demonstrate that content popularity is driven by behavioral attributes of AnswerXchange users and depends on complex interactions between search ranking algorithms, psycholinguistic factors and question writing style. Our findings provide a rationale for employing popularity predictions to guide the users into formulating better questions and editing the existing ones. For example, starting question title with a question word or adding details to the question increase number of views per question. Similar approach can be applied to promoting AnswerXchange content indexed by Google to drive organic traffic to TurboTax.
Tasks
Published	2016-01-15
URL	http://arxiv.org/abs/1601.04075v1
PDF	http://arxiv.org/pdf/1601.04075v1.pdf
PWC	https://paperswithcode.com/paper/modification-of-question-writing-style
Repo
Framework

NonSTOP: A NonSTationary Online Prediction Method for Time Series


Title	NonSTOP: A NonSTationary Online Prediction Method for Time Series
Authors	Christopher Xie, Avleen Bijral, Juan Lavista Ferres
Abstract	We present online prediction methods for time series that let us explicitly handle nonstationary artifacts (e.g. trend and seasonality) present in most real time series. Specifically, we show that applying appropriate transformations to such time series before prediction can lead to improved theoretical and empirical prediction performance. Moreover, since these transformations are usually unknown, we employ the learning with experts setting to develop a fully online method (NonSTOP-NonSTationary Online Prediction) for predicting nonstationary time series. This framework allows for seasonality and/or other trends in univariate time series and cointegration in multivariate time series. Our algorithms and regret analysis subsume recent related work while significantly expanding the applicability of such methods. For all the methods, we provide sub-linear regret bounds using relaxed assumptions. The theoretical guarantees do not fully capture the benefits of the transformations, thus we provide a data-dependent analysis of the follow-the-leader algorithm that provides insight into the success of using such transformations. We support all of our results with experiments on simulated and real data.
Tasks	Time Series
Published	2016-11-08
URL	http://arxiv.org/abs/1611.02365v4
PDF	http://arxiv.org/pdf/1611.02365v4.pdf
PWC	https://paperswithcode.com/paper/nonstop-a-nonstationary-online-prediction
Repo
Framework

The AGI Containment Problem


Title	The AGI Containment Problem
Authors	James Babcock, Janos Kramar, Roman Yampolskiy
Abstract	There is considerable uncertainty about what properties, capabilities and motivations future AGIs will have. In some plausible scenarios, AGIs may pose security risks arising from accidents and defects. In order to mitigate these risks, prudent early AGI research teams will perform significant testing on their creations before use. Unfortunately, if an AGI has human-level or greater intelligence, testing itself may not be safe; some natural AGI goal systems create emergent incentives for AGIs to tamper with their test environments, make copies of themselves on the internet, or convince developers and operators to do dangerous things. In this paper, we survey the AGI containment problem - the question of how to build a container in which tests can be conducted safely and reliably, even on AGIs with unknown motivations and capabilities that could be dangerous. We identify requirements for AGI containers, available mechanisms, and weaknesses that need to be addressed.
Tasks
Published	2016-04-02
URL	http://arxiv.org/abs/1604.00545v3
PDF	http://arxiv.org/pdf/1604.00545v3.pdf
PWC	https://paperswithcode.com/paper/the-agi-containment-problem
Repo
Framework

Privacy-Friendly Mobility Analytics using Aggregate Location Data


Title	Privacy-Friendly Mobility Analytics using Aggregate Location Data
Authors	Apostolos Pyrgelis, Emiliano De Cristofaro, Gordon Ross
Abstract	Location data can be extremely useful to study commuting patterns and disruptions, as well as to predict real-time traffic volumes. At the same time, however, the fine-grained collection of user locations raises serious privacy concerns, as this can reveal sensitive information about the users, such as, life style, political and religious inclinations, or even identities. In this paper, we study the feasibility of crowd-sourced mobility analytics over aggregate location information: users periodically report their location, using a privacy-preserving aggregation protocol, so that the server can only recover aggregates – i.e., how many, but not which, users are in a region at a given time. We experiment with real-world mobility datasets obtained from the Transport For London authority and the San Francisco Cabs network, and present a novel methodology based on time series modeling that is geared to forecast traffic volumes in regions of interest and to detect mobility anomalies in them. In the presence of anomalies, we also make enhanced traffic volume predictions by feeding our model with additional information from correlated regions. Finally, we present and evaluate a mobile app prototype, called Mobility Data Donors (MDD), in terms of computation, communication, and energy overhead, demonstrating the real-world deployability of our techniques.
Tasks	Time Series
Published	2016-09-21
URL	http://arxiv.org/abs/1609.06582v2
PDF	http://arxiv.org/pdf/1609.06582v2.pdf
PWC	https://paperswithcode.com/paper/privacy-friendly-mobility-analytics-using
Repo
Framework


Title	Sweep Distortion Removal from THz Images via Blind Demodulation
Authors	Alireza Aghasi, Barmak Heshmat, Albert Redo-Sanchez, Justin Romberg, Ramesh Raskar
Abstract	Heavy sweep distortion induced by alignments and inter-reflections of layers of a sample is a major burden in recovering 2D and 3D information in time resolved spectral imaging. This problem cannot be addressed by conventional denoising and signal processing techniques as it heavily depends on the physics of the acquisition. Here we propose and implement an algorithmic framework based on low-rank matrix recovery and alternating minimization that exploits the forward model for THz acquisition. The method allows recovering the original signal in spite of the presence of temporal-spatial distortions. We address a blind-demodulation problem, where based on several observations of the sample texture modulated by an undesired sweep pattern, the two classes of signals are separated. The performance of the method is examined in both synthetic and experimental data, and the successful reconstructions are demonstrated. The proposed general scheme can be implemented to advance inspection and imaging applications in THz and other time-resolved sensing modalities.
Tasks	Denoising
Published	2016-03-29
URL	http://arxiv.org/abs/1604.03426v1
PDF	http://arxiv.org/pdf/1604.03426v1.pdf
PWC	https://paperswithcode.com/paper/sweep-distortion-removal-from-thz-images-via
Repo
Framework