Paper Group ANR 461
A Characterization of Prediction Errors. Deep CNNs along the Time Axis with Intermap Pooling for Robustness to Spectral Variations. Pano2Vid: Automatic Cinematography for Watching 360$^{\circ}$ Videos. Neural Net Models for Open-Domain Discourse Coherence. Unsupervised Learning with Truncated Gaussian Graphical Models. Dimension Projection among La …
A Characterization of Prediction Errors
Title | A Characterization of Prediction Errors |
Authors | Christopher Meek |
Abstract | Understanding prediction errors and determining how to fix them is critical to building effective predictive systems. In this paper, we delineate four types of prediction errors and demonstrate that these four types characterize all prediction errors. In addition, we describe potential remedies and tools that can be used to reduce the uncertainty when trying to determine the source of a prediction error and when trying to take action to remove a prediction errors. |
Tasks | |
Published | 2016-11-18 |
URL | http://arxiv.org/abs/1611.05955v1 |
http://arxiv.org/pdf/1611.05955v1.pdf | |
PWC | https://paperswithcode.com/paper/a-characterization-of-prediction-errors |
Repo | |
Framework | |
Deep CNNs along the Time Axis with Intermap Pooling for Robustness to Spectral Variations
Title | Deep CNNs along the Time Axis with Intermap Pooling for Robustness to Spectral Variations |
Authors | Hwaran Lee, Geonmin Kim, Ho-Gyeong Kim, Sang-Hoon Oh, Soo-Young Lee |
Abstract | Convolutional neural networks (CNNs) with convolutional and pooling operations along the frequency axis have been proposed to attain invariance to frequency shifts of features. However, this is inappropriate with regard to the fact that acoustic features vary in frequency. In this paper, we contend that convolution along the time axis is more effective. We also propose the addition of an intermap pooling (IMP) layer to deep CNNs. In this layer, filters in each group extract common but spectrally variant features, then the layer pools the feature maps of each group. As a result, the proposed IMP CNN can achieve insensitivity to spectral variations characteristic of different speakers and utterances. The effectiveness of the IMP CNN architecture is demonstrated on several LVCSR tasks. Even without speaker adaptation techniques, the architecture achieved a WER of 12.7% on the SWB part of the Hub5’2000 evaluation test set, which is competitive with other state-of-the-art methods. |
Tasks | Large Vocabulary Continuous Speech Recognition |
Published | 2016-06-10 |
URL | http://arxiv.org/abs/1606.03207v2 |
http://arxiv.org/pdf/1606.03207v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-cnns-along-the-time-axis-with-intermap |
Repo | |
Framework | |
Pano2Vid: Automatic Cinematography for Watching 360$^{\circ}$ Videos
Title | Pano2Vid: Automatic Cinematography for Watching 360$^{\circ}$ Videos |
Authors | Yu-Chuan Su, Dinesh Jayaraman, Kristen Grauman |
Abstract | We introduce the novel task of Pano2Vid $-$ automatic cinematography in panoramic 360$^{\circ}$ videos. Given a 360$^{\circ}$ video, the goal is to direct an imaginary camera to virtually capture natural-looking normal field-of-view (NFOV) video. By selecting “where to look” within the panorama at each time step, Pano2Vid aims to free both the videographer and the end viewer from the task of determining what to watch. Towards this goal, we first compile a dataset of 360$^{\circ}$ videos downloaded from the web, together with human-edited NFOV camera trajectories to facilitate evaluation. Next, we propose AutoCam, a data-driven approach to solve the Pano2Vid task. AutoCam leverages NFOV web video to discriminatively identify space-time “glimpses” of interest at each time instant, and then uses dynamic programming to select optimal human-like camera trajectories. Through experimental evaluation on multiple newly defined Pano2Vid performance measures against several baselines, we show that our method successfully produces informative videos that could conceivably have been captured by human videographers. |
Tasks | |
Published | 2016-12-07 |
URL | http://arxiv.org/abs/1612.02335v1 |
http://arxiv.org/pdf/1612.02335v1.pdf | |
PWC | https://paperswithcode.com/paper/pano2vid-automatic-cinematography-for |
Repo | |
Framework | |
Neural Net Models for Open-Domain Discourse Coherence
Title | Neural Net Models for Open-Domain Discourse Coherence |
Authors | Jiwei Li, Dan Jurafsky |
Abstract | Discourse coherence is strongly associated with text quality, making it important to natural language generation and understanding. Yet existing models of coherence focus on measuring individual aspects of coherence (lexical overlap, rhetorical structure, entity centering) in narrow domains. In this paper, we describe domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences. We study both discriminative models that learn to distinguish coherent from incoherent discourse, and generative models that produce coherent text, including a novel neural latent-variable Markovian generative model that captures the latent discourse dependencies between sentences in a text. Our work achieves state-of-the-art performance on multiple coherence evaluations, and marks an initial step in generating coherent texts given discourse contexts. |
Tasks | Text Generation |
Published | 2016-06-05 |
URL | http://arxiv.org/abs/1606.01545v3 |
http://arxiv.org/pdf/1606.01545v3.pdf | |
PWC | https://paperswithcode.com/paper/neural-net-models-for-open-domain-discourse |
Repo | |
Framework | |
Unsupervised Learning with Truncated Gaussian Graphical Models
Title | Unsupervised Learning with Truncated Gaussian Graphical Models |
Authors | Qinliang Su, Xuejun Liao, Chunyuan Li, Zhe Gan, Lawrence Carin |
Abstract | Gaussian graphical models (GGMs) are widely used for statistical modeling, because of ease of inference and the ubiquitous use of the normal distribution in practical approximations. However, they are also known for their limited modeling abilities, due to the Gaussian assumption. In this paper, we introduce a novel variant of GGMs, which relaxes the Gaussian restriction and yet admits efficient inference. Specifically, we impose a bipartite structure on the GGM and govern the hidden variables by truncated normal distributions. The nonlinearity of the model is revealed by its connection to rectified linear unit (ReLU) neural networks. Meanwhile, thanks to the bipartite structure and appealing properties of truncated normals, we are able to train the models efficiently using contrastive divergence. We consider three output constructs, accounting for real-valued, binary and count data. We further extend the model to deep constructions and show that deep models can be used for unsupervised pre-training of rectifier neural networks. Extensive experimental results are provided to validate the proposed models and demonstrate their superiority over competing models. |
Tasks | |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04920v2 |
http://arxiv.org/pdf/1611.04920v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-with-truncated-gaussian |
Repo | |
Framework | |
Dimension Projection among Languages based on Pseudo-relevant Documents for Query Translation
Title | Dimension Projection among Languages based on Pseudo-relevant Documents for Query Translation |
Authors | Javid Dadashkarimi, Mahsa S. Shahshahani, Amirhossein Tebbifakhr, Heshaam Faili, Azadeh Shakery |
Abstract | Using top-ranked documents in response to a query has been shown to be an effective approach to improve the quality of query translation in dictionary-based cross-language information retrieval. In this paper, we propose a new method for dictionary-based query translation based on dimension projection of embedded vectors from the pseudo-relevant documents in the source language to their equivalents in the target language. To this end, first we learn low-dimensional vectors of the words in the pseudo-relevant collections separately and then aim to find a query-dependent transformation matrix between the vectors of translation pairs appeared in the collections. At the next step, representation of each query term is projected to the target language and then, after using a softmax function, a query-dependent translation model is built. Finally, the model is used for query translation. Our experiments on four CLEF collections in French, Spanish, German, and Italian demonstrate that the proposed method outperforms a word embedding baseline based on bilingual shuffling and a further number of competitive baselines. The proposed method reaches up to 87% performance of machine translation (MT) in short queries and considerable improvements in verbose queries. |
Tasks | Information Retrieval, Machine Translation |
Published | 2016-05-25 |
URL | http://arxiv.org/abs/1605.07844v2 |
http://arxiv.org/pdf/1605.07844v2.pdf | |
PWC | https://paperswithcode.com/paper/dimension-projection-among-languages-based-on |
Repo | |
Framework | |
A DNN Framework For Text Image Rectification From Planar Transformations
Title | A DNN Framework For Text Image Rectification From Planar Transformations |
Authors | Chengzhe Yan, Jie Hu, Changshui Zhang |
Abstract | In this paper, a novel neural network architecture is proposed attempting to rectify text images with mild assumptions. A new dataset of text images is collected to verify our model and open to public. We explored the capability of deep neural network in learning geometric transformation and found the model could segment the text image without explicit supervised segmentation information. Experiments show the architecture proposed can restore planar transformations with wonderful robustness and effectiveness. |
Tasks | |
Published | 2016-11-14 |
URL | http://arxiv.org/abs/1611.04298v1 |
http://arxiv.org/pdf/1611.04298v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dnn-framework-for-text-image-rectification |
Repo | |
Framework | |
Equilibrium Graphs
Title | Equilibrium Graphs |
Authors | Pedro Cabalar, Carlos Pérez, Gilberto Pérez |
Abstract | In this paper we present an extension of Peirce’s existential graphs to provide a diagrammatic representation of expressions in Quantified Equilibrium Logic (QEL). Using this formalisation, logical connectives are replaced by encircled regions (circles and squares) and quantified variables are represented as “identity” lines. Although the expressive power is equivalent to that of QEL, the new representation can be useful for illustrative or educational purposes. |
Tasks | |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.02010v1 |
http://arxiv.org/pdf/1609.02010v1.pdf | |
PWC | https://paperswithcode.com/paper/equilibrium-graphs |
Repo | |
Framework | |
Performance of a community detection algorithm based on semidefinite programming
Title | Performance of a community detection algorithm based on semidefinite programming |
Authors | Adel Javanmard, Andrea Montanari, Federico Ricci-Tersenghi |
Abstract | The problem of detecting communities in a graph is maybe one the most studied inference problems, given its simplicity and widespread diffusion among several disciplines. A very common benchmark for this problem is the stochastic block model or planted partition problem, where a phase transition takes place in the detection of the planted partition by changing the signal-to-noise ratio. Optimal algorithms for the detection exist which are based on spectral methods, but we show these are extremely sensible to slight modification in the generative model. Recently Javanmard, Montanari and Ricci-Tersenghi (arXiv:1511.08769) have used statistical physics arguments, and numerical simulations to show that finding communities in the stochastic block model via semidefinite programming is quasi optimal. Further, the resulting semidefinite relaxation can be solved efficiently, and is very robust with respect to changes in the generative model. In this paper we study in detail several practical aspects of this new algorithm based on semidefinite programming for the detection of the planted partition. The algorithm turns out to be very fast, allowing the solution of problems with $O(10^5)$ variables in few second on a laptop computer. |
Tasks | Community Detection |
Published | 2016-03-30 |
URL | http://arxiv.org/abs/1603.09045v1 |
http://arxiv.org/pdf/1603.09045v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-of-a-community-detection |
Repo | |
Framework | |
Towards an “In-the-Wild” Emotion Dataset Using a Game-based Framework
Title | Towards an “In-the-Wild” Emotion Dataset Using a Game-based Framework |
Authors | Wei Li, Farnaz Abtahi, Christina Tsangouri, Zhigang Zhu |
Abstract | In order to create an “in-the-wild” dataset of facial emotions with large number of balanced samples, this paper proposes a game-based data collection framework. The framework mainly include three components: a game engine, a game interface, and a data collection and evaluation module. We use a deep learning approach to build an emotion classifier as the game engine. Then a emotion web game to allow gamers to enjoy the games, while the data collection module obtains automatically-labelled emotion images. Using our game, we have collected more than 15,000 images within a month of the test run and built an emotion dataset “GaMo”. To evaluate the dataset, we compared the performance of two deep learning models trained on both GaMo and CIFE. The results of our experiments show that because of being large and balanced, GaMo can be used to build a more robust emotion detector than the emotion detector trained on CIFE, which was used in the game engine to collect the face images. |
Tasks | |
Published | 2016-07-10 |
URL | http://arxiv.org/abs/1607.02678v1 |
http://arxiv.org/pdf/1607.02678v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-an-in-the-wild-emotion-dataset-using |
Repo | |
Framework | |
Modification of Question Writing Style Influences Content Popularity in a Social Q&A System
Title | Modification of Question Writing Style Influences Content Popularity in a Social Q&A System |
Authors | Igor A. Podgorny |
Abstract | TurboTax AnswerXchange is a social Q&A system supporting users working on federal and state tax returns. Using 2015 data, we demonstrate that content popularity (or number of views per AnswerXchange question) can be predicted with reasonable accuracy based on attributes of the question alone. We also employ probabilistic topic analysis and uplift modeling to identify question features with the highest impact on popularity. We demonstrate that content popularity is driven by behavioral attributes of AnswerXchange users and depends on complex interactions between search ranking algorithms, psycholinguistic factors and question writing style. Our findings provide a rationale for employing popularity predictions to guide the users into formulating better questions and editing the existing ones. For example, starting question title with a question word or adding details to the question increase number of views per question. Similar approach can be applied to promoting AnswerXchange content indexed by Google to drive organic traffic to TurboTax. |
Tasks | |
Published | 2016-01-15 |
URL | http://arxiv.org/abs/1601.04075v1 |
http://arxiv.org/pdf/1601.04075v1.pdf | |
PWC | https://paperswithcode.com/paper/modification-of-question-writing-style |
Repo | |
Framework | |
NonSTOP: A NonSTationary Online Prediction Method for Time Series
Title | NonSTOP: A NonSTationary Online Prediction Method for Time Series |
Authors | Christopher Xie, Avleen Bijral, Juan Lavista Ferres |
Abstract | We present online prediction methods for time series that let us explicitly handle nonstationary artifacts (e.g. trend and seasonality) present in most real time series. Specifically, we show that applying appropriate transformations to such time series before prediction can lead to improved theoretical and empirical prediction performance. Moreover, since these transformations are usually unknown, we employ the learning with experts setting to develop a fully online method (NonSTOP-NonSTationary Online Prediction) for predicting nonstationary time series. This framework allows for seasonality and/or other trends in univariate time series and cointegration in multivariate time series. Our algorithms and regret analysis subsume recent related work while significantly expanding the applicability of such methods. For all the methods, we provide sub-linear regret bounds using relaxed assumptions. The theoretical guarantees do not fully capture the benefits of the transformations, thus we provide a data-dependent analysis of the follow-the-leader algorithm that provides insight into the success of using such transformations. We support all of our results with experiments on simulated and real data. |
Tasks | Time Series |
Published | 2016-11-08 |
URL | http://arxiv.org/abs/1611.02365v4 |
http://arxiv.org/pdf/1611.02365v4.pdf | |
PWC | https://paperswithcode.com/paper/nonstop-a-nonstationary-online-prediction |
Repo | |
Framework | |
The AGI Containment Problem
Title | The AGI Containment Problem |
Authors | James Babcock, Janos Kramar, Roman Yampolskiy |
Abstract | There is considerable uncertainty about what properties, capabilities and motivations future AGIs will have. In some plausible scenarios, AGIs may pose security risks arising from accidents and defects. In order to mitigate these risks, prudent early AGI research teams will perform significant testing on their creations before use. Unfortunately, if an AGI has human-level or greater intelligence, testing itself may not be safe; some natural AGI goal systems create emergent incentives for AGIs to tamper with their test environments, make copies of themselves on the internet, or convince developers and operators to do dangerous things. In this paper, we survey the AGI containment problem - the question of how to build a container in which tests can be conducted safely and reliably, even on AGIs with unknown motivations and capabilities that could be dangerous. We identify requirements for AGI containers, available mechanisms, and weaknesses that need to be addressed. |
Tasks | |
Published | 2016-04-02 |
URL | http://arxiv.org/abs/1604.00545v3 |
http://arxiv.org/pdf/1604.00545v3.pdf | |
PWC | https://paperswithcode.com/paper/the-agi-containment-problem |
Repo | |
Framework | |
Privacy-Friendly Mobility Analytics using Aggregate Location Data
Title | Privacy-Friendly Mobility Analytics using Aggregate Location Data |
Authors | Apostolos Pyrgelis, Emiliano De Cristofaro, Gordon Ross |
Abstract | Location data can be extremely useful to study commuting patterns and disruptions, as well as to predict real-time traffic volumes. At the same time, however, the fine-grained collection of user locations raises serious privacy concerns, as this can reveal sensitive information about the users, such as, life style, political and religious inclinations, or even identities. In this paper, we study the feasibility of crowd-sourced mobility analytics over aggregate location information: users periodically report their location, using a privacy-preserving aggregation protocol, so that the server can only recover aggregates – i.e., how many, but not which, users are in a region at a given time. We experiment with real-world mobility datasets obtained from the Transport For London authority and the San Francisco Cabs network, and present a novel methodology based on time series modeling that is geared to forecast traffic volumes in regions of interest and to detect mobility anomalies in them. In the presence of anomalies, we also make enhanced traffic volume predictions by feeding our model with additional information from correlated regions. Finally, we present and evaluate a mobile app prototype, called Mobility Data Donors (MDD), in terms of computation, communication, and energy overhead, demonstrating the real-world deployability of our techniques. |
Tasks | Time Series |
Published | 2016-09-21 |
URL | http://arxiv.org/abs/1609.06582v2 |
http://arxiv.org/pdf/1609.06582v2.pdf | |
PWC | https://paperswithcode.com/paper/privacy-friendly-mobility-analytics-using |
Repo | |
Framework | |
Sweep Distortion Removal from THz Images via Blind Demodulation
Title | Sweep Distortion Removal from THz Images via Blind Demodulation |
Authors | Alireza Aghasi, Barmak Heshmat, Albert Redo-Sanchez, Justin Romberg, Ramesh Raskar |
Abstract | Heavy sweep distortion induced by alignments and inter-reflections of layers of a sample is a major burden in recovering 2D and 3D information in time resolved spectral imaging. This problem cannot be addressed by conventional denoising and signal processing techniques as it heavily depends on the physics of the acquisition. Here we propose and implement an algorithmic framework based on low-rank matrix recovery and alternating minimization that exploits the forward model for THz acquisition. The method allows recovering the original signal in spite of the presence of temporal-spatial distortions. We address a blind-demodulation problem, where based on several observations of the sample texture modulated by an undesired sweep pattern, the two classes of signals are separated. The performance of the method is examined in both synthetic and experimental data, and the successful reconstructions are demonstrated. The proposed general scheme can be implemented to advance inspection and imaging applications in THz and other time-resolved sensing modalities. |
Tasks | Denoising |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1604.03426v1 |
http://arxiv.org/pdf/1604.03426v1.pdf | |
PWC | https://paperswithcode.com/paper/sweep-distortion-removal-from-thz-images-via |
Repo | |
Framework | |