May 7, 2019

3044 words 15 mins read

Paper Group ANR 149

Using Neural Generative Models to Release Synthetic Twitter Corpora with Reduced Stylometric Identifiability of Users. Hierarchical Boundary-Aware Neural Encoder for Video Captioning. Spectral Convolution Networks. Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Ta …

Using Neural Generative Models to Release Synthetic Twitter Corpora with Reduced Stylometric Identifiability of Users


Title	Using Neural Generative Models to Release Synthetic Twitter Corpora with Reduced Stylometric Identifiability of Users
Authors	Alexander G. Ororbia II, Fridolin Linder, Joshua Snoke
Abstract	We present a method for generating synthetic versions of Twitter data using neural generative models. The goal is protecting individuals in the source data from stylometric re-identification attacks while still releasing data that carries research value. Specifically, we generate tweet corpora that maintain user-level word distributions by augmenting the neural language models with user-specific components. We compare our approach to two standard text data protection methods: redaction and iterative translation. We evaluate the three methods on measures of risk and utility. We define risk following the stylometric models of re-identification, and we define utility based on two general word distribution measures and two common text analysis research tasks. We find that neural models are able to significantly lower risk over previous methods with little cost to utility. We also demonstrate that the neural models allow data providers to actively control the risk-utility trade-off through model tuning parameters. This work presents promising results for a new tool addressing the problem of privacy for free text and sharing social media data in a way that respects privacy and is ethically responsible.
Tasks
Published	2016-06-03
URL	http://arxiv.org/abs/1606.01151v4
PDF	http://arxiv.org/pdf/1606.01151v4.pdf
PWC	https://paperswithcode.com/paper/using-neural-generative-models-to-release
Repo
Framework

Hierarchical Boundary-Aware Neural Encoder for Video Captioning


Title	Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Authors	Lorenzo Baraldi, Costantino Grana, Rita Cucchiara
Abstract	The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be used both to encode the input video and to generate the corresponding description. In this paper, we present a recurrent video encoding scheme which can discover and leverage the hierarchical structure of the video. Unlike the classical encoder-decoder approach, in which a video is encoded continuously by a recurrent layer, we propose a novel LSTM cell, which can identify discontinuity points between frames or segments and modify the temporal connections of the encoding layer accordingly. We evaluate our approach on three large-scale datasets: the Montreal Video Annotation dataset, the MPII Movie Description dataset and the Microsoft Video Description Corpus. Experiments show that our approach can discover appropriate hierarchical representations of input videos and improve the state of the art results on movie description datasets.
Tasks	Video Captioning, Video Description
Published	2016-11-28
URL	http://arxiv.org/abs/1611.09312v3
PDF	http://arxiv.org/pdf/1611.09312v3.pdf
PWC	https://paperswithcode.com/paper/hierarchical-boundary-aware-neural-encoder
Repo
Framework

Spectral Convolution Networks


Title	Spectral Convolution Networks
Authors	Maria Francesca, Arthur Hughes, David Gregg
Abstract	Previous research has shown that computation of convolution in the frequency domain provides a significant speedup versus traditional convolution network implementations. However, this performance increase comes at the expense of repeatedly computing the transform and its inverse in order to apply other network operations such as activation, pooling, and dropout. We show, mathematically, how convolution and activation can both be implemented in the frequency domain using either the Fourier or Laplace transformation. The main contributions are a description of spectral activation under the Fourier transform and a further description of an efficient algorithm for computing both convolution and activation under the Laplace transform. By computing both the convolution and activation functions in the frequency domain, we can reduce the number of transforms required, as well as reducing overall complexity. Our description of a spectral activation function, together with existing spectral analogs of other network functions may then be used to compose a fully spectral implementation of a convolution network.
Tasks
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05378v1
PDF	http://arxiv.org/pdf/1611.05378v1.pdf
PWC	https://paperswithcode.com/paper/spectral-convolution-networks
Repo
Framework

Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks


Title	Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks
Authors	Carsten Schnober, Steffen Eger, Erik-Lân Do Dinh, Iryna Gurevych
Abstract	We analyze the performance of encoder-decoder neural models and compare them with well-known established methods. The latter represent different classes of traditional approaches that are applied to the monotone sequence-to-sequence tasks OCR post-correction, spelling correction, grapheme-to-phoneme conversion, and lemmatization. Such tasks are of practical relevance for various higher-level research fields including digital humanities, automatic text correction, and speech recognition. We investigate how well generic deep-learning approaches adapt to these tasks, and how they perform in comparison with established and more specialized methods, including our own adaptation of pruned CRFs.
Tasks	Lemmatization, Optical Character Recognition, Speech Recognition, Spelling Correction
Published	2016-10-25
URL	http://arxiv.org/abs/1610.07796v2
PDF	http://arxiv.org/pdf/1610.07796v2.pdf
PWC	https://paperswithcode.com/paper/still-not-there-comparing-traditional
Repo
Framework

A Feature based Approach for Video Compression


Title	A Feature based Approach for Video Compression
Authors	Rajer Sindhu
Abstract	It is a high cost problem for panoramic image stitching via image matching algorithm and not practical for real-time performance. In this paper, we take full advantage ofHarris corner invariant characterization method light intensity parallel meaning, translation and rotation, and made a realtime panoramic image stitching algorithm. According to the basic characteristics and performance FPGA classical algorithm, several modules such as the feature point extraction, and matching description is to optimize the feature-based logic. Real-time optimization system to achieve high precision match. The new algorithm process the image from pixel domain and obtained from CCD camera Xilinx Spartan-6 hardware platform. After the image stitching algorithm, will eventually form a portable interface to output high-definition content on the display. The results showed that, the proposed algorithm has higher precision with good real-time performance and robustness.
Tasks	Image Stitching, Video Compression
Published	2016-05-26
URL	http://arxiv.org/abs/1605.08470v1
PDF	http://arxiv.org/pdf/1605.08470v1.pdf
PWC	https://paperswithcode.com/paper/a-feature-based-approach-for-video
Repo
Framework

Dynamic Allocation of Crowd Contributions for Sentiment Analysis during the 2016 U.S. Presidential Election


Title	Dynamic Allocation of Crowd Contributions for Sentiment Analysis during the 2016 U.S. Presidential Election
Authors	Mehrnoosh Sameki, Mattia Gentil, Kate K. Mays, Lei Guo, Margrit Betke
Abstract	Opinions about the 2016 U.S. Presidential Candidates have been expressed in millions of tweets that are challenging to analyze automatically. Crowdsourcing the analysis of political tweets effectively is also difficult, due to large inter-rater disagreements when sarcasm is involved. Each tweet is typically analyzed by a fixed number of workers and majority voting. We here propose a crowdsourcing framework that instead uses a dynamic allocation of the number of workers. We explore two dynamic-allocation methods: (1) The number of workers queried to label a tweet is computed offline based on the predicted difficulty of discerning the sentiment of a particular tweet. (2) The number of crowd workers is determined online, during an iterative crowd sourcing process, based on inter-rater agreements between labels.We applied our approach to 1,000 twitter messages about the four U.S. presidential candidates Clinton, Cruz, Sanders, and Trump, collected during February 2016. We implemented the two proposed methods using decision trees that allocate more crowd efforts to tweets predicted to be sarcastic. We show that our framework outperforms the traditional static allocation scheme. It collects opinion labels from the crowd at a much lower cost while maintaining labeling accuracy.
Tasks	Sentiment Analysis
Published	2016-08-31
URL	http://arxiv.org/abs/1608.08953v2
PDF	http://arxiv.org/pdf/1608.08953v2.pdf
PWC	https://paperswithcode.com/paper/dynamic-allocation-of-crowd-contributions-for
Repo
Framework

Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding


Title	Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding
Authors	Edoardo Maria Ponti, Elisabetta Jezek, Bernardo Magnini
Abstract	Lexical sets contain the words filling the argument positions of a verb in one of its senses. They can be grounded empirically through their automatic extraction from corpora. The purpose of this paper is demonstrating that their vector representation based on word embedding provides insights onto many linguistic phenomena, and in particular about verbs undergoing the causative-inchoative alternation. A first experiment aims at investigating the internal structure of the sets, which are known to be radial and continuous categories cognitively. A second experiment shows that the distance between the subject set and object set is correlated with a semantic factor, namely the spontaneity of the verb.
Tasks
Published	2016-10-03
URL	http://arxiv.org/abs/1610.00765v1
PDF	http://arxiv.org/pdf/1610.00765v1.pdf
PWC	https://paperswithcode.com/paper/grounding-the-lexical-sets-of-causative
Repo
Framework

Training Constrained Deconvolutional Networks for Road Scene Semantic Segmentation


Title	Training Constrained Deconvolutional Networks for Road Scene Semantic Segmentation
Authors	German Ros, Simon Stent, Pablo F. Alcantarilla, Tomoki Watanabe
Abstract	In this work we investigate the problem of road scene semantic segmentation using Deconvolutional Networks (DNs). Several constraints limit the practical performance of DNs in this context: firstly, the paucity of existing pixel-wise labelled training data, and secondly, the memory constraints of embedded hardware, which rule out the practical use of state-of-the-art DN architectures such as fully convolutional networks (FCN). To address the first constraint, we introduce a Multi-Domain Road Scene Semantic Segmentation (MDRS3) dataset, aggregating data from six existing densely and sparsely labelled datasets for training our models, and two existing, separate datasets for testing their generalisation performance. We show that, while MDRS3 offers a greater volume and variety of data, end-to-end training of a memory efficient DN does not yield satisfactory performance. We propose a new training strategy to overcome this, based on (i) the creation of a best-possible source network (S-Net) from the aggregated data, ignoring time and memory constraints; and (ii) the transfer of knowledge from S-Net to the memory-efficient target network (T-Net). We evaluate different techniques for S-Net creation and T-Net transferral, and demonstrate that training a constrained deconvolutional network in this manner can unlock better performance than existing training approaches. Specifically, we show that a target network can be trained to achieve improved accuracy versus an FCN despite using less than 1% of the memory. We believe that our approach can be useful beyond automotive scenarios where labelled data is similarly scarce or fragmented and where practical constraints exist on the desired model size. We make available our network models and aggregated multi-domain dataset for reproducibility.
Tasks	Semantic Segmentation
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01545v1
PDF	http://arxiv.org/pdf/1604.01545v1.pdf
PWC	https://paperswithcode.com/paper/training-constrained-deconvolutional-networks
Repo
Framework

Interacting Particle Markov Chain Monte Carlo


Title	Interacting Particle Markov Chain Monte Carlo
Authors	Tom Rainforth, Christian A. Naesseth, Fredrik Lindsten, Brooks Paige, Jan-Willem van de Meent, Arnaud Doucet, Frank Wood
Abstract	We introduce interacting particle Markov chain Monte Carlo (iPMCMC), a PMCMC method based on an interacting pool of standard and conditional sequential Monte Carlo samplers. Like related methods, iPMCMC is a Markov chain Monte Carlo sampler on an extended space. We present empirical results that show significant improvements in mixing rates relative to both non-interacting PMCMC samplers, and a single PMCMC sampler with an equivalent memory and computational budget. An additional advantage of the iPMCMC method is that it is suitable for distributed and multi-core architectures.
Tasks
Published	2016-02-16
URL	http://arxiv.org/abs/1602.05128v3
PDF	http://arxiv.org/pdf/1602.05128v3.pdf
PWC	https://paperswithcode.com/paper/interacting-particle-markov-chain-monte-carlo
Repo
Framework

An Empirical-Bayes Score for Discrete Bayesian Networks


Title	An Empirical-Bayes Score for Discrete Bayesian Networks
Authors	Marco Scutari
Abstract	Bayesian network structure learning is often performed in a Bayesian setting, by evaluating candidate structures using their posterior probabilities for a given data set. Score-based algorithms then use those posterior probabilities as an objective function and return the maximum a posteriori network as the learned model. For discrete Bayesian networks, the canonical choice for a posterior score is the Bayesian Dirichlet equivalent uniform (BDeu) marginal likelihood with a uniform (U) graph prior (Heckerman et al., 1995). Its favourable theoretical properties descend from assuming a uniform prior both on the space of the network structures and on the space of the parameters of the network. In this paper, we revisit the limitations of these assumptions; and we introduce an alternative set of assumptions and the resulting score: the Bayesian Dirichlet sparse (BDs) empirical Bayes marginal likelihood with a marginal uniform (MU) graph prior. We evaluate its performance in an extensive simulation study, showing that MU+BDs is more accurate than U+BDeu both in learning the structure of the network and in predicting new observations, while not being computationally more complex to estimate.
Tasks
Published	2016-05-12
URL	http://arxiv.org/abs/1605.03884v3
PDF	http://arxiv.org/pdf/1605.03884v3.pdf
PWC	https://paperswithcode.com/paper/an-empirical-bayes-score-for-discrete
Repo
Framework

System Combination for Short Utterance Speaker Recognition


Title	System Combination for Short Utterance Speaker Recognition
Authors	Lantian Li, Dong Wang, Xiaodong Zhang, Thomas Fang Zheng, Panshi Jin
Abstract	For text-independent short-utterance speaker recognition (SUSR), the performance often degrades dramatically. This paper presents a combination approach to the SUSR tasks with two phonetic-aware systems: one is the DNN-based i-vector system and the other is our recently proposed subregion-based GMM-UBM system. The former employs phone posteriors to construct an i-vector model in which the shared statistics offers stronger robustness against limited test data, while the latter establishes a phone-dependent GMM-UBM system which represents speaker characteristics with more details. A score-level fusion is implemented to integrate the respective advantages from the two systems. Experimental results show that for the text-independent SUSR task, both the DNN-based i-vector system and the subregion-based GMM-UBM system outperform their respective baselines, and the score-level system combination delivers performance improvement.
Tasks	Speaker Recognition
Published	2016-03-31
URL	http://arxiv.org/abs/1603.09460v2
PDF	http://arxiv.org/pdf/1603.09460v2.pdf
PWC	https://paperswithcode.com/paper/system-combination-for-short-utterance
Repo
Framework

Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection


Title	Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection
Authors	Jiaolong Yang, Hongdong Li, Yuchao Dai, Robby T. Tan
Abstract	This paper deals with a challenging, frequently encountered, yet not properly investigated problem in two-frame optical flow estimation. That is, the input frames are compounds of two imaging layers – one desired background layer of the scene, and one distracting, possibly moving layer due to transparency or reflection. In this situation, the conventional brightness constancy constraint – the cornerstone of most existing optical flow methods – will no longer be valid. In this paper, we propose a robust solution to this problem. The proposed method performs both optical flow estimation, and image layer separation. It exploits a generalized double-layer brightness consistency constraint connecting these two tasks, and utilizes the priors for both of them. Experiments on both synthetic data and real images have confirmed the efficacy of the proposed method. To the best of our knowledge, this is the first attempt towards handling generic optical flow fields of two-frame images containing transparency or reflection.
Tasks	Optical Flow Estimation
Published	2016-05-06
URL	http://arxiv.org/abs/1605.01825v1
PDF	http://arxiv.org/pdf/1605.01825v1.pdf
PWC	https://paperswithcode.com/paper/robust-optical-flow-estimation-of-double
Repo
Framework


Title	Identifying Structures in Social Conversations in NSCLC Patients through the Semi-Automatic extraction of Topical Taxonomies
Authors	Giancarlo Crocetti, Amir A. Delay, Fatemeh Seyedmendhi
Abstract	The exploration of social conversations for addressing patient’s needs is an important analytical task in which many scholarly publications are contributing to fill the knowledge gap in this area. The main difficulty remains the inability to turn such contributions into pragmatic processes the pharmaceutical industry can leverage in order to generate insight from social media data, which can be considered as one of the most challenging source of information available today due to its sheer volume and noise. This study is based on the work by Scott Spangler and Jeffrey Kreulen and applies it to identify structure in social media through the extraction of a topical taxonomy able to capture the latent knowledge in social conversations in health-related sites. The mechanism for automatically identifying and generating a taxonomy from social conversations is developed and pressured tested using public data from media sites focused on the needs of cancer patients and their families. Moreover, a novel method for generating the category’s label and the determination of an optimal number of categories is presented which extends Scott and Jeffrey’s research in a meaningful way. We assume the reader is familiar with taxonomies, what they are and how they are used.
Tasks
Published	2016-02-12
URL	http://arxiv.org/abs/1602.04709v1
PDF	http://arxiv.org/pdf/1602.04709v1.pdf
PWC	https://paperswithcode.com/paper/identifying-structures-in-social
Repo
Framework

Very Fast Kernel SVM under Budget Constraints


Title	Very Fast Kernel SVM under Budget Constraints
Authors	David Picard
Abstract	In this paper we propose a fast online Kernel SVM algorithm under tight budget constraints. We propose to split the input space using LVQ and train a Kernel SVM in each cluster. To allow for online training, we propose to limit the size of the support vector set of each cluster using different strategies. We show in the experiment that our algorithm is able to achieve high accuracy while having a very high number of samples processed per second both in training and in the evaluation.
Tasks
Published	2016-12-31
URL	http://arxiv.org/abs/1701.00167v1
PDF	http://arxiv.org/pdf/1701.00167v1.pdf
PWC	https://paperswithcode.com/paper/very-fast-kernel-svm-under-budget-constraints
Repo
Framework

A Benchmark Dataset and Saliency-guided Stacked Autoencoders for Video-based Salient Object Detection


Title	A Benchmark Dataset and Saliency-guided Stacked Autoencoders for Video-based Salient Object Detection
Authors	Jia Li, Changqun Xia, Xiaowu Chen
Abstract	Image-based salient object detection (SOD) has been extensively studied in the past decades. However, video-based SOD is much less explored since there lack large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos (64 minutes). In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects that free-view all videos. From the user data, we find salient objects in video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for video-based salient object detection. Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliency-guided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are unsupervisedly constructed which automatically infer a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. Experimental results show that the proposed unsupervised approach outperforms 30 state-of-the-art models on the proposed dataset, including 19 image-based & classic (unsupervised or non-deep learning), 6 image-based & deep learning, and 5 video-based & unsupervised. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD.
Tasks	Eye Tracking, Object Detection, Salient Object Detection
Published	2016-11-01
URL	http://arxiv.org/abs/1611.00135v2
PDF	http://arxiv.org/pdf/1611.00135v2.pdf
PWC	https://paperswithcode.com/paper/a-benchmark-dataset-and-saliency-guided
Repo
Framework