Paper Group ANR 149
Using Neural Generative Models to Release Synthetic Twitter Corpora with Reduced Stylometric Identifiability of Users. Hierarchical Boundary-Aware Neural Encoder for Video Captioning. Spectral Convolution Networks. Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Ta …
Using Neural Generative Models to Release Synthetic Twitter Corpora with Reduced Stylometric Identifiability of Users
Title | Using Neural Generative Models to Release Synthetic Twitter Corpora with Reduced Stylometric Identifiability of Users |
Authors | Alexander G. Ororbia II, Fridolin Linder, Joshua Snoke |
Abstract | We present a method for generating synthetic versions of Twitter data using neural generative models. The goal is protecting individuals in the source data from stylometric re-identification attacks while still releasing data that carries research value. Specifically, we generate tweet corpora that maintain user-level word distributions by augmenting the neural language models with user-specific components. We compare our approach to two standard text data protection methods: redaction and iterative translation. We evaluate the three methods on measures of risk and utility. We define risk following the stylometric models of re-identification, and we define utility based on two general word distribution measures and two common text analysis research tasks. We find that neural models are able to significantly lower risk over previous methods with little cost to utility. We also demonstrate that the neural models allow data providers to actively control the risk-utility trade-off through model tuning parameters. This work presents promising results for a new tool addressing the problem of privacy for free text and sharing social media data in a way that respects privacy and is ethically responsible. |
Tasks | |
Published | 2016-06-03 |
URL | http://arxiv.org/abs/1606.01151v4 |
http://arxiv.org/pdf/1606.01151v4.pdf | |
PWC | https://paperswithcode.com/paper/using-neural-generative-models-to-release |
Repo | |
Framework | |
Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Title | Hierarchical Boundary-Aware Neural Encoder for Video Captioning |
Authors | Lorenzo Baraldi, Costantino Grana, Rita Cucchiara |
Abstract | The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be used both to encode the input video and to generate the corresponding description. In this paper, we present a recurrent video encoding scheme which can discover and leverage the hierarchical structure of the video. Unlike the classical encoder-decoder approach, in which a video is encoded continuously by a recurrent layer, we propose a novel LSTM cell, which can identify discontinuity points between frames or segments and modify the temporal connections of the encoding layer accordingly. We evaluate our approach on three large-scale datasets: the Montreal Video Annotation dataset, the MPII Movie Description dataset and the Microsoft Video Description Corpus. Experiments show that our approach can discover appropriate hierarchical representations of input videos and improve the state of the art results on movie description datasets. |
Tasks | Video Captioning, Video Description |
Published | 2016-11-28 |
URL | http://arxiv.org/abs/1611.09312v3 |
http://arxiv.org/pdf/1611.09312v3.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-boundary-aware-neural-encoder |
Repo | |
Framework | |
Spectral Convolution Networks
Title | Spectral Convolution Networks |
Authors | Maria Francesca, Arthur Hughes, David Gregg |
Abstract | Previous research has shown that computation of convolution in the frequency domain provides a significant speedup versus traditional convolution network implementations. However, this performance increase comes at the expense of repeatedly computing the transform and its inverse in order to apply other network operations such as activation, pooling, and dropout. We show, mathematically, how convolution and activation can both be implemented in the frequency domain using either the Fourier or Laplace transformation. The main contributions are a description of spectral activation under the Fourier transform and a further description of an efficient algorithm for computing both convolution and activation under the Laplace transform. By computing both the convolution and activation functions in the frequency domain, we can reduce the number of transforms required, as well as reducing overall complexity. Our description of a spectral activation function, together with existing spectral analogs of other network functions may then be used to compose a fully spectral implementation of a convolution network. |
Tasks | |
Published | 2016-11-16 |
URL | http://arxiv.org/abs/1611.05378v1 |
http://arxiv.org/pdf/1611.05378v1.pdf | |
PWC | https://paperswithcode.com/paper/spectral-convolution-networks |
Repo | |
Framework | |
Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks
Title | Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks |
Authors | Carsten Schnober, Steffen Eger, Erik-Lân Do Dinh, Iryna Gurevych |
Abstract | We analyze the performance of encoder-decoder neural models and compare them with well-known established methods. The latter represent different classes of traditional approaches that are applied to the monotone sequence-to-sequence tasks OCR post-correction, spelling correction, grapheme-to-phoneme conversion, and lemmatization. Such tasks are of practical relevance for various higher-level research fields including digital humanities, automatic text correction, and speech recognition. We investigate how well generic deep-learning approaches adapt to these tasks, and how they perform in comparison with established and more specialized methods, including our own adaptation of pruned CRFs. |
Tasks | Lemmatization, Optical Character Recognition, Speech Recognition, Spelling Correction |
Published | 2016-10-25 |
URL | http://arxiv.org/abs/1610.07796v2 |
http://arxiv.org/pdf/1610.07796v2.pdf | |
PWC | https://paperswithcode.com/paper/still-not-there-comparing-traditional |
Repo | |
Framework | |
A Feature based Approach for Video Compression
Title | A Feature based Approach for Video Compression |
Authors | Rajer Sindhu |
Abstract | It is a high cost problem for panoramic image stitching via image matching algorithm and not practical for real-time performance. In this paper, we take full advantage ofHarris corner invariant characterization method light intensity parallel meaning, translation and rotation, and made a realtime panoramic image stitching algorithm. According to the basic characteristics and performance FPGA classical algorithm, several modules such as the feature point extraction, and matching description is to optimize the feature-based logic. Real-time optimization system to achieve high precision match. The new algorithm process the image from pixel domain and obtained from CCD camera Xilinx Spartan-6 hardware platform. After the image stitching algorithm, will eventually form a portable interface to output high-definition content on the display. The results showed that, the proposed algorithm has higher precision with good real-time performance and robustness. |
Tasks | Image Stitching, Video Compression |
Published | 2016-05-26 |
URL | http://arxiv.org/abs/1605.08470v1 |
http://arxiv.org/pdf/1605.08470v1.pdf | |
PWC | https://paperswithcode.com/paper/a-feature-based-approach-for-video |
Repo | |
Framework | |
Dynamic Allocation of Crowd Contributions for Sentiment Analysis during the 2016 U.S. Presidential Election
Title | Dynamic Allocation of Crowd Contributions for Sentiment Analysis during the 2016 U.S. Presidential Election |
Authors | Mehrnoosh Sameki, Mattia Gentil, Kate K. Mays, Lei Guo, Margrit Betke |
Abstract | Opinions about the 2016 U.S. Presidential Candidates have been expressed in millions of tweets that are challenging to analyze automatically. Crowdsourcing the analysis of political tweets effectively is also difficult, due to large inter-rater disagreements when sarcasm is involved. Each tweet is typically analyzed by a fixed number of workers and majority voting. We here propose a crowdsourcing framework that instead uses a dynamic allocation of the number of workers. We explore two dynamic-allocation methods: (1) The number of workers queried to label a tweet is computed offline based on the predicted difficulty of discerning the sentiment of a particular tweet. (2) The number of crowd workers is determined online, during an iterative crowd sourcing process, based on inter-rater agreements between labels.We applied our approach to 1,000 twitter messages about the four U.S. presidential candidates Clinton, Cruz, Sanders, and Trump, collected during February 2016. We implemented the two proposed methods using decision trees that allocate more crowd efforts to tweets predicted to be sarcastic. We show that our framework outperforms the traditional static allocation scheme. It collects opinion labels from the crowd at a much lower cost while maintaining labeling accuracy. |
Tasks | Sentiment Analysis |
Published | 2016-08-31 |
URL | http://arxiv.org/abs/1608.08953v2 |
http://arxiv.org/pdf/1608.08953v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-allocation-of-crowd-contributions-for |
Repo | |
Framework | |
Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding
Title | Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding |
Authors | Edoardo Maria Ponti, Elisabetta Jezek, Bernardo Magnini |
Abstract | Lexical sets contain the words filling the argument positions of a verb in one of its senses. They can be grounded empirically through their automatic extraction from corpora. The purpose of this paper is demonstrating that their vector representation based on word embedding provides insights onto many linguistic phenomena, and in particular about verbs undergoing the causative-inchoative alternation. A first experiment aims at investigating the internal structure of the sets, which are known to be radial and continuous categories cognitively. A second experiment shows that the distance between the subject set and object set is correlated with a semantic factor, namely the spontaneity of the verb. |
Tasks | |
Published | 2016-10-03 |
URL | http://arxiv.org/abs/1610.00765v1 |
http://arxiv.org/pdf/1610.00765v1.pdf | |
PWC | https://paperswithcode.com/paper/grounding-the-lexical-sets-of-causative |
Repo | |
Framework | |
Training Constrained Deconvolutional Networks for Road Scene Semantic Segmentation
Title | Training Constrained Deconvolutional Networks for Road Scene Semantic Segmentation |
Authors | German Ros, Simon Stent, Pablo F. Alcantarilla, Tomoki Watanabe |
Abstract | In this work we investigate the problem of road scene semantic segmentation using Deconvolutional Networks (DNs). Several constraints limit the practical performance of DNs in this context: firstly, the paucity of existing pixel-wise labelled training data, and secondly, the memory constraints of embedded hardware, which rule out the practical use of state-of-the-art DN architectures such as fully convolutional networks (FCN). To address the first constraint, we introduce a Multi-Domain Road Scene Semantic Segmentation (MDRS3) dataset, aggregating data from six existing densely and sparsely labelled datasets for training our models, and two existing, separate datasets for testing their generalisation performance. We show that, while MDRS3 offers a greater volume and variety of data, end-to-end training of a memory efficient DN does not yield satisfactory performance. We propose a new training strategy to overcome this, based on (i) the creation of a best-possible source network (S-Net) from the aggregated data, ignoring time and memory constraints; and (ii) the transfer of knowledge from S-Net to the memory-efficient target network (T-Net). We evaluate different techniques for S-Net creation and T-Net transferral, and demonstrate that training a constrained deconvolutional network in this manner can unlock better performance than existing training approaches. Specifically, we show that a target network can be trained to achieve improved accuracy versus an FCN despite using less than 1% of the memory. We believe that our approach can be useful beyond automotive scenarios where labelled data is similarly scarce or fragmented and where practical constraints exist on the desired model size. We make available our network models and aggregated multi-domain dataset for reproducibility. |
Tasks | Semantic Segmentation |
Published | 2016-04-06 |
URL | http://arxiv.org/abs/1604.01545v1 |
http://arxiv.org/pdf/1604.01545v1.pdf | |
PWC | https://paperswithcode.com/paper/training-constrained-deconvolutional-networks |
Repo | |
Framework | |
Interacting Particle Markov Chain Monte Carlo
Title | Interacting Particle Markov Chain Monte Carlo |
Authors | Tom Rainforth, Christian A. Naesseth, Fredrik Lindsten, Brooks Paige, Jan-Willem van de Meent, Arnaud Doucet, Frank Wood |
Abstract | We introduce interacting particle Markov chain Monte Carlo (iPMCMC), a PMCMC method based on an interacting pool of standard and conditional sequential Monte Carlo samplers. Like related methods, iPMCMC is a Markov chain Monte Carlo sampler on an extended space. We present empirical results that show significant improvements in mixing rates relative to both non-interacting PMCMC samplers, and a single PMCMC sampler with an equivalent memory and computational budget. An additional advantage of the iPMCMC method is that it is suitable for distributed and multi-core architectures. |
Tasks | |
Published | 2016-02-16 |
URL | http://arxiv.org/abs/1602.05128v3 |
http://arxiv.org/pdf/1602.05128v3.pdf | |
PWC | https://paperswithcode.com/paper/interacting-particle-markov-chain-monte-carlo |
Repo | |
Framework | |
An Empirical-Bayes Score for Discrete Bayesian Networks
Title | An Empirical-Bayes Score for Discrete Bayesian Networks |
Authors | Marco Scutari |
Abstract | Bayesian network structure learning is often performed in a Bayesian setting, by evaluating candidate structures using their posterior probabilities for a given data set. Score-based algorithms then use those posterior probabilities as an objective function and return the maximum a posteriori network as the learned model. For discrete Bayesian networks, the canonical choice for a posterior score is the Bayesian Dirichlet equivalent uniform (BDeu) marginal likelihood with a uniform (U) graph prior (Heckerman et al., 1995). Its favourable theoretical properties descend from assuming a uniform prior both on the space of the network structures and on the space of the parameters of the network. In this paper, we revisit the limitations of these assumptions; and we introduce an alternative set of assumptions and the resulting score: the Bayesian Dirichlet sparse (BDs) empirical Bayes marginal likelihood with a marginal uniform (MU) graph prior. We evaluate its performance in an extensive simulation study, showing that MU+BDs is more accurate than U+BDeu both in learning the structure of the network and in predicting new observations, while not being computationally more complex to estimate. |
Tasks | |
Published | 2016-05-12 |
URL | http://arxiv.org/abs/1605.03884v3 |
http://arxiv.org/pdf/1605.03884v3.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-bayes-score-for-discrete |
Repo | |
Framework | |
System Combination for Short Utterance Speaker Recognition
Title | System Combination for Short Utterance Speaker Recognition |
Authors | Lantian Li, Dong Wang, Xiaodong Zhang, Thomas Fang Zheng, Panshi Jin |
Abstract | For text-independent short-utterance speaker recognition (SUSR), the performance often degrades dramatically. This paper presents a combination approach to the SUSR tasks with two phonetic-aware systems: one is the DNN-based i-vector system and the other is our recently proposed subregion-based GMM-UBM system. The former employs phone posteriors to construct an i-vector model in which the shared statistics offers stronger robustness against limited test data, while the latter establishes a phone-dependent GMM-UBM system which represents speaker characteristics with more details. A score-level fusion is implemented to integrate the respective advantages from the two systems. Experimental results show that for the text-independent SUSR task, both the DNN-based i-vector system and the subregion-based GMM-UBM system outperform their respective baselines, and the score-level system combination delivers performance improvement. |
Tasks | Speaker Recognition |
Published | 2016-03-31 |
URL | http://arxiv.org/abs/1603.09460v2 |
http://arxiv.org/pdf/1603.09460v2.pdf | |
PWC | https://paperswithcode.com/paper/system-combination-for-short-utterance |
Repo | |
Framework | |
Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection
Title | Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection |
Authors | Jiaolong Yang, Hongdong Li, Yuchao Dai, Robby T. Tan |
Abstract | This paper deals with a challenging, frequently encountered, yet not properly investigated problem in two-frame optical flow estimation. That is, the input frames are compounds of two imaging layers – one desired background layer of the scene, and one distracting, possibly moving layer due to transparency or reflection. In this situation, the conventional brightness constancy constraint – the cornerstone of most existing optical flow methods – will no longer be valid. In this paper, we propose a robust solution to this problem. The proposed method performs both optical flow estimation, and image layer separation. It exploits a generalized double-layer brightness consistency constraint connecting these two tasks, and utilizes the priors for both of them. Experiments on both synthetic data and real images have confirmed the efficacy of the proposed method. To the best of our knowledge, this is the first attempt towards handling generic optical flow fields of two-frame images containing transparency or reflection. |
Tasks | Optical Flow Estimation |
Published | 2016-05-06 |
URL | http://arxiv.org/abs/1605.01825v1 |
http://arxiv.org/pdf/1605.01825v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-optical-flow-estimation-of-double |
Repo | |
Framework | |
Identifying Structures in Social Conversations in NSCLC Patients through the Semi-Automatic extraction of Topical Taxonomies
Title | Identifying Structures in Social Conversations in NSCLC Patients through the Semi-Automatic extraction of Topical Taxonomies |
Authors | Giancarlo Crocetti, Amir A. Delay, Fatemeh Seyedmendhi |
Abstract | The exploration of social conversations for addressing patient’s needs is an important analytical task in which many scholarly publications are contributing to fill the knowledge gap in this area. The main difficulty remains the inability to turn such contributions into pragmatic processes the pharmaceutical industry can leverage in order to generate insight from social media data, which can be considered as one of the most challenging source of information available today due to its sheer volume and noise. This study is based on the work by Scott Spangler and Jeffrey Kreulen and applies it to identify structure in social media through the extraction of a topical taxonomy able to capture the latent knowledge in social conversations in health-related sites. The mechanism for automatically identifying and generating a taxonomy from social conversations is developed and pressured tested using public data from media sites focused on the needs of cancer patients and their families. Moreover, a novel method for generating the category’s label and the determination of an optimal number of categories is presented which extends Scott and Jeffrey’s research in a meaningful way. We assume the reader is familiar with taxonomies, what they are and how they are used. |
Tasks | |
Published | 2016-02-12 |
URL | http://arxiv.org/abs/1602.04709v1 |
http://arxiv.org/pdf/1602.04709v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-structures-in-social |
Repo | |
Framework | |
Very Fast Kernel SVM under Budget Constraints
Title | Very Fast Kernel SVM under Budget Constraints |
Authors | David Picard |
Abstract | In this paper we propose a fast online Kernel SVM algorithm under tight budget constraints. We propose to split the input space using LVQ and train a Kernel SVM in each cluster. To allow for online training, we propose to limit the size of the support vector set of each cluster using different strategies. We show in the experiment that our algorithm is able to achieve high accuracy while having a very high number of samples processed per second both in training and in the evaluation. |
Tasks | |
Published | 2016-12-31 |
URL | http://arxiv.org/abs/1701.00167v1 |
http://arxiv.org/pdf/1701.00167v1.pdf | |
PWC | https://paperswithcode.com/paper/very-fast-kernel-svm-under-budget-constraints |
Repo | |
Framework | |
A Benchmark Dataset and Saliency-guided Stacked Autoencoders for Video-based Salient Object Detection
Title | A Benchmark Dataset and Saliency-guided Stacked Autoencoders for Video-based Salient Object Detection |
Authors | Jia Li, Changqun Xia, Xiaowu Chen |
Abstract | Image-based salient object detection (SOD) has been extensively studied in the past decades. However, video-based SOD is much less explored since there lack large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos (64 minutes). In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects that free-view all videos. From the user data, we find salient objects in video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for video-based salient object detection. Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliency-guided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are unsupervisedly constructed which automatically infer a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. Experimental results show that the proposed unsupervised approach outperforms 30 state-of-the-art models on the proposed dataset, including 19 image-based & classic (unsupervised or non-deep learning), 6 image-based & deep learning, and 5 video-based & unsupervised. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD. |
Tasks | Eye Tracking, Object Detection, Salient Object Detection |
Published | 2016-11-01 |
URL | http://arxiv.org/abs/1611.00135v2 |
http://arxiv.org/pdf/1611.00135v2.pdf | |
PWC | https://paperswithcode.com/paper/a-benchmark-dataset-and-saliency-guided |
Repo | |
Framework | |