July 27, 2019

2864 words 14 mins read

Paper Group ANR 526

An Iterative BP-CNN Architecture for Channel Decoding. Towards the Use of Deep Reinforcement Learning with Global Policy For Query-based Extractive Summarisation. The Stixel world: A medium-level representation of traffic scenes. Interpretation of Semantic Tweet Representations. Consistent Multitask Learning with Nonlinear Output Relations. A rando …

An Iterative BP-CNN Architecture for Channel Decoding


Title	An Iterative BP-CNN Architecture for Channel Decoding
Authors	Fei Liang, Cong Shen, Feng Wu
Abstract	Inspired by recent advances in deep learning, we propose a novel iterative BP-CNN architecture for channel decoding under correlated noise. This architecture concatenates a trained convolutional neural network (CNN) with a standard belief-propagation (BP) decoder. The standard BP decoder is used to estimate the coded bits, followed by a CNN to remove the estimation errors of the BP decoder and obtain a more accurate estimation of the channel noise. Iterating between BP and CNN will gradually improve the decoding SNR and hence result in better decoding performance. To train a well-behaved CNN model, we define a new loss function which involves not only the accuracy of the noise estimation but also the normality test for the estimation errors, i.e., to measure how likely the estimation errors follow a Gaussian distribution. The introduction of the normality test to the CNN training shapes the residual noise distribution and further reduces the BER of the iterative decoding, compared to using the standard quadratic loss function. We carry out extensive experiments to analyze and verify the proposed framework. The iterative BP-CNN decoder has better BER performance with lower complexity, is suitable for parallel implementation, does not rely on any specific channel model or encoding method, and is robust against training mismatches. All of these features make it a good candidate for decoding modern channel codes.
Tasks
Published	2017-07-18
URL	http://arxiv.org/abs/1707.05697v1
PDF	http://arxiv.org/pdf/1707.05697v1.pdf
PWC	https://paperswithcode.com/paper/an-iterative-bp-cnn-architecture-for-channel
Repo
Framework

Towards the Use of Deep Reinforcement Learning with Global Policy For Query-based Extractive Summarisation


Title	Towards the Use of Deep Reinforcement Learning with Global Policy For Query-based Extractive Summarisation
Authors	Diego Molla
Abstract	Supervised approaches for text summarisation suffer from the problem of mismatch between the target labels/scores of individual sentences and the evaluation score of the final summary. Reinforcement learning can solve this problem by providing a learning mechanism that uses the score of the final summary as a guide to determine the decisions made at the time of selection of each sentence. In this paper we present a proof-of-concept approach that applies a policy-gradient algorithm to learn a stochastic policy using an undiscounted reward. The method has been applied to a policy consisting of a simple neural network and simple features. The resulting deep reinforcement learning system is able to learn a global policy and obtain encouraging results.
Tasks
Published	2017-11-10
URL	http://arxiv.org/abs/1711.03859v2
PDF	http://arxiv.org/pdf/1711.03859v2.pdf
PWC	https://paperswithcode.com/paper/towards-the-use-of-deep-reinforcement
Repo
Framework

The Stixel world: A medium-level representation of traffic scenes


Title	The Stixel world: A medium-level representation of traffic scenes
Authors	Marius Cordts, Timo Rehfeld, Lukas Schneider, David Pfeiffer, Markus Enzweiler, Stefan Roth, Marc Pollefeys, Uwe Franke
Abstract	Recent progress in advanced driver assistance systems and the race towards autonomous vehicles is mainly driven by two factors: (1) increasingly sophisticated algorithms that interpret the environment around the vehicle and react accordingly, and (2) the continuous improvements of sensor technology itself. In terms of cameras, these improvements typically include higher spatial resolution, which as a consequence requires more data to be processed. The trend to add multiple cameras to cover the entire surrounding of the vehicle is not conducive in that matter. At the same time, an increasing number of special purpose algorithms need access to the sensor input data to correctly interpret the various complex situations that can occur, particularly in urban traffic. By observing those trends, it becomes clear that a key challenge for vision architectures in intelligent vehicles is to share computational resources. We believe this challenge should be faced by introducing a representation of the sensory data that provides compressed and structured access to all relevant visual content of the scene. The Stixel World discussed in this paper is such a representation. It is a medium-level model of the environment that is specifically designed to compress information about obstacles by leveraging the typical layout of outdoor traffic scenes. It has proven useful for a multitude of automotive vision applications, including object detection, tracking, segmentation, and mapping. In this paper, we summarize the ideas behind the model and generalize it to take into account multiple dense input streams: the image itself, stereo depth maps, and semantic class probability maps that can be generated, e.g., by CNNs. Our generalization is embedded into a novel mathematical formulation for the Stixel model. We further sketch how the free parameters of the model can be learned using structured SVMs.
Tasks	Autonomous Vehicles, Object Detection
Published	2017-04-02
URL	http://arxiv.org/abs/1704.00280v1
PDF	http://arxiv.org/pdf/1704.00280v1.pdf
PWC	https://paperswithcode.com/paper/the-stixel-world-a-medium-level
Repo
Framework

Interpretation of Semantic Tweet Representations


Title	Interpretation of Semantic Tweet Representations
Authors	J Ganesh, Manish Gupta, Vasudeva Varma
Abstract	Research in analysis of microblogging platforms is experiencing a renewed surge with a large number of works applying representation learning models for applications like sentiment analysis, semantic textual similarity computation, hashtag prediction, etc. Although the performance of the representation learning models has been better than the traditional baselines for such tasks, little is known about the elementary properties of a tweet encoded within these representations, or why particular representations work better for certain tasks. Our work presented here constitutes the first step in opening the black-box of vector embeddings for tweets. Traditional feature engineering methods for high-level applications have exploited various elementary properties of tweets. We believe that a tweet representation is effective for an application because it meticulously encodes the application-specific elementary properties of tweets. To understand the elementary properties encoded in a tweet representation, we evaluate the representations on the accuracy to which they can model each of those properties such as tweet length, presence of particular words, hashtags, mentions, capitalization, etc. Our systematic extensive study of nine supervised and four unsupervised tweet representations against most popular eight textual and five social elementary properties reveal that Bi-directional LSTMs (BLSTMs) and Skip-Thought Vectors (STV) best encode the textual and social properties of tweets respectively. FastText is the best model for low resource settings, providing very little degradation with reduction in embedding size. Finally, we draw interesting insights by correlating the model performance obtained for elementary property prediction tasks with the highlevel downstream applications.
Tasks	Feature Engineering, Representation Learning, Semantic Textual Similarity, Sentiment Analysis
Published	2017-04-04
URL	http://arxiv.org/abs/1704.00898v2
PDF	http://arxiv.org/pdf/1704.00898v2.pdf
PWC	https://paperswithcode.com/paper/interpretation-of-semantic-tweet
Repo
Framework

Consistent Multitask Learning with Nonlinear Output Relations


Title	Consistent Multitask Learning with Nonlinear Output Relations
Authors	Carlo Ciliberto, Alessandro Rudi, Lorenzo Rosasco, Massimiliano Pontil
Abstract	Key to multitask learning is exploiting relationships between different tasks to improve prediction performance. If the relations are linear, regularization approaches can be used successfully. However, in practice assuming the tasks to be linearly related might be restrictive, and allowing for nonlinear structures is a challenge. In this paper, we tackle this issue by casting the problem within the framework of structured prediction. Our main contribution is a novel algorithm for learning multiple tasks which are related by a system of nonlinear equations that their joint outputs need to satisfy. We show that the algorithm is consistent and can be efficiently implemented. Experimental results show the potential of the proposed method.
Tasks	Structured Prediction
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08118v2
PDF	http://arxiv.org/pdf/1705.08118v2.pdf
PWC	https://paperswithcode.com/paper/consistent-multitask-learning-with-nonlinear
Repo
Framework

A random matrix analysis and improvement of semi-supervised learning for large dimensional data


Title	A random matrix analysis and improvement of semi-supervised learning for large dimensional data
Authors	Xiaoyi Mai, Romain Couillet
Abstract	This article provides an original understanding of the behavior of a class of graph-oriented semi-supervised learning algorithms in the limit of large and numerous data. It is demonstrated that the intuition at the root of these methods collapses in this limit and that, as a result, most of them become inconsistent. Corrective measures and a new data-driven parametrization scheme are proposed along with a theoretical analysis of the asymptotic performances of the resulting approach. A surprisingly close behavior between theoretical performances on Gaussian mixture models and on real datasets is also illustrated throughout the article, thereby suggesting the importance of the proposed analysis for dealing with practical data. As a result, significant performance gains are observed on practical data classification using the proposed parametrization.
Tasks
Published	2017-11-09
URL	http://arxiv.org/abs/1711.03404v1
PDF	http://arxiv.org/pdf/1711.03404v1.pdf
PWC	https://paperswithcode.com/paper/a-random-matrix-analysis-and-improvement-of
Repo
Framework

Object Recognition by Using Multi-level Feature Point Extraction


Title	Object Recognition by Using Multi-level Feature Point Extraction
Authors	Yang Cheng, Timeo Dubois
Abstract	In this paper, we present a novel approach for object recognition in real-time by employing multilevel feature analysis and demonstrate the practicality of adapting feature extraction into a Naive Bayesian classification framework that enables simple, efficient, and robust performance. We also show the proposed method scales well as the number of level-classes grows. To effectively understand the patches surrounding a keypoint, the trained classifier uses hundreds of simple binary features and models class posterior probabilities. In addition, the classification process is computationally cheap under the assumed independence between arbitrary sets of features. Even though for some particular scenarios, this assumption can be invalid. We demonstrate that the efficient classifier nevertheless performs remarkably well on image datasets with a large variation in the illumination environment and image capture perspectives. The experiment results show consistent accuracy can be achieved on many challenging dataset while offer interactive speed for large resolution images. The method demonstrates promising results that outperform the state-of-the-art methods on pattern recognition.
Tasks	Object Recognition
Published	2017-10-28
URL	http://arxiv.org/abs/1710.10522v1
PDF	http://arxiv.org/pdf/1710.10522v1.pdf
PWC	https://paperswithcode.com/paper/object-recognition-by-using-multi-level
Repo
Framework

Latent tree models


Title	Latent tree models
Authors	Piotr Zwiernik
Abstract	Latent tree models are graphical models defined on trees, in which only a subset of variables is observed. They were first discussed by Judea Pearl as tree-decomposable distributions to generalise star-decomposable distributions such as the latent class model. Latent tree models, or their submodels, are widely used in: phylogenetic analysis, network tomography, computer vision, causal modeling, and data clustering. They also contain other well-known classes of models like hidden Markov models, Brownian motion tree model, the Ising model on a tree, and many popular models used in phylogenetics. This article offers a concise introduction to the theory of latent tree models. We emphasise the role of tree metrics in the structural description of this model class, in designing learning algorithms, and in understanding fundamental limits of what and when can be learned.
Tasks
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00847v1
PDF	http://arxiv.org/pdf/1708.00847v1.pdf
PWC	https://paperswithcode.com/paper/latent-tree-models
Repo
Framework

Smart, Sparse Contours to Represent and Edit Images


Title	Smart, Sparse Contours to Represent and Edit Images
Authors	Tali Dekel, Chuang Gan, Dilip Krishnan, Ce Liu, William T. Freeman
Abstract	We study the problem of reconstructing an image from information stored at contour locations. We show that high-quality reconstructions with high fidelity to the source image can be obtained from sparse input, e.g., comprising less than $6%$ of image pixels. This is a significant improvement over existing contour-based reconstruction methods that require much denser input to capture subtle texture information and to ensure image quality. Our model, based on generative adversarial networks, synthesizes texture and details in regions where no input information is provided. The semantic knowledge encoded into our model and the sparsity of the input allows to use contours as an intuitive interface for semantically-aware image manipulation: local edits in contour domain translate to long-range and coherent changes in pixel space. We can perform complex structural changes such as changing facial expression by simple edits of contours. Our experiments demonstrate that humans as well as a face recognition system mostly cannot distinguish between our reconstructions and the source images.
Tasks	Face Recognition
Published	2017-12-21
URL	http://arxiv.org/abs/1712.08232v2
PDF	http://arxiv.org/pdf/1712.08232v2.pdf
PWC	https://paperswithcode.com/paper/smart-sparse-contours-to-represent-and-edit
Repo
Framework

Enhancing Interpretability of Black-box Soft-margin SVM by Integrating Data-based Priors


Title	Enhancing Interpretability of Black-box Soft-margin SVM by Integrating Data-based Priors
Authors	Shaohan Chen, Chuanhou Gao, Ping Zhang
Abstract	The lack of interpretability often makes black-box models difficult to be applied to many practical domains. For this reason, the current work, from the black-box model input port, proposes to incorporate data-based prior information into the black-box soft-margin SVM model to enhance its interpretability. The concept and incorporation mechanism of data-based prior information are successively developed, based on which the interpretable or partly interpretable SVM optimization model is designed and then solved through handily rewriting the optimization problem as a nonlinear quadratic programming problem. An algorithm for mining data-based linear prior information from data set is also proposed, which generates a linear expression with respect to two appropriate inputs identified from all inputs of system. At last, the proposed interpretability enhancement strategy is applied to eight benchmark examples for effectiveness exhibition.
Tasks
Published	2017-10-09
URL	http://arxiv.org/abs/1710.02924v2
PDF	http://arxiv.org/pdf/1710.02924v2.pdf
PWC	https://paperswithcode.com/paper/enhancing-interpretability-of-black-box-soft
Repo
Framework

AirCode: Unobtrusive Physical Tags for Digital Fabrication


Title	AirCode: Unobtrusive Physical Tags for Digital Fabrication
Authors	Dingzeyu Li, Avinash S. Nair, Shree K. Nayar, Changxi Zheng
Abstract	We present AirCode, a technique that allows the user to tag physically fabricated objects with given information. An AirCode tag consists of a group of carefully designed air pockets placed beneath the object surface. These air pockets are easily produced during the fabrication process of the object, without any additional material or postprocessing. Meanwhile, the air pockets affect only the scattering light transport under the surface, and thus are hard to notice to our naked eyes. But, by using a computational imaging method, the tags become detectable. We present a tool that automates the design of air pockets for the user to encode information. AirCode system also allows the user to retrieve the information from captured images via a robust decoding algorithm. We demonstrate our tagging technique with applications for metadata embedding, robotic grasping, as well as conveying object affordances.
Tasks	Robotic Grasping
Published	2017-07-18
URL	http://arxiv.org/abs/1707.05754v2
PDF	http://arxiv.org/pdf/1707.05754v2.pdf
PWC	https://paperswithcode.com/paper/aircode-unobtrusive-physical-tags-for-digital
Repo
Framework

Deep Learning the Physics of Transport Phenomena


Title	Deep Learning the Physics of Transport Phenomena
Authors	Amir Barati Farimani, Joseph Gomes, Vijay S. Pande
Abstract	We have developed a new data-driven paradigm for the rapid inference, modeling and simulation of the physics of transport phenomena by deep learning. Using conditional generative adversarial networks (cGAN), we train models for the direct generation of solutions to steady state heat conduction and incompressible fluid flow purely on observation without knowledge of the underlying governing equations. Rather than using iterative numerical methods to approximate the solution of the constitutive equations, cGANs learn to directly generate the solutions to these phenomena, given arbitrary boundary conditions and domain, with high test accuracy (MAE$<$1%) and state-of-the-art computational performance. The cGAN framework can be used to learn causal models directly from experimental observations where the underlying physical model is complex or unknown.
Tasks
Published	2017-09-07
URL	http://arxiv.org/abs/1709.02432v1
PDF	http://arxiv.org/pdf/1709.02432v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-the-physics-of-transport
Repo
Framework

Musical Instrument Recognition Using Their Distinctive Characteristics in Artificial Neural Networks


Title	Musical Instrument Recognition Using Their Distinctive Characteristics in Artificial Neural Networks
Authors	Babak Toghiani-Rizi, Marcus Windmark
Abstract	In this study an Artificial Neural Network was trained to classify musical instruments, using audio samples transformed to the frequency domain. Different features of the sound, in both time and frequency domain, were analyzed and compared in relation to how much information that could be derived from that limited data. The study concluded that in comparison with the base experiment, that had an accuracy of 93.5%, using the attack only resulted in 80.2% and the initial 100 Hz in 64.2%.
Tasks
Published	2017-05-14
URL	http://arxiv.org/abs/1705.04971v1
PDF	http://arxiv.org/pdf/1705.04971v1.pdf
PWC	https://paperswithcode.com/paper/musical-instrument-recognition-using-their
Repo
Framework

Audio Spectrogram Representations for Processing with Convolutional Neural Networks


Title	Audio Spectrogram Representations for Processing with Convolutional Neural Networks
Authors	L. Wyse
Abstract	One of the decisions that arise when designing a neural network for any application is how the data should be represented in order to be presented to, and possibly generated by, a neural network. For audio, the choice is less obvious than it seems to be for visual images, and a variety of representations have been used for different applications including the raw digitized sample stream, hand-crafted features, machine discovered features, MFCCs and variants that include deltas, and a variety of spectral representations. This paper reviews some of these representations and issues that arise, focusing particularly on spectrograms for generating audio using neural networks for style transfer.
Tasks	Style Transfer
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09559v1
PDF	http://arxiv.org/pdf/1706.09559v1.pdf
PWC	https://paperswithcode.com/paper/audio-spectrogram-representations-for
Repo
Framework

Synthesising Sign Language from semantics, approaching “from the target and back”


Title	Synthesising Sign Language from semantics, approaching “from the target and back”
Authors	Michael Filhol, Gilles Falquet
Abstract	We present a Sign Language modelling approach allowing to build grammars and create linguistic input for Sign synthesis through avatars. We comment on the type of grammar it allows to build, and observe a resemblance between the resulting expressions and traditional semantic representations. Comparing the ways in which the paradigms are designed, we name and contrast two essentially different strategies for building higher-level linguistic input: “source-and-forward” vs. “target-and-back”. We conclude by favouring the latter, acknowledging the power of being able to automatically generate output from semantically relevant input straight into articulations of the target language.
Tasks	Language Modelling
Published	2017-07-25
URL	http://arxiv.org/abs/1707.08041v1
PDF	http://arxiv.org/pdf/1707.08041v1.pdf
PWC	https://paperswithcode.com/paper/synthesising-sign-language-from-semantics
Repo
Framework