October 21, 2019

3102 words 15 mins read

Paper Group AWR 97

PointCloud Saliency Maps. Deep UL2DL: Channel Knowledge Transfer from Uplink to Downlink. Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition. Identification of LTV Dynamical Models with Smooth or Discontinuous Time Evolution by means of Convex Optimization. Transformation Autoregressive Networks. …

PointCloud Saliency Maps


Title	PointCloud Saliency Maps
Authors	Tianhang Zheng, Changyou Chen, Junsong Yuan, Bo Li, Kui Ren
Abstract	3D point-cloud recognition with PointNet and its variants has received remarkable progress. A missing ingredient, however, is the ability to automatically evaluate point-wise importance w.r.t.! classification performance, which is usually reflected by a saliency map. A saliency map is an important tool as it allows one to perform further processes on point-cloud data. In this paper, we propose a novel way of characterizing critical points and segments to build point-cloud saliency maps. Our method assigns each point a score reflecting its contribution to the model-recognition loss. The saliency map explicitly explains which points are the key for model recognition. Furthermore, aggregations of highly-scored points indicate important segments/subsets in a point-cloud. Our motivation for constructing a saliency map is by point dropping, which is a non-differentiable operator. To overcome this issue, we approximate point-dropping with a differentiable procedure of shifting points towards the cloud centroid. Consequently, each saliency score can be efficiently measured by the corresponding gradient of the loss w.r.t the point under the spherical coordinates. Extensive evaluations on several state-of-the-art point-cloud recognition models, including PointNet, PointNet++ and DGCNN, demonstrate the veracity and generality of our proposed saliency map. Code for experiments is released on \url{https://github.com/tianzheng4/PointCloud-Saliency-Maps}.
Tasks
Published	2018-11-28
URL	https://arxiv.org/abs/1812.01687v6
PDF	https://arxiv.org/pdf/1812.01687v6.pdf
PWC	https://paperswithcode.com/paper/learning-saliency-maps-for-adversarial-point
Repo	https://github.com/tianzheng4/Learning-PointCloud-Saliency-Maps
Framework	tf

Deep UL2DL: Channel Knowledge Transfer from Uplink to Downlink


Title	Deep UL2DL: Channel Knowledge Transfer from Uplink to Downlink
Authors	Mohammad Sadegh Safari, Vahid Pourahmadi, Shabnam Sodagari
Abstract	Knowledge of the channel state information (CSI) at the transmitter side is one of the primary sources of information that can be used for the efficient allocation of wireless resources. Obtaining downlink (DL) CSI in Frequency Division Duplexing (FDD) systems from uplink (UL) CSI is not as straightforward as in TDD systems. Therefore, users usually feed the DL-CSI back to the transmitter. To remove the need for feedback (and thus having less signaling overhead), we propose to use two recent deep neural network structures, i.e., convolutional neural networks and generative adversarial networks (GANs) to infer the DL-CSI by observing the UL-CSI. The core idea of our data-driven scheme is exploiting the fact that both DL and UL channels share the same propagation environment. As such, we extracted the environment information from the UL channel response to a latent domain and then transferred the derived environment information from the latent domain to predict the DL channel. To overcome incorrect latent domain and the problem of oversimplistic assumptions, in this work, we did not use any specific parametric model and instead used data-driven approaches to discover the underlying structure of data without any prior model assumptions. To overcome the challenge of capturing the UL-DL joint distribution, we used a mean square error-based variant of the GAN structure with improved convergence properties called boundary equilibrium GAN (BEGAN). For training and testing we used simulated data of Extended Vehicular-A (EVA) and Extended Typical Urban (ETU) models. Simulation results verified that our methods can accurately infer and predict the downlink CSI from the uplink CSI for different multipath environments in FDD communications.
Tasks	Transfer Learning
Published	2018-12-16
URL	https://arxiv.org/abs/1812.07518v3
PDF	https://arxiv.org/pdf/1812.07518v3.pdf
PWC	https://paperswithcode.com/paper/deep-ul2dl-channel-knowledge-transfer-from
Repo	https://github.com/safarisadegh/UL2DL
Framework	tf

Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition


Title	Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition
Authors	Christoph Wick, Christian Reul, Frank Puppe
Abstract	Optical Character Recognition (OCR) on contemporary and historical data is still in the focus of many researchers. Especially historical prints require book specific trained OCR models to achieve applicable results (Springmann and L"udeling, 2016, Reul et al., 2017a). To reduce the human effort for manually annotating ground truth (GT) various techniques such as voting and pretraining have shown to be very efficient (Reul et al., 2018a, Reul et al., 2018b). Calamari is a new open source OCR line recognition software that both uses state-of-the art Deep Neural Networks (DNNs) implemented in Tensorflow and giving native support for techniques such as pretraining and voting. The customizable network architectures constructed of Convolutional Neural Networks (CNNS) and Long-ShortTerm-Memory (LSTM) layers are trained by the so-called Connectionist Temporal Classification (CTC) algorithm of Graves et al. (2006). Optional usage of a GPU drastically reduces the computation times for both training and prediction. We use two different datasets to compare the performance of Calamari to OCRopy, OCRopus3, and Tesseract 4. Calamari reaches a Character Error Rate (CER) of 0.11% on the UW3 dataset written in modern English and 0.18% on the DTA19 dataset written in German Fraktur, which considerably outperforms the results of the existing softwares.
Tasks	Optical Character Recognition
Published	2018-07-05
URL	http://arxiv.org/abs/1807.02004v3
PDF	http://arxiv.org/pdf/1807.02004v3.pdf
PWC	https://paperswithcode.com/paper/calamari-a-high-performance-tensorflow-based
Repo	https://github.com/chreul/mptv
Framework	tf

Identification of LTV Dynamical Models with Smooth or Discontinuous Time Evolution by means of Convex Optimization


Title	Identification of LTV Dynamical Models with Smooth or Discontinuous Time Evolution by means of Convex Optimization
Authors	Fredrik Bagge Carlson, Anders Robertsson, Rolf Johansson
Abstract	We establish a connection between trend filtering and system identification which results in a family of new identification methods for linear, time-varying (LTV) dynamical models based on convex optimization. We demonstrate how the design of the cost function promotes a model with either a continuous change in dynamics over time, or causes discontinuous changes in model coefficients occurring at a finite (sparse) set of time instances. We further discuss the introduction of priors on the model parameters for situations where excitation is insufficient for identification. The identification problems are cast as convex optimization problems and are applicable to, e.g., ARX models and state-space models with time-varying parameters. We illustrate usage of the methods in simulations of jump-linear systems, a nonlinear robot arm with non-smooth friction and stiff contacts as well as in model-based, trajectory centric reinforcement learning on a smooth nonlinear system.
Tasks
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09794v1
PDF	http://arxiv.org/pdf/1802.09794v1.pdf
PWC	https://paperswithcode.com/paper/identification-of-ltv-dynamical-models-with
Repo	https://github.com/baggepinnen/LTVModels.jl
Framework	none

Transformation Autoregressive Networks


Title	Transformation Autoregressive Networks
Authors	Junier B. Oliva, Avinava Dubey, Manzil Zaheer, Barnabás Póczos, Ruslan Salakhutdinov, Eric P. Xing, Jeff Schneider
Abstract	The fundamental task of general density estimation $p(x)$ has been of keen interest to machine learning. In this work, we attempt to systematically characterize methods for density estimation. Broadly speaking, most of the existing methods can be categorized into either using: \textit{a}) autoregressive models to estimate the conditional factors of the chain rule, $p(x_{i}, , x_{i-1}, \ldots)$; or \textit{b}) non-linear transformations of variables of a simple base distribution. Based on the study of the characteristics of these categories, we propose multiple novel methods for each category. For example we proposed RNN based transformations to model non-Markovian dependencies. Further, through a comprehensive study over both real world and synthetic data, we show for that jointly leveraging transformations of variables and autoregressive conditional models, results in a considerable improvement in performance. We illustrate the use of our models in outlier detection and image modeling. Finally we introduce a novel data driven framework for learning a family of distributions.
Tasks	Density Estimation, Outlier Detection
Published	2018-01-30
URL	http://arxiv.org/abs/1801.09819v5
PDF	http://arxiv.org/pdf/1801.09819v5.pdf
PWC	https://paperswithcode.com/paper/transformation-autoregressive-networks
Repo	https://github.com/lupalab/tan
Framework	tf

Rethinking floating point for deep learning


Title	Rethinking floating point for deep learning
Authors	Jeff Johnson
Abstract	Reducing hardware overhead of neural networks for faster or lower power inference and training is an active area of research. Uniform quantization using integer multiply-add has been thoroughly investigated, which requires learning many quantization parameters, fine-tuning training or other prerequisites. Little effort is made to improve floating point relative to this baseline; it remains energy inefficient, and word size reduction yields drastic loss in needed dynamic range. We improve floating point to be more energy efficient than equivalent bit width integer hardware on a 28 nm ASIC process while retaining accuracy in 8 bits with a novel hybrid log multiply/linear add, Kulisch accumulation and tapered encodings from Gustafson’s posit format. With no network retraining, and drop-in replacement of all math and float32 parameters via round-to-nearest-even only, this open-sourced 8-bit log float is within 0.9% top-1 and 0.2% top-5 accuracy of the original float32 ResNet-50 CNN model on ImageNet. Unlike int8 quantization, it is still a general purpose floating point arithmetic, interpretable out-of-the-box. Our 8/38-bit log float multiply-add is synthesized and power profiled at 28 nm at 0.96x the power and 1.12x the area of 8/32-bit integer multiply-add. In 16 bits, our log float multiply-add is 0.59x the power and 0.68x the area of IEEE 754 float16 fused multiply-add, maintaining the same signficand precision and dynamic range, proving useful for training ASICs as well.
Tasks	Quantization
Published	2018-11-01
URL	http://arxiv.org/abs/1811.01721v1
PDF	http://arxiv.org/pdf/1811.01721v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-floating-point-for-deep-learning
Repo	https://github.com/facebookresearch/deepfloat
Framework	pytorch

Semi-Implicit Variational Inference


Title	Semi-Implicit Variational Inference
Authors	Mingzhang Yin, Mingyuan Zhou
Abstract	Semi-implicit variational inference (SIVI) is introduced to expand the commonly used analytic variational distribution family, by mixing the variational parameter with a flexible distribution. This mixing distribution can assume any density function, explicit or not, as long as independent random samples can be generated via reparameterization. Not only does SIVI expand the variational family to incorporate highly flexible variational distributions, including implicit ones that have no analytic density functions, but also sandwiches the evidence lower bound (ELBO) between a lower bound and an upper bound, and further derives an asymptotically exact surrogate ELBO that is amenable to optimization via stochastic gradient ascent. With a substantially expanded variational family and a novel optimization algorithm, SIVI is shown to closely match the accuracy of MCMC in inferring the posterior in a variety of Bayesian inference tasks.
Tasks	Bayesian Inference
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11183v1
PDF	http://arxiv.org/pdf/1805.11183v1.pdf
PWC	https://paperswithcode.com/paper/semi-implicit-variational-inference
Repo	https://github.com/mingzhang-yin/SIVI
Framework	none

Neural networks for stock price prediction


Title	Neural networks for stock price prediction
Authors	Yue-Gang Song, Yu-Long Zhou, Ren-Jie Han
Abstract	Due to the extremely volatile nature of financial markets, it is commonly accepted that stock price prediction is a task full of challenge. However in order to make profits or understand the essence of equity market, numerous market participants or researchers try to forecast stock price using various statistical, econometric or even neural network models. In this work, we survey and compare the predictive power of five neural network models, namely, back propagation (BP) neural network, radial basis function (RBF) neural network, general regression neural network (GRNN), support vector machine regression (SVMR), least squares support vector machine regresssion (LS-SVMR). We apply the five models to make price prediction of three individual stocks, namely, Bank of China, Vanke A and Kweichou Moutai. Adopting mean square error and average absolute percentage error as criteria, we find BP neural network consistently and robustly outperforms the other four models.
Tasks	Stock Price Prediction
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11317v1
PDF	http://arxiv.org/pdf/1805.11317v1.pdf
PWC	https://paperswithcode.com/paper/neural-networks-for-stock-price-prediction
Repo	https://github.com/aflorial/DeepDayTrade
Framework	none

Optimizing for Generalization in Machine Learning with Cross-Validation Gradients


Title	Optimizing for Generalization in Machine Learning with Cross-Validation Gradients
Authors	Shane Barratt, Rishi Sharma
Abstract	Cross-validation is the workhorse of modern applied statistics and machine learning, as it provides a principled framework for selecting the model that maximizes generalization performance. In this paper, we show that the cross-validation risk is differentiable with respect to the hyperparameters and training data for many common machine learning algorithms, including logistic regression, elastic-net regression, and support vector machines. Leveraging this property of differentiability, we propose a cross-validation gradient method (CVGM) for hyperparameter optimization. Our method enables efficient optimization in high-dimensional hyperparameter spaces of the cross-validation risk, the best surrogate of the true generalization ability of our learning algorithm.
Tasks	Hyperparameter Optimization
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07072v1
PDF	http://arxiv.org/pdf/1805.07072v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-for-generalization-in-machine-1
Repo	https://github.com/sbarratt/crossval
Framework	none

DepecheMood++: a Bilingual Emotion Lexicon Built Through Simple Yet Powerful Techniques


Title	DepecheMood++: a Bilingual Emotion Lexicon Built Through Simple Yet Powerful Techniques
Authors	Oscar Araque, Lorenzo Gatti, Jacopo Staiano, Marco Guerini
Abstract	Several lexica for sentiment analysis have been developed and made available in the NLP community. While most of these come with word polarity annotations (e.g. positive/negative), attempts at building lexica for finer-grained emotion analysis (e.g. happiness, sadness) have recently attracted significant attention. Such lexica are often exploited as a building block in the process of developing learning models for which emotion recognition is needed, and/or used as baselines to which compare the performance of the models. In this work, we contribute two new resources to the community: a) an extension of an existing and widely used emotion lexicon for English; and b) a novel version of the lexicon targeting Italian. Furthermore, we show how simple techniques can be used, both in supervised and unsupervised experimental settings, to boost performances on datasets and tasks of varying degree of domain-specificity.
Tasks	Emotion Recognition, Sentiment Analysis
Published	2018-10-08
URL	http://arxiv.org/abs/1810.03660v1
PDF	http://arxiv.org/pdf/1810.03660v1.pdf
PWC	https://paperswithcode.com/paper/depechemood-a-bilingual-emotion-lexicon-built
Repo	https://github.com/marcoguerini/DepecheMood
Framework	none

Supervised Fitting of Geometric Primitives to 3D Point Clouds


Title	Supervised Fitting of Geometric Primitives to 3D Point Clouds
Authors	Lingxiao Li, Minhyuk Sung, Anastasia Dubrovina, Li Yi, Leonidas Guibas
Abstract	Fitting geometric primitives to 3D point cloud data bridges a gap between low-level digitized 3D data and high-level structural information on the underlying 3D shapes. As such, it enables many downstream applications in 3D data processing. For a long time, RANSAC-based methods have been the gold standard for such primitive fitting problems, but they require careful per-input parameter tuning and thus do not scale well for large datasets with diverse shapes. In this work, we introduce Supervised Primitive Fitting Network (SPFN), an end-to-end neural network that can robustly detect a varying number of primitives at different scales without any user control. The network is supervised using ground truth primitive surfaces and primitive membership for the input points. Instead of directly predicting the primitives, our architecture first predicts per-point properties and then uses a differential model estimation module to compute the primitive type and parameters. We evaluate our approach on a novel benchmark of ANSI 3D mechanical component models and demonstrate a significant improvement over both the state-of-the-art RANSAC-based methods and the direct neural prediction.
Tasks	Shape Representation Of 3D Point Clouds
Published	2018-11-22
URL	https://arxiv.org/abs/1811.08988v4
PDF	https://arxiv.org/pdf/1811.08988v4.pdf
PWC	https://paperswithcode.com/paper/supervised-fitting-of-geometric-primitives-to
Repo	https://github.com/csimstu2/SPFN
Framework	tf

Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision


Title	Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision
Authors	Mohsen Joneidi, Alireza Zaeemzadeh, Nazanin Rahnavard, Mubarak Shah
Abstract	The goal of data selection is to capture the most structural information from a set of data. This paper presents a fast and accurate data selection method, in which the selected samples are optimized to span the subspace of all data. We propose a new selection algorithm, referred to as iterative projection and matching (IPM), with linear complexity w.r.t. the number of data, and without any parameter to be tuned. In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples. The computational efficiency and the selection accuracy of our proposed algorithm outperform those of the conventional methods. Furthermore, the superiority of the proposed algorithm is shown on active learning for video action recognition dataset on UCF-101; learning using representatives on ImageNet; training a generative adversarial network (GAN) to generate multi-view images from a single-view input on CMU Multi-PIE dataset; and video summarization on UTE Egocentric dataset.
Tasks	Active Learning, Data Summarization, Feature Selection, Temporal Action Localization, Video Summarization
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12326v1
PDF	http://arxiv.org/pdf/1811.12326v1.pdf
PWC	https://paperswithcode.com/paper/iterative-projection-and-matching-finding
Repo	https://github.com/zaeemzadeh/IPM
Framework	none

Classifier-agnostic saliency map extraction


Title	Classifier-agnostic saliency map extraction
Authors	Konrad Zolna, Krzysztof J. Geras, Kyunghyun Cho
Abstract	Extracting saliency maps, which indicate parts of the image important to classification, requires many tricks to achieve satisfactory performance when using classifier-dependent methods. Instead, we propose classifier-agnostic saliency map extraction, which finds all parts of the image that any classifier could use, not just one given in advance. We observe that the proposed approach extracts higher quality saliency maps and outperforms existing weakly-supervised localization techniques, setting the new state of the art result on the ImageNet dataset. We made our code publicly available at https://github.com/kondiz/casme.
Tasks
Published	2018-05-21
URL	http://arxiv.org/abs/1805.08249v2
PDF	http://arxiv.org/pdf/1805.08249v2.pdf
PWC	https://paperswithcode.com/paper/classifier-agnostic-saliency-map-extraction
Repo	https://github.com/kondiz/casme
Framework	pytorch

Cross-view image synthesis using geometry-guided conditional GANs


Title	Cross-view image synthesis using geometry-guided conditional GANs
Authors	Krishna Regmi, Ali Borji
Abstract	We address the problem of generating images across two drastically different views, namely ground (street) and aerial (overhead) views. Image synthesis by itself is a very challenging computer vision task and is even more so when generation is conditioned on an image in another view. Due the difference in viewpoints, there is small overlapping field of view and little common content between these two views. Here, we try to preserve the pixel information between the views so that the generated image is a realistic representation of cross view input image. For this, we propose to use homography as a guide to map the images between the views based on the common field of view to preserve the details in the input image. We then use generative adversarial networks to inpaint the missing regions in the transformed image and add realism to it. Our exhaustive evaluation and model comparison demonstrate that utilizing geometry constraints adds fine details to the generated images and can be a better approach for cross view image synthesis than purely pixel based synthesis methods.
Tasks	Cross-View Image-to-Image Translation, Image Generation
Published	2018-08-14
URL	https://arxiv.org/abs/1808.05469v2
PDF	https://arxiv.org/pdf/1808.05469v2.pdf
PWC	https://paperswithcode.com/paper/cross-view-image-synthesis-using-geometry
Repo	https://github.com/kregmi/cross-view-image-synthesis
Framework	pytorch

Image Processing in Quantum Computers


Title	Image Processing in Quantum Computers
Authors	Aditya Dendukuri, Khoa Luu
Abstract	Quantum Image Processing (QIP)is an exciting new field showing a lot of promise as a powerful addition to the arsenal of Image Processing techniques. Representing image pixel by pixel using classical information requires an enormous amount of computational resources. Hence, exploring methods to represent images in a different paradigm of information is important. In this work, we study the representation of images in Quantum Information. The main motivation for this pursuit is the ability of storing N bits of classical information in only log(2N) quantum bits (qubits). The promising first step was the exponentially efficient implementation of the Fourier transform in quantum computers as compared to Fast Fourier Transform in classical computers. In addition, images encoded in quantum information could obey unique quantum properties like superposition or entanglement.
Tasks
Published	2018-12-28
URL	http://arxiv.org/abs/1812.11042v3
PDF	http://arxiv.org/pdf/1812.11042v3.pdf
PWC	https://paperswithcode.com/paper/image-processing-in-quantum-computers
Repo	https://github.com/Shedka/citiesatnight
Framework	none