Paper Group AWR 97
PointCloud Saliency Maps. Deep UL2DL: Channel Knowledge Transfer from Uplink to Downlink. Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition. Identification of LTV Dynamical Models with Smooth or Discontinuous Time Evolution by means of Convex Optimization. Transformation Autoregressive Networks. …
PointCloud Saliency Maps
Title | PointCloud Saliency Maps |
Authors | Tianhang Zheng, Changyou Chen, Junsong Yuan, Bo Li, Kui Ren |
Abstract | 3D point-cloud recognition with PointNet and its variants has received remarkable progress. A missing ingredient, however, is the ability to automatically evaluate point-wise importance w.r.t.! classification performance, which is usually reflected by a saliency map. A saliency map is an important tool as it allows one to perform further processes on point-cloud data. In this paper, we propose a novel way of characterizing critical points and segments to build point-cloud saliency maps. Our method assigns each point a score reflecting its contribution to the model-recognition loss. The saliency map explicitly explains which points are the key for model recognition. Furthermore, aggregations of highly-scored points indicate important segments/subsets in a point-cloud. Our motivation for constructing a saliency map is by point dropping, which is a non-differentiable operator. To overcome this issue, we approximate point-dropping with a differentiable procedure of shifting points towards the cloud centroid. Consequently, each saliency score can be efficiently measured by the corresponding gradient of the loss w.r.t the point under the spherical coordinates. Extensive evaluations on several state-of-the-art point-cloud recognition models, including PointNet, PointNet++ and DGCNN, demonstrate the veracity and generality of our proposed saliency map. Code for experiments is released on \url{https://github.com/tianzheng4/PointCloud-Saliency-Maps}. |
Tasks | |
Published | 2018-11-28 |
URL | https://arxiv.org/abs/1812.01687v6 |
https://arxiv.org/pdf/1812.01687v6.pdf | |
PWC | https://paperswithcode.com/paper/learning-saliency-maps-for-adversarial-point |
Repo | https://github.com/tianzheng4/Learning-PointCloud-Saliency-Maps |
Framework | tf |
Deep UL2DL: Channel Knowledge Transfer from Uplink to Downlink
Title | Deep UL2DL: Channel Knowledge Transfer from Uplink to Downlink |
Authors | Mohammad Sadegh Safari, Vahid Pourahmadi, Shabnam Sodagari |
Abstract | Knowledge of the channel state information (CSI) at the transmitter side is one of the primary sources of information that can be used for the efficient allocation of wireless resources. Obtaining downlink (DL) CSI in Frequency Division Duplexing (FDD) systems from uplink (UL) CSI is not as straightforward as in TDD systems. Therefore, users usually feed the DL-CSI back to the transmitter. To remove the need for feedback (and thus having less signaling overhead), we propose to use two recent deep neural network structures, i.e., convolutional neural networks and generative adversarial networks (GANs) to infer the DL-CSI by observing the UL-CSI. The core idea of our data-driven scheme is exploiting the fact that both DL and UL channels share the same propagation environment. As such, we extracted the environment information from the UL channel response to a latent domain and then transferred the derived environment information from the latent domain to predict the DL channel. To overcome incorrect latent domain and the problem of oversimplistic assumptions, in this work, we did not use any specific parametric model and instead used data-driven approaches to discover the underlying structure of data without any prior model assumptions. To overcome the challenge of capturing the UL-DL joint distribution, we used a mean square error-based variant of the GAN structure with improved convergence properties called boundary equilibrium GAN (BEGAN). For training and testing we used simulated data of Extended Vehicular-A (EVA) and Extended Typical Urban (ETU) models. Simulation results verified that our methods can accurately infer and predict the downlink CSI from the uplink CSI for different multipath environments in FDD communications. |
Tasks | Transfer Learning |
Published | 2018-12-16 |
URL | https://arxiv.org/abs/1812.07518v3 |
https://arxiv.org/pdf/1812.07518v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-ul2dl-channel-knowledge-transfer-from |
Repo | https://github.com/safarisadegh/UL2DL |
Framework | tf |
Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition
Title | Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition |
Authors | Christoph Wick, Christian Reul, Frank Puppe |
Abstract | Optical Character Recognition (OCR) on contemporary and historical data is still in the focus of many researchers. Especially historical prints require book specific trained OCR models to achieve applicable results (Springmann and L"udeling, 2016, Reul et al., 2017a). To reduce the human effort for manually annotating ground truth (GT) various techniques such as voting and pretraining have shown to be very efficient (Reul et al., 2018a, Reul et al., 2018b). Calamari is a new open source OCR line recognition software that both uses state-of-the art Deep Neural Networks (DNNs) implemented in Tensorflow and giving native support for techniques such as pretraining and voting. The customizable network architectures constructed of Convolutional Neural Networks (CNNS) and Long-ShortTerm-Memory (LSTM) layers are trained by the so-called Connectionist Temporal Classification (CTC) algorithm of Graves et al. (2006). Optional usage of a GPU drastically reduces the computation times for both training and prediction. We use two different datasets to compare the performance of Calamari to OCRopy, OCRopus3, and Tesseract 4. Calamari reaches a Character Error Rate (CER) of 0.11% on the UW3 dataset written in modern English and 0.18% on the DTA19 dataset written in German Fraktur, which considerably outperforms the results of the existing softwares. |
Tasks | Optical Character Recognition |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.02004v3 |
http://arxiv.org/pdf/1807.02004v3.pdf | |
PWC | https://paperswithcode.com/paper/calamari-a-high-performance-tensorflow-based |
Repo | https://github.com/chreul/mptv |
Framework | tf |
Identification of LTV Dynamical Models with Smooth or Discontinuous Time Evolution by means of Convex Optimization
Title | Identification of LTV Dynamical Models with Smooth or Discontinuous Time Evolution by means of Convex Optimization |
Authors | Fredrik Bagge Carlson, Anders Robertsson, Rolf Johansson |
Abstract | We establish a connection between trend filtering and system identification which results in a family of new identification methods for linear, time-varying (LTV) dynamical models based on convex optimization. We demonstrate how the design of the cost function promotes a model with either a continuous change in dynamics over time, or causes discontinuous changes in model coefficients occurring at a finite (sparse) set of time instances. We further discuss the introduction of priors on the model parameters for situations where excitation is insufficient for identification. The identification problems are cast as convex optimization problems and are applicable to, e.g., ARX models and state-space models with time-varying parameters. We illustrate usage of the methods in simulations of jump-linear systems, a nonlinear robot arm with non-smooth friction and stiff contacts as well as in model-based, trajectory centric reinforcement learning on a smooth nonlinear system. |
Tasks | |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09794v1 |
http://arxiv.org/pdf/1802.09794v1.pdf | |
PWC | https://paperswithcode.com/paper/identification-of-ltv-dynamical-models-with |
Repo | https://github.com/baggepinnen/LTVModels.jl |
Framework | none |
Transformation Autoregressive Networks
Title | Transformation Autoregressive Networks |
Authors | Junier B. Oliva, Avinava Dubey, Manzil Zaheer, Barnabás Póczos, Ruslan Salakhutdinov, Eric P. Xing, Jeff Schneider |
Abstract | The fundamental task of general density estimation $p(x)$ has been of keen interest to machine learning. In this work, we attempt to systematically characterize methods for density estimation. Broadly speaking, most of the existing methods can be categorized into either using: \textit{a}) autoregressive models to estimate the conditional factors of the chain rule, $p(x_{i}, , x_{i-1}, \ldots)$; or \textit{b}) non-linear transformations of variables of a simple base distribution. Based on the study of the characteristics of these categories, we propose multiple novel methods for each category. For example we proposed RNN based transformations to model non-Markovian dependencies. Further, through a comprehensive study over both real world and synthetic data, we show for that jointly leveraging transformations of variables and autoregressive conditional models, results in a considerable improvement in performance. We illustrate the use of our models in outlier detection and image modeling. Finally we introduce a novel data driven framework for learning a family of distributions. |
Tasks | Density Estimation, Outlier Detection |
Published | 2018-01-30 |
URL | http://arxiv.org/abs/1801.09819v5 |
http://arxiv.org/pdf/1801.09819v5.pdf | |
PWC | https://paperswithcode.com/paper/transformation-autoregressive-networks |
Repo | https://github.com/lupalab/tan |
Framework | tf |
Rethinking floating point for deep learning
Title | Rethinking floating point for deep learning |
Authors | Jeff Johnson |
Abstract | Reducing hardware overhead of neural networks for faster or lower power inference and training is an active area of research. Uniform quantization using integer multiply-add has been thoroughly investigated, which requires learning many quantization parameters, fine-tuning training or other prerequisites. Little effort is made to improve floating point relative to this baseline; it remains energy inefficient, and word size reduction yields drastic loss in needed dynamic range. We improve floating point to be more energy efficient than equivalent bit width integer hardware on a 28 nm ASIC process while retaining accuracy in 8 bits with a novel hybrid log multiply/linear add, Kulisch accumulation and tapered encodings from Gustafson’s posit format. With no network retraining, and drop-in replacement of all math and float32 parameters via round-to-nearest-even only, this open-sourced 8-bit log float is within 0.9% top-1 and 0.2% top-5 accuracy of the original float32 ResNet-50 CNN model on ImageNet. Unlike int8 quantization, it is still a general purpose floating point arithmetic, interpretable out-of-the-box. Our 8/38-bit log float multiply-add is synthesized and power profiled at 28 nm at 0.96x the power and 1.12x the area of 8/32-bit integer multiply-add. In 16 bits, our log float multiply-add is 0.59x the power and 0.68x the area of IEEE 754 float16 fused multiply-add, maintaining the same signficand precision and dynamic range, proving useful for training ASICs as well. |
Tasks | Quantization |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.01721v1 |
http://arxiv.org/pdf/1811.01721v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-floating-point-for-deep-learning |
Repo | https://github.com/facebookresearch/deepfloat |
Framework | pytorch |
Semi-Implicit Variational Inference
Title | Semi-Implicit Variational Inference |
Authors | Mingzhang Yin, Mingyuan Zhou |
Abstract | Semi-implicit variational inference (SIVI) is introduced to expand the commonly used analytic variational distribution family, by mixing the variational parameter with a flexible distribution. This mixing distribution can assume any density function, explicit or not, as long as independent random samples can be generated via reparameterization. Not only does SIVI expand the variational family to incorporate highly flexible variational distributions, including implicit ones that have no analytic density functions, but also sandwiches the evidence lower bound (ELBO) between a lower bound and an upper bound, and further derives an asymptotically exact surrogate ELBO that is amenable to optimization via stochastic gradient ascent. With a substantially expanded variational family and a novel optimization algorithm, SIVI is shown to closely match the accuracy of MCMC in inferring the posterior in a variety of Bayesian inference tasks. |
Tasks | Bayesian Inference |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.11183v1 |
http://arxiv.org/pdf/1805.11183v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-implicit-variational-inference |
Repo | https://github.com/mingzhang-yin/SIVI |
Framework | none |
Neural networks for stock price prediction
Title | Neural networks for stock price prediction |
Authors | Yue-Gang Song, Yu-Long Zhou, Ren-Jie Han |
Abstract | Due to the extremely volatile nature of financial markets, it is commonly accepted that stock price prediction is a task full of challenge. However in order to make profits or understand the essence of equity market, numerous market participants or researchers try to forecast stock price using various statistical, econometric or even neural network models. In this work, we survey and compare the predictive power of five neural network models, namely, back propagation (BP) neural network, radial basis function (RBF) neural network, general regression neural network (GRNN), support vector machine regression (SVMR), least squares support vector machine regresssion (LS-SVMR). We apply the five models to make price prediction of three individual stocks, namely, Bank of China, Vanke A and Kweichou Moutai. Adopting mean square error and average absolute percentage error as criteria, we find BP neural network consistently and robustly outperforms the other four models. |
Tasks | Stock Price Prediction |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11317v1 |
http://arxiv.org/pdf/1805.11317v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-networks-for-stock-price-prediction |
Repo | https://github.com/aflorial/DeepDayTrade |
Framework | none |
Optimizing for Generalization in Machine Learning with Cross-Validation Gradients
Title | Optimizing for Generalization in Machine Learning with Cross-Validation Gradients |
Authors | Shane Barratt, Rishi Sharma |
Abstract | Cross-validation is the workhorse of modern applied statistics and machine learning, as it provides a principled framework for selecting the model that maximizes generalization performance. In this paper, we show that the cross-validation risk is differentiable with respect to the hyperparameters and training data for many common machine learning algorithms, including logistic regression, elastic-net regression, and support vector machines. Leveraging this property of differentiability, we propose a cross-validation gradient method (CVGM) for hyperparameter optimization. Our method enables efficient optimization in high-dimensional hyperparameter spaces of the cross-validation risk, the best surrogate of the true generalization ability of our learning algorithm. |
Tasks | Hyperparameter Optimization |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07072v1 |
http://arxiv.org/pdf/1805.07072v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-for-generalization-in-machine-1 |
Repo | https://github.com/sbarratt/crossval |
Framework | none |
DepecheMood++: a Bilingual Emotion Lexicon Built Through Simple Yet Powerful Techniques
Title | DepecheMood++: a Bilingual Emotion Lexicon Built Through Simple Yet Powerful Techniques |
Authors | Oscar Araque, Lorenzo Gatti, Jacopo Staiano, Marco Guerini |
Abstract | Several lexica for sentiment analysis have been developed and made available in the NLP community. While most of these come with word polarity annotations (e.g. positive/negative), attempts at building lexica for finer-grained emotion analysis (e.g. happiness, sadness) have recently attracted significant attention. Such lexica are often exploited as a building block in the process of developing learning models for which emotion recognition is needed, and/or used as baselines to which compare the performance of the models. In this work, we contribute two new resources to the community: a) an extension of an existing and widely used emotion lexicon for English; and b) a novel version of the lexicon targeting Italian. Furthermore, we show how simple techniques can be used, both in supervised and unsupervised experimental settings, to boost performances on datasets and tasks of varying degree of domain-specificity. |
Tasks | Emotion Recognition, Sentiment Analysis |
Published | 2018-10-08 |
URL | http://arxiv.org/abs/1810.03660v1 |
http://arxiv.org/pdf/1810.03660v1.pdf | |
PWC | https://paperswithcode.com/paper/depechemood-a-bilingual-emotion-lexicon-built |
Repo | https://github.com/marcoguerini/DepecheMood |
Framework | none |
Supervised Fitting of Geometric Primitives to 3D Point Clouds
Title | Supervised Fitting of Geometric Primitives to 3D Point Clouds |
Authors | Lingxiao Li, Minhyuk Sung, Anastasia Dubrovina, Li Yi, Leonidas Guibas |
Abstract | Fitting geometric primitives to 3D point cloud data bridges a gap between low-level digitized 3D data and high-level structural information on the underlying 3D shapes. As such, it enables many downstream applications in 3D data processing. For a long time, RANSAC-based methods have been the gold standard for such primitive fitting problems, but they require careful per-input parameter tuning and thus do not scale well for large datasets with diverse shapes. In this work, we introduce Supervised Primitive Fitting Network (SPFN), an end-to-end neural network that can robustly detect a varying number of primitives at different scales without any user control. The network is supervised using ground truth primitive surfaces and primitive membership for the input points. Instead of directly predicting the primitives, our architecture first predicts per-point properties and then uses a differential model estimation module to compute the primitive type and parameters. We evaluate our approach on a novel benchmark of ANSI 3D mechanical component models and demonstrate a significant improvement over both the state-of-the-art RANSAC-based methods and the direct neural prediction. |
Tasks | Shape Representation Of 3D Point Clouds |
Published | 2018-11-22 |
URL | https://arxiv.org/abs/1811.08988v4 |
https://arxiv.org/pdf/1811.08988v4.pdf | |
PWC | https://paperswithcode.com/paper/supervised-fitting-of-geometric-primitives-to |
Repo | https://github.com/csimstu2/SPFN |
Framework | tf |
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision
Title | Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision |
Authors | Mohsen Joneidi, Alireza Zaeemzadeh, Nazanin Rahnavard, Mubarak Shah |
Abstract | The goal of data selection is to capture the most structural information from a set of data. This paper presents a fast and accurate data selection method, in which the selected samples are optimized to span the subspace of all data. We propose a new selection algorithm, referred to as iterative projection and matching (IPM), with linear complexity w.r.t. the number of data, and without any parameter to be tuned. In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples. The computational efficiency and the selection accuracy of our proposed algorithm outperform those of the conventional methods. Furthermore, the superiority of the proposed algorithm is shown on active learning for video action recognition dataset on UCF-101; learning using representatives on ImageNet; training a generative adversarial network (GAN) to generate multi-view images from a single-view input on CMU Multi-PIE dataset; and video summarization on UTE Egocentric dataset. |
Tasks | Active Learning, Data Summarization, Feature Selection, Temporal Action Localization, Video Summarization |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12326v1 |
http://arxiv.org/pdf/1811.12326v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-projection-and-matching-finding |
Repo | https://github.com/zaeemzadeh/IPM |
Framework | none |
Classifier-agnostic saliency map extraction
Title | Classifier-agnostic saliency map extraction |
Authors | Konrad Zolna, Krzysztof J. Geras, Kyunghyun Cho |
Abstract | Extracting saliency maps, which indicate parts of the image important to classification, requires many tricks to achieve satisfactory performance when using classifier-dependent methods. Instead, we propose classifier-agnostic saliency map extraction, which finds all parts of the image that any classifier could use, not just one given in advance. We observe that the proposed approach extracts higher quality saliency maps and outperforms existing weakly-supervised localization techniques, setting the new state of the art result on the ImageNet dataset. We made our code publicly available at https://github.com/kondiz/casme. |
Tasks | |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.08249v2 |
http://arxiv.org/pdf/1805.08249v2.pdf | |
PWC | https://paperswithcode.com/paper/classifier-agnostic-saliency-map-extraction |
Repo | https://github.com/kondiz/casme |
Framework | pytorch |
Cross-view image synthesis using geometry-guided conditional GANs
Title | Cross-view image synthesis using geometry-guided conditional GANs |
Authors | Krishna Regmi, Ali Borji |
Abstract | We address the problem of generating images across two drastically different views, namely ground (street) and aerial (overhead) views. Image synthesis by itself is a very challenging computer vision task and is even more so when generation is conditioned on an image in another view. Due the difference in viewpoints, there is small overlapping field of view and little common content between these two views. Here, we try to preserve the pixel information between the views so that the generated image is a realistic representation of cross view input image. For this, we propose to use homography as a guide to map the images between the views based on the common field of view to preserve the details in the input image. We then use generative adversarial networks to inpaint the missing regions in the transformed image and add realism to it. Our exhaustive evaluation and model comparison demonstrate that utilizing geometry constraints adds fine details to the generated images and can be a better approach for cross view image synthesis than purely pixel based synthesis methods. |
Tasks | Cross-View Image-to-Image Translation, Image Generation |
Published | 2018-08-14 |
URL | https://arxiv.org/abs/1808.05469v2 |
https://arxiv.org/pdf/1808.05469v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-view-image-synthesis-using-geometry |
Repo | https://github.com/kregmi/cross-view-image-synthesis |
Framework | pytorch |
Image Processing in Quantum Computers
Title | Image Processing in Quantum Computers |
Authors | Aditya Dendukuri, Khoa Luu |
Abstract | Quantum Image Processing (QIP)is an exciting new field showing a lot of promise as a powerful addition to the arsenal of Image Processing techniques. Representing image pixel by pixel using classical information requires an enormous amount of computational resources. Hence, exploring methods to represent images in a different paradigm of information is important. In this work, we study the representation of images in Quantum Information. The main motivation for this pursuit is the ability of storing N bits of classical information in only log(2N) quantum bits (qubits). The promising first step was the exponentially efficient implementation of the Fourier transform in quantum computers as compared to Fast Fourier Transform in classical computers. In addition, images encoded in quantum information could obey unique quantum properties like superposition or entanglement. |
Tasks | |
Published | 2018-12-28 |
URL | http://arxiv.org/abs/1812.11042v3 |
http://arxiv.org/pdf/1812.11042v3.pdf | |
PWC | https://paperswithcode.com/paper/image-processing-in-quantum-computers |
Repo | https://github.com/Shedka/citiesatnight |
Framework | none |