Paper Group ANR 449
Deep Generative Models for Library Augmentation in Multiple Endmember Spectral Mixture Analysis. A Novel Topology for End-to-end Temporal Classification and Segmentation with Recurrent Neural Network. A Hybrid Approach Between Adversarial Generative Networks and Actor-Critic Policy Gradient for Low Rate High-Resolution Image Compression. Expected S …
Deep Generative Models for Library Augmentation in Multiple Endmember Spectral Mixture Analysis
Title | Deep Generative Models for Library Augmentation in Multiple Endmember Spectral Mixture Analysis |
Authors | Ricardo Augusto Borsoi, Tales Imbiriba, José Carlos Moreira Bermudez, Cédric Richard |
Abstract | Multiple Endmember Spectral Mixture Analysis (MESMA) is one of the leading approaches to perform spectral unmixing (SU) considering variability of the endmembers (EMs). It represents each endmember in the image using libraries of spectral signatures acquired a priori. However, existing spectral libraries are often small and unable to properly capture the variability of each endmember in practical scenes, what significantly compromises the performance of MESMA. In this paper, we propose a library augmentation strategy to improve the diversity of existing spectral libraries, thus improving their ability to represent the materials in real images. First, the proposed methodology leverages the power of deep generative models (DGMs) to learn the statistical distribution of the endmembers based on the spectral signatures available in the existing libraries. Afterwards, new samples can be drawn from the learned EM distributions and used to augment the spectral libraries, improving the overall quality of the unmixing process. Experimental results using synthetic and real data attest the superior performance of the proposed method even under library mismatch conditions. |
Tasks | |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09741v1 |
https://arxiv.org/pdf/1909.09741v1.pdf | |
PWC | https://paperswithcode.com/paper/190909741 |
Repo | |
Framework | |
A Novel Topology for End-to-end Temporal Classification and Segmentation with Recurrent Neural Network
Title | A Novel Topology for End-to-end Temporal Classification and Segmentation with Recurrent Neural Network |
Authors | Taiyang Zhao |
Abstract | Connectionist temporal classification (CTC) has matured as an alignment free to sequence transduction and shows competitive for end-to-end speech recognition. In the CTC topology, the blank symbol occupies more than half of the state trellis, which results the spike phenomenon of the non-blank symbols. For classification task, the spikes work quite well, but as to the segmentation task it does not provide boundaries information. In this paper, a novel topology is introduced to combine the temporal classification and segmentation ability in one framework. |
Tasks | End-To-End Speech Recognition, Speech Recognition |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04784v1 |
https://arxiv.org/pdf/1912.04784v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-topology-for-end-to-end-temporal |
Repo | |
Framework | |
A Hybrid Approach Between Adversarial Generative Networks and Actor-Critic Policy Gradient for Low Rate High-Resolution Image Compression
Title | A Hybrid Approach Between Adversarial Generative Networks and Actor-Critic Policy Gradient for Low Rate High-Resolution Image Compression |
Authors | Nicoló Savioli |
Abstract | Image compression is an essential approach for decreasing the size in bytes of the image without deteriorating the quality of it. Typically, classic algorithms are used but recently deep-learning has been successfully applied. In this work, is presented a deep super-resolution work-flow for image compression that maps low-resolution JPEG image to the high-resolution. The pipeline consists of two components: first, an encoder-decoder neural network learns how to transform the downsampling JPEG images to high resolution. Second, a combination between Generative Adversarial Networks (GANs) and reinforcement learning Actor-Critic (A3C) loss pushes the encoder-decoder to indirectly maximize High Peak Signal-to-Noise Ratio (PSNR). Although PSNR is a fully differentiable metric, this work opens the doors to new solutions for maximizing non-differential metrics through an end-to-end approach between encoder-decoder networks and reinforcement learning policy gradient methods. |
Tasks | Image Compression, Policy Gradient Methods, Super-Resolution |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04681v2 |
https://arxiv.org/pdf/1906.04681v2.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-approach-between-adversarial |
Repo | |
Framework | |
Expected Sarsa($λ$) with Control Variate for Variance Reduction
Title | Expected Sarsa($λ$) with Control Variate for Variance Reduction |
Authors | Long Yang, Yu Zhang, Jun Wen, Qian Zheng, Pengfei Li, Gang Pan |
Abstract | Off-policy learning is powerful for reinforcement learning. However, the high variance of off-policy evaluation is a critical challenge, which causes off-policy learning falls into an uncontrolled instability. In this paper, for reducing the variance, we introduce control variate technique to $\mathtt{Expected}$ $\mathtt{Sarsa}$($\lambda$) and propose a tabular $\mathtt{ES}$($\lambda$)-$\mathtt{CV}$ algorithm. We prove that if a proper estimator of value function reaches, the proposed $\mathtt{ES}$($\lambda$)-$\mathtt{CV}$ enjoys a lower variance than $\mathtt{Expected}$ $\mathtt{Sarsa}$($\lambda$). Furthermore, to extend $\mathtt{ES}$($\lambda$)-$\mathtt{CV}$ to be a convergent algorithm with linear function approximation, we propose the $\mathtt{GES}$($\lambda$) algorithm under the convex-concave saddle-point formulation. We prove that the convergence rate of $\mathtt{GES}$($\lambda$) achieves $\mathcal{O}(1/T)$, which matches or outperforms lots of state-of-art gradient-based algorithms, but we use a more relaxed condition. Numerical experiments show that the proposed algorithm performs better with lower variance than several state-of-art gradient-based TD learning algorithms: $\mathtt{GQ}$($\lambda$), $\mathtt{GTB}$($\lambda$) and $\mathtt{ABQ}$($\zeta$). |
Tasks | |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.11058v2 |
https://arxiv.org/pdf/1906.11058v2.pdf | |
PWC | https://paperswithcode.com/paper/expected-sarsa-with-control-variate-for |
Repo | |
Framework | |
Skeleton Extraction from 3D Point Clouds by Decomposing the Object into Parts
Title | Skeleton Extraction from 3D Point Clouds by Decomposing the Object into Parts |
Authors | Vijai Jayadevan, Edward Delp, Zygmunt Pizlo |
Abstract | Decomposing a point cloud into its components and extracting curve skeletons from point clouds are two related problems. Decomposition of a shape into its components is often obtained as a byproduct of skeleton extraction. In this work, we propose to extract curve skeletons, from unorganized point clouds, by decomposing the object into its parts, identifying part skeletons and then linking these part skeletons together to obtain the complete skeleton. We believe it is the most natural way to extract skeletons in the sense that this would be the way a human would approach the problem. Our parts are generalized cylinders (GCs). Since, the axis of a GC is an integral part of its definition, the parts have natural skeletal representations. We use translational symmetry, the fundamental property of GCs, to extract parts from point clouds. We demonstrate how this method can handle a large variety of shapes. We compare our method with state of the art methods and show how a part based approach can deal with some of the limitations of other methods. We present an improved version of an existing point set registration algorithm and demonstrate its utility in extracting parts from point clouds. We also show how this method can be used to extract skeletons from and identify parts of noisy point clouds. A part based approach also provides a natural and intuitive interface for user interaction. We demonstrate the ease with which mistakes, if any, can be fixed with minimal user interaction with the help of a graphical user interface. |
Tasks | |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/1912.11932v1 |
https://arxiv.org/pdf/1912.11932v1.pdf | |
PWC | https://paperswithcode.com/paper/skeleton-extraction-from-3d-point-clouds-by |
Repo | |
Framework | |
Towards Lossless Encoding of Sentences
Title | Towards Lossless Encoding of Sentences |
Authors | Gabriele Prato, Mathieu Duchesneau, Sarath Chandar, Alain Tapp |
Abstract | A lot of work has been done in the field of image compression via machine learning, but not much attention has been given to the compression of natural language. Compressing text into lossless representations while making features easily retrievable is not a trivial task, yet has huge benefits. Most methods designed to produce feature rich sentence embeddings focus solely on performing well on downstream tasks and are unable to properly reconstruct the original sequence from the learned embedding. In this work, we propose a near lossless method for encoding long sequences of texts as well as all of their sub-sequences into feature rich representations. We test our method on sentiment analysis and show good performance across all sub-sentence and sentence embeddings. |
Tasks | Image Compression, Sentence Embeddings, Sentiment Analysis |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01659v2 |
https://arxiv.org/pdf/1906.01659v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-lossless-encoding-of-sentences |
Repo | |
Framework | |
Single Image Super Resolution based on a Modified U-net with Mixed Gradient Loss
Title | Single Image Super Resolution based on a Modified U-net with Mixed Gradient Loss |
Authors | Zhengyang Lu, Ying Chen |
Abstract | Single image super-resolution (SISR) is the task of inferring a high-resolution image from a single low-resolution image. Recent research on super-resolution has achieved great progress due to the development of deep convolutional neural networks in the field of computer vision. Existing super-resolution reconstruction methods have high performances in the criterion of Mean Square Error (MSE) but most methods fail to reconstruct an image with shape edges. To solve this problem, the mixed gradient error, which is composed by MSE and a weighted mean gradient error, is proposed in this work and applied to a modified U-net network as the loss function. The modified U-net removes all batch normalization layers and one of the convolution layers in each block. The operation reduces the number of parameters, and therefore accelerates the reconstruction. Compared with the existing image super-resolution algorithms, the proposed reconstruction method has better performance and time consumption. The experiments demonstrate that modified U-net network architecture with mixed gradient loss yields high-level results on three image datasets: SET14, BSD300, ICDAR2003. Code is available online. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09428v1 |
https://arxiv.org/pdf/1911.09428v1.pdf | |
PWC | https://paperswithcode.com/paper/single-image-super-resolution-based-on-a |
Repo | |
Framework | |
Programmable Neural Network Trojan for Pre-Trained Feature Extractor
Title | Programmable Neural Network Trojan for Pre-Trained Feature Extractor |
Authors | Yu Ji, Zixin Liu, Xing Hu, Peiqi Wang, Youhui Zhang |
Abstract | Neural network (NN) trojaning attack is an emerging and important attack model that can broadly damage the system deployed with NN models. Existing studies have explored the outsourced training attack scenario and transfer learning attack scenario in some small datasets for specific domains, with limited numbers of fixed target classes. In this paper, we propose a more powerful trojaning attack method for both outsourced training attack and transfer learning attack, which outperforms existing studies in the capability, generality, and stealthiness. First, The attack is programmable that the malicious misclassification target is not fixed and can be generated on demand even after the victim’s deployment. Second, our trojan attack is not limited in a small domain; one trojaned model on a large-scale dataset can affect applications of different domains that reuse its general features. Thirdly, our trojan design is hard to be detected or eliminated even if the victims fine-tune the whole model. |
Tasks | Transfer Learning |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.07766v1 |
http://arxiv.org/pdf/1901.07766v1.pdf | |
PWC | https://paperswithcode.com/paper/programmable-neural-network-trojan-for-pre |
Repo | |
Framework | |
Deep auscultation: Predicting respiratory anomalies and diseases via recurrent neural networks
Title | Deep auscultation: Predicting respiratory anomalies and diseases via recurrent neural networks |
Authors | Diego Perna, Andrea Tagarelli |
Abstract | Respiratory diseases are among the most common causes of severe illness and death worldwide. Prevention and early diagnosis are essential to limit or even reverse the trend that characterizes the diffusion of such diseases. In this regard, the development of advanced computational tools for the analysis of respiratory auscultation sounds can become a game changer for detecting disease-related anomalies, or diseases themselves. In this work, we propose a novel learning framework for respiratory auscultation sound data. Our approach combines state-of-the-art feature extraction techniques and advanced deep-neural-network architectures. Remarkably, to the best of our knowledge, we are the first to model a recurrent-neural-network based learning framework to support the clinician in detecting respiratory diseases, at either level of abnormal sounds or pathology classes. Results obtained on the ICBHI benchmark dataset show that our approach outperforms competing methods on both anomaly-driven and pathology-driven prediction tasks, thus advancing the state-of-the-art in respiratory disease analysis. |
Tasks | |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05708v1 |
https://arxiv.org/pdf/1907.05708v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-auscultation-predicting-respiratory |
Repo | |
Framework | |
Data-Driven Multi-step Demand Prediction for Ride-hailing Services Using Convolutional Neural Network
Title | Data-Driven Multi-step Demand Prediction for Ride-hailing Services Using Convolutional Neural Network |
Authors | Chao Wang, Yi Hou, Matthew Barth |
Abstract | Ride-hailing services are growing rapidly and becoming one of the most disruptive technologies in the transportation realm. Accurate prediction of ride-hailing trip demand not only enables cities to better understand people’s activity patterns, but also helps ride-hailing companies and drivers make informed decisions to reduce deadheading vehicle miles traveled, traffic congestion, and energy consumption. In this study, a convolutional neural network (CNN)-based deep learning model is proposed for multi-step ride-hailing demand prediction using the trip request data in Chengdu, China, offered by DiDi Chuxing. The CNN model is capable of accurately predicting the ride-hailing pick-up demand at each 1-km by 1-km zone in the city of Chengdu for every 10 minutes. Compared with another deep learning model based on long short-term memory, the CNN model is 30% faster for the training and predicting process. The proposed model can also be easily extended to make multi-step predictions, which would benefit the on-demand shared autonomous vehicles applications and fleet operators in terms of supply-demand rebalancing. The prediction error attenuation analysis shows that the accuracy stays acceptable as the model predicts more steps. |
Tasks | Autonomous Vehicles |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03441v1 |
https://arxiv.org/pdf/1911.03441v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-multi-step-demand-prediction-for |
Repo | |
Framework | |
Fast and Bayes-consistent nearest neighbors
Title | Fast and Bayes-consistent nearest neighbors |
Authors | Klim Efremenko, Aryeh Kontorovich, Moshe Noivirt |
Abstract | Research on nearest-neighbor methods tends to focus somewhat dichotomously either on the statistical or the computational aspects – either on, say, Bayes consistency and rates of convergence or on techniques for speeding up the proximity search. This paper aims at bridging these realms: to reap the advantages of fast evaluation time while maintaining Bayes consistency, and further without sacrificing too much in the risk decay rate. We combine the locality-sensitive hashing (LSH) technique with a novel missing-mass argument to obtain a fast and Bayes-consistent classifier. Our algorithm’s prediction runtime compares favorably against state of the art approximate NN methods, while maintaining Bayes-consistency and attaining rates comparable to minimax. On samples of size $n$ in $\R^d$, our pre-processing phase has runtime $O(d n \log n)$, while the evaluation phase has runtime $O(d\log n)$ per query point. |
Tasks | |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.05270v2 |
https://arxiv.org/pdf/1910.05270v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-bayes-consistent-nearest-neighbors |
Repo | |
Framework | |
On Identifiability in Transformers
Title | On Identifiability in Transformers |
Authors | Gino Brunner, Yang Liu, Damián Pascual, Oliver Richter, Massimiliano Ciaramita, Roger Wattenhofer |
Abstract | In this paper we delve deep in the Transformer architecture by investigating two of its core components: self-attention and contextual embeddings. In particular, we study the identifiability of attention weights and token embeddings, and the aggregation of context into hidden tokens. We show that, for sequences longer than the attention head dimension, attention weights are not identifiable. We propose effective attention as a complementary tool for improving explanatory interpretations based on attention. Furthermore, we show that input tokens retain to a large degree their identity across the model. We also find evidence suggesting that identity information is mainly encoded in the angle of the embeddings and gradually decreases with depth. Finally, we demonstrate strong mixing of input information in the generation of contextual embeddings by means of a novel quantification method based on gradient attribution. Overall, we show that self-attention distributions are not directly interpretable and present tools to better understand and further investigate Transformer models. |
Tasks | |
Published | 2019-08-12 |
URL | https://arxiv.org/abs/1908.04211v4 |
https://arxiv.org/pdf/1908.04211v4.pdf | |
PWC | https://paperswithcode.com/paper/on-the-validity-of-self-attention-as |
Repo | |
Framework | |
Learning from Label Proportions with Consistency Regularization
Title | Learning from Label Proportions with Consistency Regularization |
Authors | Kuen-Han Tsai, Hsuan-Tien Lin |
Abstract | The problem of learning from label proportions (LLP) involves training classifiers with weak labels on bags of instances, rather than strong labels on individual instances. The weak labels only contain the label proportion of each bag. The LLP problem is important for many practical applications that only allow label proportions to be collected because of data privacy or annotation cost, and has recently received lots of research attention. Most existing works focus on extending supervised learning models to solve the LLP problem, but the weak learning nature makes it hard to further improve LLP performance with a supervised angle. In this paper, we take a different angle from semi-supervised learning. In particular, we propose a novel model inspired by consistency regularization, a popular concept in semi-supervised learning that encourages the model to produce a decision boundary that better describes the data manifold. With the introduction of consistency regularization, we further extend our study to non-uniform bag-generation and validation-based parameter-selection procedures that better match practical needs. Experiments not only justify that LLP with consistency regularization achieves superior performance, but also demonstrate the practical usability of the proposed procedures. |
Tasks | |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13188v1 |
https://arxiv.org/pdf/1910.13188v1.pdf | |
PWC | https://paperswithcode.com/paper/191013188 |
Repo | |
Framework | |
Dimensionality Reduction for Tukey Regression
Title | Dimensionality Reduction for Tukey Regression |
Authors | Kenneth L. Clarkson, Ruosong Wang, David P. Woodruff |
Abstract | We give the first dimensionality reduction methods for the overconstrained Tukey regression problem. The Tukey loss function $\y_M = \sum_i M(y_i)$ has $M(y_i) \approx y_i^p$ for residual errors $y_i$ smaller than a prescribed threshold $\tau$, but $M(y_i)$ becomes constant for errors $y_i > \tau$. Our results depend on a new structural result, proven constructively, showing that for any $d$-dimensional subspace $L \subset \mathbb{R}^n$, there is a fixed bounded-size subset of coordinates containing, for every $y \in L$, all the large coordinates, with respect to the Tukey loss function, of $y$. Our methods reduce a given Tukey regression problem to a smaller weighted version, whose solution is a provably good approximate solution to the original problem. Our reductions are fast, simple and easy to implement, and we give empirical results demonstrating their practicality, using existing heuristic solvers for the small versions. We also give exponential-time algorithms giving provably good solutions, and hardness results suggesting that a significant speedup in the worst case is unlikely. |
Tasks | Dimensionality Reduction |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05376v1 |
https://arxiv.org/pdf/1905.05376v1.pdf | |
PWC | https://paperswithcode.com/paper/dimensionality-reduction-for-tukey-regression |
Repo | |
Framework | |
Not All are Made Equal: Consistency of Weighted Averaging Estimators Under Active Learning
Title | Not All are Made Equal: Consistency of Weighted Averaging Estimators Under Active Learning |
Authors | Jack Goetz, Ambuj Tewari |
Abstract | Active learning seeks to build the best possible model with a budget of labelled data by sequentially selecting the next point to label. However the training set is no longer \textit{iid}, violating the conditions required by existing consistency results. Inspired by the success of Stone’s Theorem we aim to regain consistency for weighted averaging estimators under active learning. Based on ideas in \citet{dasgupta2012consistency}, our approach is to enforce a small amount of random sampling by running an augmented version of the underlying active learning algorithm. We generalize Stone’s Theorem in the noise free setting, proving consistency for well known classifiers such as $k$-NN, histogram and kernel estimators under conditions which mirror classical results. However in the presence of noise we can no longer deal with these estimators in a unified manner; for some satisfying this condition also guarantees sufficiency in the noisy case, while for others we can achieve near perfect inconsistency while this condition holds. Finally we provide conditions for consistency in the presence of noise, which give insight into why these estimators can behave so differently under the combination of noise and active learning. |
Tasks | Active Learning |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05321v1 |
https://arxiv.org/pdf/1910.05321v1.pdf | |
PWC | https://paperswithcode.com/paper/not-all-are-made-equal-consistency-of |
Repo | |
Framework | |