Paper Group ANR 1406
End-to-End Deep Residual Learning with Dilated Convolutions for Myocardial Infarction Detection and Localization. Reinforcement Learning for Portfolio Management. Multi-Target Multiple Instance Learning for Hyperspectral Target Detection. A type of generalization error induced by initialization in deep neural networks. On the Convergence of Project …
End-to-End Deep Residual Learning with Dilated Convolutions for Myocardial Infarction Detection and Localization
Title | End-to-End Deep Residual Learning with Dilated Convolutions for Myocardial Infarction Detection and Localization |
Authors | Iván López-Espejo |
Abstract | In this report, I investigate the use of end-to-end deep residual learning with dilated convolutions for myocardial infarction (MI) detection and localization from electrocardiogram (ECG) signals. Although deep residual learning has already been applied to MI detection and localization, I propose a more accurate system that distinguishes among a higher number (i.e., six) of MI locations. Inspired by speech waveform processing with neural networks, I found a more robust front-end than directly arranging the multi-lead ECG signal into an input matrix consisting of the use of a single one-dimensional convolutional layer per ECG lead to extract a pseudo-time-frequency representation and create a compact and discriminative input feature volume. As a result, I end up with a system achieving an MI detection and localization accuracy of 99.99% on the well-known Physikalisch-Technische Bundesanstalt (PTB) database. |
Tasks | Myocardial infarction detection |
Published | 2019-09-15 |
URL | https://arxiv.org/abs/1909.12923v1 |
https://arxiv.org/pdf/1909.12923v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-deep-residual-learning-with |
Repo | |
Framework | |
Reinforcement Learning for Portfolio Management
Title | Reinforcement Learning for Portfolio Management |
Authors | Angelos Filos |
Abstract | In this thesis, we develop a comprehensive account of the expressive power, modelling efficiency, and performance advantages of so-called trading agents (i.e., Deep Soft Recurrent Q-Network (DSRQN) and Mixture of Score Machines (MSM)), based on both traditional system identification (model-based approach) as well as on context-independent agents (model-free approach). The analysis provides conclusive support for the ability of model-free reinforcement learning methods to act as universal trading agents, which are not only capable of reducing the computational and memory complexity (owing to their linear scaling with the size of the universe), but also serve as generalizing strategies across assets and markets, regardless of the trading universe on which they have been trained. The relatively low volume of daily returns in financial market data is addressed via data augmentation (a generative approach) and a choice of pre-training strategies, both of which are validated against current state-of-the-art models. For rigour, a risk-sensitive framework which includes transaction costs is considered, and its performance advantages are demonstrated in a variety of scenarios, from synthetic time-series (sinusoidal, sawtooth and chirp waves), simulated market series (surrogate data based), through to real market data (S&P 500 and EURO STOXX 50). The analysis and simulations confirm the superiority of universal model-free reinforcement learning agents over current portfolio management model in asset allocation strategies, with the achieved performance advantage of as much as 9.2% in annualized cumulative returns and 13.4% in annualized Sharpe Ratio. |
Tasks | Data Augmentation, Time Series |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.09571v1 |
https://arxiv.org/pdf/1909.09571v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-portfolio |
Repo | |
Framework | |
Multi-Target Multiple Instance Learning for Hyperspectral Target Detection
Title | Multi-Target Multiple Instance Learning for Hyperspectral Target Detection |
Authors | Susan Meerdink, James Bocinsky, Alina Zare, Nicholas Kroeger, Connor McCurley, Daniel Shats, Paul Gader |
Abstract | In remote sensing, it is often challenging to acquire or collect a large dataset that is accurately labeled. This difficulty is usually due to several issues, including but not limited to the study site’s spatial area and accessibility, errors in the global positioning system (GPS), and mixed pixels caused by an image’s spatial resolution. We propose an approach, with two variations, that estimates multiple target signatures from training samples with imprecise labels: Multi-Target Multiple Instance Adaptive Cosine Estimator (Multi-Target MI-ACE) and Multi-Target Multiple Instance Spectral Match Filter (Multi-Target MI-SMF). The proposed methods address the problems above by directly considering the multiple-instance, imprecisely labeled dataset. They learn a dictionary of target signatures that optimizes detection against a background using the Adaptive Cosine Estimator (ACE) and Spectral Match Filter (SMF). Experiments were conducted to test the proposed algorithms using a simulated hyperspectral dataset, the MUUFL Gulfport hyperspectral dataset collected over the University of Southern Mississippi-Gulfpark Campus, and the AVIRIS hyperspectral dataset collected over Santa Barbara County, California. Both simulated and real hyperspectral target detection experiments show the proposed algorithms are effective at learning target signatures and performing target detection. |
Tasks | Multiple Instance Learning |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.03316v3 |
https://arxiv.org/pdf/1909.03316v3.pdf | |
PWC | https://paperswithcode.com/paper/multi-target-multiple-instance-learning-for |
Repo | |
Framework | |
A type of generalization error induced by initialization in deep neural networks
Title | A type of generalization error induced by initialization in deep neural networks |
Authors | Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma |
Abstract | How different initializations and loss functions affect the learning of a deep neural network (DNN), specifically its generalization error, is an important problem in practice. In this work, focusing on regression problems, we develop a kernel-norm minimization framework for the analysis of DNNs in the kernel regime in which the number of neurons in each hidden layer is sufficiently large (Jacot et al. 2018, Lee et al. 2019). We find that, in the kernel regime, for any loss in a general class of functions, e.g., any Lp loss for $1 < p < \infty$, the DNN finds the same global minima-the one that is nearest to the initial value in the parameter space, or equivalently, the one that is closest to the initial DNN output in the corresponding reproducing kernel Hilbert space. With this framework, we prove that a non-zero initial output increases the generalization error of DNN. We further propose an antisymmetrical initialization (ASI) trick that eliminates this type of error and accelerates the training. We also demonstrate experimentally that even for DNNs in the non-kernel regime, our theoretical analysis and the ASI trick remain effective. Overall, our work provides insight into how initialization and loss function quantitatively affect the generalization of DNNs, and also provides guidance for the training of DNNs. |
Tasks | |
Published | 2019-05-19 |
URL | https://arxiv.org/abs/1905.07777v1 |
https://arxiv.org/pdf/1905.07777v1.pdf | |
PWC | https://paperswithcode.com/paper/a-type-of-generalization-error-induced-by |
Repo | |
Framework | |
On the Convergence of Projected-Gradient Methods with Low-Rank Projections for Smooth Convex Minimization over Trace-Norm Balls and Related Problems
Title | On the Convergence of Projected-Gradient Methods with Low-Rank Projections for Smooth Convex Minimization over Trace-Norm Balls and Related Problems |
Authors | Dan Garber |
Abstract | Smooth convex minimization over the unit trace-norm ball is an important optimization problem in machine learning, signal processing, statistics and other fields, that underlies many tasks in which one wishes to recover a low-rank matrix given certain measurements. While first-order methods for convex optimization enjoy optimal convergence rates, they require in worst-case to compute a full-rank SVD on each iteration, in order to compute the projection onto the trace-norm ball. These full-rank SVD computations however prohibit the application of such methods to large problems. A simple and natural heuristic to reduce the computational cost is to approximate the projection using only a low-rank SVD. This raises the question if, and under what conditions, this simple heuristic can indeed result in provable convergence to the optimal solution. In this paper we show that any optimal solution is a center of a Euclid. ball inside-which the projected-gradient mapping admits rank that is at most the multiplicity of the largest singular value of the gradient vector. Moreover, the radius of the ball scales with the spectral gap of this gradient vector. We show how this readily implies the local convergence (i.e., from a “warm-start” initialization) of standard first-order methods, using only low-rank SVD computations. We also quantify the effect of “over-parameterization”, i.e., using SVD computations with higher rank, on the radius of this ball, showing it can increase dramatically with moderately larger rank. We extend our results also to the setting of optimization with trace-norm regularization and optimization over bounded-trace positive semidefinite matrices. Our theoretical investigation is supported by concrete empirical evidence that demonstrates the \textit{correct} convergence of first-order methods with low-rank projections on real-world datasets. |
Tasks | |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.01644v1 |
http://arxiv.org/pdf/1902.01644v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-convergence-of-projected-gradient |
Repo | |
Framework | |
Knowledge is Never Enough: Towards Web Aided Deep Open World Recognition
Title | Knowledge is Never Enough: Towards Web Aided Deep Open World Recognition |
Authors | Massimiliano Mancini, Hakan Karaoguz, Elisa Ricci, Patric Jensfelt, Barbara Caputo |
Abstract | While today’s robots are able to perform sophisticated tasks, they can only act on objects they have been trained to recognize. This is a severe limitation: any robot will inevitably see new objects in unconstrained settings, and thus will always have visual knowledge gaps. However, standard visual modules are usually built on a limited set of classes and are based on the strong prior that an object must belong to one of those classes. Identifying whether an instance does not belong to the set of known categories (i.e. open set recognition), only partially tackles this problem, as a truly autonomous agent should be able not only to detect what it does not know, but also to extend dynamically its knowledge about the world. We contribute to this challenge with a deep learning architecture that can dynamically update its known classes in an end-to-end fashion. The proposed deep network, based on a deep extension of a non-parametric model, detects whether a perceived object belongs to the set of categories known by the system and learns it without the need to retrain the whole system from scratch. Annotated images about the new category can be provided by an ‘oracle’ (i.e. human supervision), or by autonomous mining of the Web. Experiments on two different databases and on a robot platform demonstrate the promise of our approach. |
Tasks | Open Set Learning |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01258v1 |
https://arxiv.org/pdf/1906.01258v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-is-never-enough-towards-web-aided |
Repo | |
Framework | |
Simplified calcium signaling cascade for synaptic plasticity
Title | Simplified calcium signaling cascade for synaptic plasticity |
Authors | Vladimir Kornijcuk, Dohun Kim, Guhyun Kim, Doo Seok Jeong |
Abstract | We propose a model for synaptic plasticity based on a calcium signaling cascade. The model simplifies the full signaling pathways from a calcium influx to the phosphorylation (potentiation) and dephosphorylation (depression) of glutamate receptors that are gated by fictive C1 and C2 catalysts, respectively. This model is based on tangible chemical reactions, including fictive catalysts, for long-term plasticity rather than the conceptual theories commonplace in various models, such as preset thresholds of calcium concentration. Our simplified model successfully reproduced the experimental synaptic plasticity induced by different protocols such as (i) a synchronous pairing protocol and (ii) correlated presynaptic and postsynaptic action potentials (APs). Further, the ocular dominance plasticity (or the experimental verification of the celebrated Bienenstock–Cooper–Munro theory) was reproduced by two model synapses that compete by means of back-propagating APs (bAPs). The key to this competition is synapse-specific bAPs with reference to bAP-boosting on the physiological grounds. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11326v1 |
https://arxiv.org/pdf/1911.11326v1.pdf | |
PWC | https://paperswithcode.com/paper/simplified-calcium-signaling-cascade-for |
Repo | |
Framework | |
Teaching DNNs to design fast fashion
Title | Teaching DNNs to design fast fashion |
Authors | Abhinav Ravi, Arun Patro, Vikram Garg, Anoop Kolar Rajagopal, Aruna Rajan, Rajdeep Hazra Banerjee |
Abstract | $ $"Fast Fashion” spearheads the biggest disruption in fashion that enabled to engineer resilient supply chains to quickly respond to changing fashion trends. The conventional design process in commercial manufacturing is often fed through “trends” or prevailing modes of dressing around the world that indicate sudden interest in a new form of expression, cyclic patterns, and popular modes of expression for a given time frame. In this work, we propose a fully automated system to explore, detect, and finally synthesize trends in fashion into design elements by designing representative prototypes of apparel given time series signals generated from social media feeds. Our system is envisioned to be the first step in design of Fast Fashion where the production cycle for clothes from design inception to manufacturing is meant to be rapid and responsive to current “trends”. It also works to reduce wastage in fashion production by taking in customer feedback on sellability at the time of design generation. We also provide an interface wherein the designers can play with multiple trending styles in fashion and visualize designs as interpolations of elements of these styles. We aim to aid the creative process through generating interesting and inspiring combinations for a designer to mull by running them through her key customers. |
Tasks | Time Series |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.12159v2 |
https://arxiv.org/pdf/1906.12159v2.pdf | |
PWC | https://paperswithcode.com/paper/teaching-dnns-to-design-fast-fashion |
Repo | |
Framework | |
A Hybrid Stochastic Optimization Framework for Stochastic Composite Nonconvex Optimization
Title | A Hybrid Stochastic Optimization Framework for Stochastic Composite Nonconvex Optimization |
Authors | Quoc Tran-Dinh, Nhan H. Pham, Dzung T. Phan, Lam M. Nguyen |
Abstract | In this paper, we introduce a new approach to develop stochastic optimization algorithms for solving stochastic composite and possibly nonconvex optimization problems. The main idea is to combine two stochastic estimators to form a new hybrid one. We first introduce our hybrid estimator and then investigate its fundamental properties to form a foundation theory for algorithmic development. Next, we apply our theory to develop several variants of stochastic gradient methods to solve both expectation and finite-sum composite optimization problems. Our first algorithm can be viewed as a variant of proximal stochastic gradient methods with a single-loop, but can achieve $\mathcal{O}(\sigma^3\varepsilon^{-1} + \sigma\varepsilon^{-3})$ complexity bound that is significantly better than the $\mathcal{O}(\sigma^2\varepsilon^{-4})$-complexity in state-of-the-art stochastic gradient methods, where $\sigma$ is the variance and $\varepsilon$ is a desired accuracy. Then, we consider two different variants of our method: adaptive step-size and double-loop schemes that have the same theoretical guarantees as in our first algorithm. We also study two mini-batch variants and develop two hybrid SARAH-SVRG algorithms to solve the finite-sum problems. In all cases, we achieve the best-known complexity bounds under standard assumptions. We test our methods on several numerical examples with real datasets and compare them with state-of-the-arts. Our numerical experiments show that the new methods are comparable and, in many cases, outperform their competitors. |
Tasks | Stochastic Optimization |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.03793v1 |
https://arxiv.org/pdf/1907.03793v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-stochastic-optimization-framework |
Repo | |
Framework | |
Deep and Dense Sarcasm Detection
Title | Deep and Dense Sarcasm Detection |
Authors | Devin Pelser, Hugh Murrell |
Abstract | Recent work in automated sarcasm detection has placed a heavy focus on context and meta-data. Whilst certain utterances indeed require background knowledge and commonsense reasoning, previous works have only explored shallow models for capturing the lexical, syntactic and semantic cues present within a text. In this paper, we propose a deep 56 layer network, implemented with dense connectivity to model the isolated utterance and extract richer features therein. We compare our approach against recent state-of-the-art architectures which make considerable use of extrinsic information, and demonstrate competitive results whilst using only the local features of the text. Further, we provide an analysis of the dependency of prior convolution outputs in generating the final feature maps. Finally a case study is presented, supporting that our approach accurately classifies additional uses of clear sarcasm, which a standard CNN misclassifies. |
Tasks | Sarcasm Detection |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07474v2 |
https://arxiv.org/pdf/1911.07474v2.pdf | |
PWC | https://paperswithcode.com/paper/dense-and-deep-sarcasm-detection |
Repo | |
Framework | |
Bundle Method Sketching for Low Rank Semidefinite Programming
Title | Bundle Method Sketching for Low Rank Semidefinite Programming |
Authors | Lijun Ding, Benjamin Grimmer |
Abstract | In this paper, we show that the bundle method can be applied to solve semidefinite programming problems with a low rank solution without ever constructing a full matrix. To accomplish this, we use recent results from randomly sketching matrix optimization problems and from the analysis of bundle methods. Under strong duality and strict complementarity of SDP, we achieve $\tilde{O}(\frac{1}{\epsilon})$ convergence rates for both the primal and the dual sequences, and the algorithm proposed outputs a $O(\sqrt{\epsilon})$ approximate solution $\hat{X}$ (measured by distances) with a low rank representation with at most $\tilde{O}(\frac{1}{\epsilon})$ many iterations. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04443v1 |
https://arxiv.org/pdf/1911.04443v1.pdf | |
PWC | https://paperswithcode.com/paper/bundle-method-sketching-for-low-rank |
Repo | |
Framework | |
Signal Coding and Perfect Reconstruction using Spike Trains
Title | Signal Coding and Perfect Reconstruction using Spike Trains |
Authors | Anik Chattopadhyay, Arunava Banerjee |
Abstract | In many animal sensory pathways, the transformation from external stimuli to spike trains is essentially deterministic. In this context, a new mathematical framework for coding and reconstruction, based on a biologically plausible model of the spiking neuron, is presented. The framework considers encoding of a signal through spike trains generated by an ensemble of neurons via a standard convolve-then-threshold mechanism. Neurons are distinguished by their convolution kernels and threshold values. Reconstruction is posited as a convex optimization minimizing energy. Formal conditions under which perfect reconstruction of the signal from the spike trains is possible are then identified in this setup. Finally, a stochastic gradient descent mechanism is proposed to achieve these conditions. Simulation experiments are presented to demonstrate the strength and efficacy of the framework |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1906.00092v2 |
https://arxiv.org/pdf/1906.00092v2.pdf | |
PWC | https://paperswithcode.com/paper/190600092 |
Repo | |
Framework | |
Weather Influence and Classification with Automotive Lidar Sensors
Title | Weather Influence and Classification with Automotive Lidar Sensors |
Authors | Robin Heinzler, Philipp Schindler, Jürgen Seekircher, Werner Ritter, Wilhelm Stork |
Abstract | Lidar sensors are often used in mobile robots and autonomous vehicles to complement camera, radar and ultrasonic sensors for environment perception. Typically, perception algorithms are trained to only detect moving and static objects as well as ground estimation, but intentionally ignore weather effects to reduce false detections. In this work, we present an in-depth analysis of automotive lidar performance under harsh weather conditions, i.e. heavy rain and dense fog. An extensive data set has been recorded for various fog and rain conditions, which is the basis for the conducted in-depth analysis of the point cloud under changing environmental conditions. In addition, we introduce a novel approach to detect and classify rain or fog with lidar sensors only and achieve an mean union over intersection of 97.14 % for a data set in controlled environments. The analysis of weather influences on the performance of lidar sensors and the weather detection are important steps towards improving safety levels for autonomous driving in adverse weather conditions by providing reliable information to adapt vehicle behavior. |
Tasks | Autonomous Driving, Autonomous Vehicles |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07675v1 |
https://arxiv.org/pdf/1906.07675v1.pdf | |
PWC | https://paperswithcode.com/paper/weather-influence-and-classification-with |
Repo | |
Framework | |
Spectral Analysis of Kernel and Neural Embeddings: Optimization and Generalization
Title | Spectral Analysis of Kernel and Neural Embeddings: Optimization and Generalization |
Authors | Arman Rahbar, Emilio Jorge, Devdatt Dubhashi, Morteza Haghir Chehreghani |
Abstract | We extend the recent results of (Arora et al. 2019). by spectral analysis of the representations corresponding to the kernel and neural embeddings. They showed that in a simple single-layer network, the alignment of the labels to the eigenvectors of the corresponding Gram matrix determines both the convergence of the optimization during training as well as the generalization properties. We generalize their result to the kernel and neural representations and show these extensions improve both optimization and generalization of the basic setup studied in (Arora et al. 2019). In particular, we first extend the setup with the Gaussian kernel and the approximations by random Fourier features as well as with the embeddings produced by two-layer networks trained on different tasks. We then study the use of more sophisticated kernels and embeddings, those designed optimally for deep neural networks and those developed for the classification task of interest given the data and the training labels, independent of any specific classification model. |
Tasks | |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.05095v2 |
https://arxiv.org/pdf/1905.05095v2.pdf | |
PWC | https://paperswithcode.com/paper/spectral-analysis-of-kernel-and-neural |
Repo | |
Framework | |
The Role of Memory in Stochastic Optimization
Title | The Role of Memory in Stochastic Optimization |
Authors | Antonio Orvieto, Jonas Kohler, Aurelien Lucchi |
Abstract | The choice of how to retain information about past gradients dramatically affects the convergence properties of state-of-the-art stochastic optimization methods, such as Heavy-ball, Nesterov’s momentum, RMSprop and Adam. Building on this observation, we use stochastic differential equations (SDEs) to explicitly study the role of memory in gradient-based algorithms. We first derive a general continuous-time model that can incorporate arbitrary types of memory, for both deterministic and stochastic settings. We provide convergence guarantees for this SDE for weakly-quasi-convex and quadratically growing functions. We then demonstrate how to discretize this SDE to get a flexible discrete-time algorithm that can implement a board spectrum of memories ranging from short- to long-term. Not only does this algorithm increase the degrees of freedom in algorithmic choice for practitioners but it also comes with better stability properties than classical momentum in the convex stochastic setting. In particular, no iterate averaging is needed for convergence. Interestingly, our analysis also provides a novel interpretation of Nesterov’s momentum as stable gradient amplification and highlights a possible reason for its unstable behavior in the (convex) stochastic setting. Furthermore, we discuss the use of long term memory for second-moment estimation in adaptive methods, such as Adam and RMSprop. Finally, we provide an extensive experimental study of the effect of different types of memory in both convex and nonconvex settings. |
Tasks | Stochastic Optimization |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01678v2 |
https://arxiv.org/pdf/1907.01678v2.pdf | |
PWC | https://paperswithcode.com/paper/the-role-of-memory-in-stochastic-optimization |
Repo | |
Framework | |