Paper Group ANR 1451
Temporal Registration in Application to In-utero MRI Time Series. Learning One-hidden-layer neural networks via Provable Gradient Descent with Random Initialization. Distributed Machine Learning on Mobile Devices: A Survey. Mid-price Prediction Based on Machine Learning Methods with Technical and Quantitative Indicators. Spherical U-Net on Cortical …
Temporal Registration in Application to In-utero MRI Time Series
Title | Temporal Registration in Application to In-utero MRI Time Series |
Authors | Ruizhi Liao, Esra A. Turk, Miaomiao Zhang, Jie Luo, Elfar Adalsteinsson, P. Ellen Grant, Polina Golland |
Abstract | We present a robust method to correct for motion in volumetric in-utero MRI time series. Time-course analysis for in-utero volumetric MRI time series often suffers from substantial and unpredictable fetal motion. Registration provides voxel correspondences between images and is commonly employed for motion correction. Current registration methods often fail when aligning images that are substantially different from a template (reference image). To achieve accurate and robust alignment, we make a Markov assumption on the nature of motion and take advantage of the temporal smoothness in the image data. Forward message passing in the corresponding hidden Markov model (HMM) yields an estimation algorithm that only has to account for relatively small motion between consecutive frames. We evaluate the utility of the temporal model in the context of in-utero MRI time series alignment by examining the accuracy of propagated segmentation label maps. Our results suggest that the proposed model captures accurately the temporal dynamics of transformations in in-utero MRI time series. |
Tasks | Time Series, Time Series Alignment |
Published | 2019-03-06 |
URL | http://arxiv.org/abs/1903.02959v1 |
http://arxiv.org/pdf/1903.02959v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-registration-in-application-to-in |
Repo | |
Framework | |
Learning One-hidden-layer neural networks via Provable Gradient Descent with Random Initialization
Title | Learning One-hidden-layer neural networks via Provable Gradient Descent with Random Initialization |
Authors | Shuhao Xia, Yuanming Shi |
Abstract | Although deep learning has shown its powerful performance in many applications, the mathematical principles behind neural networks are still mysterious. In this paper, we consider the problem of learning a one-hidden-layer neural network with quadratic activations. We focus on the under-parameterized regime where the number of hidden units is smaller than the dimension of the inputs. We shall propose to solve the problem via a provable gradient-based method with random initialization. For the non-convex neural networks training problem we reveal that the gradient descent iterates are able to enter a local region that enjoys strong convexity and smoothness within a few iterations, and then provably converges to a globally optimal model at a linear rate with near-optimal sample complexity. We further corroborate our theoretical findings via various experiments. |
Tasks | |
Published | 2019-07-04 |
URL | https://arxiv.org/abs/1907.06594v2 |
https://arxiv.org/pdf/1907.06594v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-one-hidden-layer-neural-networks-via |
Repo | |
Framework | |
Distributed Machine Learning on Mobile Devices: A Survey
Title | Distributed Machine Learning on Mobile Devices: A Survey |
Authors | Renjie Gu, Shuo Yang, Fan Wu |
Abstract | In recent years, mobile devices have gained increasingly development with stronger computation capability and larger storage. Some of the computation-intensive machine learning and deep learning tasks can now be run on mobile devices. To take advantage of the resources available on mobile devices and preserve users’ privacy, the idea of mobile distributed machine learning is proposed. It uses local hardware resources and local data to solve machine learning sub-problems on mobile devices, and only uploads computation results instead of original data to contribute to the optimization of the global model. This architecture can not only relieve computation and storage burden on servers, but also protect the users’ sensitive information. Another benefit is the bandwidth reduction, as various kinds of local data can now participate in the training process without being uploaded to the server. In this paper, we provide a comprehensive survey on recent studies of mobile distributed machine learning. We survey a number of widely-used mobile distributed machine learning methods. We also present an in-depth discussion on the challenges and future directions in this area. We believe that this survey can demonstrate a clear overview of mobile distributed machine learning and provide guidelines on applying mobile distributed machine learning to real applications. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08329v1 |
https://arxiv.org/pdf/1909.08329v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-machine-learning-on-mobile |
Repo | |
Framework | |
Mid-price Prediction Based on Machine Learning Methods with Technical and Quantitative Indicators
Title | Mid-price Prediction Based on Machine Learning Methods with Technical and Quantitative Indicators |
Authors | Adamantios Ntakaris, Juho Kanniainen, Moncef Gabbouj, Alexandros Iosifidis |
Abstract | Stock price prediction is a challenging task, but machine learning methods have recently been used successfully for this purpose. In this paper, we extract over 270 hand-crafted features (factors) inspired by technical and quantitative analysis and tested their validity on short-term mid-price movement prediction. We focus on a wrapper feature selection method using entropy, least-mean squares, and linear discriminant analysis. We also build a new quantitative feature based on adaptive logistic regression for online learning, which is constantly selected first among the majority of the proposed feature selection methods. This study examines the best combination of features using high frequency limit order book data from Nasdaq Nordic. Our results suggest that sorting methods and classifiers can be used in such a way that one can reach the best performance with a combination of only very few advanced hand-crafted features. |
Tasks | Feature Selection, Stock Price Prediction |
Published | 2019-07-13 |
URL | https://arxiv.org/abs/1907.09452v1 |
https://arxiv.org/pdf/1907.09452v1.pdf | |
PWC | https://paperswithcode.com/paper/mid-price-prediction-based-on-machine |
Repo | |
Framework | |
Spherical U-Net on Cortical Surfaces: Methods and Applications
Title | Spherical U-Net on Cortical Surfaces: Methods and Applications |
Authors | Fenqiang Zhao, Shunren Xia, Zhengwang Wu, Dingna Duan, Li Wang, Weili Lin, John H Gilmore, Dinggang Shen, Gang Li |
Abstract | Convolutional Neural Networks (CNNs) have been providing the state-of-the-art performance for learning-related problems involving 2D/3D images in Euclidean space. However, unlike in the Euclidean space, the shapes of many structures in medical imaging have a spherical topology in a manifold space, e.g., brain cortical or subcortical surfaces represented by triangular meshes, with large inter-subject and intrasubject variations in vertex number and local connectivity. Hence, there is no consistent neighborhood definition and thus no straightforward convolution/transposed convolution operations for cortical/subcortical surface data. In this paper, by leveraging the regular and consistent geometric structure of the resampled cortical surface mapped onto the spherical space, we propose a novel convolution filter analogous to the standard convolution on the image grid. Accordingly, we develop corresponding operations for convolution, pooling, and transposed convolution for spherical surface data and thus construct spherical CNNs. Specifically, we propose the Spherical U-Net architecture by replacing all operations in the standard U-Net with their spherical operation counterparts. We then apply the Spherical U-Net to two challenging and neuroscientifically important tasks in infant brains: cortical surface parcellation and cortical attribute map development prediction. Both applications demonstrate the competitive performance in the accuracy, computational efficiency, and effectiveness of our proposed Spherical U-Net, in comparison with the state-of-the-art methods. |
Tasks | |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00906v1 |
http://arxiv.org/pdf/1904.00906v1.pdf | |
PWC | https://paperswithcode.com/paper/spherical-u-net-on-cortical-surfaces-methods |
Repo | |
Framework | |
Visualizing and Measuring the Geometry of BERT
Title | Visualizing and Measuring the Geometry of BERT |
Authors | Andy Coenen, Emily Reif, Ann Yuan, Been Kim, Adam Pearce, Fernanda Viégas, Martin Wattenberg |
Abstract | Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT. At a high level, linguistic features seem to be represented in separate semantic and syntactic subspaces. We find evidence of a fine-grained geometric representation of word senses. We also present empirical descriptions of syntactic representations in both attention matrices and individual word embeddings, as well as a mathematical argument to explain the geometry of these representations. |
Tasks | Word Embeddings |
Published | 2019-06-06 |
URL | https://arxiv.org/abs/1906.02715v2 |
https://arxiv.org/pdf/1906.02715v2.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-and-measuring-the-geometry-of |
Repo | |
Framework | |
Hybrid symbiotic organisms search feedforward neural network model for stock price prediction
Title | Hybrid symbiotic organisms search feedforward neural network model for stock price prediction |
Authors | Bradley J. Pillay, Absalom E. Ezugwu |
Abstract | The prediction of stock prices is an important task in economics, investment and financial decision-making. It has for several decades, spurred the interest of many researchers to design stock price predictive models. In this paper, the symbiotic organisms search algorithm, a new metaheuristic algorithm is employed as an efficient method for training feedforward neural networks (FFNN). The training process is used to build a better stock price predictive model. The Straits Times Index, Nikkei 225, NASDAQ Composite, S&P 500, and Dow Jones Industrial Average indices were utilized as time series data sets for training and testing proposed predic-tive model. Three evaluation methods namely, Root Mean Squared Error, Mean Absolute Percentage Error and Mean Absolution Deviation are used to compare the results of the implemented model. The computational results obtained revealed that the hybrid Symbiotic Organisms Search Algorithm exhibited outstanding predictive performance when compared to the hybrid Particle Swarm Optimization, Genetic Algorithm, and ARIMA based models. The new model is a promising predictive technique for solving high dimensional nonlinear time series data that are difficult to capture by traditional models. |
Tasks | Decision Making, Stock Price Prediction, Time Series |
Published | 2019-06-23 |
URL | https://arxiv.org/abs/1906.10121v2 |
https://arxiv.org/pdf/1906.10121v2.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-symbiotic-organisms-search-feedforward |
Repo | |
Framework | |
Label-efficient audio classification through multitask learning and self-supervision
Title | Label-efficient audio classification through multitask learning and self-supervision |
Authors | Tyler Lee, Ting Gong, Suchismita Padhy, Andrew Rouditchenko, Anthony Ndirango |
Abstract | While deep learning has been incredibly successful in modeling tasks with large, carefully curated labeled datasets, its application to problems with limited labeled data remains a challenge. The aim of the present work is to improve the label efficiency of large neural networks operating on audio data through a combination of multitask learning and self-supervised learning on unlabeled data. We trained an end-to-end audio feature extractor based on WaveNet that feeds into simple, yet versatile task-specific neural networks. We describe several easily implemented self-supervised learning tasks that can operate on any large, unlabeled audio corpus. We demonstrate that, in scenarios with limited labeled training data, one can significantly improve the performance of three different supervised classification tasks individually by up to 6% through simultaneous training with these additional self-supervised tasks. We also show that incorporating data augmentation into our multitask setting leads to even further gains in performance. |
Tasks | Audio Classification, Data Augmentation |
Published | 2019-10-19 |
URL | https://arxiv.org/abs/1910.12587v1 |
https://arxiv.org/pdf/1910.12587v1.pdf | |
PWC | https://paperswithcode.com/paper/label-efficient-audio-classification-through |
Repo | |
Framework | |
LaTeS: Latent Space Distillation for Teacher-Student Driving Policy Learning
Title | LaTeS: Latent Space Distillation for Teacher-Student Driving Policy Learning |
Authors | Albert Zhao, Tong He, Yitao Liang, Haibin Huang, Guy Van den Broeck, Stefano Soatto |
Abstract | We describe a policy learning approach to map visual inputs to driving controls that leverages side information on semantics and affordances of objects in the scene from a secondary teacher model. While the teacher receives semantic segmentation and stop “intention” values as inputs and produces an estimate of the driving controls, the primary student model only receives images as inputs, and attempts to imitate the controls while being biased towards the latent representation of the teacher model. The latent representation encodes task-relevant information in the inputs of the teacher model, which are semantic segmentation of the image, and intention values for driving controls in the presence of objects in the scene such as vehicles, pedestrians and traffic lights. Our student model does not attempt to infer semantic segmentation or intention values from its inputs, nor to mimic the output behavior of the teacher. It instead attempts to capture the representation of the teacher inputs that are relevant for driving. Our training does not require laborious annotations such as maps or objects in three dimensions; even the teacher model just requires two-dimensional segmentation and intention values. Moreover, our model runs in real time of 59 FPS. We test our approach on recent simulated and real-world driving datasets, and introduce a more challenging but realistic evaluation protocol that considers a run that reaches the destination successful only if it does not violate common traffic rules. |
Tasks | Semantic Segmentation |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.02973v1 |
https://arxiv.org/pdf/1912.02973v1.pdf | |
PWC | https://paperswithcode.com/paper/lates-latent-space-distillation-for-teacher |
Repo | |
Framework | |
Particle Filter Recurrent Neural Networks
Title | Particle Filter Recurrent Neural Networks |
Authors | Xiao Ma, Peter Karkus, David Hsu, Wee Sun Lee |
Abstract | Recurrent neural networks (RNNs) have been extraordinarily successful for prediction with sequential data. To tackle highly variable and noisy real-world data, we introduce Particle Filter Recurrent Neural Networks (PF-RNNs), a new RNN family that explicitly models uncertainty in its internal structure: while an RNN relies on a long, deterministic latent state vector, a PF-RNN maintains a latent state distribution, approximated as a set of particles. For effective learning, we provide a fully differentiable particle filter algorithm that updates the PF-RNN latent state distribution according to the Bayes rule. Experiments demonstrate that the proposed PF-RNNs outperform the corresponding standard gated RNNs on a synthetic robot localization dataset and 10 real-world sequence prediction datasets for text classification, stock price prediction, etc. |
Tasks | Stock Price Prediction, Text Classification |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.12885v2 |
https://arxiv.org/pdf/1905.12885v2.pdf | |
PWC | https://paperswithcode.com/paper/particle-filter-recurrent-neural-networks |
Repo | |
Framework | |
Zero-Shot Audio Classification Based on Class Label Embeddings
Title | Zero-Shot Audio Classification Based on Class Label Embeddings |
Authors | Huang Xie, Tuomas Virtanen |
Abstract | This paper proposes a zero-shot learning approach for audio classification based on the textual information about class labels without any audio samples from target classes. We propose an audio classification system built on the bilinear model, which takes audio feature embeddings and semantic class label embeddings as input, and measures the compatibility between an audio feature embedding and a class label embedding. We use VGGish to extract audio feature embeddings from audio recordings. We treat textual labels as semantic side information of audio classes, and use Word2Vec to generate class label embeddings. Results on the ESC-50 dataset show that the proposed system can perform zero-shot audio classification with small training dataset. It can achieve accuracy (26 % on average) better than random guess (10 %) on each audio category. Particularly, it reaches up to 39.7 % for the category of natural audio classes. |
Tasks | Audio Classification, Zero-Shot Learning |
Published | 2019-05-06 |
URL | https://arxiv.org/abs/1905.01926v2 |
https://arxiv.org/pdf/1905.01926v2.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-audio-classification-based-on-class |
Repo | |
Framework | |
Reinforcement Learning for Mean Field Game
Title | Reinforcement Learning for Mean Field Game |
Authors | Mridul Agarwal, Vaneet Aggarwal, Arnob Ghosh, Nilay Tiwari |
Abstract | Stochastic games provide a framework for interactions among multiple agents and enable a myriad of applications. In these games, agents decide on actions simultaneously, the state of every agent moves to the next state, and each agent receives a reward. However, finding an equilibrium (if exists) in this game is often difficult when the number of agents becomes large. This paper focuses on finding a mean-field equilibrium (MFE) in an action coupled stochastic game setting in an episodic framework. It is assumed that the impact of the other agents’ can be assumed by the empirical distribution of the mean of the actions. All agents know the action distribution and employ lower-myopic best response dynamics to choose the optimal oblivious strategy. This paper proposes a posterior sampling based approach for reinforcement learning in the mean-field game, where each agent samples a transition probability from the previous transitions. We show that the policy and action distributions converge to the optimal oblivious strategy and the limiting distribution, respectively, which constitute an MFE. |
Tasks | |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13357v2 |
https://arxiv.org/pdf/1905.13357v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-mean-field-game |
Repo | |
Framework | |
Learning with Noisy Labels for Sentence-level Sentiment Classification
Title | Learning with Noisy Labels for Sentence-level Sentiment Classification |
Authors | Hao Wang, Bing Liu, Chaozhuo Li, Yan Yang, Tianrui Li |
Abstract | Deep neural networks (DNNs) can fit (or even over-fit) the training data very well. If a DNN model is trained using data with noisy labels and tested on data with clean labels, the model may perform poorly. This paper studies the problem of learning with noisy labels for sentence-level sentiment classification. We propose a novel DNN model called NetAb (as shorthand for convolutional neural Networks with Ab-networks) to handle noisy labels during training. NetAb consists of two convolutional neural networks, one with a noise transition layer for dealing with the input noisy labels and the other for predicting ‘clean’ labels. We train the two networks using their respective loss functions in a mutual reinforcement manner. Experimental results demonstrate the effectiveness of the proposed model. |
Tasks | Sentiment Analysis |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00124v1 |
https://arxiv.org/pdf/1909.00124v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-with-noisy-labels-for-sentence-level |
Repo | |
Framework | |
Non-linear Multitask Learning with Deep Gaussian Processes
Title | Non-linear Multitask Learning with Deep Gaussian Processes |
Authors | Ayman Boustati, Theodoros Damoulas, Richard S. Savage |
Abstract | We present a multi-task learning formulation for Deep Gaussian processes (DGPs), through non-linear mixtures of latent processes. The latent space is composed of private processes that capture within-task information and shared processes that capture across-task dependencies. We propose two different methods for segmenting the latent space: through hard coding shared and task-specific processes or through soft sharing with Automatic Relevance Determination kernels. We show that our formulation is able to improve the learning performance and transfer information between the tasks, outperforming other probabilistic multi-task learning models across real-world and benchmarking settings. |
Tasks | Gaussian Processes, Multi-Task Learning |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12407v2 |
https://arxiv.org/pdf/1905.12407v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-in-deep-gaussian |
Repo | |
Framework | |
DARC: Differentiable ARchitecture Compression
Title | DARC: Differentiable ARchitecture Compression |
Authors | Shashank Singh, Ashish Khetan, Zohar Karnin |
Abstract | In many learning situations, resources at inference time are significantly more constrained than resources at training time. This paper studies a general paradigm, called Differentiable ARchitecture Compression (DARC), that combines model compression and architecture search to learn models that are resource-efficient at inference time. Given a resource-intensive base architecture, DARC utilizes the training data to learn which sub-components can be replaced by cheaper alternatives. The high-level technique can be applied to any neural architecture, and we report experiments on state-of-the-art convolutional neural networks for image classification. For a WideResNet with $97.2%$ accuracy on CIFAR-10, we improve single-sample inference speed by $2.28\times$ and memory footprint by $5.64\times$, with no accuracy loss. For a ResNet with $79.15%$ Top1 accuracy on ImageNet, we improve batch inference speed by $1.29\times$ and memory footprint by $3.57\times$ with $1%$ accuracy loss. We also give theoretical Rademacher complexity bounds in simplified cases, showing how DARC avoids overfitting despite over-parameterization. |
Tasks | Image Classification, Model Compression, Neural Architecture Search |
Published | 2019-05-20 |
URL | https://arxiv.org/abs/1905.08170v1 |
https://arxiv.org/pdf/1905.08170v1.pdf | |
PWC | https://paperswithcode.com/paper/darc-differentiable-architecture-compression |
Repo | |
Framework | |