July 26, 2019

3363 words 16 mins read

Paper Group ANR 771

Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods. On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms. Actively Learning what makes a Discrete Sequence Valid. Vision-based Detection of Acoustic Timed Events: a Case Study on Cl …

Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods


Title	Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods
Authors	Cheng Ju, Mary Combs, Samuel D Lendle, Jessica M Franklin, Richard Wyss, Sebastian Schneeweiss, Mark J. van der Laan
Abstract	The optimal learner for prediction modeling varies depending on the underlying data-generating distribution. Super Learner (SL) is a generic ensemble learning algorithm that uses cross-validation to select among a “library” of candidate prediction models. The SL is not restricted to a single prediction model, but uses the strengths of a variety of learning algorithms to adapt to different databases. While the SL has been shown to perform well in a number of settings, it has not been thoroughly evaluated in large electronic healthcare databases that are common in pharmacoepidemiology and comparative effectiveness research. In this study, we applied and evaluated the performance of the SL in its ability to predict treatment assignment using three electronic healthcare databases. We considered a library of algorithms that consisted of both nonparametric and parametric models. We also considered a novel strategy for prediction modeling that combines the SL with the high-dimensional propensity score (hdPS) variable selection algorithm. Predictive performance was assessed using three metrics: the negative log-likelihood, area under the curve (AUC), and time complexity. Results showed that the best individual algorithm, in terms of predictive performance, varied across datasets. The SL was able to adapt to the given dataset and optimize predictive performance relative to any individual learner. Combining the SL with the hdPS was the most consistent prediction method and may be promising for PS estimation and prediction modeling in electronic healthcare databases.
Tasks
Published	2017-03-07
URL	http://arxiv.org/abs/1703.02236v2
PDF	http://arxiv.org/pdf/1703.02236v2.pdf
PWC	https://paperswithcode.com/paper/propensity-score-prediction-for-electronic
Repo
Framework

On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms


Title	On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms
Authors	Nicolas Garcia Trillos, Zachary Kaplan, Thabo Samakhoana, Daniel Sanz-Alonso
Abstract	A popular approach to semi-supervised learning proceeds by endowing the input data with a graph structure in order to extract geometric information and incorporate it into a Bayesian framework. We introduce new theory that gives appropriate scalings of graph parameters that provably lead to a well-defined limiting posterior as the size of the unlabeled data set grows. Furthermore, we show that these consistency results have profound algorithmic implications. When consistency holds, carefully designed graph-based Markov chain Monte Carlo algorithms are proved to have a uniform spectral gap, independent of the number of unlabeled inputs. Several numerical experiments corroborate both the statistical consistency and the algorithmic scalability established by the theory.
Tasks
Published	2017-10-20
URL	https://arxiv.org/abs/1710.07702v2
PDF	https://arxiv.org/pdf/1710.07702v2.pdf
PWC	https://paperswithcode.com/paper/on-the-consistency-of-graph-based-bayesian
Repo
Framework

Actively Learning what makes a Discrete Sequence Valid


Title	Actively Learning what makes a Discrete Sequence Valid
Authors	David Janz, Jos van der Westhuizen, José Miguel Hernández-Lobato
Abstract	Deep learning techniques have been hugely successful for traditional supervised and unsupervised machine learning problems. In large part, these techniques solve continuous optimization problems. Recently however, discrete generative deep learning models have been successfully used to efficiently search high-dimensional discrete spaces. These methods work by representing discrete objects as sequences, for which powerful sequence-based deep models can be employed. Unfortunately, these techniques are significantly hindered by the fact that these generative models often produce invalid sequences. As a step towards solving this problem, we propose to learn a deep recurrent validator model. Given a partial sequence, our model learns the probability of that sequence occurring as the beginning of a full valid sequence. Thus this identifies valid versus invalid sequences and crucially it also provides insight about how individual sequence elements influence the validity of discrete objects. To learn this model we propose an approach inspired by seminal work in Bayesian active learning. On a synthetic dataset, we demonstrate the ability of our model to distinguish valid and invalid sequences. We believe this is a key step toward learning generative models that faithfully produce valid discrete objects.
Tasks	Active Learning
Published	2017-08-15
URL	http://arxiv.org/abs/1708.04465v1
PDF	http://arxiv.org/pdf/1708.04465v1.pdf
PWC	https://paperswithcode.com/paper/actively-learning-what-makes-a-discrete
Repo
Framework

Vision-based Detection of Acoustic Timed Events: a Case Study on Clarinet Note Onsets


Title	Vision-based Detection of Acoustic Timed Events: a Case Study on Clarinet Note Onsets
Authors	A. Bazzica, J. C. van Gemert, C. C. S. Liem, A. Hanjalic
Abstract	Acoustic events often have a visual counterpart. Knowledge of visual information can aid the understanding of complex auditory scenes, even when only a stereo mixdown is available in the audio domain, \eg identifying which musicians are playing in large musical ensembles. In this paper, we consider a vision-based approach to note onset detection. As a case study we focus on challenging, real-world clarinetist videos and carry out preliminary experiments on a 3D convolutional neural network based on multiple streams and purposely avoiding temporal pooling. We release an audiovisual dataset with 4.5 hours of clarinetist videos together with cleaned annotations which include about 36,000 onsets and the coordinates for a number of salient points and regions of interest. By performing several training trials on our dataset, we learned that the problem is challenging. We found that the CNN model is highly sensitive to the optimization algorithm and hyper-parameters, and that treating the problem as binary classification may prevent the joint optimization of precision and recall. To encourage further research, we publicly share our dataset, annotations and all models and detail which issues we came across during our preliminary experiments.
Tasks
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09556v1
PDF	http://arxiv.org/pdf/1706.09556v1.pdf
PWC	https://paperswithcode.com/paper/vision-based-detection-of-acoustic-timed
Repo
Framework

Let Features Decide for Themselves: Feature Mask Network for Person Re-identification


Title	Let Features Decide for Themselves: Feature Mask Network for Person Re-identification
Authors	Guodong Ding, Salman Khan, Zhenmin Tang, Fatih Porikli
Abstract	Person re-identification aims at establishing the identity of a pedestrian from a gallery that contains images of multiple people obtained from a multi-camera system. Many challenges such as occlusions, drastic lighting and pose variations across the camera views, indiscriminate visual appearances, cluttered backgrounds, imperfect detections, motion blur, and noise make this task highly challenging. While most approaches focus on learning features and metrics to derive better representations, we hypothesize that both local and global contextual cues are crucial for an accurate identity matching. To this end, we propose a Feature Mask Network (FMN) that takes advantage of ResNet high-level features to predict a feature map mask and then imposes it on the low-level features to dynamically reweight different object parts for a locally aware feature representation. This serves as an effective attention mechanism by allowing the network to focus on local details selectively. Given the resemblance of person re-identification with classification and retrieval tasks, we frame the network training as a multi-task objective optimization, which further improves the learned feature descriptions. We conduct experiments on Market-1501, DukeMTMC-reID and CUHK03 datasets, where the proposed approach respectively achieves significant improvements of $5.3%$, $9.1%$ and $10.7%$ in mAP measure relative to the state-of-the-art.
Tasks	Person Re-Identification
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07155v1
PDF	http://arxiv.org/pdf/1711.07155v1.pdf
PWC	https://paperswithcode.com/paper/let-features-decide-for-themselves-feature
Repo
Framework

Anticipating Daily Intention using On-Wrist Motion Triggered Sensing


Title	Anticipating Daily Intention using On-Wrist Motion Triggered Sensing
Authors	Tz-Ying Wu, Ting-An Chien, Cheng-Sheng Chan, Chan-Wei Hu, Min Sun
Abstract	Anticipating human intention by observing one’s actions has many applications. For instance, picking up a cellphone, then a charger (actions) implies that one wants to charge the cellphone (intention). By anticipating the intention, an intelligent system can guide the user to the closest power outlet. We propose an on-wrist motion triggered sensing system for anticipating daily intentions, where the on-wrist sensors help us to persistently observe one’s actions. The core of the system is a novel Recurrent Neural Network (RNN) and Policy Network (PN), where the RNN encodes visual and motion observation to anticipate intention, and the PN parsimoniously triggers the process of visual observation to reduce computation requirement. We jointly trained the whole network using policy gradient and cross-entropy loss. To evaluate, we collect the first daily “intention” dataset consisting of 2379 videos with 34 intentions and 164 unique action sequences. Our method achieves 92.68%, 90.85%, 97.56% accuracy on three users while processing only 29% of the visual observation on average.
Tasks
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07477v1
PDF	http://arxiv.org/pdf/1710.07477v1.pdf
PWC	https://paperswithcode.com/paper/anticipating-daily-intention-using-on-wrist
Repo
Framework

Director Field Analysis (DFA): Exploring Local White Matter Geometric Structure in diffusion MRI


Title	Director Field Analysis (DFA): Exploring Local White Matter Geometric Structure in diffusion MRI
Authors	Jian Cheng, Peter J. Basser
Abstract	In Diffusion Tensor Imaging (DTI) or High Angular Resolution Diffusion Imaging (HARDI), a tensor field or a spherical function field (e.g., an orientation distribution function field), can be estimated from measured diffusion weighted images. In this paper, inspired by the microscopic theoretical treatment of phases in liquid crystals, we introduce a novel mathematical framework, called Director Field Analysis (DFA), to study local geometric structural information of white matter based on the reconstructed tensor field or spherical function field: 1) We propose a set of mathematical tools to process general director data, which consists of dyadic tensors that have orientations but no direction. 2) We propose Orientational Order (OO) and Orientational Dispersion (OD) indices to describe the degree of alignment and dispersion of a spherical function in a single voxel or in a region, respectively; 3) We also show how to construct a local orthogonal coordinate frame in each voxel exhibiting anisotropic diffusion; 4) Finally, we define three indices to describe three types of orientational distortion (splay, bend, and twist) in a local spatial neighborhood, and a total distortion index to describe distortions of all three types. To our knowledge, this is the first work to quantitatively describe orientational distortion (splay, bend, and twist) in general spherical function fields from DTI or HARDI data. The proposed DFA and its related mathematical tools can be used to process not only diffusion MRI data but also general director field data, and the proposed scalar indices are useful for detecting local geometric changes of white matter for voxel-based or tract-based analysis in both DTI and HARDI acquisitions. The related codes and a tutorial for DFA will be released in DMRITool.
Tasks
Published	2017-06-06
URL	http://arxiv.org/abs/1706.01862v2
PDF	http://arxiv.org/pdf/1706.01862v2.pdf
PWC	https://paperswithcode.com/paper/director-field-analysis-dfa-exploring-local
Repo
Framework

Minimax Lower Bounds for Ridge Combinations Including Neural Nets


Title	Minimax Lower Bounds for Ridge Combinations Including Neural Nets
Authors	Jason M. Klusowski, Andrew R. Barron
Abstract	Estimation of functions of $ d $ variables is considered using ridge combinations of the form $ \textstyle\sum_{k=1}^m c_{1,k} \phi(\textstyle\sum_{j=1}^d c_{0,j,k}x_j-b_k) $ where the activation function $ \phi $ is a function with bounded value and derivative. These include single-hidden layer neural networks, polynomials, and sinusoidal models. From a sample of size $ n $ of possibly noisy values at random sites $ X \in B = [-1,1]^d $, the minimax mean square error is examined for functions in the closure of the $ \ell_1 $ hull of ridge functions with activation $ \phi $. It is shown to be of order $ d/n $ to a fractional power (when $ d $ is of smaller order than $ n $), and to be of order $ (\log d)/n $ to a fractional power (when $ d $ is of larger order than $ n $). Dependence on constraints $ v_0 $ and $ v_1 $ on the $ \ell_1 $ norms of inner parameter $ c_0 $ and outer parameter $ c_1 $, respectively, is also examined. Also, lower and upper bounds on the fractional power are given. The heart of the analysis is development of information-theoretic packing numbers for these classes of functions.
Tasks
Published	2017-02-09
URL	http://arxiv.org/abs/1702.02828v1
PDF	http://arxiv.org/pdf/1702.02828v1.pdf
PWC	https://paperswithcode.com/paper/minimax-lower-bounds-for-ridge-combinations
Repo
Framework

From Characters to Words to in Between: Do We Capture Morphology?


Title	From Characters to Words to in Between: Do We Capture Morphology?
Authors	Clara Vania, Adam Lopez
Abstract	Words can be represented by composing the representations of subword units such as word segments, characters, and/or character n-grams. While such representations are effective and may capture the morphological regularities of words, they have not been systematically compared, and it is not understood how they interact with different morphological typologies. On a language modeling task, we present experiments that systematically vary (1) the basic unit of representation, (2) the composition of these representations, and (3) the morphological typology of the language modeled. Our results extend previous findings that character representations are effective across typologies, and we find that a previously unstudied combination of character trigram representations composed with bi-LSTMs outperforms most others. But we also find room for improvement: none of the character-level models match the predictive accuracy of a model with access to true morphological analyses, even when learned from an order of magnitude more data.
Tasks	Language Modelling
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08352v1
PDF	http://arxiv.org/pdf/1704.08352v1.pdf
PWC	https://paperswithcode.com/paper/from-characters-to-words-to-in-between-do-we
Repo
Framework

Automated flow for compressing convolution neural networks for efficient edge-computation with FPGA


Title	Automated flow for compressing convolution neural networks for efficient edge-computation with FPGA
Authors	Farhan Shafiq, Takato Yamada, Antonio T. Vilchez, Sakyasingha Dasgupta
Abstract	Deep convolutional neural networks (CNN) based solutions are the current state- of-the-art for computer vision tasks. Due to the large size of these models, they are typically run on clusters of CPUs or GPUs. However, power requirements and cost budgets can be a major hindrance in adoption of CNN for IoT applications. Recent research highlights that CNN contain significant redundancy in their structure and can be quantized to lower bit-width parameters and activations, while maintaining acceptable accuracy. Low bit-width and especially single bit-width (binary) CNN are particularly suitable for mobile applications based on FPGA implementation, due to the bitwise logic operations involved in binarized CNN. Moreover, the transition to lower bit-widths opens new avenues for performance optimizations and model improvement. In this paper, we present an automatic flow from trained TensorFlow models to FPGA system on chip implementation of binarized CNN. This flow involves quantization of model parameters and activations, generation of network and model in embedded-C, followed by automatic generation of the FPGA accelerator for binary convolutions. The automated flow is demonstrated through implementation of binarized “YOLOV2” on the low cost, low power Cyclone- V FPGA device. Experiments on object detection using binarized YOLOV2 demonstrate significant performance benefit in terms of model size and inference speed on FPGA as compared to CPU and mobile CPU platforms. Furthermore, the entire automated flow from trained models to FPGA synthesis can be completed within one hour.
Tasks	Object Detection, Quantization
Published	2017-12-18
URL	http://arxiv.org/abs/1712.06272v1
PDF	http://arxiv.org/pdf/1712.06272v1.pdf
PWC	https://paperswithcode.com/paper/automated-flow-for-compressing-convolution
Repo
Framework

Low Resourced Machine Translation via Morpho-syntactic Modeling: The Case of Dialectal Arabic


Title	Low Resourced Machine Translation via Morpho-syntactic Modeling: The Case of Dialectal Arabic
Authors	Alexander Erdmann, Nizar Habash, Dima Taji, Houda Bouamor
Abstract	We present the second ever evaluated Arabic dialect-to-dialect machine translation effort, and the first to leverage external resources beyond a small parallel corpus. The subject has not previously received serious attention due to lack of naturally occurring parallel data; yet its importance is evidenced by dialectal Arabic’s wide usage and breadth of inter-dialect variation, comparable to that of Romance languages. Our results suggest that modeling morphology and syntax significantly improves dialect-to-dialect translation, though optimizing such data-sparse models requires consideration of the linguistic differences between dialects and the nature of available data and resources. On a single-reference blind test set where untranslated input scores 6.5 BLEU and a model trained only on parallel data reaches 14.6, pivot techniques and morphosyntactic modeling significantly improve performance to 17.5.
Tasks	Machine Translation
Published	2017-12-18
URL	http://arxiv.org/abs/1712.06273v1
PDF	http://arxiv.org/pdf/1712.06273v1.pdf
PWC	https://paperswithcode.com/paper/low-resourced-machine-translation-via-morpho
Repo
Framework

Toward Inverse Control of Physics-Based Sound Synthesis


Title	Toward Inverse Control of Physics-Based Sound Synthesis
Authors	A. Pfalz, E. Berdahl
Abstract	Long Short-Term Memory networks (LSTMs) can be trained to realize inverse control of physics-based sound synthesizers. Physics-based sound synthesizers simulate the laws of physics to produce output sound according to input gesture signals. When a user’s gestures are measured in real time, she or he can use them to control physics-based sound synthesizers, thereby creating simulated virtual instruments. An intriguing question is how to program a computer to learn to play such physics-based models. This work demonstrates that LSTMs can be trained to accomplish this inverse control task with four physics-based sound synthesizers.
Tasks
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09551v1
PDF	http://arxiv.org/pdf/1706.09551v1.pdf
PWC	https://paperswithcode.com/paper/toward-inverse-control-of-physics-based-sound
Repo
Framework

Curvature-aided Incremental Aggregated Gradient Method


Title	Curvature-aided Incremental Aggregated Gradient Method
Authors	Hoi-To Wai, Wei Shi, Angelia Nedic, Anna Scaglione
Abstract	We propose a new algorithm for finite sum optimization which we call the curvature-aided incremental aggregated gradient (CIAG) method. Motivated by the problem of training a classifier for a d-dimensional problem, where the number of training data is $m$ and $m \gg d \gg 1$, the CIAG method seeks to accelerate incremental aggregated gradient (IAG) methods using aids from the curvature (or Hessian) information, while avoiding the evaluation of matrix inverses required by the incremental Newton (IN) method. Specifically, our idea is to exploit the incrementally aggregated Hessian matrix to trace the full gradient vector at every incremental step, therefore achieving an improved linear convergence rate over the state-of-the-art IAG methods. For strongly convex problems, the fast linear convergence rate requires the objective function to be close to quadratic, or the initial point to be close to optimal solution. Importantly, we show that running one iteration of the CIAG method yields the same improvement to the optimality gap as running one iteration of the full gradient method, while the complexity is $O(d^2)$ for CIAG and $O(md)$ for the full gradient. Overall, the CIAG method strikes a balance between the high computation complexity incremental Newton-type methods and the slow IAG method. Our numerical results support the theoretical findings and show that the CIAG method often converges with much fewer iterations than IAG, and requires much shorter running time than IN when the problem dimension is high.
Tasks
Published	2017-10-24
URL	http://arxiv.org/abs/1710.08936v1
PDF	http://arxiv.org/pdf/1710.08936v1.pdf
PWC	https://paperswithcode.com/paper/curvature-aided-incremental-aggregated
Repo
Framework

Atari games and Intel processors


Title	Atari games and Intel processors
Authors	Robert Adamski, Tomasz Grel, Maciej Klimek, Henryk Michalewski
Abstract	The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions. In this work we present our results on learning strategies in Atari games using a Convolutional Neural Network, the Math Kernel Library and TensorFlow 0.11rc0 machine learning framework. We also analyze effects of asynchronous computations on the convergence of reinforcement learning algorithms.
Tasks	Atari Games
Published	2017-05-19
URL	http://arxiv.org/abs/1705.06936v1
PDF	http://arxiv.org/pdf/1705.06936v1.pdf
PWC	https://paperswithcode.com/paper/atari-games-and-intel-processors
Repo
Framework

Deep Neural Network Approximation using Tensor Sketching


Title	Deep Neural Network Approximation using Tensor Sketching
Authors	Shiva Prasad Kasiviswanathan, Nina Narodytska, Hongxia Jin
Abstract	Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a smaller network architecture that approximates the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments. In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.
Tasks
Published	2017-10-21
URL	http://arxiv.org/abs/1710.07850v1
PDF	http://arxiv.org/pdf/1710.07850v1.pdf
PWC	https://paperswithcode.com/paper/deep-neural-network-approximation-using
Repo
Framework