January 31, 2020

3145 words 15 mins read

Paper Group ANR 1

Paper Group ANR 1

LoRAS: An oversampling approach for imbalanced datasets. Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning. Non-locally Encoder-Decoder Convolutional Network for Whole Brain QSM Inversion. Image Aesthetics Assessment using Multi Channel Convolutional Neural Networks. Theoretical Limitation …

LoRAS: An oversampling approach for imbalanced datasets

Title LoRAS: An oversampling approach for imbalanced datasets
Authors Saptarshi Bej, Narek Davtyan, Markus Wolfien, Mariam Nassar, Olaf Wolkenhauer
Abstract The Synthetic Minority Oversampling TEchnique (SMOTE) is widely-used for the analysis of imbalanced datasets. It is known that SMOTE frequently over-generalizes the minority class, leading to misclassifications for the majority class, and effecting the overall balance of the model. In this article, we present an approach that overcomes this limitation of SMOTE, employing Localized Random Affine Shadowsampling (LoRAS) to oversample from an approximated data manifold of the minority class. We benchmarked our algorithm with 12 publicly available imbalaned datasets using three different Machine Learning (ML) algorithms and comparing the performance of LoRAS, SMOTE and several SMOTE extensions, observed that LoRAS, on average generates better ML models in terms of F1-Score and Balanced accuracy. Another key observation is that while most of the extensions of SMOTE we have tested, improve the F1-Score with respect to SMOTE on an average, they compromise on the Balanced accuracy of a classification model. LoRAS on the contrary, improves both F1 Score and the Balanced accuracy thus produces better classification models. Moreover, to explain the success of the algorithm, we have constructed a mathematical framework to prove that LoRAS oversampling technique provides a better estimate for the mean of the underlying local data distribution of the minority class data space.
Tasks
Published 2019-08-22
URL https://arxiv.org/abs/1908.08346v3
PDF https://arxiv.org/pdf/1908.08346v3.pdf
PWC https://paperswithcode.com/paper/loras-an-oversampling-approach-for-imbalanced
Repo
Framework

Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning

Title Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning
Authors Xudong Sun, Bernd Bischl
Abstract Aiming at a comprehensive and concise tutorial survey, recap of variational inference and reinforcement learning with Probabilistic Graphical Models are given with detailed derivations. Reviews and comparisons on recent advances in deep reinforcement learning are made from various aspects. We offer detailed derivations to a taxonomy of Probabilistic Graphical Model and Variational Inference methods in deep reinforcement learning, which serves as a complementary material on top of the original contributions.
Tasks
Published 2019-08-25
URL https://arxiv.org/abs/1908.09381v5
PDF https://arxiv.org/pdf/1908.09381v5.pdf
PWC https://paperswithcode.com/paper/tutorial-and-survey-on-probabilistic
Repo
Framework

Non-locally Encoder-Decoder Convolutional Network for Whole Brain QSM Inversion

Title Non-locally Encoder-Decoder Convolutional Network for Whole Brain QSM Inversion
Authors Juan Liu, Kevin M. Koch
Abstract Quantitative Susceptibility Mapping (QSM) reconstruction is a challenging inverse problem driven by ill conditioning of its field-to -susceptibility transformation. State-of-art QSM reconstruction methods either suffer from image artifacts or long computation times, which limits QSM clinical translation efforts. To overcome these limitations, a non-locally encoder-decoder gated convolutional neural network is trained to infer whole brain susceptibility map, using the local field and brain mask as the inputs. The performance of the proposed method is evaluated relative to synthetic data, a publicly available challenge dataset, and clinical datasets. The proposed approach can outperform existing methods on quantitative metrics and visual assessment of image sharpness and streaking artifacts. The estimated susceptibility maps can preserve conspicuity of fine features and suppress streaking artifacts. The demonstrated methods have potential value in advancing QSM clinical research and aiding in the translation of QSM to clinical operations.
Tasks
Published 2019-04-11
URL http://arxiv.org/abs/1904.05493v1
PDF http://arxiv.org/pdf/1904.05493v1.pdf
PWC https://paperswithcode.com/paper/non-locally-encoder-decoder-convolutional
Repo
Framework

Image Aesthetics Assessment using Multi Channel Convolutional Neural Networks

Title Image Aesthetics Assessment using Multi Channel Convolutional Neural Networks
Authors Nishi Doshi, Gitam Shikhenawis, Suman K Mitra
Abstract Image Aesthetics Assessment is one of the emerging domains in research. The domain deals with classification of images into categories depending on the basis of how pleasant they are for the users to watch. In this article, the focus is on categorizing the images in high quality and low quality image. Deep convolutional neural networks are used to classify the images. Instead of using just the raw image as input, different crops and saliency maps of the images are also used, as input to the proposed multi channel CNN architecture. The experiments reported on widely used AVA database show improvement in the aesthetic assessment performance over existing approaches.
Tasks
Published 2019-11-21
URL https://arxiv.org/abs/1911.09301v1
PDF https://arxiv.org/pdf/1911.09301v1.pdf
PWC https://paperswithcode.com/paper/image-aesthetics-assessment-using-multi
Repo
Framework

Theoretical Limitations of Self-Attention in Neural Sequence Models

Title Theoretical Limitations of Self-Attention in Neural Sequence Models
Authors Michael Hahn
Abstract Transformers are emerging as the new workhorse of NLP, showing great success across tasks. Unlike LSTMs, transformers process input sequences entirely through self-attention. Previous work has suggested that the computational capabilities of self-attention to process hierarchical structures are limited. In this work, we mathematically investigate the computational power of self-attention to model formal languages. Across both soft and hard attention, we show strong theoretical limitations of the computational abilities of self-attention, finding that it cannot model periodic finite-state languages, nor hierarchical structure, unless the number of layers or heads increases with input length. These limitations seem surprising given the practical success of self-attention and the prominent role assigned to hierarchical structure in linguistics, suggesting that natural language can be approximated well with models that are too weak for the formal languages typically assumed in theoretical linguistics.
Tasks
Published 2019-06-16
URL https://arxiv.org/abs/1906.06755v2
PDF https://arxiv.org/pdf/1906.06755v2.pdf
PWC https://paperswithcode.com/paper/theoretical-limitations-of-self-attention-in
Repo
Framework

Dual Adversarial Inference for Text-to-Image Synthesis

Title Dual Adversarial Inference for Text-to-Image Synthesis
Authors Qicheng Lao, Mohammad Havaei, Ahmad Pesaranghader, Francis Dutil, Lisa Di Jorio, Thomas Fevens
Abstract Synthesizing images from a given text description involves engaging two types of information: the content, which includes information explicitly described in the text (e.g., color, composition, etc.), and the style, which is usually not well described in the text (e.g., location, quantity, size, etc.). However, in previous works, it is typically treated as a process of generating images only from the content, i.e., without considering learning meaningful style representations. In this paper, we aim to learn two variables that are disentangled in the latent space, representing content and style respectively. We achieve this by augmenting current text-to-image synthesis frameworks with a dual adversarial inference mechanism. Through extensive experiments, we show that our model learns, in an unsupervised manner, style representations corresponding to certain meaningful information present in the image that are not well described in the text. The new framework also improves the quality of synthesized images when evaluated on Oxford-102, CUB and COCO datasets.
Tasks Image Generation
Published 2019-08-14
URL https://arxiv.org/abs/1908.05324v1
PDF https://arxiv.org/pdf/1908.05324v1.pdf
PWC https://paperswithcode.com/paper/dual-adversarial-inference-for-text-to-image
Repo
Framework

Breadth-first, Depth-next Training of Random Forests

Title Breadth-first, Depth-next Training of Random Forests
Authors Andreea Anghel, Nikolas Ioannou, Thomas Parnell, Nikolaos Papandreou, Celestine Mendler-Dünner, Haris Pozidis
Abstract In this paper we analyze, evaluate, and improve the performance of training Random Forest (RF) models on modern CPU architectures. An exact, state-of-the-art binary decision tree building algorithm is used as the basis of this study. Firstly, we investigate the trade-offs between using different tree building algorithms, namely breadth-first-search (BFS) and depth-search-first (DFS). We design a novel, dynamic, hybrid BFS-DFS algorithm and demonstrate that it performs better than both BFS and DFS, and is more robust in the presence of workloads with different characteristics. Secondly, we identify CPU performance bottlenecks when generating trees using this approach, and propose optimizations to alleviate them. The proposed hybrid tree building algorithm for RF is implemented in the Snap Machine Learning framework, and speeds up the training of RFs by 7.8x on average when compared to state-of-the-art RF solvers (sklearn, H2O, and xgboost) on a range of datasets, RF configurations, and multi-core CPU architectures.
Tasks
Published 2019-10-15
URL https://arxiv.org/abs/1910.06853v1
PDF https://arxiv.org/pdf/1910.06853v1.pdf
PWC https://paperswithcode.com/paper/breadth-first-depth-next-training-of-random
Repo
Framework

Method of Contraction-Expansion (MOCE) for Simultaneous Inference in Linear Models

Title Method of Contraction-Expansion (MOCE) for Simultaneous Inference in Linear Models
Authors Fei Wang, Ling Zhou, Lu Tang, Peter X. -K. Song
Abstract Simultaneous inference after model selection is of critical importance to address scientific hypotheses involving a set of parameters. In this paper, we consider high-dimensional linear regression model in which a regularization procedure such as LASSO is applied to yield a sparse model. To establish a simultaneous post-model selection inference, we propose a method of contraction and expansion (MOCE) along the line of debiasing estimation that enables us to balance the bias-and-variance trade-off so that the super-sparsity assumption may be relaxed. We establish key theoretical results for the proposed MOCE procedure from which the expanded model can be selected with theoretical guarantees and simultaneous confidence regions can be constructed by the joint asymptotic normal distribution. In comparison with existing methods, our proposed method exhibits stable and reliable coverage at a nominal significance level with substantially less computational burden, and thus it is trustworthy for its application in solving real-world problems.
Tasks Model Selection
Published 2019-08-04
URL https://arxiv.org/abs/1908.01253v1
PDF https://arxiv.org/pdf/1908.01253v1.pdf
PWC https://paperswithcode.com/paper/method-of-contraction-expansion-moce-for
Repo
Framework

Towards Safety-Aware Computing System Design in Autonomous Vehicles

Title Towards Safety-Aware Computing System Design in Autonomous Vehicles
Authors Hengyu Zhao, Yubo Zhang, Pingfan Meng, Hui Shi, Li Erran Li, Tiancheng Lou, Jishen Zhao
Abstract Recently, autonomous driving development ignited competition among car makers and technical corporations. Low-level automation cars are already commercially available. But high automated vehicles where the vehicle drives by itself without human monitoring is still at infancy. Such autonomous vehicles (AVs) rely on the computing system in the car to to interpret the environment and make driving decisions. Therefore, computing system design is essential particularly in enhancing the attainment of driving safety. However, to our knowledge, no clear guideline exists so far regarding safety-aware AV computing system and architecture design. To understand the safety requirement of AV computing system, we performed a field study by running industrial Level-4 autonomous driving fleets in various locations, road conditions, and traffic patterns. The field study indicates that traditional computing system performance metrics, such as tail latency, average latency, maximum latency, and timeout, cannot fully satisfy the safety requirement for AV computing system design. To address this issue, we propose a `safety score’ as a primary metric for measuring the level of safety in AV computing system design. Furthermore, we propose a perception latency model, which helps architects estimate the safety score of given architecture and system design without physically testing them in an AV. We demonstrate the use of our safety score and latency model, by developing and evaluating a safety-aware AV computing system computation hardware resource management scheme. |
Tasks Autonomous Driving, Autonomous Vehicles
Published 2019-05-21
URL https://arxiv.org/abs/1905.08453v2
PDF https://arxiv.org/pdf/1905.08453v2.pdf
PWC https://paperswithcode.com/paper/towards-safety-aware-computing-system-design
Repo
Framework

Least Angle Regression in Tangent Space and LASSO for Generalized Linear Model

Title Least Angle Regression in Tangent Space and LASSO for Generalized Linear Model
Authors Yoshihiro Hirose
Abstract We propose sparse estimation methods for the generalized linear models, which run Least Angle Regression (LARS) and Least Absolute Shrinkage and Selection Operator (LASSO) in the tangent space of the manifold of the statistical model. Our approach is to roughly approximate the statistical model and to subsequently use exact calculations. LARS was proposed as an efficient algorithm for parameter estimation and variable selection for the normal linear model. The LARS algorithm is described in terms of Euclidean geometry with regarding correlation as metric of the space. Since the LARS algorithm only works in Euclidean space, we transform a manifold of the statistical model into the tangent space at the origin. In the generalized linear regression, this transformation allows us to run the original LARS algorithm for the generalized linear models. The proposed methods are efficient and perform well. Real-data analysis shows that the proposed methods output similar results as that of the $l_1$-penalized maximum likelihood estimation for the generalized linear models. Numerical experiments show that our methods work well and they can be better than the $l_1$-penalization for the generalized linear models in generalization, parameter estimation, and model selection.
Tasks Model Selection
Published 2019-07-18
URL https://arxiv.org/abs/1907.08100v2
PDF https://arxiv.org/pdf/1907.08100v2.pdf
PWC https://paperswithcode.com/paper/least-angle-regression-in-tangent-space-and
Repo
Framework

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations

Title Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations
Authors Sawyer Birnbaum, Volodymyr Kuleshov, Zayd Enam, Pang Wei Koh, Stefano Ermon
Abstract Learning representations that accurately capture long-range dependencies in sequential inputs — including text, audio, and genomic data — is a key problem in deep learning. Feed-forward convolutional models capture only feature interactions within finite receptive fields while recurrent architectures can be slow and difficult to train due to vanishing gradients. Here, we propose Temporal Feature-Wise Linear Modulation (TFiLM) — a novel architectural component inspired by adaptive batch normalization and its extensions — that uses a recurrent neural network to alter the activations of a convolutional model. This approach expands the receptive field of convolutional sequence models with minimal computational overhead. Empirically, we find that TFiLM significantly improves the learning speed and accuracy of feed-forward neural networks on a range of generative and discriminative learning tasks, including text classification and audio super-resolution
Tasks Audio Super-Resolution, Super-Resolution, Text Classification
Published 2019-09-14
URL https://arxiv.org/abs/1909.06628v2
PDF https://arxiv.org/pdf/1909.06628v2.pdf
PWC https://paperswithcode.com/paper/temporal-film-capturing-long-range-sequence
Repo
Framework

Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax

Title Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax
Authors Yinfei Yang, Gustavo Hernandez Abrego, Steve Yuan, Mandy Guo, Qinlan Shen, Daniel Cer, Yun-hsuan Sung, Brian Strope, Ray Kurzweil
Abstract In this paper, we present an approach to learn multilingual sentence embeddings using a bi-directional dual-encoder with additive margin softmax. The embeddings are able to achieve state-of-the-art results on the United Nations (UN) parallel corpus retrieval task. In all the languages tested, the system achieves P@1 of 86% or higher. We use pairs retrieved by our approach to train NMT models that achieve similar performance to models trained on gold pairs. We explore simple document-level embeddings constructed by averaging our sentence embeddings. On the UN document-level retrieval task, document embeddings achieve around 97% on P@1 for all experimented language pairs. Lastly, we evaluate the proposed model on the BUCC mining task. The learned embeddings with raw cosine similarity scores achieve competitive results compared to current state-of-the-art models, and with a second-stage scorer we achieve a new state-of-the-art level on this task.
Tasks Sentence Embedding, Sentence Embeddings
Published 2019-02-22
URL https://arxiv.org/abs/1902.08564v2
PDF https://arxiv.org/pdf/1902.08564v2.pdf
PWC https://paperswithcode.com/paper/improving-multilingual-sentence-embedding
Repo
Framework

Read classification using semi-supervised deep learning

Title Read classification using semi-supervised deep learning
Authors Tomislav Šebrek, Jan Tomljanović, Josip Krapac, Mile Šikić
Abstract In this paper, we propose a semi-supervised deep learning method for detecting the specific types of reads that impede the de novo genome assembly process. Instead of dealing directly with sequenced reads, we analyze their coverage graphs converted to 1D-signals. We noticed that specific signal patterns occur in each relevant class of reads. Semi-supervised approach is chosen because manually labelling the data is a very slow and tedious process, so our goal was to facilitate the assembly process with as little labeled data as possible. We tested two models to learn patterns in the coverage graphs: M1+M2 and semi-GAN. We evaluated the performance of each model based on a manually labeled dataset that comprises various reads from multiple reference genomes with respect to the number of labeled examples that were used during the training process. In addition, we embedded our detection in the assembly process which improved the quality of assemblies.
Tasks
Published 2019-04-23
URL http://arxiv.org/abs/1904.10353v1
PDF http://arxiv.org/pdf/1904.10353v1.pdf
PWC https://paperswithcode.com/paper/read-classification-using-semi-supervised
Repo
Framework

Stability selection enables robust learning of partial differential equations from limited noisy data

Title Stability selection enables robust learning of partial differential equations from limited noisy data
Authors Suryanarayana Maddu, Bevan L. Cheeseman, Ivo F. Sbalzarini, Christian L. Müller
Abstract We present a statistical learning framework for robust identification of partial differential equations from noisy spatiotemporal data. Extending previous sparse regression approaches for inferring PDE models from simulated data, we address key issues that have thus far limited the application of these methods to noisy experimental data, namely their robustness against noise and the need for manual parameter tuning. We address both points by proposing a stability-based model selection scheme to determine the level of regularization required for reproducible recovery of the underlying PDE. This avoids manual parameter tuning and provides a principled way to improve the method’s robustness against noise in the data. Our stability selection approach, termed PDE-STRIDE, can be combined with any sparsity-promoting penalized regression model and provides an interpretable criterion for model component importance. We show that in particular the combination of stability selection with the iterative hard-thresholding algorithm from compressed sensing provides a fast, parameter-free, and robust computational framework for PDE inference that outperforms previous algorithmic approaches with respect to recovery accuracy, amount of data required, and robustness to noise. We illustrate the performance of our approach on a wide range of noise-corrupted simulated benchmark problems, including 1D Burgers, 2D vorticity-transport, and 3D reaction-diffusion problems. We demonstrate the practical applicability of our method on real-world data by considering a purely data-driven re-evaluation of the advective triggering hypothesis for an embryonic polarization system in C.~elegans. Using fluorescence microscopy images of C.~elegans zygotes as input data, our framework is able to recover the PDE model for the regulatory reaction-diffusion-flow network of the associated proteins.
Tasks Model Selection
Published 2019-07-17
URL https://arxiv.org/abs/1907.07810v1
PDF https://arxiv.org/pdf/1907.07810v1.pdf
PWC https://paperswithcode.com/paper/stability-selection-enables-robust-learning
Repo
Framework

Improving Branch Prediction By Modeling Global History with Convolutional Neural Networks

Title Improving Branch Prediction By Modeling Global History with Convolutional Neural Networks
Authors Stephen J Tarsa, Chit-Kwan Lin, Gokce Keskin, Gautham Chinya, Hong Wang
Abstract CPU branch prediction has hit a wall–existing techniques achieve near-perfect accuracy on 99% of static branches, and yet the mispredictions that remain hide major performance gains. In a companion report, we show that a primary source of mispredictions is a handful of systematically hard-to-predict branches (H2Ps), e.g. just 10 static instructions per SimPoint phase in SPECint 2017. The lost opportunity posed by these mispredictions is significant to the CPU: 14.0% in instructions-per-cycle (IPC) on Intel SkyLake and 37.4% IPC when the pipeline is scaled four-fold, on par with gains from process technology. However, up to 80% of this upside is unreachable by the best known branch predictors, even when afforded exponentially more resources. New approaches are needed, and machine learning (ML) provides a palette of powerful predictors. A growing body of work has shown that ML models are deployable within the microarchitecture to optimize hardware at runtime, and are one way to customize CPUs post-silicon by training to customer applications. We develop this scenario for branch prediction using convolutional neural networks (CNNs) to boost accuracy for H2Ps. Step-by-step, we (1) map CNNs to the global history data used by existing branch predictors; (2) show how CNNs improve H2P prediction in SPEC 2017; (3) adapt 2-bit CNN inference to the constraints of current branch prediction units; and (4) establish that CNN helper predictors are reusable across application executions on different inputs, enabling us to amortize offline training and deploy ML pattern matching to improve IPC.
Tasks
Published 2019-06-20
URL https://arxiv.org/abs/1906.09889v1
PDF https://arxiv.org/pdf/1906.09889v1.pdf
PWC https://paperswithcode.com/paper/improving-branch-prediction-by-modeling
Repo
Framework
comments powered by Disqus