January 25, 2020

2984 words 15 mins read

Paper Group NAWR 13

Paper Group NAWR 13

hULMonA: The Universal Language Model in Arabic. Shallow RNN: Accurate Time-series Classification on Resource Constrained Devices. Planning in entropy-regularized Markov decision processes and games. Arbicon-Net: Arbitrary Continuous Geometric Transformation Networks for Image Registration. Efficient Pure Exploration in Adaptive Round model. Visual …

hULMonA: The Universal Language Model in Arabic

Title hULMonA: The Universal Language Model in Arabic
Authors Obeida ElJundi, Wissam Antoun, Nour El Droubi, Hazem Hajj, Wassim El-Hajj, Khaled Shaban
Abstract Arabic is a complex language with limited resources which makes it challenging to produce accurate text classification tasks such as sentiment analysis. The utilization of transfer learning (TL) has recently shown promising results for advancing accuracy of text classification in English. TL models are pre-trained on large corpora, and then fine-tuned on task-specific datasets. In particular, universal language models (ULMs), such as recently developed BERT, have achieved state-of-the-art results in various NLP tasks in English. In this paper, we hypothesize that similar success can be achieved for Arabic. The work aims at supporting the hypothesis by developing the first Universal Language Model in Arabic (hULMonA - حلمنا meaning our dream), demonstrating its use for Arabic classifications tasks, and demonstrating how a pre-trained multi-lingual BERT can also be used for Arabic. We then conduct a benchmark study to evaluate both ULM successes with Arabic sentiment analysis. Experiment results show that the developed hULMonA and multi-lingual ULM are able to generalize well to multiple Arabic data sets and achieve new state of the art results in Arabic Sentiment Analysis for some of the tested sets.
Tasks Arabic Sentiment Analysis, Language Modelling, Sentiment Analysis, Text Classification, Transfer Learning
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4608/
PDF https://www.aclweb.org/anthology/W19-4608
PWC https://paperswithcode.com/paper/hulmona-the-universal-language-model-in
Repo https://github.com/aub-mind/hULMonA
Framework none

Shallow RNN: Accurate Time-series Classification on Resource Constrained Devices

Title Shallow RNN: Accurate Time-series Classification on Resource Constrained Devices
Authors Don Dennis, Durmus Alp Emre Acar, Vikram Mandikal, Vinu Sankar Sadasivan, Venkatesh Saligrama, Harsha Vardhan Simhadri, Prateek Jain
Abstract Recurrent Neural Networks (RNNs) capture long dependencies and context, and 2 hence are the key component of typical sequential data based tasks. However, the sequential nature of RNNs dictates a large inference cost for long sequences even if the hardware supports parallelization. To induce long-term dependencies, and yet admit parallelization, we introduce novel shallow RNNs. In this architecture, the first layer splits the input sequence and runs several independent RNNs. The second layer consumes the output of the first layer using a second RNN thus capturing long dependencies. We provide theoretical justification for our architecture under weak assumptions that we verify on real-world benchmarks. Furthermore, we show that for time-series classification, our technique leads to substantially improved inference time over standard RNNs without compromising accuracy. For example, we can deploy audio-keyword classification on tiny Cortex M4 devices (100MHz processor, 256KB RAM, no DSP available) which was not possible using standard RNN models. Similarly, using SRNN in the popular Listen-Attend-Spell (LAS) architecture for phoneme classification [4], we can reduce the lag inphoneme classification by 10-12x while maintaining state-of-the-art accuracy.
Tasks Time Series, Time Series Classification
Published 2019-12-01
URL http://papers.nips.cc/paper/9451-shallow-rnn-accurate-time-series-classification-on-resource-constrained-devices
PDF http://papers.nips.cc/paper/9451-shallow-rnn-accurate-time-series-classification-on-resource-constrained-devices.pdf
PWC https://paperswithcode.com/paper/shallow-rnn-accurate-time-series
Repo https://github.com/Microsoft/EdgeML
Framework tf

Planning in entropy-regularized Markov decision processes and games

Title Planning in entropy-regularized Markov decision processes and games
Authors Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Menard, Remi Munos, Michal Valko
Abstract We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the SmoothCruiser. SmoothCruiser makes use of the smoothness of the Bellman operator promoted by the regularization to achieve problem-independent sample complexity of order $\tilde{\mathcal{O}}(1/\epsilon^4)$ for a desired accuracy $\epsilon$, whereas for non-regularized settings there are no known algorithms with guaranteed polynomial sample complexity in the worst case.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9405-planning-in-entropy-regularized-markov-decision-processes-and-games
PDF http://papers.nips.cc/paper/9405-planning-in-entropy-regularized-markov-decision-processes-and-games.pdf
PWC https://paperswithcode.com/paper/planning-in-entropy-regularized-markov
Repo https://github.com/omardrwch/smoothcruiser-check
Framework none

Arbicon-Net: Arbitrary Continuous Geometric Transformation Networks for Image Registration

Title Arbicon-Net: Arbitrary Continuous Geometric Transformation Networks for Image Registration
Authors Jianchun Chen, Lingjing Wang, Xiang Li, Yi Fang
Abstract This paper concerns the undetermined problem of estimating geometric transformation between image pairs. Recent methods introduce deep neural networks to predict the controlling parameters of hand-crafted geometric transformation models (e.g. thin-plate spline) for image registration and matching. However, the low-dimension parametric models are incapable of estimating a highly complex geometric transform with limited flexibility to model the actual geometric deformation from image pairs. To address this issue, we present an end-to-end trainable deep neural networks, named Arbitrary Continuous Geometric Transformation Networks (Arbicon-Net), to directly predict the dense displacement field for pairwise image alignment. Arbicon-Net is generalized from training data to predict the desired arbitrary continuous geometric transformation in a data-driven manner for unseen new pair of images. Particularly, without imposing penalization terms, the predicted displacement vector function is proven to be spatially continuous and smooth. To verify the performance of Arbicon-Net, we conducted semantic alignment tests over both synthetic and real image dataset with various experimental settings. The results demonstrate that Arbicon-Net outperforms the previous image alignment techniques in identifying the image correspondences.
Tasks Image Registration
Published 2019-12-01
URL http://papers.nips.cc/paper/8602-arbicon-net-arbitrary-continuous-geometric-transformation-networks-for-image-registration
PDF http://papers.nips.cc/paper/8602-arbicon-net-arbitrary-continuous-geometric-transformation-networks-for-image-registration.pdf
PWC https://paperswithcode.com/paper/arbicon-net-arbitrary-continuous-geometric
Repo https://github.com/nyummvc/Arbicon-Net
Framework pytorch

Efficient Pure Exploration in Adaptive Round model

Title Efficient Pure Exploration in Adaptive Round model
Authors Tianyuan Jin, Jieming Shi, Xiaokui Xiao, Enhong Chen
Abstract In the adaptive setting, many multi-armed bandit applications allow the learner to adaptively draw samples and adjust sampling strategy in rounds. In many real applications, not only the query complexity but also the round complexity need to be optimized. In this paper, we study both PAC and exact top-$k$ arm identification problems and design efficient algorithms considering both round complexity and query complexity. For PAC problem, we achieve optimal query complexity and use only $O(\log_{\frac{k}{\delta}}^*(n))$ rounds, which matches the lower bound of round complexity, while most of existing works need $\Theta(\log \frac{n}{k})$ rounds. For exact top-$k$ arm identification, we improve the round complexity factor from $\log n$ to $\log_{\frac{1}{\delta}}^*(n)$, and achieve near optimal query complexity. In experiments, our algorithms conduct far fewer rounds, and outperform state of the art by orders of magnitude with respect to query cost.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8887-efficient-pure-exploration-in-adaptive-round-model
PDF http://papers.nips.cc/paper/8887-efficient-pure-exploration-in-adaptive-round-model.pdf
PWC https://paperswithcode.com/paper/efficient-pure-exploration-in-adaptive-round
Repo https://github.com/jmshi123/mab-nips-2019
Framework none

Visual Tracking via Adaptive Spatially-Regularized Correlation Filters

Title Visual Tracking via Adaptive Spatially-Regularized Correlation Filters
Authors Kenan Dai, Dong Wang, Huchuan Lu, Chong Sun, Jianhua Li
Abstract In this work, we propose a novel adaptive spatially-regularized correlation filters (ASRCF) model to simultaneously optimize the filter coefficients and the spatial regularization weight. First, this adaptive spatial regularization scheme could learn an effective spatial weight for a specific object and its appearance variations, and therefore result in more reliable filter coefficients during the tracking process. Second, our ASRCF model can be effectively optimized based on the alternating direction method of multipliers, where each subproblem has the closed-from solution. Third, our tracker applies two kinds of CF models to estimate the location and scale respectively. The location CF model exploits ensembles of shallow and deep features to determine the optimal position accurately. The scale CF model works on multi-scale shallow features to estimate the optimal scale efficiently. Extensive experiments on five recent benchmarks show that our tracker performs favorably against many state-of-the-art algorithms, with real-time performance of 28fps.
Tasks Visual Tracking
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Dai_Visual_Tracking_via_Adaptive_Spatially-Regularized_Correlation_Filters_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Dai_Visual_Tracking_via_Adaptive_Spatially-Regularized_Correlation_Filters_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/visual-tracking-via-adaptive-spatially
Repo https://github.com/Daikenan/ASRCF
Framework none

Katib: A Distributed General AutoML Platform on Kubernetes

Title Katib: A Distributed General AutoML Platform on Kubernetes
Authors Jinan Zhou, Andrey Velichkevich, Kirill Prosvirov, Anubhav Garg, Yuji Oshima, Debo Dutta
Abstract Automatic Machine Learning (AutoML) is a powerful mechanism to design and tune models. We present Katib, a scalable Kubernetes-native general AutoML platform that can support a range of AutoML algorithms including both hyper-parameter tuning and neural architecture search. The system is divided into separate components, encapsulated as micro-services. Each micro-service operates within a Kubernetes pod and communicates with others via well-defined APIs, thus allowing flexible management and scalable deployment at a minimal cost. Together with a powerful user interface, Katib provides a universal platform for researchers as well as enterprises to try, compare and deploy their AutoML algorithms, on any Kubernetes platform.
Tasks AutoML, Hyperparameter Optimization, Neural Architecture Search
Published 2019-01-01
URL https://www.usenix.org/conference/opml19/presentation/zhou
PDF https://www.usenix.org/system/files/opml19papers-zhou.pdf
PWC https://paperswithcode.com/paper/katib-a-distributed-general-automl-platform
Repo https://github.com/kubeflow/katib
Framework pytorch

AttPool: Towards Hierarchical Feature Representation in Graph Convolutional Networks via Attention Mechanism

Title AttPool: Towards Hierarchical Feature Representation in Graph Convolutional Networks via Attention Mechanism
Authors Jingjia Huang, Zhangheng Li, Nannan Li, Shan Liu, Ge Li
Abstract Graph convolutional networks (GCNs) are potentially short of the ability to learn hierarchical representation for graph embedding, which holds them back in the graph classification task. Here, we propose AttPool, which is a novel graph pooling module based on attention mechanism, to remedy the problem. It is able to select nodes that are significant for graph representation adaptively, and generate hierarchical features via aggregating the attention-weighted information in nodes. Additionally, we devise a hierarchical prediction architecture to sufficiently leverage the hierarchical representation and facilitate the model learning. The AttPool module together with the entire training structure can be integrated into existing GCNs, and is trained in an end-to-end fashion conveniently. The experimental results on several graph-classification benchmark datasets with various scales demonstrate the effectiveness of our method.
Tasks Graph Classification, Graph Embedding
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Huang_AttPool_Towards_Hierarchical_Feature_Representation_in_Graph_Convolutional_Networks_via_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Huang_AttPool_Towards_Hierarchical_Feature_Representation_in_Graph_Convolutional_Networks_via_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/attpool-towards-hierarchical-feature
Repo https://github.com/hjjpku/Attention_in_Graph
Framework pytorch

Neural Lyapunov Control

Title Neural Lyapunov Control
Authors Ya-Chien Chang, Nima Roohi, Sicun Gao
Abstract We propose new methods for learning control policies and neural network Lyapunov functions for nonlinear control problems, with provable guarantee of stability. The framework consists of a learner that attempts to find the control and Lyapunov functions, and a falsifier that finds counterexamples to quickly guide the learner towards solutions. The procedure terminates when no counterexample is found by the falsifier, in which case the controlled nonlinear system is provably stable. The approach significantly simplifies the process of Lyapunov control design, provides end-to-end correctness guarantee, and can obtain much larger regions of attraction than existing methods such as LQR and SOS/SDP. We show experiments on how the new methods obtain high-quality solutions for challenging robot control problems such as path tracking for wheeled vehicles and humanoid robot balancing.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8587-neural-lyapunov-control
PDF http://papers.nips.cc/paper/8587-neural-lyapunov-control.pdf
PWC https://paperswithcode.com/paper/neural-lyapunov-control
Repo https://github.com/YaChienChang/Neural-Lyapunov-Control
Framework pytorch

When Color Constancy Goes Wrong: Correcting Improperly White-Balanced Images

Title When Color Constancy Goes Wrong: Correcting Improperly White-Balanced Images
Authors Mahmoud Afifi, Brian Price, Scott Cohen, Michael S. Brown
Abstract This paper focuses on correcting a camera image that has been improperly white-balanced. This situation occurs when a camera’s auto white balance fails or when the wrong manual white-balance setting is used. Even after decades of computational color constancy research, there are no effective solutions to this problem. The challenge lies not in identifying what the correct white balance should have been, but in the fact that the in-camera white-balance procedure is followed by several camera-specific nonlinear color manipulations that make it challenging to correct the image’s colors in post-processing. This paper introduces the first method to explicitly address this problem. Our method is enabled by a dataset of over 65,000 pairs of incorrectly white-balanced images and their corresponding correctly white-balanced images. Using this dataset, we introduce a k-nearest neighbor strategy that is able to compute a nonlinear color mapping function to correct the image’s colors. We show our method is highly effective and generalizes well to camera models not in the training set.
Tasks Color Constancy
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Afifi_When_Color_Constancy_Goes_Wrong_Correcting_Improperly_White-Balanced_Images_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Afifi_When_Color_Constancy_Goes_Wrong_Correcting_Improperly_White-Balanced_Images_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/when-color-constancy-goes-wrong-correcting
Repo https://github.com/mahmoudnafifi/WB_sRGB
Framework none

RelGAN: Relational Generative Adversarial Networks for Text Generation

Title RelGAN: Relational Generative Adversarial Networks for Text Generation
Authors Weili Nie, Nina Narodytska, Ankit Patel
Abstract Generative adversarial networks (GANs) have achieved great success at generating realistic images. However, the text generation still remains a challenging task for modern GAN architectures. In this work, we propose RelGAN, a new GAN architecture for text generation, consisting of three main components: a relational memory based generator for the long-distance dependency modeling, the Gumbel-Softmax relaxation for training GANs on discrete data, and multiple embedded representations in the discriminator to provide a more informative signal for the generator updates. Our experiments show that RelGAN outperforms current state-of-the-art models in terms of sample quality and diversity, and we also reveal via ablation studies that each component of RelGAN contributes critically to its performance improvements. Moreover, a key advantage of our method, that distinguishes it from other GANs, is the ability to control the trade-off between sample quality and diversity via the use of a single adjustable parameter. Finally, RelGAN is the first architecture that makes GANs with Gumbel-Softmax relaxation succeed in generating realistic text.
Tasks Text Generation
Published 2019-05-01
URL https://openreview.net/forum?id=rJedV3R5tm
PDF https://openreview.net/pdf?id=rJedV3R5tm
PWC https://paperswithcode.com/paper/relgan-relational-generative-adversarial
Repo https://github.com/weilinie/RelGAN
Framework tf

Multi-task Temporal Convolutional Network for Predicting Water Quality Sensor

Title Multi-task Temporal Convolutional Network for Predicting Water Quality Sensor
Authors Zhang, Yifan; Thorburn, Peter; Fitch, Peter
Abstract Predicting the trend of water quality is essential in environmental management decision support systems. Despite various data-driven models in water quality prediction, most studies focus on predicting a single water quality variable. When multiple water quality variables need to be estimated, preparing several data-driven models may require unaffordable computing resources. Also, the changing patterns of several water quality variables can only be revealed by processing long term historical observations, which is not well supported by conventional data-driven models. In this paper, we propose a multi-task temporal convolution network (MTCN) for predicting multiple water quality variables. The temporal convolution offers one the capability to explore the temporal dependencies among a remarkably long historical period. Furthermore, instead of providing predictions for only one water quality variable, the MTCN is designed to predict multiple water quality variables simultaneously. Data collected from the Burnett River, Queensland is used to evaluate the MTCN. Compared to training a set of single-task TCNs for each variable separately, the proposed MTCN achieves the best RMSE scores in predicting both temperature and DO in the following 48 time steps but only requires 53% of the total training time of the TCN. Therefore, the MTCN is an encouraging approach for water quality management by processing a large amount of sensor data.
Tasks Time Series Prediction
Published 2019-12-05
URL https://link.springer.com/chapter/10.1007/978-3-030-36808-1_14#citeas
PDF https://www.ivivan.com/papers/ICONIP2019.pdf
PWC https://paperswithcode.com/paper/multi-task-temporal-convolutional-network-for
Repo https://github.com/ivivan/MTCN
Framework tf

Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function

Title Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function
Authors Devendra Singh Sachan, Manzil Zaheer, Ruslan Salakhutdinov
Abstract In this paper, we do a careful study of a bidirectional LSTM network for the task of text classification using both supervised and semi-supervised approaches. In prior work, it has been reported that in order to get good classification accuracy using LSTMmodels for text classification task, pretraining the LSTM model parameters using unsupervised learning methods such as language modeling or sequence auto-encoder is necessary [2, 20]. However, we find that our simple model, when trained with cross-entropy loss is able to achieve competitive results compared with the more complex models. Furthermore, in addition to cross-entropy loss, by using a combination of entropy minimization, adversarial, and virtual adversarial losses for both labeled and unlabeled data, we report new state-of-the-art results for text classification task on four benchmark datasets. In particular, on ACL-IMDB sentiment analysis and AG-News topic classification datasets, our method outperforms current approaches by a substantial margin.
Tasks Language Modelling, Sentiment Analysis, Text Classification
Published 2019-02-01
URL https://www.semanticscholar.org/paper/Revisiting-LSTM-Networks-for-Semi-Supervised-Text-Sachan-Petuum/c3f89364aecd661eb032840d2fe3efd0f6d1698c
PDF https://www.aaai.org/Papers/AAAI/2019/AAAI-SachanD.7236.pdf
PWC https://paperswithcode.com/paper/revisiting-lstm-networks-for-semi-supervised
Repo https://github.com/DevSinghSachan/ssl_text_classification
Framework none
Title Examining Hyperparameters of Neural Networks Trained Using Local Search
Authors Ahmed Aly, Gianluca Guadagni, Joanne Bechta Dugan
Abstract Deep neural networks (DNNs) have been found useful for many applications. However, training and designing those networks can be challenging and is considered more of an art or an engineering process than rigorous science. In this regard, the important process of choosing hyperparameters is relevant. In addition, training neural networks with derivative-free methods is somewhat understudied. Particularly, with regards to hyperparameter selection. The paper presents a small-scale study of 3 hyperparam-eters choice for convolutional neural networks (CNNs). The networks were trained with two single-candidate optimization algorithms: Stochastic Gradient Descent (derivative-based) and Local Search (derivative-free). The CNN is trained on a subset of the FashionMNIST dataset. Experimental results show that hyperparameter selection can be detrimental for Local Search, especially regarding network parametrization. Moreover, the best hyperparameter choices didn’t match for both algorithms. Future investigation into the training dynamics of Local Search is likely needed.
Tasks
Published 2019-12-10
URL https://www.researchgate.net/publication/338501734_Examining_Hyperparameters_of_Neural_Networks_Trained_Using_Local_Search
PDF https://www.researchgate.net/publication/338501734_Examining_Hyperparameters_of_Neural_Networks_Trained_Using_Local_Search
PWC https://paperswithcode.com/paper/examining-hyperparameters-of-neural-networks
Repo https://github.com/AroMorin/DNNOP
Framework pytorch

Expressive power of tensor-network factorizations for probabilistic modeling

Title Expressive power of tensor-network factorizations for probabilistic modeling
Authors Ivan Glasser, Ryan Sweke, Nicola Pancotti, Jens Eisert, Ignacio Cirac
Abstract Tensor-network techniques have recently proven useful in machine learning, both as a tool for the formulation of new learning algorithms and for enhancing the mathematical understanding of existing methods. Inspired by these developments, and the natural correspondence between tensor networks and probabilistic graphical models, we provide a rigorous analysis of the expressive power of various tensor-network factorizations of discrete multivariate probability distributions. These factorizations include non-negative tensor-trains/MPS, which are in correspondence with hidden Markov models, and Born machines, which are naturally related to the probabilistic interpretation of quantum circuits. When used to model probability distributions, they exhibit tractable likelihoods and admit efficient learning algorithms. Interestingly, we prove that there exist probability distributions for which there are unbounded separations between the resource requirements of some of these tensor-network factorizations. Of particular interest, using complex instead of real tensors can lead to an arbitrarily large reduction in the number of parameters of the network. Additionally, we introduce locally purified states (LPS), a new factorization inspired by techniques for the simulation of quantum systems, with provably better expressive power than all other representations considered. The ramifications of this result are explored through numerical experiments.
Tasks Tensor Networks
Published 2019-12-01
URL http://papers.nips.cc/paper/8429-expressive-power-of-tensor-network-factorizations-for-probabilistic-modeling
PDF http://papers.nips.cc/paper/8429-expressive-power-of-tensor-network-factorizations-for-probabilistic-modeling.pdf
PWC https://paperswithcode.com/paper/expressive-power-of-tensor-network-1
Repo https://github.com/glivan/tensor_networks_for_probabilistic_modeling
Framework none
comments powered by Disqus