October 20, 2019

3152 words 15 mins read

Paper Group AWR 244

Towards Machine Learning-Based Optimal HAS. Deep Semantic Segmentation in an AUV for Online Posidonia Oceanica Meadows identification. NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System. Higher-order Coreference Resolution with Coarse-to-fine Inference. Generating Adaptive and Robust Filter Sets Using …

Towards Machine Learning-Based Optimal HAS


Title	Towards Machine Learning-Based Optimal HAS
Authors	Christian Sieber, Korbinian Hagn, Christian Moldovan, Tobias Hoßfeld, Wolfgang Kellerer
Abstract	Mobile video consumption is increasing and sophisticated video quality adaptation strategies are required to deal with mobile throughput fluctuations. These adaptation strategies have to keep the switching frequency low, the average quality high and prevent stalling occurrences to ensure customer satisfaction. This paper proposes a novel methodology for the design of machine learning-based adaptation logics named HASBRAIN. Furthermore, the performance of a trained neural network against two algorithms from the literature is evaluated. We first use a modified existing optimization formulation to calculate optimal adaptation paths with a minimum number of quality switches for a wide range of videos and for challenging mobile throughput patterns. Afterwards we use the resulting optimal adaptation paths to train and compare different machine learning models. The evaluation shows that an artificial neural network-based model can reach a high average quality with a low number of switches in the mobile scenario. The proposed methodology is general enough to be extended for further designs of machine learning-based algorithms and the provided model can be deployed in on-demand streaming scenarios or be further refined using reward-based mechanisms such as reinforcement learning. All tools, models and datasets created during the work are provided as open-source software.
Tasks
Published	2018-08-24
URL	http://arxiv.org/abs/1808.08065v1
PDF	http://arxiv.org/pdf/1808.08065v1.pdf
PWC	https://paperswithcode.com/paper/towards-machine-learning-based-optimal-has
Repo	https://github.com/csieber/pydashsim
Framework	tf

Deep Semantic Segmentation in an AUV for Online Posidonia Oceanica Meadows identification


Title	Deep Semantic Segmentation in an AUV for Online Posidonia Oceanica Meadows identification
Authors	Miguel Martin-Abadal, Eric Guerrero-Font, Francisco Bonin-Font, Yolanda Gonzalez-Cid
Abstract	Recent studies have shown evidence of a significant decline of the Posidonia oceanica (P.O.) meadows on a global scale. The monitoring and mapping of these meadows are fundamental tools for measuring their status. We present an approach based on a deep neural network to automatically perform a high-precision semantic segmentation of P.O. meadows in sea-floor images, offering several improvements over the state of the art techniques. Our network demonstrates outstanding performance over diverse test sets, reaching a precision of 96.57% and an accuracy of 96.81%, surpassing the reliability of labelling the images manually. Also, the network is implemented in an Autonomous Underwater Vehicle (AUV), performing an online P.O. segmentation, which will be used to generate real-time semantic coverage maps.
Tasks	Semantic Segmentation
Published	2018-06-22
URL	http://arxiv.org/abs/1807.03117v2
PDF	http://arxiv.org/pdf/1807.03117v2.pdf
PWC	https://paperswithcode.com/paper/deep-semantic-segmentation-in-an-auv-for
Repo	https://github.com/srv/Posidonia-semantic-segmentation
Framework	tf

NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System


Title	NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System
Authors	Xi Victoria Lin, Chenglong Wang, Luke Zettlemoyer, Michael D. Ernst
Abstract	We present new data and semantic parsing methods for the problem of mapping English sentences to Bash commands (NL2Bash). Our long-term goal is to enable any user to perform operations such as file manipulation, search, and application-specific scripting by simply stating their goals in English. We take a first step in this domain, by providing a new dataset of challenging but commonly used Bash commands and expert-written English descriptions, along with baseline methods to establish performance levels on this task.
Tasks	Semantic Parsing
Published	2018-02-25
URL	http://arxiv.org/abs/1802.08979v2
PDF	http://arxiv.org/pdf/1802.08979v2.pdf
PWC	https://paperswithcode.com/paper/nl2bash-a-corpus-and-semantic-parser-for
Repo	https://github.com/TellinaTool/nl2bash
Framework	tf

Higher-order Coreference Resolution with Coarse-to-fine Inference


Title	Higher-order Coreference Resolution with Coarse-to-fine Inference
Authors	Kenton Lee, Luheng He, Luke Zettlemoyer
Abstract	We introduce a fully differentiable approximation to higher-order inference for coreference resolution. Our approach uses the antecedent distribution from a span-ranking architecture as an attention mechanism to iteratively refine span representations. This enables the model to softly consider multiple hops in the predicted clusters. To alleviate the computational cost of this iterative process, we introduce a coarse-to-fine approach that incorporates a less accurate but more efficient bilinear factor, enabling more aggressive pruning without hurting accuracy. Compared to the existing state-of-the-art span-ranking approach, our model significantly improves accuracy on the English OntoNotes benchmark, while being far more computationally efficient.
Tasks	Coreference Resolution
Published	2018-04-15
URL	http://arxiv.org/abs/1804.05392v1
PDF	http://arxiv.org/pdf/1804.05392v1.pdf
PWC	https://paperswithcode.com/paper/higher-order-coreference-resolution-with
Repo	https://github.com/kkjawz/coref-ee
Framework	tf

Generating Adaptive and Robust Filter Sets Using an Unsupervised Learning Framework


Title	Generating Adaptive and Robust Filter Sets Using an Unsupervised Learning Framework
Authors	Mohit Prabhushankar, Dogancan Temel, Ghassan AlRegib
Abstract	In this paper, we introduce an adaptive unsupervised learning framework, which utilizes natural images to train filter sets. The applicability of these filter sets is demonstrated by evaluating their performance in two contrasting applications - image quality assessment and texture retrieval. While assessing image quality, the filters need to capture perceptual differences based on dissimilarities between a reference image and its distorted version. In texture retrieval, the filters need to assess similarity between texture images to retrieve closest matching textures. Based on experiments, we show that the filter responses span a set in which a monotonicity-based metric can measure both the perceptual dissimilarity of natural images and the similarity of texture images. In addition, we corrupt the images in the test set and demonstrate that the proposed method leads to robust and reliable retrieval performance compared to existing methods.
Tasks	Image Quality Assessment
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08927v1
PDF	http://arxiv.org/pdf/1811.08927v1.pdf
PWC	https://paperswithcode.com/paper/generating-adaptive-and-robust-filter-sets
Repo	https://github.com/olivesgatech/Adaptive-and-Robust-Filter-Sets
Framework	none

AU R-CNN: Encoding Expert Prior Knowledge into R-CNN for Action Unit Detection


Title	AU R-CNN: Encoding Expert Prior Knowledge into R-CNN for Action Unit Detection
Authors	Chen Ma, Li Chen, Junhai Yong
Abstract	Detecting action units (AUs) on human faces is challenging because various AUs make subtle facial appearance change over various regions at different scales. Current works have attempted to recognize AUs by emphasizing important regions. However, the incorporation of expert prior knowledge into region definition remains under-exploited, and current AU detection approaches do not use regional convolutional neural networks (R-CNN) with expert prior knowledge to directly focus on AU-related regions adaptively. By incorporating expert prior knowledge, we propose a novel R-CNN based model named AU R-CNN. The proposed solution offers two main contributions: (1) AU R-CNN directly observes different facial regions, where various AUs are located. Specifically, we define an AU partition rule which encodes the expert prior knowledge into the region definition and RoI-level label definition. This design produces considerably better detection performance than existing approaches. (2) We integrate various dynamic models (including convolutional long short-term memory, two stream network, conditional random field, and temporal action localization network) into AU R-CNN and then investigate and analyze the reason behind the performance of dynamic models. Experiment results demonstrate that \textit{only} static RGB image information and no optical flow-based AU R-CNN surpasses the one fused with dynamic models. AU R-CNN is also superior to traditional CNNs that use the same backbone on varying image resolutions. State-of-the-art recognition performance of AU detection is achieved. The complete network is end-to-end trainable. Experiments on BP4D and DISFA datasets show the effectiveness of our approach. The implementation code is available online.
Tasks	Action Unit Detection, Temporal Action Localization
Published	2018-12-14
URL	https://arxiv.org/abs/1812.05788v2
PDF	https://arxiv.org/pdf/1812.05788v2.pdf
PWC	https://paperswithcode.com/paper/au-r-cnn-encoding-expert-prior-knowledge-into
Repo	https://github.com/sharpstill/AU_R-CNN
Framework	none

Compositional coding capsule network with k-means routing for text classification


Title	Compositional coding capsule network with k-means routing for text classification
Authors	Hao Ren, Hong Lu
Abstract	Text classification is a challenging problem which aims to identify the category of texts. Recently, Capsule Networks (CapsNets) are proposed for image classification. It has been shown that CapsNets have several advantages over Convolutional Neural Networks (CNNs), while, their validity in the domain of text has less been explored. An effective method named deep compositional code learning has been proposed lately. This method can save many parameters about word embeddings without any significant sacrifices in performance. In this paper, we introduce the Compositional Coding (CC) mechanism between capsules, and we propose a new routing algorithm, which is based on k-means clustering theory. Experiments conducted on eight challenging text classification datasets show the proposed method achieves competitive accuracy compared to the state-of-the-art approach with significantly fewer parameters.
Tasks	Sentiment Analysis, Text Classification
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09177v3
PDF	http://arxiv.org/pdf/1810.09177v3.pdf
PWC	https://paperswithcode.com/paper/compositional-coding-capsule-network-with-k
Repo	https://github.com/leftthomas/CCCapsNet
Framework	pytorch

Novel Deep Learning Model for Traffic Sign Detection Using Capsule Networks


Title	Novel Deep Learning Model for Traffic Sign Detection Using Capsule Networks
Authors	Amara Dinesh Kumar
Abstract	Convolutional neural networks are the most widely used deep learning algorithms for traffic signal classification till date but they fail to capture pose, view, orientation of the images because of the intrinsic inability of max pooling layer.This paper proposes a novel method for Traffic sign detection using deep learning architecture called capsule networks that achieves outstanding performance on the German traffic sign dataset.Capsule network consists of capsules which are a group of neurons representing the instantiating parameters of an object like the pose and orientation by using the dynamic routing and route by agreement algorithms.unlike the previous approaches of manual feature extraction,multiple deep neural networks with many parameters,our method eliminates the manual effort and provides resistance to the spatial variances.CNNs can be fooled easily using various adversary attacks and capsule networks can overcome such attacks from the intruders and can offer more reliability in traffic sign detection for autonomous vehicles.Capsule network have achieved the state-of-the-art accuracy of 97.6% on German Traffic Sign Recognition Benchmark dataset (GTSRB).
Tasks	Traffic Sign Recognition
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04424v1
PDF	http://arxiv.org/pdf/1805.04424v1.pdf
PWC	https://paperswithcode.com/paper/novel-deep-learning-model-for-traffic-sign
Repo	https://github.com/dineshresearch/Novel-Deep-Learning-Model-for-Traffic-Sign-Detection-Using-Capsule-Networks
Framework	tf

Learning from the Syndrome


Title	Learning from the Syndrome
Authors	Loren Lugosch, Warren J. Gross
Abstract	In this paper, we introduce the syndrome loss, an alternative loss function for neural error-correcting decoders based on a relaxation of the syndrome. The syndrome loss penalizes the decoder for producing outputs that do not correspond to valid codewords. We show that training with the syndrome loss yields decoders with consistently lower frame error rate for a number of short block codes, at little additional cost during training and no additional cost during inference. The proposed method does not depend on knowledge of the transmitted codeword, making it a promising tool for online adaptation to changing channel conditions.
Tasks
Published	2018-10-23
URL	http://arxiv.org/abs/1810.10902v1
PDF	http://arxiv.org/pdf/1810.10902v1.pdf
PWC	https://paperswithcode.com/paper/181010902
Repo	https://github.com/lorenlugosch/neural-min-sum-decoding
Framework	tf

Pruning Techniques for Mixed Ensembles of Genetic Programming Models


Title	Pruning Techniques for Mixed Ensembles of Genetic Programming Models
Authors	Mauro Castelli, Ivo Gonçalves, Luca Manzoni, Leonardo Vanneschi
Abstract	The objective of this paper is to define an effective strategy for building an ensemble of Genetic Programming (GP) models. Ensemble methods are widely used in machine learning due to their features: they average out biases, they reduce the variance and they usually generalize better than single models. Despite these advantages, building ensemble of GP models is not a well-developed topic in the evolutionary computation community. To fill this gap, we propose a strategy that blends individuals produced by standard syntax-based GP and individuals produced by geometric semantic genetic programming, one of the newest semantics-based method developed in GP. In fact, recent literature showed that combining syntax and semantics could improve the generalization ability of a GP model. Additionally, to improve the diversity of the GP models used to build up the ensemble, we propose different pruning criteria that are based on correlation and entropy, a commonly used measure in information theory. Experimental results,obtained over different complex problems, suggest that the pruning criteria based on correlation and entropy could be effective in improving the generalization ability of the ensemble model and in reducing the computational burden required to build it.
Tasks
Published	2018-01-23
URL	http://arxiv.org/abs/1801.07668v1
PDF	http://arxiv.org/pdf/1801.07668v1.pdf
PWC	https://paperswithcode.com/paper/pruning-techniques-for-mixed-ensembles-of
Repo	https://github.com/evaboost/evaboost
Framework	none

A Game-Based Approximate Verification of Deep Neural Networks with Provable Guarantees


Title	A Game-Based Approximate Verification of Deep Neural Networks with Provable Guarantees
Authors	Min Wu, Matthew Wicker, Wenjie Ruan, Xiaowei Huang, Marta Kwiatkowska
Abstract	Despite the improved accuracy of deep neural networks, the discovery of adversarial examples has raised serious safety concerns. In this paper, we study two variants of pointwise robustness, the maximum safe radius problem, which for a given input sample computes the minimum distance to an adversarial example, and the feature robustness problem, which aims to quantify the robustness of individual features to adversarial perturbations. We demonstrate that, under the assumption of Lipschitz continuity, both problems can be approximated using finite optimisation by discretising the input space, and the approximation has provable guarantees, i.e., the error is bounded. We then show that the resulting optimisation problems can be reduced to the solution of two-player turn-based games, where the first player selects features and the second perturbs the image within the feature. While the second player aims to minimise the distance to an adversarial example, depending on the optimisation objective the first player can be cooperative or competitive. We employ an anytime approach to solve the games, in the sense of approximating the value of a game by monotonically improving its upper and lower bounds. The Monte Carlo tree search algorithm is applied to compute upper bounds for both games, and the Admissible A* and the Alpha-Beta Pruning algorithms are, respectively, used to compute lower bounds for the maximum safety radius and feature robustness games. When working on the upper bound of the maximum safe radius problem, our tool demonstrates competitive performance against existing adversarial example crafting algorithms. Furthermore, we show how our framework can be deployed to evaluate pointwise robustness of neural networks in safety-critical applications such as traffic sign recognition in self-driving cars.
Tasks	Adversarial Attack, Adversarial Defense, Self-Driving Cars, Traffic Sign Recognition
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03571v2
PDF	http://arxiv.org/pdf/1807.03571v2.pdf
PWC	https://paperswithcode.com/paper/a-game-based-approximate-verification-of-deep
Repo	https://github.com/TrustAI/DeepGame
Framework	tf

Total Recall: Understanding Traffic Signs using Deep Hierarchical Convolutional Neural Networks


Title	Total Recall: Understanding Traffic Signs using Deep Hierarchical Convolutional Neural Networks
Authors	Sourajit Saha, Sharif Amit Kamran, Ali Shihab Sabbir
Abstract	Recognizing Traffic Signs using intelligent systems can drastically reduce the number of accidents happening world-wide. With the arrival of Self-driving cars it has become a staple challenge to solve the automatic recognition of Traffic and Hand-held signs in the major streets. Various machine learning techniques like Random Forest, SVM as well as deep learning models has been proposed for classifying traffic signs. Though they reach state-of-the-art performance on a particular data-set, but fall short of tackling multiple Traffic Sign Recognition benchmarks. In this paper, we propose a novel and one-for-all architecture that aces multiple benchmarks with better overall score than the state-of-the-art architectures. Our model is made of residual convolutional blocks with hierarchical dilated skip connections joined in steps. With this we score 99.33% Accuracy in German sign recognition benchmark and 99.17% Accuracy in Belgian traffic sign classification benchmark. Moreover, we propose a newly devised dilated residual learning representation technique which is very low in both memory and computational complexity.
Tasks	Self-Driving Cars, Traffic Sign Recognition
Published	2018-08-30
URL	http://arxiv.org/abs/1808.10524v2
PDF	http://arxiv.org/pdf/1808.10524v2.pdf
PWC	https://paperswithcode.com/paper/total-recall-understanding-traffic-signs
Repo	https://github.com/Sourajit2110/DilatedSkipTotalRecall
Framework	none

Revisiting Distributional Correspondence Indexing: A Python Reimplementation and New Experiments


Title	Revisiting Distributional Correspondence Indexing: A Python Reimplementation and New Experiments
Authors	Alejandro Moreo, Andrea Esuli, Fabrizio Sebastiani
Abstract	This paper introduces PyDCI, a new implementation of Distributional Correspondence Indexing (DCI) written in Python. DCI is a transfer learning method for cross-domain and cross-lingual text classification for which we had provided an implementation (here called JaDCI) built on top of JaTeCS, a Java framework for text classification. PyDCI is a stand-alone version of DCI that exploits scikit-learn and the SciPy stack. We here report on new experiments that we have carried out in order to test PyDCI, and in which we use as baselines new high-performing methods that have appeared after DCI was originally proposed. These experiments show that, thanks to a few subtle ways in which we have improved DCI, PyDCI outperforms both JaDCI and the above-mentioned high-performing methods, and delivers the best known results on the two popular benchmarks on which we had tested DCI, i.e., MultiDomainSentiment (a.k.a. MDS – for cross-domain adaptation) and Webis-CLS-10 (for cross-lingual adaptation). PyDCI, together with the code allowing to replicate our experiments, is available at https://github.com/AlexMoreo/pydci .
Tasks	Domain Adaptation, Sentiment Analysis, Text Classification, Transfer Learning
Published	2018-10-19
URL	http://arxiv.org/abs/1810.09311v1
PDF	http://arxiv.org/pdf/1810.09311v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-distributional-correspondence
Repo	https://github.com/AlexMoreo/pydci
Framework	none

Fully Convolutional Networks for Automated Segmentation of Abdominal Adipose Tissue Depots in Multicenter Water-Fat MRI


Title	Fully Convolutional Networks for Automated Segmentation of Abdominal Adipose Tissue Depots in Multicenter Water-Fat MRI
Authors	Taro Langner, Anders Hedström, Katharina Mörwald, Daniel Weghuber, Anders Forslund, Peter Bergsten, Håkan Ahlström, Joel Kullberg
Abstract	Purpose: An approach for the automated segmentation of visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT) in multicenter water-fat MRI scans of the abdomen was investigated, using two different neural network architectures. Methods: The two fully convolutional network architectures U-Net and V-Net were trained, evaluated and compared on the water-fat MRI data. Data of the study Tellus with 90 scans from a single center was used for a 10-fold cross-validation in which the most successful configuration for both networks was determined. These configurations were then tested on 20 scans of the multicenter study beta-cell function in JUvenile Diabetes and Obesity (BetaJudo), which involved a different study population and scanning device. Results: The U-Net outperformed the used implementation of the V-Net in both cross-validation and testing. In cross-validation, the U-Net reached average dice scores of 0.988 (VAT) and 0.992 (SAT). The average of the absolute quantification errors amount to 0.67% (VAT) and 0.39% (SAT). On the multi-center test data, the U-Net performs only slightly worse, with average dice scores of 0.970 (VAT) and 0.987 (SAT) and quantification errors of 2.80% (VAT) and 1.65% (SAT). Conclusion: The segmentations generated by the U-Net allow for reliable quantification and could therefore be viable for high-quality automated measurements of VAT and SAT in large-scale studies with minimal need for human intervention. The high performance on the multicenter test data furthermore shows the robustness of this approach for data of different patient demographics and imaging centers, as long as a consistent imaging protocol is used.
Tasks
Published	2018-06-26
URL	http://arxiv.org/abs/1807.03122v5
PDF	http://arxiv.org/pdf/1807.03122v5.pdf
PWC	https://paperswithcode.com/paper/fully-convolutional-networks-for-automated
Repo	https://github.com/tarolangner/fcn_vatsat
Framework	pytorch

DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors


Title	DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors
Authors	Arash Vahdat, Evgeny Andriyash, William G. Macready
Abstract	Boltzmann machines are powerful distributions that have been shown to be an effective prior over binary latent variables in variational autoencoders (VAEs). However, previous methods for training discrete VAEs have used the evidence lower bound and not the tighter importance-weighted bound. We propose two approaches for relaxing Boltzmann machines to continuous distributions that permit training with importance-weighted bounds. These relaxations are based on generalized overlapping transformations and the Gaussian integral trick. Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors. An implementation which reproduces these results is available at https://github.com/QuadrantAI/dvae .
Tasks	Omniglot
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07445v4
PDF	http://arxiv.org/pdf/1805.07445v4.pdf
PWC	https://paperswithcode.com/paper/dvae-discrete-variational-autoencoders-with
Repo	https://github.com/QuadrantAI/dvae
Framework	tf