October 17, 2019

2731 words 13 mins read

Paper Group ANR 963

Paper Group ANR 963

Zero-Resource Neural Machine Translation with Multi-Agent Communication Game. Noise generation for compression algorithms. Text-Independent Speaker Verification Based on Deep Neural Networks and Segmental Dynamic Time Warping. CAAD 2018: Iterative Ensemble Adversarial Attack. Estimation and Restoration of Compositional Degradation Using Convolution …

Zero-Resource Neural Machine Translation with Multi-Agent Communication Game

Title Zero-Resource Neural Machine Translation with Multi-Agent Communication Game
Authors Yun Chen, Yang Liu, Victor O. K. Li
Abstract While end-to-end neural machine translation (NMT) has achieved notable success in the past years in translating a handful of resource-rich language pairs, it still suffers from the data scarcity problem for low-resource language pairs and domains. To tackle this problem, we propose an interactive multimodal framework for zero-resource neural machine translation. Instead of being passively exposed to large amounts of parallel corpora, our learners (implemented as encoder-decoder architecture) engage in cooperative image description games, and thus develop their own image captioning or neural machine translation model from the need to communicate in order to succeed at the game. Experimental results on the IAPR-TC12 and Multi30K datasets show that the proposed learning mechanism significantly improves over the state-of-the-art methods.
Tasks Image Captioning, Machine Translation
Published 2018-02-09
URL http://arxiv.org/abs/1802.03116v1
PDF http://arxiv.org/pdf/1802.03116v1.pdf
PWC https://paperswithcode.com/paper/zero-resource-neural-machine-translation-with
Repo
Framework

Noise generation for compression algorithms

Title Noise generation for compression algorithms
Authors Renata Khasanova, Jan Wassenberg, Jyrki Alakuijala
Abstract In various Computer Vision and Signal Processing applications, noise is typically perceived as a drawback of the image capturing system that ought to be removed. We, on the other hand, claim that image noise, just as texture, is important for visual perception and, therefore, critical for lossy compression algorithms that tend to make decompressed images look less realistic by removing small image details. In this paper we propose a physically and biologically inspired technique that learns a noise model at the encoding step of the compression algorithm and then generates the appropriate amount of additive noise at the decoding step. Our method can significantly increase the realism of the decompressed image at the cost of few bytes of additional memory space regardless of the original image size. The implementation of our method is open-sourced and available at https://github.com/google/pik.
Tasks
Published 2018-03-24
URL http://arxiv.org/abs/1803.09165v1
PDF http://arxiv.org/pdf/1803.09165v1.pdf
PWC https://paperswithcode.com/paper/noise-generation-for-compression-algorithms
Repo
Framework

Text-Independent Speaker Verification Based on Deep Neural Networks and Segmental Dynamic Time Warping

Title Text-Independent Speaker Verification Based on Deep Neural Networks and Segmental Dynamic Time Warping
Authors Mohamed Adel, Mohamed Afify, Akram Gaballah
Abstract In this paper we present a new method for text-independent speaker verification that combines segmental dynamic time warping (SDTW) and the d-vector approach. The d-vectors, generated from a feed forward deep neural network trained to distinguish between speakers, are used as features to perform alignment and hence calculate the overall distance between the enrolment and test utterances.We present results on the NIST 2008 data set for speaker verification where the proposed method outperforms the conventional i-vector baseline with PLDA scores and outperforms d-vector approach with local distances based on cosine and PLDA scores. Also score combination with the i-vector/PLDA baseline leads to significant gains over both methods.
Tasks Speaker Verification, Text-Independent Speaker Verification
Published 2018-06-26
URL http://arxiv.org/abs/1806.09932v1
PDF http://arxiv.org/pdf/1806.09932v1.pdf
PWC https://paperswithcode.com/paper/text-independent-speaker-verification-based
Repo
Framework

CAAD 2018: Iterative Ensemble Adversarial Attack

Title CAAD 2018: Iterative Ensemble Adversarial Attack
Authors Jiayang Liu, Weiming Zhang, Nenghai Yu
Abstract Deep Neural Networks (DNNs) have recently led to significant improvements in many fields. However, DNNs are vulnerable to adversarial examples which are samples with imperceptible perturbations while dramatically misleading the DNNs. Adversarial attacks can be used to evaluate the robustness of deep learning models before they are deployed. Unfortunately, most of existing adversarial attacks can only fool a black-box model with a low success rate. To improve the success rates for black-box adversarial attacks, we proposed an iterated adversarial attack against an ensemble of image classifiers. With this method, we won the 5th place in CAAD 2018 Targeted Adversarial Attack competition.
Tasks Adversarial Attack
Published 2018-11-07
URL http://arxiv.org/abs/1811.03456v1
PDF http://arxiv.org/pdf/1811.03456v1.pdf
PWC https://paperswithcode.com/paper/caad-2018-iterative-ensemble-adversarial
Repo
Framework

Estimation and Restoration of Compositional Degradation Using Convolutional Neural Networks

Title Estimation and Restoration of Compositional Degradation Using Convolutional Neural Networks
Authors Kazutaka Uchida, Masayuki Tanaka, Masatoshi Okutomi
Abstract Image restoration from a single image degradation type, such as blurring, hazing, random noise, and compression has been investigated for decades. However, image degradations in practice are often a mixture of several types of degradation. Such compositional degradations complicate restoration because they require the differentiation of different degradation types and levels. In this paper, we propose a convolutional neural network (CNN) model for estimating the degradation properties of a given degraded image. Furthermore, we introduce an image restoration CNN model that adopts the estimated degradation properties as its input. Experimental results show that the proposed degradation estimation model can successfully infer the degradation properties of compositionally degraded images. The proposed restoration model can restore degraded images by exploiting the estimated degradation properties and can achieve both blind and nonblind image restorations.
Tasks Image Restoration
Published 2018-12-23
URL http://arxiv.org/abs/1812.09629v1
PDF http://arxiv.org/pdf/1812.09629v1.pdf
PWC https://paperswithcode.com/paper/estimation-and-restoration-of-compositional
Repo
Framework

Bilinear Factorization For Low-Rank SDP Learning

Title Bilinear Factorization For Low-Rank SDP Learning
Authors En-Liang Hu
Abstract Many machine learning problems can be reduced to learning a low-rank positive semidefinite matrix (denoted as $Z$), which encounters semidefinite program (SDP). Existing SDP solvers are often expensive for large-scale learning. To avoid directly solving SDP, some works convert SDP into a nonconvex program by factorizing $Z$ $quadraticly$ as $XX^\top$. However, this would bring higher-order nonlinearity, resulting in scarcity of structure in subsequent optimization. In this paper, we propose a novel surrogate for SDP-related learning, in which the structure of subproblem is exploited. More specifically, we surrogate unconstrained SDP by a biconvex problem, through factorizing $Z$ $bilinearly$ as $XY^\top$ and using a Courant penalty to penalize the difference of $X$ and $Y$, in which the resultant subproblems are convex. Furthermore, we provide a theoretical bound for the associated penalty parameter under the assumption that the subobjective function is $L$-Lipschitz-smooth and $\sigma-$strongly convex, such that the proposed surrogate will solve the original SDP when the penalty parameter is larger than this bound, that is $\gamma>\frac{1}{4}(L-\sigma)$. Experiments on two SDP-related machine learning applications demonstrate that the proposed algorithm is as accurate as the state-of-the-art, but is faster on large-scale learning.
Tasks
Published 2018-11-03
URL https://arxiv.org/abs/1811.01198v4
PDF https://arxiv.org/pdf/1811.01198v4.pdf
PWC https://paperswithcode.com/paper/biconvex-landscape-in-sdp-related-learning
Repo
Framework

Viewpoint Estimation-Insights & Model

Title Viewpoint Estimation-Insights & Model
Authors Gilad Divon, Ayellet Tal
Abstract This paper addresses the problem of viewpoint estimation of an object in a given image. It presents five key insights that should be taken into consideration when designing a CNN that solves the problem. Based on these insights, the paper proposes a network in which (i) The architecture jointly solves detection, classification, and viewpoint estimation. (ii) New types of data are added and trained on. (iii) A novel loss function, which takes into account both the geometry of the problem and the new types of data, is propose. Our network improves the state-of-the-art results for this problem by 9.8%.
Tasks Viewpoint Estimation
Published 2018-07-03
URL http://arxiv.org/abs/1807.01312v1
PDF http://arxiv.org/pdf/1807.01312v1.pdf
PWC https://paperswithcode.com/paper/viewpoint-estimation-insights-model
Repo
Framework

The Wasserstein transform

Title The Wasserstein transform
Authors Facundo Mémoli, Zane Smith, Zhengchao Wan
Abstract We introduce the Wasserstein transform, a method for enhancing and denoising datasets defined on general metric spaces. The construction draws inspiration from Optimal Transportation ideas. We establish precise connections with the mean shift family of algorithms and establish the stability of both our method and mean shift under data perturbation.
Tasks Denoising
Published 2018-10-17
URL http://arxiv.org/abs/1810.07793v1
PDF http://arxiv.org/pdf/1810.07793v1.pdf
PWC https://paperswithcode.com/paper/the-wasserstein-transform
Repo
Framework

A new Taxonomy of Continuous Global Optimization Algorithms

Title A new Taxonomy of Continuous Global Optimization Algorithms
Authors Jörg Stork, A. E. Eiben, Thomas Bartz-Beielstein
Abstract Surrogate-based optimization and nature-inspired metaheuristics have become the state-of-the-art in solving real-world optimization problems. Still, it is difficult for beginners and even experts to get an overview that explains their advantages in comparison to the large number of available methods in the scope of continuous optimization. Available taxonomies lack the integration of surrogate-based approaches and thus their embedding in the larger context of this broad field. This article presents a taxonomy of the field, which further matches the idea of nature-inspired algorithms, as it is based on the human behavior in path finding. Intuitive analogies make it easy to conceive the most basic principles of the search algorithms, even for beginners and non-experts in this area of research. However, this scheme does not oversimplify the high complexity of the different algorithms, as the class identifier only defines a descriptive meta-level of the algorithm search strategies. The taxonomy was established by exploring and matching algorithm schemes, extracting similarities and differences, and creating a set of classification indicators to distinguish between five distinct classes. In practice, this taxonomy allows recommendations for the applicability of the corresponding algorithms and helps developers trying to create or improve their own algorithms.
Tasks
Published 2018-08-27
URL http://arxiv.org/abs/1808.08818v1
PDF http://arxiv.org/pdf/1808.08818v1.pdf
PWC https://paperswithcode.com/paper/a-new-taxonomy-of-continuous-global
Repo
Framework

Approximation Algorithms for D-optimal Design

Title Approximation Algorithms for D-optimal Design
Authors Mohit Singh, Weijun Xie
Abstract Experimental design is a classical statistics problem and its aim is to estimate an unknown $m$-dimensional vector $\beta$ from linear measurements where a Gaussian noise is introduced in each measurement. For the combinatorial experimental design problem, the goal is to pick $k$ out of the given $n$ experiments so as to make the most accurate estimate of the unknown parameters, denoted as $\hat{\beta}$. In this paper, we will study one of the most robust measures of error estimation - $D$-optimality criterion, which corresponds to minimizing the volume of the confidence ellipsoid for the estimation error $\beta-\hat{\beta}$. The problem gives rise to two natural variants depending on whether repetitions of experiments are allowed or not. We first propose an approximation algorithm with a $\frac1e$-approximation for the $D$-optimal design problem with and without repetitions, giving the first constant factor approximation for the problem. We then analyze another sampling approximation algorithm and prove that it is $(1-\epsilon)$-approximation if $k\geq \frac{4m}{\epsilon}+\frac{12}{\epsilon^2}\log(\frac{1}{\epsilon})$ for any $\epsilon \in (0,1)$. Finally, for $D$-optimal design with repetitions, we study a different algorithm proposed by literature and show that it can improve this asymptotic approximation ratio.
Tasks
Published 2018-02-23
URL https://arxiv.org/abs/1802.08372v2
PDF https://arxiv.org/pdf/1802.08372v2.pdf
PWC https://paperswithcode.com/paper/approximate-positively-correlated
Repo
Framework

Extensible Grounding of Speech for Robot Instruction

Title Extensible Grounding of Speech for Robot Instruction
Authors Jonathan Connell
Abstract Spoken language is a convenient interface for commanding a mobile robot. Yet for this to work a number of base terms must be grounded in perceptual and motor skills. We detail the language processing used on our robot ELI and explain how this grounding is performed, how it interacts with user gestures, and how it handles phenomena such as anaphora. More importantly, however, there are certain concepts which the robot cannot be preprogrammed with, such as the names of various objects in a household or the nature of specific tasks it may be requested to perform. In these cases it is vital that there exist a method for extending the grounding, essentially “learning by being told”. We describe how this was successfully implemented for learning new nouns and verbs in a tabletop setting. Creating this language learning kernel may be the last explicit programming the robot ever needs - the core mechanism could eventually be used for imparting a vast amount of knowledge, much as a child learns from its parents and teachers.
Tasks
Published 2018-07-31
URL http://arxiv.org/abs/1807.11838v1
PDF http://arxiv.org/pdf/1807.11838v1.pdf
PWC https://paperswithcode.com/paper/extensible-grounding-of-speech-for-robot
Repo
Framework

Crowd-Powered Data Mining

Title Crowd-Powered Data Mining
Authors Chengliang Chai, Ju Fan, Guoliang Li, Jiannan Wang, Yudian Zheng
Abstract Many data mining tasks cannot be completely addressed by auto- mated processes, such as sentiment analysis and image classification. Crowdsourcing is an effective way to harness the human cognitive ability to process these machine-hard tasks. Thanks to public crowdsourcing platforms, e.g., Amazon Mechanical Turk and Crowd- Flower, we can easily involve hundreds of thousands of ordinary workers (i.e., the crowd) to address these machine-hard tasks. In this tutorial, we will survey and synthesize a wide spectrum of existing studies on crowd-powered data mining. We first give an overview of crowdsourcing, and then summarize the fundamental techniques, including quality control, cost control, and latency control, which must be considered in crowdsourced data mining. Next we review crowd-powered data mining operations, including classification, clustering, pattern mining, machine learning using the crowd (including deep learning, transfer learning and semi-supervised learning) and knowledge discovery. Finally, we provide the emerging challenges in crowdsourced data mining.
Tasks Image Classification, Sentiment Analysis, Transfer Learning
Published 2018-06-13
URL http://arxiv.org/abs/1806.04968v2
PDF http://arxiv.org/pdf/1806.04968v2.pdf
PWC https://paperswithcode.com/paper/crowd-powered-data-mining
Repo
Framework

Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa

Title Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa
Authors Jie Yang, Thomas Drake, Andreas Damianou, Yoelle Maarek
Abstract This paper presents a generic Bayesian framework that enables any deep learning model to actively learn from targeted crowds. Our framework inherits from recent advances in Bayesian deep learning, and extends existing work by considering the targeted crowdsourcing approach, where multiple annotators with unknown expertise contribute an uncontrolled amount (often limited) of annotations. Our framework leverages the low-rank structure in annotations to learn individual annotator expertise, which then helps to infer the true labels from noisy and sparse annotations. It provides a unified Bayesian model to simultaneously infer the true labels and train the deep learning model in order to reach an optimal learning efficacy. Finally, our framework exploits the uncertainty of the deep learning model during prediction as well as the annotators’ estimated expertise to minimize the number of required annotations and annotators for optimally training the deep learning model. We evaluate the effectiveness of our framework for intent classification in Alexa (Amazon’s personal assistant), using both synthetic and real-world datasets. Experiments show that our framework can accurately learn annotator expertise, infer true labels, and effectively reduce the amount of annotations in model training as compared to state-of-the-art approaches. We further discuss the potential of our proposed framework in bridging machine learning and crowdsourcing towards improved human-in-the-loop systems.
Tasks Active Learning, Intent Classification
Published 2018-03-12
URL http://arxiv.org/abs/1803.04223v1
PDF http://arxiv.org/pdf/1803.04223v1.pdf
PWC https://paperswithcode.com/paper/leveraging-crowdsourcing-data-for-deep-active
Repo
Framework

Intelligent Trainer for Model-Based Reinforcement Learning

Title Intelligent Trainer for Model-Based Reinforcement Learning
Authors Yuanlong Li, Linsen Dong, Xin Zhou, Yonggang Wen, Kyle Guan
Abstract Model-based reinforcement learning (MBRL) has been proposed as a promising alternative solution to tackle the high sampling cost challenge in the canonical reinforcement learning (RL), by leveraging a learned model to generate synthesized data for policy training purpose. The MBRL framework, nevertheless, is inherently limited by the convoluted process of jointly learning control policy and configuring hyper-parameters (e.g., global/local models, real and synthesized data, etc). The training process could be tedious and prohibitively costly. In this research, we propose an “reinforcement on reinforcement” (RoR) architecture to decompose the convoluted tasks into two layers of reinforcement learning. The inner layer is the canonical model-based RL training process environment (TPE), which learns the control policy for the underlying system and exposes interfaces to access states, actions and rewards. The outer layer presents an RL agent, called as AI trainer, to learn an optimal hyper-parameter configuration for the inner TPE. This decomposition approach provides a desirable flexibility to implement different trainer designs, called as “train the trainer”. In our research, we propose and optimize two alternative trainer designs: 1) a uni-head trainer and 2) a multi-head trainer. Our proposed RoR framework is evaluated for five tasks in the OpenAI gym (i.e., Pendulum, Mountain Car, Reacher, Half Cheetah and Swimmer). Compared to three other baseline algorithms, our proposed Train-the-Trainer algorithm has a competitive performance in auto-tuning capability, with upto 56% expected sampling cost saving without knowing the best parameter setting in advance. The proposed trainer framework can be easily extended to other cases in which the hyper-parameter tuning is costly.
Tasks
Published 2018-05-24
URL https://arxiv.org/abs/1805.09496v6
PDF https://arxiv.org/pdf/1805.09496v6.pdf
PWC https://paperswithcode.com/paper/intelligent-trainer-for-model-based
Repo
Framework

ISIC 2017 Skin Lesion Segmentation Using Deep Encoder-Decoder Network

Title ISIC 2017 Skin Lesion Segmentation Using Deep Encoder-Decoder Network
Authors Ngoc-Quang Nguyen
Abstract This paper summarizes our method and validation results for part 1 of the ISBI Challenge 2018. Our algorithm makes use of deep encoder-decoder network and novel skin lesion data augmentation to segment the challenge objective. Besides, we also propose an effective testing strategy by applying multi-model comparison.
Tasks Data Augmentation, Lesion Segmentation
Published 2018-07-24
URL http://arxiv.org/abs/1807.09083v1
PDF http://arxiv.org/pdf/1807.09083v1.pdf
PWC https://paperswithcode.com/paper/isic-2017-skin-lesion-segmentation-using-deep
Repo
Framework
comments powered by Disqus