July 30, 2019

3094 words 15 mins read

Paper Group AWR 10

Paper Group AWR 10

Reflection Separation Using Guided Annotation. S$^3$FD: Single Shot Scale-invariant Face Detector. UntrimmedNets for Weakly Supervised Action Recognition and Detection. Classical Structured Prediction Losses for Sequence to Sequence Learning. Learning to diagnose from scratch by exploiting dependencies among labels. Image Quality Assessment Guided …

Reflection Separation Using Guided Annotation

Title Reflection Separation Using Guided Annotation
Authors Ofer Springer, Yair Weiss
Abstract Photographs taken through a glass surface often contain an approximately linear superposition of reflected and transmitted layers. Decomposing an image into these layers is generally an ill-posed task and the use of an additional image prior and user provided cues is presently necessary in order to obtain good results. Current annotation approaches rely on a strong sparsity assumption. For images with significant texture this assumption does not typically hold, thus rendering the annotation process unviable. In this paper we show that using a Gaussian Mixture Model patch prior, the correct local decomposition can almost always be found as one of 100 likely modes of the posterior. Thus, the user need only choose one of these modes in a sparse set of patches and the decomposition may then be completed automatically. We demonstrate the performance of our method using synthesized and real reflection images.
Tasks
Published 2017-02-20
URL http://arxiv.org/abs/1702.05958v2
PDF http://arxiv.org/pdf/1702.05958v2.pdf
PWC https://paperswithcode.com/paper/reflection-separation-using-guided-annotation
Repo https://github.com/ofersp/refsep
Framework none

S$^3$FD: Single Shot Scale-invariant Face Detector

Title S$^3$FD: Single Shot Scale-invariant Face Detector
Authors Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Hailin Shi, Xiaobo Wang, Stan Z. Li
Abstract This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S$^3$FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces. Specifically, we try to solve the common problem that anchor-based detectors deteriorate dramatically as the objects become smaller. We make contributions in the following three aspects: 1) proposing a scale-equitable face detection framework to handle different scales of faces well. We tile anchors on a wide range of layers to ensure that all scales of faces have enough features for detection. Besides, we design anchor scales based on the effective receptive field and a proposed equal proportion interval principle; 2) improving the recall rate of small faces by a scale compensation anchor matching strategy; 3) reducing the false positive rate of small faces via a max-out background label. As a consequence, our method achieves state-of-the-art detection performance on all the common face detection benchmarks, including the AFW, PASCAL face, FDDB and WIDER FACE datasets, and can run at 36 FPS on a Nvidia Titan X (Pascal) for VGA-resolution images.
Tasks Face Detection
Published 2017-08-17
URL http://arxiv.org/abs/1708.05237v3
PDF http://arxiv.org/pdf/1708.05237v3.pdf
PWC https://paperswithcode.com/paper/s3fd-single-shot-scale-invariant-face
Repo https://github.com/LeeRel1991/SFD
Framework none

UntrimmedNets for Weakly Supervised Action Recognition and Detection

Title UntrimmedNets for Weakly Supervised Action Recognition and Detection
Authors Limin Wang, Yuanjun Xiong, Dahua Lin, Luc Van Gool
Abstract Current action recognition methods heavily rely on trimmed videos for model training. However, it is expensive and time-consuming to acquire a large-scale trimmed video dataset. This paper presents a new weakly supervised architecture, called UntrimmedNet, which is able to directly learn action recognition models from untrimmed videos without the requirement of temporal annotations of action instances. Our UntrimmedNet couples two important components, the classification module and the selection module, to learn the action models and reason about the temporal duration of action instances, respectively. These two components are implemented with feed-forward networks, and UntrimmedNet is therefore an end-to-end trainable architecture. We exploit the learned models for action recognition (WSR) and detection (WSD) on the untrimmed video datasets of THUMOS14 and ActivityNet. Although our UntrimmedNet only employs weak supervision, our method achieves performance superior or comparable to that of those strongly supervised approaches on these two datasets.
Tasks Temporal Action Localization, Weakly Supervised Action Localization
Published 2017-03-09
URL http://arxiv.org/abs/1703.03329v2
PDF http://arxiv.org/pdf/1703.03329v2.pdf
PWC https://paperswithcode.com/paper/untrimmednets-for-weakly-supervised-action
Repo https://github.com/zhengshou/AutoLoc
Framework none

Classical Structured Prediction Losses for Sequence to Sequence Learning

Title Classical Structured Prediction Losses for Sequence to Sequence Learning
Authors Sergey Edunov, Myle Ott, Michael Auli, David Grangier, Marc’Aurelio Ranzato
Abstract There has been much recent work on training neural attention models at the sequence-level using either reinforcement learning-style methods or by optimizing the beam. In this paper, we survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence to sequence models. Our experiments show that these losses can perform surprisingly well by slightly outperforming beam search optimization in a like for like setup. We also report new state of the art results on both IWSLT’14 German-English translation as well as Gigaword abstractive summarization. On the larger WMT’14 English-French translation task, sequence-level training achieves 41.5 BLEU which is on par with the state of the art.
Tasks Abstractive Text Summarization, Machine Translation, Structured Prediction
Published 2017-11-14
URL http://arxiv.org/abs/1711.04956v5
PDF http://arxiv.org/pdf/1711.04956v5.pdf
PWC https://paperswithcode.com/paper/classical-structured-prediction-losses-for
Repo https://github.com/pytorch/fairseq
Framework pytorch

Learning to diagnose from scratch by exploiting dependencies among labels

Title Learning to diagnose from scratch by exploiting dependencies among labels
Authors Li Yao, Eric Poblenz, Dmitry Dagunts, Ben Covington, Devon Bernard, Kevin Lyman
Abstract The field of medical diagnostics contains a wealth of challenges which closely resemble classical machine learning problems; practical constraints, however, complicate the translation of these endpoints naively into classical architectures. Many tasks in radiology, for example, are largely problems of multi-label classification wherein medical images are interpreted to indicate multiple present or suspected pathologies. Clinical settings drive the necessity for high accuracy simultaneously across a multitude of pathological outcomes and greatly limit the utility of tools which consider only a subset. This issue is exacerbated by a general scarcity of training data and maximizes the need to extract clinically relevant features from available samples – ideally without the use of pre-trained models which may carry forward undesirable biases from tangentially related tasks. We present and evaluate a partial solution to these constraints in using LSTMs to leverage interdependencies among target labels in predicting 14 pathologic patterns from chest x-rays and establish state of the art results on the largest publicly available chest x-ray dataset from the NIH without pre-training. Furthermore, we propose and discuss alternative evaluation metrics and their relevance in clinical practice.
Tasks Multi-Label Classification
Published 2017-10-28
URL http://arxiv.org/abs/1710.10501v2
PDF http://arxiv.org/pdf/1710.10501v2.pdf
PWC https://paperswithcode.com/paper/learning-to-diagnose-from-scratch-by
Repo https://github.com/liyu10000/pneumoconiosis
Framework pytorch

Image Quality Assessment Guided Deep Neural Networks Training

Title Image Quality Assessment Guided Deep Neural Networks Training
Authors Zhuo Chen, Weisi Lin, Shiqi Wang, Long Xu, Leida Li
Abstract For many computer vision problems, the deep neural networks are trained and validated based on the assumption that the input images are pristine (i.e., artifact-free). However, digital images are subject to a wide range of distortions in real application scenarios, while the practical issues regarding image quality in high level visual information understanding have been largely ignored. In this paper, in view of the fact that most widely deployed deep learning models are susceptible to various image distortions, the distorted images are involved for data augmentation in the deep neural network training process to learn a reliable model for practical applications. In particular, an image quality assessment based label smoothing method, which aims at regularizing the label distribution of training images, is further proposed to tune the objective functions in learning the neural network. Experimental results show that the proposed method is effective in dealing with both low and high quality images in the typical image classification task.
Tasks Data Augmentation, Image Classification, Image Quality Assessment
Published 2017-08-13
URL http://arxiv.org/abs/1708.03880v1
PDF http://arxiv.org/pdf/1708.03880v1.pdf
PWC https://paperswithcode.com/paper/image-quality-assessment-guided-deep-neural
Repo https://github.com/dzuba29/Deeplom
Framework tf

DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks

Title DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks
Authors Lingpeng Kong, Chris Alberti, Daniel Andor, Ivan Bogatyy, David Weiss
Abstract In this work, we present a compact, modular framework for constructing novel recurrent neural architectures. Our basic module is a new generic unit, the Transition Based Recurrent Unit (TBRU). In addition to hidden layer activations, TBRUs have discrete state dynamics that allow network connections to be built dynamically as a function of intermediate activations. By connecting multiple TBRUs, we can extend and combine commonly used architectures such as sequence-to-sequence, attention mechanisms, and re-cursive tree-structured models. A TBRU can also serve as both an encoder for downstream tasks and as a decoder for its own task simultaneously, resulting in more accurate multi-task learning. We call our approach Dynamic Recurrent Acyclic Graphical Neural Networks, or DRAGNN. We show that DRAGNN is significantly more accurate and efficient than seq2seq with attention for syntactic dependency parsing and yields more accurate multi-task learning for extractive summarization tasks.
Tasks Dependency Parsing, Multi-Task Learning
Published 2017-03-13
URL http://arxiv.org/abs/1703.04474v1
PDF http://arxiv.org/pdf/1703.04474v1.pdf
PWC https://paperswithcode.com/paper/dragnn-a-transition-based-framework-for
Repo https://github.com/tensorflow/models/tree/master/research/syntaxnet
Framework tf

Learning Approximate Stochastic Transition Models

Title Learning Approximate Stochastic Transition Models
Authors Yuhang Song, Christopher Grimm, Xianming Wang, Michael L. Littman
Abstract We examine the problem of learning mappings from state to state, suitable for use in a model-based reinforcement-learning setting, that simultaneously generalize to novel states and can capture stochastic transitions. We show that currently popular generative adversarial networks struggle to learn these stochastic transition models but a modification to their loss functions results in a powerful learning algorithm for this class of problems.
Tasks
Published 2017-10-26
URL http://arxiv.org/abs/1710.09718v1
PDF http://arxiv.org/pdf/1710.09718v1.pdf
PWC https://paperswithcode.com/paper/learning-approximate-stochastic-transition
Repo https://github.com/YuhangSong/SGAN
Framework tf

Inception Recurrent Convolutional Neural Network for Object Recognition

Title Inception Recurrent Convolutional Neural Network for Object Recognition
Authors Md Zahangir Alom, Mahmudul Hasan, Chris Yakopcic, Tarek M. Taha
Abstract Deep convolutional neural networks (DCNNs) are an influential tool for solving various problems in the machine learning and computer vision fields. In this paper, we introduce a new deep learning model called an Inception- Recurrent Convolutional Neural Network (IRCNN), which utilizes the power of an inception network combined with recurrent layers in DCNN architecture. We have empirically evaluated the recognition performance of the proposed IRCNN model using different benchmark datasets such as MNIST, CIFAR-10, CIFAR- 100, and SVHN. Experimental results show similar or higher recognition accuracy when compared to most of the popular DCNNs including the RCNN. Furthermore, we have investigated IRCNN performance against equivalent Inception Networks and Inception-Residual Networks using the CIFAR-100 dataset. We report about 3.5%, 3.47% and 2.54% improvement in classification accuracy when compared to the RCNN, equivalent Inception Networks, and Inception- Residual Networks on the augmented CIFAR- 100 dataset respectively.
Tasks Object Recognition
Published 2017-04-25
URL http://arxiv.org/abs/1704.07709v1
PDF http://arxiv.org/pdf/1704.07709v1.pdf
PWC https://paperswithcode.com/paper/inception-recurrent-convolutional-neural
Repo https://github.com/Insiyaa/IRCNN-keras
Framework none

Accelerating Neural Architecture Search using Performance Prediction

Title Accelerating Neural Architecture Search using Performance Prediction
Authors Bowen Baker, Otkrist Gupta, Ramesh Raskar, Nikhil Naik
Abstract Methods for neural network hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of model configurations. In this paper, we show that standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validation performance data. We empirically show that our performance prediction models are much more effective than prominent Bayesian counterparts, are simpler to implement, and are faster to train. Our models can predict final performance in both visual classification and language modeling domains, are effective for predicting performance of drastically varying model architectures, and can even generalize between model classes. Using these prediction models, we also propose an early stopping method for hyperparameter optimization and meta-modeling, which obtains a speedup of a factor up to 6x in both hyperparameter optimization and meta-modeling. Finally, we empirically show that our early stopping method can be seamlessly incorporated into both reinforcement learning-based architecture selection algorithms and bandit based search methods. Through extensive experimentation, we empirically show our performance prediction models and early stopping algorithm are state-of-the-art in terms of prediction accuracy and speedup achieved while still identifying the optimal model configurations.
Tasks Hyperparameter Optimization, Language Modelling, Neural Architecture Search, Time Series
Published 2017-05-30
URL http://arxiv.org/abs/1705.10823v2
PDF http://arxiv.org/pdf/1705.10823v2.pdf
PWC https://paperswithcode.com/paper/accelerating-neural-architecture-search-using
Repo https://github.com/nikdnaik/accelerating_nas-1
Framework none

Metacontrol for Adaptive Imagination-Based Optimization

Title Metacontrol for Adaptive Imagination-Based Optimization
Authors Jessica B. Hamrick, Andrew J. Ballard, Razvan Pascanu, Oriol Vinyals, Nicolas Heess, Peter W. Battaglia
Abstract Many machine learning systems are built to solve the hardest examples of a particular task, which often makes them large and expensive to run—especially with respect to the easier examples, which might require much less computation. For an agent with a limited computational budget, this “one-size-fits-all” approach may result in the agent wasting valuable computation on easy examples, while not spending enough on hard examples. Rather than learning a single, fixed policy for solving all instances of a task, we introduce a metacontroller which learns to optimize a sequence of “imagined” internal simulations over predictive models of the world in order to construct a more informed, and more economical, solution. The metacontroller component is a model-free reinforcement learning agent, which decides both how many iterations of the optimization procedure to run, as well as which model to consult on each iteration. The models (which we call “experts”) can be state transition models, action-value functions, or any other mechanism that provides information useful for solving the task, and can be learned on-policy or off-policy in parallel with the metacontroller. When the metacontroller, controller, and experts were trained with “interaction networks” (Battaglia et al., 2016) as expert models, our approach was able to solve a challenging decision-making problem under complex non-linear dynamics. The metacontroller learned to adapt the amount of computation it performed to the difficulty of the task, and learned how to choose which experts to consult by factoring in both their reliability and individual computational resource costs. This allowed the metacontroller to achieve a lower overall cost (task loss plus computational cost) than more traditional fixed policy approaches. These results demonstrate that our approach is a powerful framework for using…
Tasks Decision Making
Published 2017-05-07
URL http://arxiv.org/abs/1705.02670v1
PDF http://arxiv.org/pdf/1705.02670v1.pdf
PWC https://paperswithcode.com/paper/metacontrol-for-adaptive-imagination-based
Repo https://github.com/deepmind/spaceship_dataset
Framework none

SGNMT – A Flexible NMT Decoding Platform for Quick Prototyping of New Models and Search Strategies

Title SGNMT – A Flexible NMT Decoding Platform for Quick Prototyping of New Models and Search Strategies
Authors Felix Stahlberg, Eva Hasler, Danielle Saunders, Bill Byrne
Abstract This paper introduces SGNMT, our experimental platform for machine translation research. SGNMT provides a generic interface to neural and symbolic scoring modules (predictors) with left-to-right semantic such as translation models like NMT, language models, translation lattices, $n$-best lists or other kinds of scores and constraints. Predictors can be combined with other predictors to form complex decoding tasks. SGNMT implements a number of search strategies for traversing the space spanned by the predictors which are appropriate for different predictor constellations. Adding new predictors or decoding strategies is particularly easy, making it a very efficient tool for prototyping new research ideas. SGNMT is actively being used by students in the MPhil program in Machine Learning, Speech and Language Technology at the University of Cambridge for course work and theses, as well as for most of the research work in our group.
Tasks Machine Translation
Published 2017-07-21
URL http://arxiv.org/abs/1707.06885v1
PDF http://arxiv.org/pdf/1707.06885v1.pdf
PWC https://paperswithcode.com/paper/sgnmt-a-flexible-nmt-decoding-platform-for
Repo https://github.com/ucam-smt/sgnmt
Framework tf

SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Title SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation
Authors Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, Lucia Specia
Abstract Semantic Textual Similarity (STS) measures the meaning similarity of sentences. Applications include machine translation (MT), summarization, generation, question answering (QA), short answer grading, semantic search, dialog and conversational systems. The STS shared task is a venue for assessing the current state-of-the-art. The 2017 task focuses on multilingual and cross-lingual pairs with one sub-track exploring MT quality estimation (MTQE) data. The task obtained strong participation from 31 teams, with 17 participating in all language tracks. We summarize performance and review a selection of well performing methods. Analysis highlights common errors, providing insight into the limitations of existing models. To support ongoing work on semantic representations, the STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017).
Tasks Machine Translation, Question Answering, Semantic Textual Similarity
Published 2017-07-31
URL http://arxiv.org/abs/1708.00055v1
PDF http://arxiv.org/pdf/1708.00055v1.pdf
PWC https://paperswithcode.com/paper/semeval-2017-task-1-semantic-textual
Repo https://github.com/laraolmos/madrid-nlp-meetup
Framework tf

Quantum Neuron: an elementary building block for machine learning on quantum computers

Title Quantum Neuron: an elementary building block for machine learning on quantum computers
Authors Yudong Cao, Gian Giacomo Guerreschi, Alán Aspuru-Guzik
Abstract Even the most sophisticated artificial neural networks are built by aggregating substantially identical units called neurons. A neuron receives multiple signals, internally combines them, and applies a non-linear function to the resulting weighted sum. Several attempts to generalize neurons to the quantum regime have been proposed, but all proposals collided with the difficulty of implementing non-linear activation functions, which is essential for classical neurons, due to the linear nature of quantum mechanics. Here we propose a solution to this roadblock in the form of a small quantum circuit that naturally simulates neurons with threshold activation. Our quantum circuit defines a building block, the “quantum neuron”, that can reproduce a variety of classical neural network constructions while maintaining the ability to process superpositions of inputs and preserve quantum coherence and entanglement. In the construction of feedforward networks of quantum neurons, we provide numerical evidence that the network not only can learn a function when trained with superposition of inputs and the corresponding output, but that this training suffices to learn the function on all individual inputs separately. When arranged to mimic Hopfield networks, quantum neural networks exhibit properties of associative memory. Patterns are encoded using the simple Hebbian rule for the weights and we demonstrate attractor dynamics from corrupted inputs. Finally, the fact that our quantum model closely captures (traditional) neural network dynamics implies that the vast body of literature and results on neural networks becomes directly relevant in the context of quantum machine learning.
Tasks Quantum Machine Learning
Published 2017-11-30
URL http://arxiv.org/abs/1711.11240v1
PDF http://arxiv.org/pdf/1711.11240v1.pdf
PWC https://paperswithcode.com/paper/quantum-neuron-an-elementary-building-block
Repo https://github.com/inJeans/qnn
Framework none

Updating Singular Value Decomposition for Rank One Matrix Perturbation

Title Updating Singular Value Decomposition for Rank One Matrix Perturbation
Authors Ratnik Gandhi, Amoli Rajgor
Abstract An efficient Singular Value Decomposition (SVD) algorithm is an important tool for distributed and streaming computation in big data problems. It is observed that update of singular vectors of a rank-1 perturbed matrix is similar to a Cauchy matrix-vector product. With this observation, in this paper, we present an efficient method for updating Singular Value Decomposition of rank-1 perturbed matrix in $O(n^2 \ \text{log}(\frac{1}{\epsilon}))$ time. The method uses Fast Multipole Method (FMM) for updating singular vectors in $O(n \ \text{log} (\frac{1}{\epsilon}))$ time, where $\epsilon$ is the precision of computation.
Tasks
Published 2017-07-26
URL http://arxiv.org/abs/1707.08369v1
PDF http://arxiv.org/pdf/1707.08369v1.pdf
PWC https://paperswithcode.com/paper/updating-singular-value-decomposition-for
Repo https://github.com/AmoliR/rank1-svd-update
Framework none
comments powered by Disqus