October 21, 2019

3171 words 15 mins read

Paper Group AWR 18

Fast Dynamic Routing Based on Weighted Kernel Density Estimation. JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition. Bayesian Optimization with Expensive Integrands. Analyzing Neuroimaging Data Through Recurrent Deep Learning Models. Efficient Loss-Based Decoding on Graphs For Extreme Classification. …

Fast Dynamic Routing Based on Weighted Kernel Density Estimation


Title	Fast Dynamic Routing Based on Weighted Kernel Density Estimation
Authors	Suofei Zhang, Wei Zhao, Xiaofu Wu, Quan Zhou
Abstract	Capsules as well as dynamic routing between them are most recently proposed structures for deep neural networks. A capsule groups data into vectors or matrices as poses rather than conventional scalars to represent specific properties of target instance. Besides of pose, a capsule should be attached with a probability (often denoted as activation) for its presence. The dynamic routing helps capsules achieve more generalization capacity with many fewer model parameters. However, the bottleneck that prevents widespread applications of capsule is the expense of computation during routing. To address this problem, we generalize existing routing methods within the framework of weighted kernel density estimation, and propose two fast routing methods with different optimization strategies. Our methods prompt the time efficiency of routing by nearly 40% with negligible performance degradation. By stacking a hybrid of convolutional layers and capsule layers, we construct a network architecture to handle inputs at a resolution of $64\times{64}$ pixels. The proposed models achieve a parallel performance with other leading methods in multiple benchmarks.
Tasks	Density Estimation
Published	2018-05-28
URL	http://arxiv.org/abs/1805.10807v2
PDF	http://arxiv.org/pdf/1805.10807v2.pdf
PWC	https://paperswithcode.com/paper/fast-dynamic-routing-based-on-weighted-kernel
Repo	https://github.com/andyweizhao/capsule_text_classification
Framework	tf

JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition


Title	JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition
Authors	Canyu Le, Xin Li
Abstract	This paper proposes a novel algorithm to reassemble an arbitrarily shredded image to its original status. Existing reassembly pipelines commonly consist of a local matching stage and a global compositions stage. In the local stage, a key challenge in fragment reassembly is to reliably compute and identify correct pairwise matching, for which most existing algorithms use handcrafted features, and hence, cannot reliably handle complicated puzzles. We build a deep convolutional neural network to detect the compatibility of a pairwise stitching, and use it to prune computed pairwise matches. To improve the network efficiency and accuracy, we transfer the calculation of CNN to the stitching region and apply a boost training strategy. In the global composition stage, we modify the commonly adopted greedy edge selection strategies to two new loop closure based searching algorithms. Extensive experiments show that our algorithm significantly outperforms existing methods on solving various puzzles, especially those challenging ones with many fragment pieces.
Tasks
Published	2018-09-11
URL	http://arxiv.org/abs/1809.04137v1
PDF	http://arxiv.org/pdf/1809.04137v1.pdf
PWC	https://paperswithcode.com/paper/jigsawnet-shredded-image-reassembly-using
Repo	https://github.com/Lecanyu/JigsawNet
Framework	tf

Bayesian Optimization with Expensive Integrands


Title	Bayesian Optimization with Expensive Integrands
Authors	Saul Toscano-Palmerin, Peter I. Frazier
Abstract	We propose a Bayesian optimization algorithm for objective functions that are sums or integrals of expensive-to-evaluate functions, allowing noisy evaluations. These objective functions arise in multi-task Bayesian optimization for tuning machine learning hyperparameters, optimization via simulation, and sequential design of experiments with random environmental conditions. Our method is average-case optimal by construction when a single evaluation of the integrand remains within our evaluation budget. Achieving this one-step optimality requires solving a challenging value of information optimization problem, for which we provide a novel efficient discretization-free computational method. We also provide consistency proofs for our method in both continuum and discrete finite domains for objective functions that are sums. In numerical experiments comparing against previous state-of-the-art methods, including those that also leverage sum or integral structure, our method performs as well or better across a wide range of problems and offers significant improvements when evaluations are noisy or the integrand varies smoothly in the integrated variables.
Tasks
Published	2018-03-23
URL	http://arxiv.org/abs/1803.08661v1
PDF	http://arxiv.org/pdf/1803.08661v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-optimization-with-expensive
Repo	https://github.com/toscanosaul/bayesian_quadrature_optimization
Framework	none

Analyzing Neuroimaging Data Through Recurrent Deep Learning Models


Title	Analyzing Neuroimaging Data Through Recurrent Deep Learning Models
Authors	Armin W. Thomas, Hauke R. Heekeren, Klaus-Robert Müller, Wojciech Samek
Abstract	The application of deep learning (DL) models to neuroimaging data poses several challenges, due to the high dimensionality, low sample size and complex temporo-spatial dependency structure of these datasets. Even further, DL models act as as black-box models, impeding insight into the association of cognitive state and brain activity. To approach these challenges, we introduce the DeepLight framework, which utilizes long short-term memory (LSTM) based DL models to analyze whole-brain functional Magnetic Resonance Imaging (fMRI) data. To decode a cognitive state (e.g., seeing the image of a house), DeepLight separates the fMRI volume into a sequence of axial brain slices, which is then sequentially processed by an LSTM. To maintain interpretability, DeepLight adapts the layer-wise relevance propagation (LRP) technique. Thereby, decomposing its decoding decision into the contributions of the single input voxels to this decision. Importantly, the decomposition is performed on the level of single fMRI volumes, enabling DeepLight to study the associations between cognitive state and brain activity on several levels of data granularity, from the level of the group down to the level of single time points. To demonstrate the versatility of DeepLight, we apply it to a large fMRI dataset of the Human Connectome Project. We show that DeepLight outperforms conventional approaches of uni- and multivariate fMRI analysis in decoding the cognitive states and in identifying the physiologically appropriate brain regions associated with these states. We further demonstrate DeepLight’s ability to study the fine-grained temporo-spatial variability of brain activity over sequences of single fMRI samples.
Tasks
Published	2018-10-23
URL	http://arxiv.org/abs/1810.09945v2
PDF	http://arxiv.org/pdf/1810.09945v2.pdf
PWC	https://paperswithcode.com/paper/interpretable-lstms-for-whole-brain
Repo	https://github.com/ArrasL/LRP_for_LSTM
Framework	none

Efficient Loss-Based Decoding on Graphs For Extreme Classification


Title	Efficient Loss-Based Decoding on Graphs For Extreme Classification
Authors	Itay Evron, Edward Moroshko, Koby Crammer
Abstract	In extreme classification problems, learning algorithms are required to map instances to labels from an extremely large label set. We build on a recent extreme classification framework with logarithmic time and space, and on a general approach for error correcting output coding (ECOC) with loss-based decoding, and introduce a flexible and efficient approach accompanied by theoretical bounds. Our framework employs output codes induced by graphs, for which we show how to perform efficient loss-based decoding to potentially improve accuracy. In addition, our framework offers a tradeoff between accuracy, model size and prediction time. We show how to find the sweet spot of this tradeoff using only the training data. Our experimental study demonstrates the validity of our assumptions and claims, and shows that our method is competitive with state-of-the-art algorithms.
Tasks
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03319v2
PDF	http://arxiv.org/pdf/1803.03319v2.pdf
PWC	https://paperswithcode.com/paper/efficient-loss-based-decoding-on-graphs-for
Repo	https://github.com/ievron/wltls
Framework	none

Delta-encoder: an effective sample synthesis method for few-shot object recognition


Title	Delta-encoder: an effective sample synthesis method for few-shot object recognition
Authors	Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Rogerio Feris, Abhishek Kumar, Raja Giryes, Alex M. Bronstein
Abstract	Learning to classify new categories based on just one or a few examples is a long-standing challenge in modern computer vision. In this work, we proposes a simple yet effective method for few-shot (and one-shot) object recognition. Our approach is based on a modified auto-encoder, denoted Delta-encoder, that learns to synthesize new samples for an unseen category just by seeing few examples from it. The synthesized samples are then used to train a classifier. The proposed approach learns to both extract transferable intra-class deformations, or “deltas”, between same-class pairs of training examples, and to apply those deltas to the few provided examples of a novel class (unseen during training) in order to efficiently synthesize samples from that new class. The proposed method improves over the state-of-the-art in one-shot object-recognition and compares favorably in the few-shot case. Upon acceptance code will be made available.
Tasks	Few-Shot Image Classification, Few-Shot Learning, Object Recognition
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04734v3
PDF	http://arxiv.org/pdf/1806.04734v3.pdf
PWC	https://paperswithcode.com/paper/delta-encoder-an-effective-sample-synthesis
Repo	https://github.com/EliSchwartz/DeltaEncoder
Framework	tf

DAC: Data-free Automatic Acceleration of Convolutional Networks


Title	DAC: Data-free Automatic Acceleration of Convolutional Networks
Authors	Xin Li, Shuai Zhang, Bolan Jiang, Yingyong Qi, Mooi Choo Chuah, Ning Bi
Abstract	Deploying a deep learning model on mobile/IoT devices is a challenging task. The difficulty lies in the trade-off between computation speed and accuracy. A complex deep learning model with high accuracy runs slowly on resource-limited devices, while a light-weight model that runs much faster loses accuracy. In this paper, we propose a novel decomposition method, namely DAC, that is capable of factorizing an ordinary convolutional layer into two layers with much fewer parameters. DAC computes the corresponding weights for the newly generated layers directly from the weights of the original convolutional layer. Thus, no training (or fine-tuning) or any data is needed. The experimental results show that DAC reduces a large number of floating-point operations (FLOPs) while maintaining high accuracy of a pre-trained model. If 2% accuracy drop is acceptable, DAC saves 53% FLOPs of VGG16 image classification model on ImageNet dataset, 29% FLOPS of SSD300 object detection model on PASCAL VOC2007 dataset, and 46% FLOPS of a multi-person pose estimation model on Microsoft COCO dataset. Compared to other existing decomposition methods, DAC achieves better performance.
Tasks	Image Classification, Multi-Person Pose Estimation, Object Detection, Pose Estimation
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08374v2
PDF	http://arxiv.org/pdf/1812.08374v2.pdf
PWC	https://paperswithcode.com/paper/dac-data-free-automatic-acceleration-of
Repo	https://github.com/baizhenmao95/2019-ZTE-Algorithm-Competition
Framework	caffe2

Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels


Title	Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels
Authors	Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor Tsang, Masashi Sugiyama
Abstract	Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called Co-teaching for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.
Tasks
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06872v3
PDF	http://arxiv.org/pdf/1804.06872v3.pdf
PWC	https://paperswithcode.com/paper/co-teaching-robust-training-of-deep-neural
Repo	https://github.com/bhanML/Co-teaching
Framework	pytorch

Efficient keyword spotting using dilated convolutions and gating


Title	Efficient keyword spotting using dilated convolutions and gating
Authors	Alice Coucke, Mohammed Chlieh, Thibault Gisselbrecht, David Leroy, Mathieu Poumeyrol, Thibaut Lavril
Abstract	We explore the application of end-to-end stateless temporal modeling to small-footprint keyword spotting as opposed to recurrent networks that model long-term temporal dependencies using internal states. We propose a model inspired by the recent success of dilated convolutions in sequence modeling applications, allowing to train deeper architectures in resource-constrained configurations. Gated activations and residual connections are also added, following a similar configuration to WaveNet. In addition, we apply a custom target labeling that back-propagates loss from specific frames of interest, therefore yielding higher accuracy and only requiring to detect the end of the keyword. Our experimental results show that our model outperforms a max-pooling loss trained recurrent neural network using LSTM cells, with a significant decrease in false rejection rate. The underlying dataset - “Hey Snips” utterances recorded by over 2.2K different speakers - has been made publicly available to establish an open reference for wake-word detection.
Tasks	Keyword Spotting, Small-Footprint Keyword Spotting
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07684v2
PDF	http://arxiv.org/pdf/1811.07684v2.pdf
PWC	https://paperswithcode.com/paper/efficient-keyword-spotting-using-dilated
Repo	https://github.com/snipsco/tract
Framework	tf

Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network


Title	Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network
Authors	Yabin Zhang, Kui Jia, Zhixin Wang
Abstract	Fine-grained object categorization aims for distinguishing objects of subordinate categories that belong to the same entry-level object category. The task is challenging due to the facts that (1) training images with ground-truth labels are difficult to obtain, and (2) variations among different subordinate categories are subtle. It is well established that characterizing features of different subordinate categories are located on local parts of object instances. In fact, careful part annotations are available in many fine-grained categorization datasets. However, manually annotating object parts requires expertise, which is also difficult to generalize to new fine-grained categorization tasks. In this work, we propose a Weakly Supervised Part Detection Network (PartNet) that is able to detect discriminative local parts for use of fine-grained categorization. A vanilla PartNet builds on top of a base subnetwork two parallel streams of upper network layers, which respectively compute scores of classification probabilities (over subordinate categories) and detection probabilities (over a specified number of discriminative part detectors) for local regions of interest (RoIs). The image-level prediction is obtained by aggregating element-wise products of these region-level probabilities. To generate a diverse set of RoIs as inputs of PartNet, we propose a simple Discretized Part Proposals module (DPP) that directly targets for proposing candidates of discriminative local parts, with no bridging via object-level proposals. Experiments on the benchmark CUB-200-2011 and Oxford Flower 102 datasets show the efficacy of our proposed method for both discriminative part detection and fine-grained categorization. In particular, we achieve the new state-of-the-art performance on CUB-200-2011 dataset when ground-truth part annotations are not available.
Tasks
Published	2018-06-16
URL	https://arxiv.org/abs/1806.06198v2
PDF	https://arxiv.org/pdf/1806.06198v2.pdf
PWC	https://paperswithcode.com/paper/part-aware-fine-grained-object-categorization
Repo	https://github.com/YabinZhang1994/PartNet
Framework	pytorch

Trick Me If You Can: Human-in-the-loop Generation of Adversarial Examples for Question Answering


Title	Trick Me If You Can: Human-in-the-loop Generation of Adversarial Examples for Question Answering
Authors	Eric Wallace, Pedro Rodriguez, Shi Feng, Ikuya Yamada, Jordan Boyd-Graber
Abstract	Adversarial evaluation stress tests a model’s understanding of natural language. While past approaches expose superficial patterns, the resulting adversarial examples are limited in complexity and diversity. We propose human-in-the-loop adversarial generation, where human authors are guided to break models. We aid the authors with interpretations of model predictions through an interactive user interface. We apply this generation framework to a question answering task called Quizbowl, where trivia enthusiasts craft adversarial questions. The resulting questions are validated via live human–computer matches: although the questions appear ordinary to humans, they systematically stump neural and information retrieval models. The adversarial questions cover diverse phenomena from multi-hop reasoning to entity type distractors, exposing open challenges in robust question answering.
Tasks	Information Retrieval, Question Answering
Published	2018-09-07
URL	https://arxiv.org/abs/1809.02701v4
PDF	https://arxiv.org/pdf/1809.02701v4.pdf
PWC	https://paperswithcode.com/paper/trick-me-if-you-can-adversarial-writing-of
Repo	https://github.com/Eric-Wallace/trickme-interface
Framework	none

Weakly Supervised Instance Segmentation using Class Peak Response


Title	Weakly Supervised Instance Segmentation using Class Peak Response
Authors	Yanzhao Zhou, Yi Zhu, Qixiang Ye, Qiang Qiu, Jianbin Jiao
Abstract	Weakly supervised instance segmentation with image-level labels, instead of expensive pixel-level masks, remains unexplored. In this paper, we tackle this challenging problem by exploiting class peak responses to enable a classification network for instance mask extraction. With image labels supervision only, CNN classifiers in a fully convolutional manner can produce class response maps, which specify classification confidence at each image location. We observed that local maximums, i.e., peaks, in a class response map typically correspond to strong visual cues residing inside each instance. Motivated by this, we first design a process to stimulate peaks to emerge from a class response map. The emerged peaks are then back-propagated and effectively mapped to highly informative regions of each object instance, such as instance boundaries. We refer to the above maps generated from class peak responses as Peak Response Maps (PRMs). PRMs provide a fine-detailed instance-level representation, which allows instance masks to be extracted even with some off-the-shelf methods. To the best of our knowledge, we for the first time report results for the challenging image-level supervised instance segmentation task. Extensive experiments show that our method also boosts weakly supervised pointwise localization as well as semantic segmentation performance, and reports state-of-the-art results on popular benchmarks, including PASCAL VOC 2012 and MS COCO.
Tasks	Instance Segmentation, Semantic Segmentation, Weakly-supervised instance segmentation
Published	2018-04-03
URL	http://arxiv.org/abs/1804.00880v1
PDF	http://arxiv.org/pdf/1804.00880v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-instance-segmentation-using-1
Repo	https://github.com/ZhouYanzhao/PRM
Framework	pytorch

An efficient framework for learning sentence representations


Title	An efficient framework for learning sentence representations
Authors	Lajanugen Logeswaran, Honglak Lee
Abstract	In this work we propose a simple and efficient framework for learning sentence representations from unlabelled data. Drawing inspiration from the distributional hypothesis and recent work on learning sentence representations, we reformulate the problem of predicting the context in which a sentence appears as a classification problem. Given a sentence and its context, a classifier distinguishes context sentences from other contrastive sentences based on their vector representations. This allows us to efficiently learn different types of encoding functions, and we show that the model learns high-quality sentence representations. We demonstrate that our sentence representations outperform state-of-the-art unsupervised and supervised representation learning methods on several downstream NLP tasks that involve understanding sentence semantics while achieving an order of magnitude speedup in training time.
Tasks	Representation Learning
Published	2018-03-07
URL	http://arxiv.org/abs/1803.02893v1
PDF	http://arxiv.org/pdf/1803.02893v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-framework-for-learning-sentence
Repo	https://github.com/lajanugen/S2V
Framework	tf

DeepTileBars: Visualizing Term Distribution for Neural Information Retrieval


Title	DeepTileBars: Visualizing Term Distribution for Neural Information Retrieval
Authors	Zhiwen Tang, Grace Hui Yang
Abstract	Most neural Information Retrieval (Neu-IR) models derive query-to-document ranking scores based on term-level matching. Inspired by TileBars, a classical term distribution visualization method, in this paper, we propose a novel Neu-IR model that handles query-to-document matching at the subtopic and higher levels. Our system first splits the documents into topical segments, “visualizes” the matchings between the query and the segments, and then feeds an interaction matrix into a Neu-IR model, DeepTileBars, to obtain the final ranking scores. DeepTileBars models the relevance signals occurring at different granularities in a document’s topic hierarchy. It better captures the discourse structure of a document and thus the matching patterns. Although its design and implementation are light-weight, DeepTileBars outperforms other state-of-the-art Neu-IR models on benchmark datasets including the Text REtrieval Conference (TREC) 2010-2012 Web Tracks and LETOR 4.0.
Tasks	Ad-Hoc Information Retrieval, Document Ranking, Information Retrieval
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00606v2
PDF	http://arxiv.org/pdf/1811.00606v2.pdf
PWC	https://paperswithcode.com/paper/deeptilebars-visualizing-term-distribution
Repo	https://github.com/smt-HS/DeepTileBars-release
Framework	none

Adversarial Generalized Method of Moments


Title	Adversarial Generalized Method of Moments
Authors	Greg Lewis, Vasilis Syrgkanis
Abstract	We provide an approach for learning deep neural net representations of models described via conditional moment restrictions. Conditional moment restrictions are widely used, as they are the language by which social scientists describe the assumptions they make to enable causal inference. We formulate the problem of estimating the underling model as a zero-sum game between a modeler and an adversary and apply adversarial training. Our approach is similar in nature to Generative Adversarial Networks (GAN), though here the modeler is learning a representation of a function that satisfies a continuum of moment conditions and the adversary is identifying violating moments. We outline ways of constructing effective adversaries in practice, including kernels centered by k-means clustering, and random forests. We examine the practical performance of our approach in the setting of non-parametric instrumental variable regression.
Tasks	Causal Inference
Published	2018-03-19
URL	http://arxiv.org/abs/1803.07164v2
PDF	http://arxiv.org/pdf/1803.07164v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-generalized-method-of-moments
Repo	https://github.com/vsyrgkanis/adversarial_gmm
Framework	tf