Paper Group AWR 18
Fast Dynamic Routing Based on Weighted Kernel Density Estimation. JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition. Bayesian Optimization with Expensive Integrands. Analyzing Neuroimaging Data Through Recurrent Deep Learning Models. Efficient Loss-Based Decoding on Graphs For Extreme Classification. …
Fast Dynamic Routing Based on Weighted Kernel Density Estimation
Title | Fast Dynamic Routing Based on Weighted Kernel Density Estimation |
Authors | Suofei Zhang, Wei Zhao, Xiaofu Wu, Quan Zhou |
Abstract | Capsules as well as dynamic routing between them are most recently proposed structures for deep neural networks. A capsule groups data into vectors or matrices as poses rather than conventional scalars to represent specific properties of target instance. Besides of pose, a capsule should be attached with a probability (often denoted as activation) for its presence. The dynamic routing helps capsules achieve more generalization capacity with many fewer model parameters. However, the bottleneck that prevents widespread applications of capsule is the expense of computation during routing. To address this problem, we generalize existing routing methods within the framework of weighted kernel density estimation, and propose two fast routing methods with different optimization strategies. Our methods prompt the time efficiency of routing by nearly 40% with negligible performance degradation. By stacking a hybrid of convolutional layers and capsule layers, we construct a network architecture to handle inputs at a resolution of $64\times{64}$ pixels. The proposed models achieve a parallel performance with other leading methods in multiple benchmarks. |
Tasks | Density Estimation |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.10807v2 |
http://arxiv.org/pdf/1805.10807v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-dynamic-routing-based-on-weighted-kernel |
Repo | https://github.com/andyweizhao/capsule_text_classification |
Framework | tf |
JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition
Title | JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition |
Authors | Canyu Le, Xin Li |
Abstract | This paper proposes a novel algorithm to reassemble an arbitrarily shredded image to its original status. Existing reassembly pipelines commonly consist of a local matching stage and a global compositions stage. In the local stage, a key challenge in fragment reassembly is to reliably compute and identify correct pairwise matching, for which most existing algorithms use handcrafted features, and hence, cannot reliably handle complicated puzzles. We build a deep convolutional neural network to detect the compatibility of a pairwise stitching, and use it to prune computed pairwise matches. To improve the network efficiency and accuracy, we transfer the calculation of CNN to the stitching region and apply a boost training strategy. In the global composition stage, we modify the commonly adopted greedy edge selection strategies to two new loop closure based searching algorithms. Extensive experiments show that our algorithm significantly outperforms existing methods on solving various puzzles, especially those challenging ones with many fragment pieces. |
Tasks | |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.04137v1 |
http://arxiv.org/pdf/1809.04137v1.pdf | |
PWC | https://paperswithcode.com/paper/jigsawnet-shredded-image-reassembly-using |
Repo | https://github.com/Lecanyu/JigsawNet |
Framework | tf |
Bayesian Optimization with Expensive Integrands
Title | Bayesian Optimization with Expensive Integrands |
Authors | Saul Toscano-Palmerin, Peter I. Frazier |
Abstract | We propose a Bayesian optimization algorithm for objective functions that are sums or integrals of expensive-to-evaluate functions, allowing noisy evaluations. These objective functions arise in multi-task Bayesian optimization for tuning machine learning hyperparameters, optimization via simulation, and sequential design of experiments with random environmental conditions. Our method is average-case optimal by construction when a single evaluation of the integrand remains within our evaluation budget. Achieving this one-step optimality requires solving a challenging value of information optimization problem, for which we provide a novel efficient discretization-free computational method. We also provide consistency proofs for our method in both continuum and discrete finite domains for objective functions that are sums. In numerical experiments comparing against previous state-of-the-art methods, including those that also leverage sum or integral structure, our method performs as well or better across a wide range of problems and offers significant improvements when evaluations are noisy or the integrand varies smoothly in the integrated variables. |
Tasks | |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08661v1 |
http://arxiv.org/pdf/1803.08661v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-optimization-with-expensive |
Repo | https://github.com/toscanosaul/bayesian_quadrature_optimization |
Framework | none |
Analyzing Neuroimaging Data Through Recurrent Deep Learning Models
Title | Analyzing Neuroimaging Data Through Recurrent Deep Learning Models |
Authors | Armin W. Thomas, Hauke R. Heekeren, Klaus-Robert Müller, Wojciech Samek |
Abstract | The application of deep learning (DL) models to neuroimaging data poses several challenges, due to the high dimensionality, low sample size and complex temporo-spatial dependency structure of these datasets. Even further, DL models act as as black-box models, impeding insight into the association of cognitive state and brain activity. To approach these challenges, we introduce the DeepLight framework, which utilizes long short-term memory (LSTM) based DL models to analyze whole-brain functional Magnetic Resonance Imaging (fMRI) data. To decode a cognitive state (e.g., seeing the image of a house), DeepLight separates the fMRI volume into a sequence of axial brain slices, which is then sequentially processed by an LSTM. To maintain interpretability, DeepLight adapts the layer-wise relevance propagation (LRP) technique. Thereby, decomposing its decoding decision into the contributions of the single input voxels to this decision. Importantly, the decomposition is performed on the level of single fMRI volumes, enabling DeepLight to study the associations between cognitive state and brain activity on several levels of data granularity, from the level of the group down to the level of single time points. To demonstrate the versatility of DeepLight, we apply it to a large fMRI dataset of the Human Connectome Project. We show that DeepLight outperforms conventional approaches of uni- and multivariate fMRI analysis in decoding the cognitive states and in identifying the physiologically appropriate brain regions associated with these states. We further demonstrate DeepLight’s ability to study the fine-grained temporo-spatial variability of brain activity over sequences of single fMRI samples. |
Tasks | |
Published | 2018-10-23 |
URL | http://arxiv.org/abs/1810.09945v2 |
http://arxiv.org/pdf/1810.09945v2.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-lstms-for-whole-brain |
Repo | https://github.com/ArrasL/LRP_for_LSTM |
Framework | none |
Efficient Loss-Based Decoding on Graphs For Extreme Classification
Title | Efficient Loss-Based Decoding on Graphs For Extreme Classification |
Authors | Itay Evron, Edward Moroshko, Koby Crammer |
Abstract | In extreme classification problems, learning algorithms are required to map instances to labels from an extremely large label set. We build on a recent extreme classification framework with logarithmic time and space, and on a general approach for error correcting output coding (ECOC) with loss-based decoding, and introduce a flexible and efficient approach accompanied by theoretical bounds. Our framework employs output codes induced by graphs, for which we show how to perform efficient loss-based decoding to potentially improve accuracy. In addition, our framework offers a tradeoff between accuracy, model size and prediction time. We show how to find the sweet spot of this tradeoff using only the training data. Our experimental study demonstrates the validity of our assumptions and claims, and shows that our method is competitive with state-of-the-art algorithms. |
Tasks | |
Published | 2018-03-08 |
URL | http://arxiv.org/abs/1803.03319v2 |
http://arxiv.org/pdf/1803.03319v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-loss-based-decoding-on-graphs-for |
Repo | https://github.com/ievron/wltls |
Framework | none |
Delta-encoder: an effective sample synthesis method for few-shot object recognition
Title | Delta-encoder: an effective sample synthesis method for few-shot object recognition |
Authors | Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Rogerio Feris, Abhishek Kumar, Raja Giryes, Alex M. Bronstein |
Abstract | Learning to classify new categories based on just one or a few examples is a long-standing challenge in modern computer vision. In this work, we proposes a simple yet effective method for few-shot (and one-shot) object recognition. Our approach is based on a modified auto-encoder, denoted Delta-encoder, that learns to synthesize new samples for an unseen category just by seeing few examples from it. The synthesized samples are then used to train a classifier. The proposed approach learns to both extract transferable intra-class deformations, or “deltas”, between same-class pairs of training examples, and to apply those deltas to the few provided examples of a novel class (unseen during training) in order to efficiently synthesize samples from that new class. The proposed method improves over the state-of-the-art in one-shot object-recognition and compares favorably in the few-shot case. Upon acceptance code will be made available. |
Tasks | Few-Shot Image Classification, Few-Shot Learning, Object Recognition |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04734v3 |
http://arxiv.org/pdf/1806.04734v3.pdf | |
PWC | https://paperswithcode.com/paper/delta-encoder-an-effective-sample-synthesis |
Repo | https://github.com/EliSchwartz/DeltaEncoder |
Framework | tf |
DAC: Data-free Automatic Acceleration of Convolutional Networks
Title | DAC: Data-free Automatic Acceleration of Convolutional Networks |
Authors | Xin Li, Shuai Zhang, Bolan Jiang, Yingyong Qi, Mooi Choo Chuah, Ning Bi |
Abstract | Deploying a deep learning model on mobile/IoT devices is a challenging task. The difficulty lies in the trade-off between computation speed and accuracy. A complex deep learning model with high accuracy runs slowly on resource-limited devices, while a light-weight model that runs much faster loses accuracy. In this paper, we propose a novel decomposition method, namely DAC, that is capable of factorizing an ordinary convolutional layer into two layers with much fewer parameters. DAC computes the corresponding weights for the newly generated layers directly from the weights of the original convolutional layer. Thus, no training (or fine-tuning) or any data is needed. The experimental results show that DAC reduces a large number of floating-point operations (FLOPs) while maintaining high accuracy of a pre-trained model. If 2% accuracy drop is acceptable, DAC saves 53% FLOPs of VGG16 image classification model on ImageNet dataset, 29% FLOPS of SSD300 object detection model on PASCAL VOC2007 dataset, and 46% FLOPS of a multi-person pose estimation model on Microsoft COCO dataset. Compared to other existing decomposition methods, DAC achieves better performance. |
Tasks | Image Classification, Multi-Person Pose Estimation, Object Detection, Pose Estimation |
Published | 2018-12-20 |
URL | http://arxiv.org/abs/1812.08374v2 |
http://arxiv.org/pdf/1812.08374v2.pdf | |
PWC | https://paperswithcode.com/paper/dac-data-free-automatic-acceleration-of |
Repo | https://github.com/baizhenmao95/2019-ZTE-Algorithm-Competition |
Framework | caffe2 |
Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels
Title | Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels |
Authors | Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor Tsang, Masashi Sugiyama |
Abstract | Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called Co-teaching for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models. |
Tasks | |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06872v3 |
http://arxiv.org/pdf/1804.06872v3.pdf | |
PWC | https://paperswithcode.com/paper/co-teaching-robust-training-of-deep-neural |
Repo | https://github.com/bhanML/Co-teaching |
Framework | pytorch |
Efficient keyword spotting using dilated convolutions and gating
Title | Efficient keyword spotting using dilated convolutions and gating |
Authors | Alice Coucke, Mohammed Chlieh, Thibault Gisselbrecht, David Leroy, Mathieu Poumeyrol, Thibaut Lavril |
Abstract | We explore the application of end-to-end stateless temporal modeling to small-footprint keyword spotting as opposed to recurrent networks that model long-term temporal dependencies using internal states. We propose a model inspired by the recent success of dilated convolutions in sequence modeling applications, allowing to train deeper architectures in resource-constrained configurations. Gated activations and residual connections are also added, following a similar configuration to WaveNet. In addition, we apply a custom target labeling that back-propagates loss from specific frames of interest, therefore yielding higher accuracy and only requiring to detect the end of the keyword. Our experimental results show that our model outperforms a max-pooling loss trained recurrent neural network using LSTM cells, with a significant decrease in false rejection rate. The underlying dataset - “Hey Snips” utterances recorded by over 2.2K different speakers - has been made publicly available to establish an open reference for wake-word detection. |
Tasks | Keyword Spotting, Small-Footprint Keyword Spotting |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07684v2 |
http://arxiv.org/pdf/1811.07684v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-keyword-spotting-using-dilated |
Repo | https://github.com/snipsco/tract |
Framework | tf |
Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network
Title | Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network |
Authors | Yabin Zhang, Kui Jia, Zhixin Wang |
Abstract | Fine-grained object categorization aims for distinguishing objects of subordinate categories that belong to the same entry-level object category. The task is challenging due to the facts that (1) training images with ground-truth labels are difficult to obtain, and (2) variations among different subordinate categories are subtle. It is well established that characterizing features of different subordinate categories are located on local parts of object instances. In fact, careful part annotations are available in many fine-grained categorization datasets. However, manually annotating object parts requires expertise, which is also difficult to generalize to new fine-grained categorization tasks. In this work, we propose a Weakly Supervised Part Detection Network (PartNet) that is able to detect discriminative local parts for use of fine-grained categorization. A vanilla PartNet builds on top of a base subnetwork two parallel streams of upper network layers, which respectively compute scores of classification probabilities (over subordinate categories) and detection probabilities (over a specified number of discriminative part detectors) for local regions of interest (RoIs). The image-level prediction is obtained by aggregating element-wise products of these region-level probabilities. To generate a diverse set of RoIs as inputs of PartNet, we propose a simple Discretized Part Proposals module (DPP) that directly targets for proposing candidates of discriminative local parts, with no bridging via object-level proposals. Experiments on the benchmark CUB-200-2011 and Oxford Flower 102 datasets show the efficacy of our proposed method for both discriminative part detection and fine-grained categorization. In particular, we achieve the new state-of-the-art performance on CUB-200-2011 dataset when ground-truth part annotations are not available. |
Tasks | |
Published | 2018-06-16 |
URL | https://arxiv.org/abs/1806.06198v2 |
https://arxiv.org/pdf/1806.06198v2.pdf | |
PWC | https://paperswithcode.com/paper/part-aware-fine-grained-object-categorization |
Repo | https://github.com/YabinZhang1994/PartNet |
Framework | pytorch |
Trick Me If You Can: Human-in-the-loop Generation of Adversarial Examples for Question Answering
Title | Trick Me If You Can: Human-in-the-loop Generation of Adversarial Examples for Question Answering |
Authors | Eric Wallace, Pedro Rodriguez, Shi Feng, Ikuya Yamada, Jordan Boyd-Graber |
Abstract | Adversarial evaluation stress tests a model’s understanding of natural language. While past approaches expose superficial patterns, the resulting adversarial examples are limited in complexity and diversity. We propose human-in-the-loop adversarial generation, where human authors are guided to break models. We aid the authors with interpretations of model predictions through an interactive user interface. We apply this generation framework to a question answering task called Quizbowl, where trivia enthusiasts craft adversarial questions. The resulting questions are validated via live human–computer matches: although the questions appear ordinary to humans, they systematically stump neural and information retrieval models. The adversarial questions cover diverse phenomena from multi-hop reasoning to entity type distractors, exposing open challenges in robust question answering. |
Tasks | Information Retrieval, Question Answering |
Published | 2018-09-07 |
URL | https://arxiv.org/abs/1809.02701v4 |
https://arxiv.org/pdf/1809.02701v4.pdf | |
PWC | https://paperswithcode.com/paper/trick-me-if-you-can-adversarial-writing-of |
Repo | https://github.com/Eric-Wallace/trickme-interface |
Framework | none |
Weakly Supervised Instance Segmentation using Class Peak Response
Title | Weakly Supervised Instance Segmentation using Class Peak Response |
Authors | Yanzhao Zhou, Yi Zhu, Qixiang Ye, Qiang Qiu, Jianbin Jiao |
Abstract | Weakly supervised instance segmentation with image-level labels, instead of expensive pixel-level masks, remains unexplored. In this paper, we tackle this challenging problem by exploiting class peak responses to enable a classification network for instance mask extraction. With image labels supervision only, CNN classifiers in a fully convolutional manner can produce class response maps, which specify classification confidence at each image location. We observed that local maximums, i.e., peaks, in a class response map typically correspond to strong visual cues residing inside each instance. Motivated by this, we first design a process to stimulate peaks to emerge from a class response map. The emerged peaks are then back-propagated and effectively mapped to highly informative regions of each object instance, such as instance boundaries. We refer to the above maps generated from class peak responses as Peak Response Maps (PRMs). PRMs provide a fine-detailed instance-level representation, which allows instance masks to be extracted even with some off-the-shelf methods. To the best of our knowledge, we for the first time report results for the challenging image-level supervised instance segmentation task. Extensive experiments show that our method also boosts weakly supervised pointwise localization as well as semantic segmentation performance, and reports state-of-the-art results on popular benchmarks, including PASCAL VOC 2012 and MS COCO. |
Tasks | Instance Segmentation, Semantic Segmentation, Weakly-supervised instance segmentation |
Published | 2018-04-03 |
URL | http://arxiv.org/abs/1804.00880v1 |
http://arxiv.org/pdf/1804.00880v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-instance-segmentation-using-1 |
Repo | https://github.com/ZhouYanzhao/PRM |
Framework | pytorch |
An efficient framework for learning sentence representations
Title | An efficient framework for learning sentence representations |
Authors | Lajanugen Logeswaran, Honglak Lee |
Abstract | In this work we propose a simple and efficient framework for learning sentence representations from unlabelled data. Drawing inspiration from the distributional hypothesis and recent work on learning sentence representations, we reformulate the problem of predicting the context in which a sentence appears as a classification problem. Given a sentence and its context, a classifier distinguishes context sentences from other contrastive sentences based on their vector representations. This allows us to efficiently learn different types of encoding functions, and we show that the model learns high-quality sentence representations. We demonstrate that our sentence representations outperform state-of-the-art unsupervised and supervised representation learning methods on several downstream NLP tasks that involve understanding sentence semantics while achieving an order of magnitude speedup in training time. |
Tasks | Representation Learning |
Published | 2018-03-07 |
URL | http://arxiv.org/abs/1803.02893v1 |
http://arxiv.org/pdf/1803.02893v1.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-framework-for-learning-sentence |
Repo | https://github.com/lajanugen/S2V |
Framework | tf |
DeepTileBars: Visualizing Term Distribution for Neural Information Retrieval
Title | DeepTileBars: Visualizing Term Distribution for Neural Information Retrieval |
Authors | Zhiwen Tang, Grace Hui Yang |
Abstract | Most neural Information Retrieval (Neu-IR) models derive query-to-document ranking scores based on term-level matching. Inspired by TileBars, a classical term distribution visualization method, in this paper, we propose a novel Neu-IR model that handles query-to-document matching at the subtopic and higher levels. Our system first splits the documents into topical segments, “visualizes” the matchings between the query and the segments, and then feeds an interaction matrix into a Neu-IR model, DeepTileBars, to obtain the final ranking scores. DeepTileBars models the relevance signals occurring at different granularities in a document’s topic hierarchy. It better captures the discourse structure of a document and thus the matching patterns. Although its design and implementation are light-weight, DeepTileBars outperforms other state-of-the-art Neu-IR models on benchmark datasets including the Text REtrieval Conference (TREC) 2010-2012 Web Tracks and LETOR 4.0. |
Tasks | Ad-Hoc Information Retrieval, Document Ranking, Information Retrieval |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00606v2 |
http://arxiv.org/pdf/1811.00606v2.pdf | |
PWC | https://paperswithcode.com/paper/deeptilebars-visualizing-term-distribution |
Repo | https://github.com/smt-HS/DeepTileBars-release |
Framework | none |
Adversarial Generalized Method of Moments
Title | Adversarial Generalized Method of Moments |
Authors | Greg Lewis, Vasilis Syrgkanis |
Abstract | We provide an approach for learning deep neural net representations of models described via conditional moment restrictions. Conditional moment restrictions are widely used, as they are the language by which social scientists describe the assumptions they make to enable causal inference. We formulate the problem of estimating the underling model as a zero-sum game between a modeler and an adversary and apply adversarial training. Our approach is similar in nature to Generative Adversarial Networks (GAN), though here the modeler is learning a representation of a function that satisfies a continuum of moment conditions and the adversary is identifying violating moments. We outline ways of constructing effective adversaries in practice, including kernels centered by k-means clustering, and random forests. We examine the practical performance of our approach in the setting of non-parametric instrumental variable regression. |
Tasks | Causal Inference |
Published | 2018-03-19 |
URL | http://arxiv.org/abs/1803.07164v2 |
http://arxiv.org/pdf/1803.07164v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-generalized-method-of-moments |
Repo | https://github.com/vsyrgkanis/adversarial_gmm |
Framework | tf |