Paper Group AWR 75
A Particle Swarm Optimization-based Flexible Convolutional Auto-Encoder for Image Classification. The Space of Transferable Adversarial Examples. Relaxed Oracles for Semi-Supervised Clustering. Tensor Regression Networks with various Low-Rank Tensor Approximations. Detect to Track and Track to Detect. Offline bilingual word vectors, orthogonal tran …
A Particle Swarm Optimization-based Flexible Convolutional Auto-Encoder for Image Classification
Title | A Particle Swarm Optimization-based Flexible Convolutional Auto-Encoder for Image Classification |
Authors | Yanan Sun, Bing Xue, Mengjie Zhang, Gary G. Yen |
Abstract | Convolutional auto-encoders have shown their remarkable performance in stacking to deep convolutional neural networks for classifying image data during past several years. However, they are unable to construct the state-of-the-art convolutional neural networks due to their intrinsic architectures. In this regard, we propose a flexible convolutional auto-encoder by eliminating the constraints on the numbers of convolutional layers and pooling layers from the traditional convolutional auto-encoder. We also design an architecture discovery method by using particle swarm optimization, which is capable of automatically searching for the optimal architectures of the proposed flexible convolutional auto-encoder with much less computational resource and without any manual intervention. We use the designed architecture optimization algorithm to test the proposed flexible convolutional auto-encoder through utilizing one graphic processing unit card on four extensively used image classification datasets. Experimental results show that our work in this paper significantly outperform the peer competitors including the state-of-the-art algorithm. |
Tasks | Image Classification |
Published | 2017-12-13 |
URL | http://arxiv.org/abs/1712.05042v2 |
http://arxiv.org/pdf/1712.05042v2.pdf | |
PWC | https://paperswithcode.com/paper/a-particle-swarm-optimization-based-flexible |
Repo | https://github.com/yn-sun/evocae |
Framework | tf |
The Space of Transferable Adversarial Examples
Title | The Space of Transferable Adversarial Examples |
Authors | Florian Tramèr, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel |
Abstract | Adversarial examples are maliciously perturbed inputs designed to mislead machine learning (ML) models at test-time. They often transfer: the same adversarial example fools more than one model. In this work, we propose novel methods for estimating the previously unknown dimensionality of the space of adversarial inputs. We find that adversarial examples span a contiguous subspace of large (~25) dimensionality. Adversarial subspaces with higher dimensionality are more likely to intersect. We find that for two different models, a significant fraction of their subspaces is shared, thus enabling transferability. In the first quantitative analysis of the similarity of different models’ decision boundaries, we show that these boundaries are actually close in arbitrary directions, whether adversarial or benign. We conclude by formally studying the limits of transferability. We derive (1) sufficient conditions on the data distribution that imply transferability for simple model classes and (2) examples of scenarios in which transfer does not occur. These findings indicate that it may be possible to design defenses against transfer-based attacks, even for models that are vulnerable to direct attacks. |
Tasks | |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03453v2 |
http://arxiv.org/pdf/1704.03453v2.pdf | |
PWC | https://paperswithcode.com/paper/the-space-of-transferable-adversarial |
Repo | https://github.com/panda1230/Adversarial_NoiseLearning_NoL |
Framework | pytorch |
Relaxed Oracles for Semi-Supervised Clustering
Title | Relaxed Oracles for Semi-Supervised Clustering |
Authors | Taewan Kim, Joydeep Ghosh |
Abstract | Pairwise “same-cluster” queries are one of the most widely used forms of supervision in semi-supervised clustering. However, it is impractical to ask human oracles to answer every query correctly. In this paper, we study the influence of allowing “not-sure” answers from a weak oracle and propose an effective algorithm to handle such uncertainties in query responses. Two realistic weak oracle models are considered where ambiguity in answering depends on the distance between two points. We show that a small query complexity is adequate for effective clustering with high probability by providing better pairs to the weak oracle. Experimental results on synthetic and real data show the effectiveness of our approach in overcoming supervision uncertainties and yielding high quality clusters. |
Tasks | |
Published | 2017-11-20 |
URL | http://arxiv.org/abs/1711.07433v1 |
http://arxiv.org/pdf/1711.07433v1.pdf | |
PWC | https://paperswithcode.com/paper/relaxed-oracles-for-semi-supervised |
Repo | https://github.com/twankim/weaksemi |
Framework | none |
Tensor Regression Networks with various Low-Rank Tensor Approximations
Title | Tensor Regression Networks with various Low-Rank Tensor Approximations |
Authors | Xingwei Cao, Guillaume Rabusseau |
Abstract | Tensor regression networks achieve high compression rate of neural networks while having slight impact on performances. They do so by imposing low tensor rank structure on the weight matrices of fully connected layers. In recent years, tensor regression networks have been investigated from the perspective of their compressive power, however, the regularization effect of enforcing low-rank tensor structure has not been investigated enough. We study tensor regression networks using various low-rank tensor approximations, aiming to compare the compressive and regularization power of different low-rank constraints. We evaluate the compressive and regularization performances of the proposed model with both deep and shallow convolutional neural networks. The outcome of our experiment suggests the superiority of Global Average Pooling Layer over Tensor Regression Layer when applied to deep convolutional neural network with CIFAR-10 dataset. On the contrary, shallow convolutional neural networks with tensor regression layer and dropout achieved lower test error than both Global Average Pooling and fully-connected layer with dropout function when trained with a small number of samples. |
Tasks | |
Published | 2017-12-27 |
URL | http://arxiv.org/abs/1712.09520v2 |
http://arxiv.org/pdf/1712.09520v2.pdf | |
PWC | https://paperswithcode.com/paper/tensor-regression-networks-with-various-low |
Repo | https://github.com/Vixaer/LowRankTRN |
Framework | tf |
Detect to Track and Track to Detect
Title | Detect to Track and Track to Detect |
Authors | Christoph Feichtenhofer, Axel Pinz, Andrew Zisserman |
Abstract | Recent approaches for high accuracy detection and tracking of object categories in video consist of complex multistage solutions that become more cumbersome each year. In this paper we propose a ConvNet architecture that jointly performs detection and tracking, solving the task in a simple and effective way. Our contributions are threefold: (i) we set up a ConvNet architecture for simultaneous detection and tracking, using a multi-task objective for frame-based object detection and across-frame track regression; (ii) we introduce correlation features that represent object co-occurrences across time to aid the ConvNet during tracking; and (iii) we link the frame level detections based on our across-frame tracklets to produce high accuracy detections at the video level. Our ConvNet architecture for spatiotemporal object detection is evaluated on the large-scale ImageNet VID dataset where it achieves state-of-the-art results. Our approach provides better single model performance than the winning method of the last ImageNet challenge while being conceptually much simpler. Finally, we show that by increasing the temporal stride we can dramatically increase the tracker speed. |
Tasks | Object Detection |
Published | 2017-10-11 |
URL | http://arxiv.org/abs/1710.03958v2 |
http://arxiv.org/pdf/1710.03958v2.pdf | |
PWC | https://paperswithcode.com/paper/detect-to-track-and-track-to-detect |
Repo | https://github.com/feichtenhofer/detect-track |
Framework | none |
Offline bilingual word vectors, orthogonal transformations and the inverted softmax
Title | Offline bilingual word vectors, orthogonal transformations and the inverted softmax |
Authors | Samuel L. Smith, David H. P. Turban, Steven Hamblin, Nils Y. Hammerla |
Abstract | Usually bilingual word vectors are trained “online”. Mikolov et al. showed they can also be found “offline”, whereby two pre-trained embeddings are aligned with a linear transformation, using dictionaries compiled from expert knowledge. In this work, we prove that the linear transformation between two spaces should be orthogonal. This transformation can be obtained using the singular value decomposition. We introduce a novel “inverted softmax” for identifying translation pairs, with which we improve the precision @1 of Mikolov’s original mapping from 34% to 43%, when translating a test set composed of both common and rare English words into Italian. Orthogonal transformations are more robust to noise, enabling us to learn the transformation without expert bilingual signal by constructing a “pseudo-dictionary” from the identical character strings which appear in both languages, achieving 40% precision on the same test set. Finally, we extend our method to retrieve the true translations of English sentences from a corpus of 200k Italian sentences with a precision @1 of 68%. |
Tasks | |
Published | 2017-02-13 |
URL | http://arxiv.org/abs/1702.03859v1 |
http://arxiv.org/pdf/1702.03859v1.pdf | |
PWC | https://paperswithcode.com/paper/offline-bilingual-word-vectors-orthogonal |
Repo | https://github.com/jiajunhua/facebookresearch-MUSE |
Framework | pytorch |
End-to-end Driving via Conditional Imitation Learning
Title | End-to-end Driving via Conditional Imitation Learning |
Authors | Felipe Codevilla, Matthias Müller, Antonio López, Vladlen Koltun, Alexey Dosovitskiy |
Abstract | Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at https://youtu.be/cFtnflNe5fM |
Tasks | Imitation Learning |
Published | 2017-10-06 |
URL | http://arxiv.org/abs/1710.02410v2 |
http://arxiv.org/pdf/1710.02410v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-driving-via-conditional-imitation |
Repo | https://github.com/bitsauce/Carla-ppo |
Framework | tf |
Deep CNN ensembles and suggestive annotations for infant brain MRI segmentation
Title | Deep CNN ensembles and suggestive annotations for infant brain MRI segmentation |
Authors | Jose Dolz, Christian Desrosiers, Li Wang, Jing Yuan, Dinggang Shen, Ismail Ben Ayed |
Abstract | Precise 3D segmentation of infant brain tissues is an essential step towards comprehensive volumetric studies and quantitative analysis of early brain developement. However, computing such segmentations is very challenging, especially for 6-month infant brain, due to the poor image quality, among other difficulties inherent to infant brain MRI, e.g., the isointense contrast between white and gray matter and the severe partial volume effect due to small brain sizes. This study investigates the problem with an ensemble of semi-dense fully convolutional neural networks (CNNs), which employs T1-weighted and T2-weighted MR images as input. We demonstrate that the ensemble agreement is highly correlated with the segmentation errors. Therefore, our method provides measures that can guide local user corrections. To the best of our knowledge, this work is the first ensemble of 3D CNNs for suggesting annotations within images. Furthermore, inspired by the very recent success of dense networks, we propose a novel architecture, SemiDenseNet, which connects all convolutional layers directly to the end of the network. Our architecture allows the efficient propagation of gradients during training, while limiting the number of parameters, requiring one order of magnitude less parameters than popular medical image segmentation networks such as 3D U-Net. Another contribution of our work is the study of the impact that early or late fusions of multiple image modalities might have on the performances of deep architectures. We report evaluations of our method on the public data of the MICCAI iSEG-2017 Challenge on 6-month infant brain MRI segmentation, and show very competitive results among 21 teams, ranking first or second in most metrics. |
Tasks | Infant Brain Mri Segmentation, Medical Image Segmentation, Semantic Segmentation |
Published | 2017-12-14 |
URL | http://arxiv.org/abs/1712.05319v2 |
http://arxiv.org/pdf/1712.05319v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-cnn-ensembles-and-suggestive-annotations |
Repo | https://github.com/josedolz/SemiDenseNet |
Framework | none |
Collaborative Deep Reinforcement Learning
Title | Collaborative Deep Reinforcement Learning |
Authors | Kaixiang Lin, Shu Wang, Jiayu Zhou |
Abstract | Besides independent learning, human learning process is highly improved by summarizing what has been learned, communicating it with peers, and subsequently fusing knowledge from different sources to assist the current learning goal. This collaborative learning procedure ensures that the knowledge is shared, continuously refined, and concluded from different perspectives to construct a more profound understanding. The idea of knowledge transfer has led to many advances in machine learning and data mining, but significant challenges remain, especially when it comes to reinforcement learning, heterogeneous model structures, and different learning tasks. Motivated by human collaborative learning, in this paper we propose a collaborative deep reinforcement learning (CDRL) framework that performs adaptive knowledge transfer among heterogeneous learning agents. Specifically, the proposed CDRL conducts a novel deep knowledge distillation method to address the heterogeneity among different learning tasks with a deep alignment network. Furthermore, we present an efficient collaborative Asynchronous Advantage Actor-Critic (cA3C) algorithm to incorporate deep knowledge distillation into the online training of agents, and demonstrate the effectiveness of the CDRL framework using extensive empirical evaluation on OpenAI gym. |
Tasks | Transfer Learning |
Published | 2017-02-19 |
URL | http://arxiv.org/abs/1702.05796v1 |
http://arxiv.org/pdf/1702.05796v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-deep-reinforcement-learning |
Repo | https://github.com/illidanlab/cdrl |
Framework | tf |
Dynamic Routing Between Capsules
Title | Dynamic Routing Between Capsules |
Authors | Sara Sabour, Nicholas Frosst, Geoffrey E Hinton |
Abstract | A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or an object part. We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation parameters. Active capsules at one level make predictions, via transformation matrices, for the instantiation parameters of higher-level capsules. When multiple predictions agree, a higher level capsule becomes active. We show that a discrimininatively trained, multi-layer capsule system achieves state-of-the-art performance on MNIST and is considerably better than a convolutional net at recognizing highly overlapping digits. To achieve these results we use an iterative routing-by-agreement mechanism: A lower-level capsule prefers to send its output to higher level capsules whose activity vectors have a big scalar product with the prediction coming from the lower-level capsule. |
Tasks | Image Classification |
Published | 2017-10-26 |
URL | http://arxiv.org/abs/1710.09829v2 |
http://arxiv.org/pdf/1710.09829v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-routing-between-capsules |
Repo | https://github.com/Suraj-Panwar/Capsule_Network_based_Deep_Q_learning |
Framework | tf |
Leaf Counting with Deep Convolutional and Deconvolutional Networks
Title | Leaf Counting with Deep Convolutional and Deconvolutional Networks |
Authors | Shubhra Aich, Ian Stavness |
Abstract | In this paper, we investigate the problem of counting rosette leaves from an RGB image, an important task in plant phenotyping. We propose a data-driven approach for this task generalized over different plant species and imaging setups. To accomplish this task, we use state-of-the-art deep learning architectures: a deconvolutional network for initial segmentation and a convolutional network for leaf counting. Evaluation is performed on the leaf counting challenge dataset at CVPPP-2017. Despite the small number of training samples in this dataset, as compared to typical deep learning image sets, we obtain satisfactory performance on segmenting leaves from the background as a whole and counting the number of leaves using simple data augmentation strategies. Comparative analysis is provided against methods evaluated on the previous competition datasets. Our framework achieves mean and standard deviation of absolute count difference of 1.62 and 2.30 averaged over all five test datasets. |
Tasks | Data Augmentation |
Published | 2017-08-24 |
URL | http://arxiv.org/abs/1708.07570v2 |
http://arxiv.org/pdf/1708.07570v2.pdf | |
PWC | https://paperswithcode.com/paper/leaf-counting-with-deep-convolutional-and |
Repo | https://github.com/p2irc/leaf_count_ICCVW-2017 |
Framework | none |
Deep Learning Methods for Improved Decoding of Linear Codes
Title | Deep Learning Methods for Improved Decoding of Linear Codes |
Authors | Eliya Nachmani, Elad Marciano, Loren Lugosch, Warren J. Gross, David Burshtein, Yair Beery |
Abstract | The problem of low complexity, close to optimal, channel decoding of linear codes with short to moderate block length is considered. It is shown that deep learning methods can be used to improve a standard belief propagation decoder, despite the large example space. Similar improvements are obtained for the min-sum algorithm. It is also shown that tying the parameters of the decoders across iterations, so as to form a recurrent neural network architecture, can be implemented with comparable results. The advantage is that significantly less parameters are required. We also introduce a recurrent neural decoder architecture based on the method of successive relaxation. Improvements over standard belief propagation are also observed on sparser Tanner graph representations of the codes. Furthermore, we demonstrate that the neural belief propagation decoder can be used to improve the performance, or alternatively reduce the computational complexity, of a close to optimal decoder of short BCH codes. |
Tasks | |
Published | 2017-06-21 |
URL | http://arxiv.org/abs/1706.07043v2 |
http://arxiv.org/pdf/1706.07043v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-methods-for-improved-decoding |
Repo | https://github.com/lorenlugosch/neural-min-sum-decoding |
Framework | tf |
Learning Visual Reasoning Without Strong Priors
Title | Learning Visual Reasoning Without Strong Priors |
Authors | Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville |
Abstract | Achieving artificial visual reasoning - the ability to answer image-related questions which require a multi-step, high-level process - is an important step towards artificial general intelligence. This multi-modal task requires learning a question-dependent, structured reasoning process over images from language. Standard deep learning approaches tend to exploit biases in the data rather than learn this underlying structure, while leading methods learn to visually reason successfully but are hand-crafted for reasoning. We show that a general-purpose, Conditional Batch Normalization approach achieves state-of-the-art results on the CLEVR Visual Reasoning benchmark with a 2.4% error rate. We outperform the next best end-to-end method (4.5%) and even methods that use extra supervision (3.1%). We probe our model to shed light on how it reasons, showing it has learned a question-dependent, multi-step process. Previous work has operated under the assumption that visual reasoning calls for a specialized architecture, but we show that a general architecture with proper conditioning can learn to visually reason effectively. |
Tasks | Visual Reasoning |
Published | 2017-07-10 |
URL | http://arxiv.org/abs/1707.03017v5 |
http://arxiv.org/pdf/1707.03017v5.pdf | |
PWC | https://paperswithcode.com/paper/learning-visual-reasoning-without-strong |
Repo | https://github.com/GuessWhatGame/clevr |
Framework | tf |
Few-Shot Learning with Graph Neural Networks
Title | Few-Shot Learning with Graph Neural Networks |
Authors | Victor Garcia, Joan Bruna |
Abstract | We propose to study the problem of few-shot learning with the prism of inference on a partially observed graphical model, constructed from a collection of input images whose label can be either observed or not. By assimilating generic message-passing inference algorithms with their neural-network counterparts, we define a graph neural network architecture that generalizes several of the recently proposed few-shot learning models. Besides providing improved numerical performance, our framework is easily extended to variants of few-shot learning, such as semi-supervised or active learning, demonstrating the ability of graph-based models to operate well on ‘relational’ tasks. |
Tasks | Active Learning, Few-Shot Learning |
Published | 2017-11-10 |
URL | http://arxiv.org/abs/1711.04043v3 |
http://arxiv.org/pdf/1711.04043v3.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-learning-with-graph-neural-networks |
Repo | https://github.com/HoganZhang/few-shot-gnn |
Framework | pytorch |
Understanding Infographics through Textual and Visual Tag Prediction
Title | Understanding Infographics through Textual and Visual Tag Prediction |
Authors | Zoya Bylinskii, Sami Alsheikh, Spandan Madan, Adria Recasens, Kimberli Zhong, Hanspeter Pfister, Fredo Durand, Aude Oliva |
Abstract | We introduce the problem of visual hashtag discovery for infographics: extracting visual elements from an infographic that are diagnostic of its topic. Given an infographic as input, our computational approach automatically outputs textual and visual elements predicted to be representative of the infographic content. Concretely, from a curated dataset of 29K large infographic images sampled across 26 categories and 391 tags, we present an automated two step approach. First, we extract the text from an infographic and use it to predict text tags indicative of the infographic content. And second, we use these predicted text tags as a supervisory signal to localize the most diagnostic visual elements from within the infographic i.e. visual hashtags. We report performances on a categorization and multi-label tag prediction problem and compare our proposed visual hashtags to human annotations. |
Tasks | |
Published | 2017-09-26 |
URL | http://arxiv.org/abs/1709.09215v1 |
http://arxiv.org/pdf/1709.09215v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-infographics-through-textual |
Repo | https://github.com/cvzoya/visuallydata |
Framework | none |