Paper Group AWR 88
Steganographic Generative Adversarial Networks. Group-level Emotion Recognition using Transfer Learning from Face Identification. Deep Learning for Patient-Specific Kidney Graft Survival Analysis. libact: Pool-based Active Learning in Python. Fast Single-Class Classification and the Principle of Logit Separation. DropMax: Adaptive Variational Softm …
Steganographic Generative Adversarial Networks
Title | Steganographic Generative Adversarial Networks |
Authors | Denis Volkhonskiy, Ivan Nazarov, Evgeny Burnaev |
Abstract | Steganography is collection of methods to hide secret information (“payload”) within non-secret information “container”). Its counterpart, Steganalysis, is the practice of determining if a message contains a hidden payload, and recovering it if possible. Presence of hidden payloads is typically detected by a binary classifier. In the present study, we propose a new model for generating image-like containers based on Deep Convolutional Generative Adversarial Networks (DCGAN). This approach allows to generate more setganalysis-secure message embedding using standard steganography algorithms. Experiment results demonstrate that the new model successfully deceives the steganography analyzer, and for this reason, can be used in steganographic applications. |
Tasks | |
Published | 2017-03-16 |
URL | https://arxiv.org/abs/1703.05502v2 |
https://arxiv.org/pdf/1703.05502v2.pdf | |
PWC | https://paperswithcode.com/paper/steganographic-generative-adversarial |
Repo | https://github.com/dvolkhonskiy/adversarial-steganography |
Framework | tf |
Group-level Emotion Recognition using Transfer Learning from Face Identification
Title | Group-level Emotion Recognition using Transfer Learning from Face Identification |
Authors | Alexandr G. Rassadin, Alexey S. Gruzdev, Andrey V. Savchenko |
Abstract | In this paper, we describe our algorithmic approach, which was used for submissions in the fifth Emotion Recognition in the Wild (EmotiW 2017) group-level emotion recognition sub-challenge. We extracted feature vectors of detected faces using the Convolutional Neural Network trained for face identification task, rather than traditional pre-training on emotion recognition problems. In the final pipeline an ensemble of Random Forest classifiers was learned to predict emotion score using available training set. In case when the faces have not been detected, one member of our ensemble extracts features from the whole image. During our experimental study, the proposed approach showed the lowest error rate when compared to other explored techniques. In particular, we achieved 75.4% accuracy on the validation data, which is 20% higher than the handcrafted feature-based baseline. The source code using Keras framework is publicly available. |
Tasks | Emotion Recognition, Face Identification, Transfer Learning |
Published | 2017-09-06 |
URL | http://arxiv.org/abs/1709.01688v3 |
http://arxiv.org/pdf/1709.01688v3.pdf | |
PWC | https://paperswithcode.com/paper/group-level-emotion-recognition-using |
Repo | https://github.com/arassadin/emotiw2017 |
Framework | none |
Deep Learning for Patient-Specific Kidney Graft Survival Analysis
Title | Deep Learning for Patient-Specific Kidney Graft Survival Analysis |
Authors | Margaux Luck, Tristan Sylvain, Héloïse Cardinal, Andrea Lodi, Yoshua Bengio |
Abstract | An accurate model of patient-specific kidney graft survival distributions can help to improve shared-decision making in the treatment and care of patients. In this paper, we propose a deep learning method that directly models the survival function instead of estimating the hazard function to predict survival times for graft patients based on the principle of multi-task learning. By learning to jointly predict the time of the event, and its rank in the cox partial log likelihood framework, our deep learning approach outperforms, in terms of survival time prediction quality and concordance index, other common methods for survival analysis, including the Cox Proportional Hazards model and a network trained on the cox partial log-likelihood. |
Tasks | Decision Making, Multi-Task Learning, Survival Analysis |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10245v1 |
http://arxiv.org/pdf/1705.10245v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-patient-specific-kidney |
Repo | https://github.com/EbnulMahmood/PatternLab |
Framework | none |
libact: Pool-based Active Learning in Python
Title | libact: Pool-based Active Learning in Python |
Authors | Yao-Yuan Yang, Shao-Chuan Lee, Yu-An Chung, Tung-En Wu, Si-An Chen, Hsuan-Tien Lin |
Abstract | libact is a Python package designed to make active learning easier for general users. The package not only implements several popular active learning strategies, but also features the active-learning-by-learning meta-algorithm that assists the users to automatically select the best strategy on the fly. Furthermore, the package provides a unified interface for implementing more strategies, models and application-specific labelers. The package is open-source on Github, and can be easily installed from Python Package Index repository. |
Tasks | Active Learning |
Published | 2017-10-01 |
URL | http://arxiv.org/abs/1710.00379v1 |
http://arxiv.org/pdf/1710.00379v1.pdf | |
PWC | https://paperswithcode.com/paper/libact-pool-based-active-learning-in-python |
Repo | https://github.com/ntucllab/libact |
Framework | none |
Fast Single-Class Classification and the Principle of Logit Separation
Title | Fast Single-Class Classification and the Principle of Logit Separation |
Authors | Gil Keren, Sivan Sabato, Björn Schuller |
Abstract | We consider neural network training, in applications in which there are many possible classes, but at test-time, the task is a binary classification task of determining whether the given example belongs to a specific class, where the class of interest can be different each time the classifier is applied. For instance, this is the case for real-time image search. We define the Single Logit Classification (SLC) task: training the network so that at test-time, it would be possible to accurately identify whether the example belongs to a given class in a computationally efficient manner, based only on the output logit for this class. We propose a natural principle, the Principle of Logit Separation, as a guideline for choosing and designing losses suitable for the SLC. We show that the cross-entropy loss function is not aligned with the Principle of Logit Separation. In contrast, there are known loss functions, as well as novel batch loss functions that we propose, which are aligned with this principle. In total, we study seven loss functions. Our experiments show that indeed in almost all cases, losses that are aligned with the Principle of Logit Separation obtain at least 20% relative accuracy improvement in the SLC task compared to losses that are not aligned with it, and sometimes considerably more. Furthermore, we show that fast SLC does not cause any drop in binary classification accuracy, compared to standard classification in which all logits are computed, and yields a speedup which grows with the number of classes. For instance, we demonstrate a 10x speedup when the number of classes is 400,000. Tensorflow code for optimizing the new batch losses is publicly available at https://github.com/cruvadom/Logit Separation. |
Tasks | Image Retrieval |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10246v4 |
http://arxiv.org/pdf/1705.10246v4.pdf | |
PWC | https://paperswithcode.com/paper/fast-single-class-classification-and-the |
Repo | https://github.com/cruvadom/Logit_Separation |
Framework | tf |
DropMax: Adaptive Variational Softmax
Title | DropMax: Adaptive Variational Softmax |
Authors | Hae Beom Lee, Juho Lee, Saehoon Kim, Eunho Yang, Sung Ju Hwang |
Abstract | We propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes according to dropout probabilities adaptively decided for each instance. Specifically, we overlay binary masking variables over class output probabilities, which are input-adaptively learned via variational inference. This stochastic regularization has an effect of building an ensemble classifier out of exponentially many classifiers with different decision boundaries. Moreover, the learning of dropout rates for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes. We validate our model on multiple public datasets for classification, on which it obtains significantly improved accuracy over the regular softmax classifier and other baselines. Further analysis of the learned dropout probabilities shows that our model indeed selects confusing classes more often when it performs classification. |
Tasks | |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.07834v5 |
http://arxiv.org/pdf/1712.07834v5.pdf | |
PWC | https://paperswithcode.com/paper/dropmax-adaptive-variational-softmax |
Repo | https://github.com/haebeom-lee/dropmax |
Framework | tf |
Using stochastic computation graphs formalism for optimization of sequence-to-sequence model
Title | Using stochastic computation graphs formalism for optimization of sequence-to-sequence model |
Authors | Eugene Golikov, Vlad Zhukov, Maksim Kretov |
Abstract | Variety of machine learning problems can be formulated as an optimization task for some (surrogate) loss function. Calculation of loss function can be viewed in terms of stochastic computation graphs (SCG). We use this formalism to analyze a problem of optimization of famous sequence-to-sequence model with attention and propose reformulation of the task. Examples are given for machine translation (MT). Our work provides a unified view on different optimization approaches for sequence-to-sequence models and could help researchers in developing new network architectures with embedded stochastic nodes. |
Tasks | Machine Translation |
Published | 2017-11-21 |
URL | http://arxiv.org/abs/1711.07724v2 |
http://arxiv.org/pdf/1711.07724v2.pdf | |
PWC | https://paperswithcode.com/paper/using-stochastic-computation-graphs-formalism |
Repo | https://github.com/deepmipt/seq2seq_scg |
Framework | pytorch |
Learning to Compare: Relation Network for Few-Shot Learning
Title | Learning to Compare: Relation Network for Few-Shot Learning |
Authors | Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H. S. Torr, Timothy M. Hospedales |
Abstract | We present a conceptually simple, flexible, and general framework for few-shot learning, where a classifier must learn to recognise new classes given only few examples from each. Our method, called the Relation Network (RN), is trained end-to-end from scratch. During meta-learning, it learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting. Once trained, a RN is able to classify images of new classes by computing relation scores between query images and the few examples of each new class without further updating the network. Besides providing improved performance on few-shot learning, our framework is easily extended to zero-shot learning. Extensive experiments on five benchmarks demonstrate that our simple approach provides a unified and effective approach for both of these two tasks. |
Tasks | Few-Shot Image Classification, Few-Shot Learning, Meta-Learning, Zero-Shot Learning |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.06025v2 |
http://arxiv.org/pdf/1711.06025v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-compare-relation-network-for-few |
Repo | https://github.com/prolearner/LearningToCompareTF |
Framework | tf |
Per-Pixel Feedback for improving Semantic Segmentation
Title | Per-Pixel Feedback for improving Semantic Segmentation |
Authors | Aditya Ganeshan |
Abstract | Semantic segmentation is the task of assigning a label to each pixel in the image.In recent years, deep convolutional neural networks have been driving advances in multiple tasks related to cognition. Although, DCNNs have resulted in unprecedented visual recognition performances, they offer little transparency. To understand how DCNN based models work at the task of semantic segmentation, we try to analyze the DCNN models in semantic segmentation. We try to find the importance of global image information for labeling pixels. Based on the experiments on discriminative regions, and modeling of fixations, we propose a set of new training loss functions for fine-tuning DCNN based models. The proposed training regime has shown improvement in performance of DeepLab Large FOV(VGG-16) Segmentation model for PASCAL VOC 2012 dataset. However, further test remains to conclusively evaluate the benefits due to the proposed loss functions across models, and data-sets. Submitted in part fulfillment of the requirements for the degree of Integrated Masters of Science in Applied Mathematics. Update: Further Experiment showed minimal benefits. Code Available here. |
Tasks | Semantic Segmentation |
Published | 2017-12-07 |
URL | http://arxiv.org/abs/1712.02861v1 |
http://arxiv.org/pdf/1712.02861v1.pdf | |
PWC | https://paperswithcode.com/paper/per-pixel-feedback-for-improving-semantic |
Repo | https://github.com/BardOfCodes/Seg-Unravel |
Framework | none |
Large-Scale Evolution of Image Classifiers
Title | Large-Scale Evolution of Image Classifiers |
Authors | Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc Le, Alex Kurakin |
Abstract | Neural networks have proven effective at solving difficult problems but designing their architectures can be challenging, even for image classification problems alone. Our goal is to minimize human participation, so we employ evolutionary algorithms to discover such networks automatically. Despite significant computational requirements, we show that it is now possible to evolve models with accuracies within the range of those published in the last year. Specifically, we employ simple evolutionary techniques at unprecedented scales to discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial initial conditions and reaching accuracies of 94.6% (95.6% for ensemble) and 77.0%, respectively. To do this, we use novel and intuitive mutation operators that navigate large search spaces; we stress that no human participation is required once evolution starts and that the output is a fully-trained model. Throughout this work, we place special emphasis on the repeatability of results, the variability in the outcomes and the computational requirements. |
Tasks | Hyperparameter Optimization, Image Classification, Neural Architecture Search |
Published | 2017-03-03 |
URL | http://arxiv.org/abs/1703.01041v2 |
http://arxiv.org/pdf/1703.01041v2.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-evolution-of-image-classifiers |
Repo | https://github.com/marijnvk/LargeScaleEvolution |
Framework | tf |
Dynamic Word Embeddings
Title | Dynamic Word Embeddings |
Authors | Robert Bamler, Stephan Mandt |
Abstract | We present a probabilistic language model for time-stamped text data which tracks the semantic evolution of individual words over time. The model represents words and contexts by latent trajectories in an embedding space. At each moment in time, the embedding vectors are inferred from a probabilistic version of word2vec [Mikolov et al., 2013]. These embedding vectors are connected in time through a latent diffusion process. We describe two scalable variational inference algorithms–skip-gram smoothing and skip-gram filtering–that allow us to train the model jointly over all times; thus learning on all data while simultaneously allowing word and context vectors to drift. Experimental results on three different corpora demonstrate that our dynamic model infers word embedding trajectories that are more interpretable and lead to higher predictive likelihoods than competing methods that are based on static models trained separately on time slices. |
Tasks | Language Modelling, Word Embeddings |
Published | 2017-02-27 |
URL | http://arxiv.org/abs/1702.08359v2 |
http://arxiv.org/pdf/1702.08359v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-word-embeddings |
Repo | https://github.com/accessai/dynamic_word_embeddings |
Framework | pytorch |
Efficient Architecture Search by Network Transformation
Title | Efficient Architecture Search by Network Transformation |
Authors | Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, Jun Wang |
Abstract | Techniques for automatically designing deep neural network architectures such as reinforcement learning based approaches have recently shown promising results. However, their success is based on vast computational resources (e.g. hundreds of GPUs), making them difficult to be widely used. A noticeable limitation is that they still design and train each network from scratch during the exploration of the architecture space, which is highly inefficient. In this paper, we propose a new framework toward efficient architecture search by exploring the architecture space based on the current network and reusing its weights. We employ a reinforcement learning agent as the meta-controller, whose action is to grow the network depth or layer width with function-preserving transformations. As such, the previously validated networks can be reused for further exploration, thus saves a large amount of computational cost. We apply our method to explore the architecture space of the plain convolutional neural networks (no skip-connections, branching etc.) on image benchmark datasets (CIFAR-10, SVHN) with restricted computational resources (5 GPUs). Our method can design highly competitive networks that outperform existing networks using the same design scheme. On CIFAR-10, our model without skip-connections achieves 4.23% test error rate, exceeding a vast majority of modern architectures and approaching DenseNet. Furthermore, by applying our method to explore the DenseNet architecture space, we are able to achieve more accurate networks with fewer parameters. |
Tasks | Image Classification, Neural Architecture Search |
Published | 2017-07-16 |
URL | http://arxiv.org/abs/1707.04873v2 |
http://arxiv.org/pdf/1707.04873v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-architecture-search-by-network |
Repo | https://github.com/han-cai/PathLevel-EAS |
Framework | pytorch |
Learning Language Representations for Typology Prediction
Title | Learning Language Representations for Typology Prediction |
Authors | Chaitanya Malaviya, Graham Neubig, Patrick Littell |
Abstract | One central mystery of neural NLP is what neural models “know” about their subject matter. When a neural machine translation system learns to translate from one language to another, does it learn the syntax or semantics of the languages? Can this knowledge be extracted from the system to fill holes in human scientific knowledge? Existing typological databases contain relatively full feature specifications for only a few hundred languages. Exploiting the existence of parallel texts in more than a thousand languages, we build a massive many-to-one neural machine translation (NMT) system from 1017 languages into English, and use this to predict information missing from typological databases. Experiments show that the proposed method is able to infer not only syntactic, but also phonological and phonetic inventory features, and improves over a baseline that has access to information about the languages’ geographic and phylogenetic neighbors. |
Tasks | Machine Translation |
Published | 2017-07-29 |
URL | http://arxiv.org/abs/1707.09569v1 |
http://arxiv.org/pdf/1707.09569v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-language-representations-for |
Repo | https://github.com/chaitanyamalaviya/lang-reps |
Framework | none |
In Search of a Dataset for Handwritten Optical Music Recognition: Introducing MUSCIMA++
Title | In Search of a Dataset for Handwritten Optical Music Recognition: Introducing MUSCIMA++ |
Authors | Jan Hajič jr., Pavel Pecina |
Abstract | Optical Music Recognition (OMR) has long been without an adequate dataset and ground truth for evaluating OMR systems, which has been a major problem for establishing a state of the art in the field. Furthermore, machine learning methods require training data. We analyze how the OMR processing pipeline can be expressed in terms of gradually more complex ground truth, and based on this analysis, we design the MUSCIMA++ dataset of handwritten music notation that addresses musical symbol recognition and notation reconstruction. The MUSCIMA++ dataset version 0.9 consists of 140 pages of handwritten music, with 91255 manually annotated notation symbols and 82261 explicitly marked relationships between symbol pairs. The dataset allows training and evaluating models for symbol classification, symbol localization, and notation graph assembly, both in isolation and jointly. Open-source tools are provided for manipulating the dataset, visualizing the data and further annotation, and the dataset itself is made available under an open license. |
Tasks | |
Published | 2017-03-14 |
URL | http://arxiv.org/abs/1703.04824v1 |
http://arxiv.org/pdf/1703.04824v1.pdf | |
PWC | https://paperswithcode.com/paper/in-search-of-a-dataset-for-handwritten |
Repo | https://github.com/OMR-Research/muscima-pp |
Framework | none |
Quality Aware Network for Set to Set Recognition
Title | Quality Aware Network for Set to Set Recognition |
Authors | Yu Liu, Junjie Yan, Wanli Ouyang |
Abstract | This paper targets on the problem of set to set recognition, which learns the metric between two image sets. Images in each set belong to the same identity. Since images in a set can be complementary, they hopefully lead to higher accuracy in practical applications. However, the quality of each sample cannot be guaranteed, and samples with poor quality will hurt the metric. In this paper, the quality aware network (QAN) is proposed to confront this problem, where the quality of each sample can be automatically learned although such information is not explicitly provided in the training stage. The network has two branches, where the first branch extracts appearance feature embedding for each sample and the other branch predicts quality score for each sample. Features and quality scores of all samples in a set are then aggregated to generate the final feature embedding. We show that the two branches can be trained in an end-to-end manner given only the set-level identity annotation. Analysis on gradient spread of this mechanism indicates that the quality learned by the network is beneficial to set-to-set recognition and simplifies the distribution that the network needs to fit. Experiments on both face verification and person re-identification show advantages of the proposed QAN. The source code and network structure can be downloaded at https://github.com/sciencefans/Quality-Aware-Network. |
Tasks | Face Verification, Person Re-Identification |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03373v1 |
http://arxiv.org/pdf/1704.03373v1.pdf | |
PWC | https://paperswithcode.com/paper/quality-aware-network-for-set-to-set |
Repo | https://github.com/sciencefans/Quality-Aware-Network |
Framework | none |