Paper Group AWR 118
NestedNet: Learning Nested Sparse Structures in Deep Neural Networks. On the choice of the low-dimensional domain for global optimization via random embeddings. Online Multiclass Boosting. Reinforcement Learning for Pivoting Task. EAST: An Efficient and Accurate Scene Text Detector. Limited-Memory Matrix Adaptation for Large Scale Black-box Optimiz …
NestedNet: Learning Nested Sparse Structures in Deep Neural Networks
Title | NestedNet: Learning Nested Sparse Structures in Deep Neural Networks |
Authors | Eunwoo Kim, Chanho Ahn, Songhwai Oh |
Abstract | Recently, there have been increasing demands to construct compact deep architectures to remove unnecessary redundancy and to improve the inference speed. While many recent works focus on reducing the redundancy by eliminating unneeded weight parameters, it is not possible to apply a single deep architecture for multiple devices with different resources. When a new device or circumstantial condition requires a new deep architecture, it is necessary to construct and train a new network from scratch. In this work, we propose a novel deep learning framework, called a nested sparse network, which exploits an n-in-1-type nested structure in a neural network. A nested sparse network consists of multiple levels of networks with a different sparsity ratio associated with each level, and higher level networks share parameters with lower level networks to enable stable nested learning. The proposed framework realizes a resource-aware versatile architecture as the same network can meet diverse resource requirements. Moreover, the proposed nested network can learn different forms of knowledge in its internal networks at different levels, enabling multiple tasks using a single network, such as coarse-to-fine hierarchical classification. In order to train the proposed nested sparse network, we propose efficient weight connection learning and channel and layer scheduling strategies. We evaluate our network in multiple tasks, including adaptive deep compression, knowledge distillation, and learning class hierarchy, and demonstrate that nested sparse networks perform competitively, but more efficiently, compared to existing methods. |
Tasks | |
Published | 2017-12-11 |
URL | http://arxiv.org/abs/1712.03781v2 |
http://arxiv.org/pdf/1712.03781v2.pdf | |
PWC | https://paperswithcode.com/paper/nestednet-learning-nested-sparse-structures |
Repo | https://github.com/niceday15/nested-network-cifar100 |
Framework | tf |
On the choice of the low-dimensional domain for global optimization via random embeddings
Title | On the choice of the low-dimensional domain for global optimization via random embeddings |
Authors | Mickaël Binois, David Ginsbourger, Olivier Roustant |
Abstract | The challenge of taking many variables into account in optimization problems may be overcome under the hypothesis of low effective dimensionality. Then, the search of solutions can be reduced to the random embedding of a low dimensional space into the original one, resulting in a more manageable optimization problem. Specifically, in the case of time consuming black-box functions and when the budget of evaluations is severely limited, global optimization with random embeddings appears as a sound alternative to random search. Yet, in the case of box constraints on the native variables, defining suitable bounds on a low dimensional domain appears to be complex. Indeed, a small search domain does not guarantee to find a solution even under restrictive hypotheses about the function, while a larger one may slow down convergence dramatically. Here we tackle the issue of low-dimensional domain selection based on a detailed study of the properties of the random embedding, giving insight on the aforementioned difficulties. In particular, we describe a minimal low-dimensional set in correspondence with the embedded search space. We additionally show that an alternative equivalent embedding procedure yields simultaneously a simpler definition of the low-dimensional minimal set and better properties in practice. Finally, the performance and robustness gains of the proposed enhancements for Bayesian optimization are illustrated on numerical examples. |
Tasks | |
Published | 2017-04-18 |
URL | http://arxiv.org/abs/1704.05318v3 |
http://arxiv.org/pdf/1704.05318v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-choice-of-the-low-dimensional-domain |
Repo | https://github.com/Lance-Q/Awesome-Embedding-Optimization-Paper-List |
Framework | none |
Online Multiclass Boosting
Title | Online Multiclass Boosting |
Authors | Young Hun Jung, Jack Goetz, Ambuj Tewari |
Abstract | Recent work has extended the theoretical analysis of boosting algorithms to multiclass problems and to online settings. However, the multiclass extension is in the batch setting and the online extensions only consider binary classification. We fill this gap in the literature by defining, and justifying, a weak learning condition for online multiclass boosting. This condition leads to an optimal boosting algorithm that requires the minimal number of weak learners to achieve a certain accuracy. Additionally, we propose an adaptive algorithm which is near optimal and enjoys an excellent performance on real data due to its adaptive property. |
Tasks | |
Published | 2017-02-23 |
URL | http://arxiv.org/abs/1702.07305v3 |
http://arxiv.org/pdf/1702.07305v3.pdf | |
PWC | https://paperswithcode.com/paper/online-multiclass-boosting |
Repo | https://github.com/yhjung88/OnlineBoostingWithVFDT |
Framework | none |
Reinforcement Learning for Pivoting Task
Title | Reinforcement Learning for Pivoting Task |
Authors | Rika Antonova, Silvia Cruciani, Christian Smith, Danica Kragic |
Abstract | In this work we propose an approach to learn a robust policy for solving the pivoting task. Recently, several model-free continuous control algorithms were shown to learn successful policies without prior knowledge of the dynamics of the task. However, obtaining successful policies required thousands to millions of training episodes, limiting the applicability of these approaches to real hardware. We developed a training procedure that allows us to use a simple custom simulator to learn policies robust to the mismatch of simulation vs robot. In our experiments, we demonstrate that the policy learned in the simulator is able to pivot the object to the desired target angle on the real robot. We also show generalization to an object with different inertia, shape, mass and friction properties than those used during training. This result is a step towards making model-free reinforcement learning available for solving robotics tasks via pre-training in simulators that offer only an imprecise match to the real-world dynamics. |
Tasks | Continuous Control |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00472v1 |
http://arxiv.org/pdf/1703.00472v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-pivoting-task |
Repo | https://github.com/LeoToledo/PivotingTaskRL |
Framework | none |
EAST: An Efficient and Accurate Scene Text Detector
Title | EAST: An Efficient and Accurate Scene Text Detector |
Authors | Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, Jiajun Liang |
Abstract | Previous approaches for scene text detection have already achieved promising performances across various benchmarks. However, they usually fall short when dealing with challenging scenarios, even when equipped with deep neural network models, because the overall performance is determined by the interplay of multiple stages and components in the pipelines. In this work, we propose a simple yet powerful pipeline that yields fast and accurate text detection in natural scenes. The pipeline directly predicts words or text lines of arbitrary orientations and quadrilateral shapes in full images, eliminating unnecessary intermediate steps (e.g., candidate aggregation and word partitioning), with a single neural network. The simplicity of our pipeline allows concentrating efforts on designing loss functions and neural network architecture. Experiments on standard datasets including ICDAR 2015, COCO-Text and MSRA-TD500 demonstrate that the proposed algorithm significantly outperforms state-of-the-art methods in terms of both accuracy and efficiency. On the ICDAR 2015 dataset, the proposed algorithm achieves an F-score of 0.7820 at 13.2fps at 720p resolution. |
Tasks | Curved Text Detection, Scene Text Detection |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03155v2 |
http://arxiv.org/pdf/1704.03155v2.pdf | |
PWC | https://paperswithcode.com/paper/east-an-efficient-and-accurate-scene-text |
Repo | https://github.com/quickgrid/AI-ML-DL-CV-NLP-SP |
Framework | tf |
Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization
Title | Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization |
Authors | Ilya Loshchilov, Tobias Glasmachers, Hans-Georg Beyer |
Abstract | The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is a popular method to deal with nonconvex and/or stochastic optimization problems when the gradient information is not available. Being based on the CMA-ES, the recently proposed Matrix Adaptation Evolution Strategy (MA-ES) provides a rather surprising result that the covariance matrix and all associated operations (e.g., potentially unstable eigendecomposition) can be replaced in the CMA-ES by a updated transformation matrix without any loss of performance. In order to further simplify MA-ES and reduce its $\mathcal{O}\big(n^2\big)$ time and storage complexity to $\mathcal{O}\big(n\log(n)\big)$, we present the Limited-Memory Matrix Adaptation Evolution Strategy (LM-MA-ES) for efficient zeroth order large-scale optimization. The algorithm demonstrates state-of-the-art performance on a set of established large-scale benchmarks. We explore the algorithm on the problem of generating adversarial inputs for a (non-smooth) random forest classifier, demonstrating a surprising vulnerability of the classifier. |
Tasks | Stochastic Optimization |
Published | 2017-05-18 |
URL | http://arxiv.org/abs/1705.06693v1 |
http://arxiv.org/pdf/1705.06693v1.pdf | |
PWC | https://paperswithcode.com/paper/limited-memory-matrix-adaptation-for-large |
Repo | https://github.com/Alsr96/LMMAES |
Framework | none |
Dependent landmark drift: robust point set registration with a Gaussian mixture model and a statistical shape model
Title | Dependent landmark drift: robust point set registration with a Gaussian mixture model and a statistical shape model |
Authors | Osamu Hirose |
Abstract | The goal of point set registration is to find point-by-point correspondences between point sets, each of which characterizes the shape of an object. Because local preservation of object geometry is assumed, prevalent algorithms in the area can often elegantly solve the problems without using geometric information specific to the objects. This means that registration performance can be further improved by using prior knowledge of object geometry. In this paper, we propose a novel point set registration method using the Gaussian mixture model with prior shape information encoded as a statistical shape model. Our transformation model is defined as a combination of the similar transformation, motion coherence, and the statistical shape model. Therefore, the proposed method works effectively if the target point set includes outliers and missing regions, or if it is rotated. The computational cost can be reduced to linear, and therefore the method is scalable to large point sets. The effectiveness of the method will be verified through comparisons with existing algorithms using datasets concerning human body shapes, hands, and faces. |
Tasks | |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06588v3 |
http://arxiv.org/pdf/1711.06588v3.pdf | |
PWC | https://paperswithcode.com/paper/dependent-landmark-drift-robust-point-set |
Repo | https://github.com/ohirose/dld |
Framework | none |
Semi-supervised Multitask Learning for Sequence Labeling
Title | Semi-supervised Multitask Learning for Sequence Labeling |
Authors | Marek Rei |
Abstract | We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. This language modeling objective incentivises the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks. The architecture was evaluated on a range of datasets, covering the tasks of error detection in learner texts, named entity recognition, chunking and POS-tagging. The novel language modeling objective provided consistent performance improvements on every benchmark, without requiring any additional annotated or unannotated data. |
Tasks | Chunking, Grammatical Error Detection, Language Modelling, Named Entity Recognition, Part-Of-Speech Tagging |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07156v1 |
http://arxiv.org/pdf/1704.07156v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-multitask-learning-for |
Repo | https://github.com/marekrei/sequence-labeler |
Framework | tf |
Deriving Neural Architectures from Sequence and Graph Kernels
Title | Deriving Neural Architectures from Sequence and Graph Kernels |
Authors | Tao Lei, Wengong Jin, Regina Barzilay, Tommi Jaakkola |
Abstract | The design of neural architectures for structured objects is typically guided by experimental insights rather than a formal process. In this work, we appeal to kernels over combinatorial structures, such as sequences and graphs, to derive appropriate neural operations. We introduce a class of deep recurrent neural operations and formally characterize their associated kernel spaces. Our recurrent modules compare the input to virtual reference objects (cf. filters in CNN) via the kernels. Similar to traditional neural operations, these reference objects are parameterized and directly optimized in end-to-end training. We empirically evaluate the proposed class of neural architectures on standard applications such as language modeling and molecular graph regression, achieving state-of-the-art results across these applications. |
Tasks | Graph Regression, Language Modelling |
Published | 2017-05-25 |
URL | http://arxiv.org/abs/1705.09037v3 |
http://arxiv.org/pdf/1705.09037v3.pdf | |
PWC | https://paperswithcode.com/paper/deriving-neural-architectures-from-sequence |
Repo | https://github.com/taolei87/icml17_knn |
Framework | tf |
Self-adaptation of Genetic Operators Through Genetic Programming Techniques
Title | Self-adaptation of Genetic Operators Through Genetic Programming Techniques |
Authors | Andres Felipe Cruz Salinas, Jonatan Gomez Perdomo |
Abstract | Here we propose an evolutionary algorithm that self modifies its operators at the same time that candidate solutions are evolved. This tackles convergence and lack of diversity issues, leading to better solutions. Operators are represented as trees and are evolved using genetic programming (GP) techniques. The proposed approach is tested with real benchmark functions and an analysis of operator evolution is provided. |
Tasks | |
Published | 2017-12-17 |
URL | http://arxiv.org/abs/1712.06070v1 |
http://arxiv.org/pdf/1712.06070v1.pdf | |
PWC | https://paperswithcode.com/paper/self-adaptation-of-genetic-operators-through |
Repo | https://github.com/afcruzs/AOEA |
Framework | none |
Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs
Title | Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs |
Authors | Cristóbal Esteban, Stephanie L. Hyland, Gunnar Rätsch |
Abstract | Generative Adversarial Networks (GANs) have shown remarkable success as a framework for training models to produce realistic-looking data. In this work, we propose a Recurrent GAN (RGAN) and Recurrent Conditional GAN (RCGAN) to produce realistic real-valued multi-dimensional time series, with an emphasis on their application to medical data. RGANs make use of recurrent neural networks in the generator and the discriminator. In the case of RCGANs, both of these RNNs are conditioned on auxiliary information. We demonstrate our models in a set of toy datasets, where we show visually and quantitatively (using sample likelihood and maximum mean discrepancy) that they can successfully generate realistic time-series. We also describe novel evaluation methods for GANs, where we generate a synthetic labelled training dataset, and evaluate on a real test set the performance of a model trained on the synthetic data, and vice-versa. We illustrate with these metrics that RCGANs can generate time-series data useful for supervised training, with only minor degradation in performance on real test data. This is demonstrated on digit classification from ‘serialised’ MNIST and by training an early warning system on a medical dataset of 17,000 patients from an intensive care unit. We further discuss and analyse the privacy concerns that may arise when using RCGANs to generate realistic synthetic medical time series data. |
Tasks | Time Series |
Published | 2017-06-08 |
URL | http://arxiv.org/abs/1706.02633v2 |
http://arxiv.org/pdf/1706.02633v2.pdf | |
PWC | https://paperswithcode.com/paper/real-valued-medical-time-series-generation |
Repo | https://github.com/LiDan456/GAN-AD |
Framework | tf |
Normalized Direction-preserving Adam
Title | Normalized Direction-preserving Adam |
Authors | Zijun Zhang, Lin Ma, Zongpeng Li, Chuan Wu |
Abstract | Adaptive optimization algorithms, such as Adam and RMSprop, have shown better optimization performance than stochastic gradient descent (SGD) in some scenarios. However, recent studies show that they often lead to worse generalization performance than SGD, especially for training deep neural networks (DNNs). In this work, we identify the reasons that Adam generalizes worse than SGD, and develop a variant of Adam to eliminate the generalization gap. The proposed method, normalized direction-preserving Adam (ND-Adam), enables more precise control of the direction and step size for updating weight vectors, leading to significantly improved generalization performance. Following a similar rationale, we further improve the generalization performance in classification tasks by regularizing the softmax logits. By bridging the gap between SGD and Adam, we also hope to shed light on why certain optimization algorithms generalize better than others. |
Tasks | |
Published | 2017-09-13 |
URL | http://arxiv.org/abs/1709.04546v2 |
http://arxiv.org/pdf/1709.04546v2.pdf | |
PWC | https://paperswithcode.com/paper/normalized-direction-preserving-adam |
Repo | https://github.com/zj10/ND-Adam |
Framework | pytorch |
Improving the Accuracy of Pre-trained Word Embeddings for Sentiment Analysis
Title | Improving the Accuracy of Pre-trained Word Embeddings for Sentiment Analysis |
Authors | Seyed Mahdi Rezaeinia, Ali Ghodsi, Rouhollah Rahmani |
Abstract | Sentiment analysis is one of the well-known tasks and fast growing research areas in natural language processing (NLP) and text classifications. This technique has become an essential part of a wide range of applications including politics, business, advertising and marketing. There are various techniques for sentiment analysis, but recently word embeddings methods have been widely used in sentiment classification tasks. Word2Vec and GloVe are currently among the most accurate and usable word embedding methods which can convert words into meaningful vectors. However, these methods ignore sentiment information of texts and need a huge corpus of texts for training and generating exact vectors which are used as inputs of deep learning models. As a result, because of the small size of some corpuses, researcher often have to use pre-trained word embeddings which were trained on other large text corpus such as Google News with about 100 billion words. The increasing accuracy of pre-trained word embeddings has a great impact on sentiment analysis research. In this paper we propose a novel method, Improved Word Vectors (IWV), which increases the accuracy of pre-trained word embeddings in sentiment analysis. Our method is based on Part-of-Speech (POS) tagging techniques, lexicon-based approaches and Word2Vec/GloVe methods. We tested the accuracy of our method via different deep learning models and sentiment datasets. Our experiment results show that Improved Word Vectors (IWV) are very effective for sentiment analysis. |
Tasks | Part-Of-Speech Tagging, Sentiment Analysis, Word Embeddings |
Published | 2017-11-23 |
URL | http://arxiv.org/abs/1711.08609v1 |
http://arxiv.org/pdf/1711.08609v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-the-accuracy-of-pre-trained-word |
Repo | https://github.com/PrashantRanjan09/Improved-Word-Embeddings |
Framework | tf |
Attentive Semantic Video Generation using Captions
Title | Attentive Semantic Video Generation using Captions |
Authors | Tanya Marwah, Gaurav Mittal, Vineeth N. Balasubramanian |
Abstract | This paper proposes a network architecture to perform variable length semantic video generation using captions. We adopt a new perspective towards video generation where we allow the captions to be combined with the long-term and short-term dependencies between video frames and thus generate a video in an incremental manner. Our experiments demonstrate our network architecture’s ability to distinguish between objects, actions and interactions in a video and combine them to generate videos for unseen captions. The network also exhibits the capability to perform spatio-temporal style transfer when asked to generate videos for a sequence of captions. We also show that the network’s ability to learn a latent representation allows it generate videos in an unsupervised manner and perform other tasks such as action recognition. (Accepted in International Conference in Computer Vision (ICCV) 2017) |
Tasks | Style Transfer, Temporal Action Localization, Video Generation |
Published | 2017-08-20 |
URL | http://arxiv.org/abs/1708.05980v3 |
http://arxiv.org/pdf/1708.05980v3.pdf | |
PWC | https://paperswithcode.com/paper/attentive-semantic-video-generation-using |
Repo | https://github.com/Singularity42/cap2vid |
Framework | tf |
NNVLP: A Neural Network-Based Vietnamese Language Processing Toolkit
Title | NNVLP: A Neural Network-Based Vietnamese Language Processing Toolkit |
Authors | Thai-Hoang Pham, Xuan-Khoai Pham, Tuan-Anh Nguyen, Phuong Le-Hong |
Abstract | This paper demonstrates neural network-based toolkit namely NNVLP for essential Vietnamese language processing tasks including part-of-speech (POS) tagging, chunking, named entity recognition (NER). Our toolkit is a combination of bidirectional Long Short-Term Memory (Bi-LSTM), Convolutional Neural Network (CNN), Conditional Random Field (CRF), using pre-trained word embeddings as input, which achieves state-of-the-art results on these three tasks. We provide both API and web demo for this toolkit. |
Tasks | Chunking, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings |
Published | 2017-08-24 |
URL | http://arxiv.org/abs/1708.07241v5 |
http://arxiv.org/pdf/1708.07241v5.pdf | |
PWC | https://paperswithcode.com/paper/nnvlp-a-neural-network-based-vietnamese |
Repo | https://github.com/pth1993/NNVLP |
Framework | none |