Paper Group AWR 32
In-Bed Pose Estimation: Deep Learning with Shallow Dataset. Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization. Lifelong Learning with Dynamically Expandable Networks. Attacking Binarized Neural Networks. Triple Generative Adversarial Nets. Simple Online and Realtime Tracking with a Deep Association Metric. ExprGAN: Facial …
In-Bed Pose Estimation: Deep Learning with Shallow Dataset
Title | In-Bed Pose Estimation: Deep Learning with Shallow Dataset |
Authors | Shuangjun Liu, Yu Yin, Sarah Ostadabbas |
Abstract | Although human pose estimation for various computer vision (CV) applications has been studied extensively in the last few decades, yet in-bed pose estimation using camera-based vision methods has been ignored by the CV community because it is assumed to be identical to the general purpose pose estimation methods. However, in-bed pose estimation has its own specialized aspects and comes with specific challenges including the notable differences in lighting conditions throughout a day and also having different pose distribution from the common human surveillance viewpoint. In this paper, we demonstrate that these challenges significantly lessen the effectiveness of existing general purpose pose estimation models. In order to address the lighting variation challenge, infrared selective (IRS) image acquisition technique is proposed to provide uniform quality data under various lighting conditions. In addition, to deal with unconventional pose perspective, a 2-end histogram of oriented gradient (HOG) rectification method is presented. In this work, we explored the idea of employing a pre-trained convolutional neural network (CNN) model trained on large public datasets of general human poses and fine-tuning the model using our own shallow in-bed IRS dataset. We developed an IRS imaging system and collected IRS image data from several realistic life-size mannequins in a simulated hospital room environment. A pre-trained CNN called convolutional pose machine (CPM) was repurposed for in-bed pose estimation by fine-tuning its specific intermediate layers. Using the HOG rectification method, the pose estimation performance of CPM significantly improved by 26.4% in PCK0.1 criteria compared to the model without such rectification. |
Tasks | Pose Estimation |
Published | 2017-11-03 |
URL | http://arxiv.org/abs/1711.01005v3 |
http://arxiv.org/pdf/1711.01005v3.pdf | |
PWC | https://paperswithcode.com/paper/in-bed-pose-estimation-deep-learning-with |
Repo | https://github.com/ostadabbas/in-bed-pose-estimation |
Framework | none |
Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization
Title | Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization |
Authors | Xipeng Qiu, Jingjing Gong, Xuanjing Huang |
Abstract | In this paper, we give an overview for the shared task at the CCF Conference on Natural Language Processing & Chinese Computing (NLPCC 2017): Chinese News Headline Categorization. The dataset of this shared task consists 18 classes, 12,000 short texts along with corresponded labels for each class. The dataset and example code can be accessed at https://github.com/FudanNLP/nlpcc2017_news_headline_categorization. |
Tasks | |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1706.02883v1 |
http://arxiv.org/pdf/1706.02883v1.pdf | |
PWC | https://paperswithcode.com/paper/overview-of-the-nlpcc-2017-shared-task |
Repo | https://github.com/FudanNLP/nlpcc2017_news_headline_categorization |
Framework | tf |
Lifelong Learning with Dynamically Expandable Networks
Title | Lifelong Learning with Dynamically Expandable Networks |
Authors | Jaehong Yoon, Eunho Yang, Jeongtae Lee, Sung Ju Hwang |
Abstract | We propose a novel deep network architecture for lifelong learning which we refer to as Dynamically Expandable Network (DEN), that can dynamically decide its network capacity as it trains on a sequence of tasks, to learn a compact overlapping knowledge sharing structure among tasks. DEN is efficiently trained in an online manner by performing selective retraining, dynamically expands network capacity upon arrival of each task with only the necessary number of units, and effectively prevents semantic drift by splitting/duplicating units and timestamping them. We validate DEN on multiple public datasets under lifelong learning scenarios, on which it not only significantly outperforms existing lifelong learning methods for deep networks, but also achieves the same level of performance as the batch counterparts with substantially fewer number of parameters. Further, the obtained network fine-tuned on all tasks obtained significantly better performance over the batch models, which shows that it can be used to estimate the optimal network structure even when all tasks are available in the first place. |
Tasks | |
Published | 2017-08-04 |
URL | http://arxiv.org/abs/1708.01547v11 |
http://arxiv.org/pdf/1708.01547v11.pdf | |
PWC | https://paperswithcode.com/paper/lifelong-learning-with-dynamically-expandable |
Repo | https://github.com/b5510546671/Chest-Xrays-Leaning |
Framework | tf |
Attacking Binarized Neural Networks
Title | Attacking Binarized Neural Networks |
Authors | Angus Galloway, Graham W. Taylor, Medhat Moussa |
Abstract | Neural networks with low-precision weights and activations offer compelling efficiency advantages over their full-precision equivalents. The two most frequently discussed benefits of quantization are reduced memory consumption, and a faster forward pass when implemented with efficient bitwise operations. We propose a third benefit of very low-precision neural networks: improved robustness against some adversarial attacks, and in the worst case, performance that is on par with full-precision models. We focus on the very low-precision case where weights and activations are both quantized to $\pm$1, and note that stochastically quantizing weights in just one layer can sharply reduce the impact of iterative attacks. We observe that non-scaled binary neural networks exhibit a similar effect to the original defensive distillation procedure that led to gradient masking, and a false notion of security. We address this by conducting both black-box and white-box experiments with binary models that do not artificially mask gradients. |
Tasks | Quantization |
Published | 2017-11-01 |
URL | http://arxiv.org/abs/1711.00449v2 |
http://arxiv.org/pdf/1711.00449v2.pdf | |
PWC | https://paperswithcode.com/paper/attacking-binarized-neural-networks |
Repo | https://github.com/AngusG/cleverhans-attacking-bnns |
Framework | tf |
Triple Generative Adversarial Nets
Title | Triple Generative Adversarial Nets |
Authors | Chongxuan Li, Kun Xu, Jun Zhu, Bo Zhang |
Abstract | Generative Adversarial Nets (GANs) have shown promise in image generation and semi-supervised learning (SSL). However, existing GANs in SSL have two problems: (1) the generator and the discriminator (i.e. the classifier) may not be optimal at the same time; and (2) the generator cannot control the semantics of the generated samples. The problems essentially arise from the two-player formulation, where a single discriminator shares incompatible roles of identifying fake samples and predicting labels and it only estimates the data without considering the labels. To address the problems, we present triple generative adversarial net (Triple-GAN), which consists of three players—a generator, a discriminator and a classifier. The generator and the classifier characterize the conditional distributions between images and labels, and the discriminator solely focuses on identifying fake image-label pairs. We design compatible utilities to ensure that the distributions characterized by the classifier and the generator both converge to the data distribution. Our results on various datasets demonstrate that Triple-GAN as a unified model can simultaneously (1) achieve the state-of-the-art classification results among deep generative models, and (2) disentangle the classes and styles of the input and transfer smoothly in the data space via interpolation in the latent space class-conditionally. |
Tasks | Image Generation |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02291v4 |
http://arxiv.org/pdf/1703.02291v4.pdf | |
PWC | https://paperswithcode.com/paper/triple-generative-adversarial-nets |
Repo | https://github.com/zhenxuan00/triple-gan |
Framework | none |
Simple Online and Realtime Tracking with a Deep Association Metric
Title | Simple Online and Realtime Tracking with a Deep Association Metric |
Authors | Nicolai Wojke, Alex Bewley, Dietrich Paulus |
Abstract | Simple Online and Realtime Tracking (SORT) is a pragmatic approach to multiple object tracking with a focus on simple, effective algorithms. In this paper, we integrate appearance information to improve the performance of SORT. Due to this extension we are able to track objects through longer periods of occlusions, effectively reducing the number of identity switches. In spirit of the original framework we place much of the computational complexity into an offline pre-training stage where we learn a deep association metric on a large-scale person re-identification dataset. During online application, we establish measurement-to-track associations using nearest neighbor queries in visual appearance space. Experimental evaluation shows that our extensions reduce the number of identity switches by 45%, achieving overall competitive performance at high frame rates. |
Tasks | Large-Scale Person Re-Identification, Multiple Object Tracking, Object Tracking, Person Re-Identification |
Published | 2017-03-21 |
URL | http://arxiv.org/abs/1703.07402v1 |
http://arxiv.org/pdf/1703.07402v1.pdf | |
PWC | https://paperswithcode.com/paper/simple-online-and-realtime-tracking-with-a |
Repo | https://github.com/Cjiangbpcs/cjiang.github.io |
Framework | none |
ExprGAN: Facial Expression Editing with Controllable Expression Intensity
Title | ExprGAN: Facial Expression Editing with Controllable Expression Intensity |
Authors | Hui Ding, Kumar Sricharan, Rama Chellappa |
Abstract | Facial expression editing is a challenging task as it needs a high-level semantic understanding of the input face image. In conventional methods, either paired training data is required or the synthetic face resolution is low. Moreover, only the categories of facial expression can be changed. To address these limitations, we propose an Expression Generative Adversarial Network (ExprGAN) for photo-realistic facial expression editing with controllable expression intensity. An expression controller module is specially designed to learn an expressive and compact expression code in addition to the encoder-decoder network. This novel architecture enables the expression intensity to be continuously adjusted from low to high. We further show that our ExprGAN can be applied for other tasks, such as expression transfer, image retrieval, and data augmentation for training improved face expression recognition models. To tackle the small size of the training database, an effective incremental learning scheme is proposed. Quantitative and qualitative evaluations on the widely used Oulu-CASIA dataset demonstrate the effectiveness of ExprGAN. |
Tasks | Data Augmentation, Image Retrieval |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03842v2 |
http://arxiv.org/pdf/1709.03842v2.pdf | |
PWC | https://paperswithcode.com/paper/exprgan-facial-expression-editing-with |
Repo | https://github.com/hengxyz/ExpressionGAN |
Framework | tf |
Single-Queue Decoding for Neural Machine Translation
Title | Single-Queue Decoding for Neural Machine Translation |
Authors | Raphael Shu, Hideki Nakayama |
Abstract | Neural machine translation models rely on the beam search algorithm for decoding. In practice, we found that the quality of hypotheses in the search space is negatively affected owing to the fixed beam size. To mitigate this problem, we store all hypotheses in a single priority queue and use a universal score function for hypothesis selection. The proposed algorithm is more flexible as the discarded hypotheses can be revisited in a later step. We further design a penalty function to punish the hypotheses that tend to produce a final translation that is much longer or shorter than expected. Despite its simplicity, we show that the proposed decoding algorithm is able to select hypotheses with better qualities and improve the translation performance. |
Tasks | Machine Translation |
Published | 2017-07-06 |
URL | http://arxiv.org/abs/1707.01830v2 |
http://arxiv.org/pdf/1707.01830v2.pdf | |
PWC | https://paperswithcode.com/paper/single-queue-decoding-for-neural-machine |
Repo | https://github.com/zomux/nmtdec |
Framework | none |
Deep Recurrent Generative Decoder for Abstractive Text Summarization
Title | Deep Recurrent Generative Decoder for Abstractive Text Summarization |
Authors | Piji Li, Wai Lam, Lidong Bing, Zihao Wang |
Abstract | We propose a new framework for abstractive text summarization based on a sequence-to-sequence oriented encoder-decoder model equipped with a deep recurrent generative decoder (DRGN). Latent structure information implied in the target summaries is learned based on a recurrent latent random model for improving the summarization quality. Neural variational inference is employed to address the intractable posterior inference for the recurrent latent variables. Abstractive summaries are generated based on both the generative latent variables and the discriminative deterministic states. Extensive experiments on some benchmark datasets in different languages show that DRGN achieves improvements over the state-of-the-art methods. |
Tasks | Abstractive Text Summarization, Text Summarization |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00625v1 |
http://arxiv.org/pdf/1708.00625v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-recurrent-generative-decoder-for |
Repo | https://github.com/toru34/li_emnlp_2017 |
Framework | none |
Virtual Adversarial Ladder Networks For Semi-supervised Learning
Title | Virtual Adversarial Ladder Networks For Semi-supervised Learning |
Authors | Saki Shinoda, Daniel E. Worrall, Gabriel J. Brostow |
Abstract | Semi-supervised learning (SSL) partially circumvents the high cost of labeling data by augmenting a small labeled dataset with a large and relatively cheap unlabeled dataset drawn from the same distribution. This paper offers a novel interpretation of two deep learning-based SSL approaches, ladder networks and virtual adversarial training (VAT), as applying distributional smoothing to their respective latent spaces. We propose a class of models that fuse these approaches. We achieve near-supervised accuracy with high consistency on the MNIST dataset using just 5 labels per class: our best model, ladder with layer-wise virtual adversarial noise (LVAN-LW), achieves 1.42% +/- 0.12 average error rate on the MNIST test set, in comparison with 1.62% +/- 0.65 reported for the ladder network. On adversarial examples generated with L2-normalized fast gradient method, LVAN-LW trained with 5 examples per class achieves average error rate 2.4% +/- 0.3 compared to 68.6% +/- 6.5 for the ladder network and 9.9% +/- 7.5 for VAT. |
Tasks | |
Published | 2017-11-20 |
URL | http://arxiv.org/abs/1711.07476v2 |
http://arxiv.org/pdf/1711.07476v2.pdf | |
PWC | https://paperswithcode.com/paper/virtual-adversarial-ladder-networks-for-semi |
Repo | https://github.com/sakishinoda/tf-ssl |
Framework | tf |
Learning Distributed Representations of Texts and Entities from Knowledge Base
Title | Learning Distributed Representations of Texts and Entities from Knowledge Base |
Authors | Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, Yoshiyasu Takefuji |
Abstract | We describe a neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities. Given a text in the KB, we train our proposed model to predict entities that are relevant to the text. Our model is designed to be generic with the ability to address various NLP tasks with ease. We train the model using a large corpus of texts and their entity annotations extracted from Wikipedia. We evaluated the model on three important NLP tasks (i.e., sentence textual similarity, entity linking, and factoid question answering) involving both unsupervised and supervised settings. As a result, we achieved state-of-the-art results on all three of these tasks. Our code and trained models are publicly available for further academic research. |
Tasks | Entity Disambiguation, Entity Linking, Question Answering |
Published | 2017-05-06 |
URL | http://arxiv.org/abs/1705.02494v3 |
http://arxiv.org/pdf/1705.02494v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-distributed-representations-of-texts |
Repo | https://github.com/keel-keywordextraction-entitylinking/entityLinking |
Framework | none |
A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers
Title | A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers |
Authors | Yong Sheng Soh, Venkat Chandrasekaran |
Abstract | Regularization techniques are widely employed in optimization-based approaches for solving ill-posed inverse problems in data analysis and scientific computing. These methods are based on augmenting the objective with a penalty function, which is specified based on prior domain-specific expertise to induce a desired structure in the solution. We consider the problem of learning suitable regularization functions from data in settings in which precise domain knowledge is not directly available. Previous work under the title of dictionary learning' or sparse coding’ may be viewed as learning a regularization function that can be computed via linear programming. We describe generalizations of these methods to learn regularizers that can be computed and optimized via semidefinite programming. Our framework for learning such semidefinite regularizers is based on obtaining structured factorizations of data matrices, and our algorithmic approach for computing these factorizations combines recent techniques for rank minimization problems along with an operator analog of Sinkhorn scaling. Under suitable conditions on the input data, our algorithm provides a locally linearly convergent method for identifying the correct regularizer that promotes the type of structure contained in the data. Our analysis is based on the stability properties of Operator Sinkhorn scaling and their relation to geometric aspects of determinantal varieties (in particular tangent spaces with respect to these varieties). The regularizers obtained using our framework can be employed effectively in semidefinite programming relaxations for solving inverse problems. |
Tasks | Dictionary Learning |
Published | 2017-01-05 |
URL | http://arxiv.org/abs/1701.01207v1 |
http://arxiv.org/pdf/1701.01207v1.pdf | |
PWC | https://paperswithcode.com/paper/a-matrix-factorization-approach-for-learning |
Repo | https://github.com/yssoh/yssoh.github.io |
Framework | none |
Deep Multitask Learning for Semantic Dependency Parsing
Title | Deep Multitask Learning for Semantic Dependency Parsing |
Authors | Hao Peng, Sam Thomson, Noah A. Smith |
Abstract | We present a deep neural architecture that parses sentences into three semantic dependency graph formalisms. By using efficient, nearly arc-factored inference and a bidirectional-LSTM composed with a multi-layer perceptron, our base system is able to significantly improve the state of the art for semantic dependency parsing, without using hand-engineered features or syntax. We then explore two multitask learning approaches—one that shares parameters across formalisms, and one that uses higher-order structures to predict the graphs jointly. We find that both approaches improve performance across formalisms on average, achieving a new state of the art. Our code is open-source and available at https://github.com/Noahs-ARK/NeurboParser. |
Tasks | Dependency Parsing, Semantic Dependency Parsing |
Published | 2017-04-22 |
URL | http://arxiv.org/abs/1704.06855v2 |
http://arxiv.org/pdf/1704.06855v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-multitask-learning-for-semantic |
Repo | https://github.com/Noahs-ARK/NeurboParser |
Framework | none |
Feature-based time-series analysis
Title | Feature-based time-series analysis |
Authors | Ben D. Fulcher |
Abstract | This work presents an introduction to feature-based time-series analysis. The time series as a data type is first described, along with an overview of the interdisciplinary time-series analysis literature. I then summarize the range of feature-based representations for time series that have been developed to aid interpretable insights into time-series structure. Particular emphasis is given to emerging research that facilitates wide comparison of feature-based representations that allow us to understand the properties of a time-series dataset that make it suited to a particular feature-based representation or analysis algorithm. The future of time-series analysis is likely to embrace approaches that exploit machine learning methods to partially automate human learning to aid understanding of the complex dynamical patterns in the time series we measure from the world. |
Tasks | Time Series, Time Series Analysis |
Published | 2017-09-23 |
URL | http://arxiv.org/abs/1709.08055v2 |
http://arxiv.org/pdf/1709.08055v2.pdf | |
PWC | https://paperswithcode.com/paper/feature-based-time-series-analysis |
Repo | https://github.com/benfulcher/hctsa |
Framework | none |
Simple Recurrent Units for Highly Parallelizable Recurrence
Title | Simple Recurrent Units for Highly Parallelizable Recurrence |
Authors | Tao Lei, Yu Zhang, Sida I. Wang, Hui Dai, Yoav Artzi |
Abstract | Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU), a light recurrent unit that balances model capacity and scalability. SRU is designed to provide expressive recurrence, enable highly parallelized implementation, and comes with careful initialization to facilitate training of deep models. We demonstrate the effectiveness of SRU on multiple NLP tasks. SRU achieves 5–9x speed-up over cuDNN-optimized LSTM on classification and question answering datasets, and delivers stronger results than LSTM and convolutional models. We also obtain an average of 0.7 BLEU improvement over the Transformer model on translation by incorporating SRU into the architecture. |
Tasks | Machine Translation, Question Answering, Text Classification |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.02755v5 |
http://arxiv.org/pdf/1709.02755v5.pdf | |
PWC | https://paperswithcode.com/paper/simple-recurrent-units-for-highly |
Repo | https://github.com/taolei87/sru |
Framework | pytorch |