October 20, 2019

3158 words 15 mins read

Paper Group AWR 280

Paper Group AWR 280

A Style-Aware Content Loss for Real-time HD Style Transfer. Sparse Kernel PCA for Outlier Detection. Improving GAN Training via Binarized Representation Entropy (BRE) Regularization. Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders. Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging. General sol …

A Style-Aware Content Loss for Real-time HD Style Transfer

Title A Style-Aware Content Loss for Real-time HD Style Transfer
Authors Artsiom Sanakoyeu, Dmytro Kotovenko, Sabine Lang, Björn Ommer
Abstract Recently, style transfer has received a lot of attention. While much of this research has aimed at speeding up processing, the approaches are still lacking from a principled, art historical standpoint: a style is more than just a single image or an artist, but previous work is limited to only a single instance of a style or shows no benefit from more images. Moreover, previous work has relied on a direct comparison of art in the domain of RGB images or on CNNs pre-trained on ImageNet, which requires millions of labeled object bounding boxes and can introduce an extra bias, since it has been assembled without artistic consideration. To circumvent these issues, we propose a style-aware content loss, which is trained jointly with a deep encoder-decoder network for real-time, high-resolution stylization of images and videos. We propose a quantitative measure for evaluating the quality of a stylized image and also have art historians rank patches from our approach against those from previous work. These and our qualitative results ranging from small image patches to megapixel stylistic images and videos show that our approach better captures the subtle nature in which a style affects content.
Tasks Image Stylization, Style Transfer
Published 2018-07-26
URL http://arxiv.org/abs/1807.10201v2
PDF http://arxiv.org/pdf/1807.10201v2.pdf
PWC https://paperswithcode.com/paper/a-style-aware-content-loss-for-real-time-hd
Repo https://github.com/peterfind/adaptive-style-transfer-lossChange
Framework tf

Sparse Kernel PCA for Outlier Detection

Title Sparse Kernel PCA for Outlier Detection
Authors Rudrajit Das, Aditya Golatkar, Suyash P. Awate
Abstract In this paper, we propose a new method to perform Sparse Kernel Principal Component Analysis (SKPCA) and also mathematically analyze the validity of SKPCA. We formulate SKPCA as a constrained optimization problem with elastic net regularization (Hastie et al.) in kernel feature space and solve it. We consider outlier detection (where KPCA is employed) as an application for SKPCA, using the RBF kernel. We test it on 5 real-world datasets and show that by using just 4% (or even less) of the principal components (PCs), where each PC has on average less than 12% non-zero elements in the worst case among all 5 datasets, we are able to nearly match and in 3 datasets even outperform KPCA. We also compare the performance of our method with a recently proposed method for SKPCA by Wang et al. and show that our method performs better in terms of both accuracy and sparsity. We also provide a novel probabilistic proof to justify the existence of sparse solutions for KPCA using the RBF kernel. To the best of our knowledge, this is the first attempt at theoretically analyzing the validity of SKPCA.
Tasks Outlier Detection
Published 2018-09-07
URL http://arxiv.org/abs/1809.02497v2
PDF http://arxiv.org/pdf/1809.02497v2.pdf
PWC https://paperswithcode.com/paper/sparse-kernel-pca-for-outlier-detection
Repo https://github.com/AdityaGolatkar/Sparse-Kernel-PCA-for-outlier-detection
Framework none

Improving GAN Training via Binarized Representation Entropy (BRE) Regularization

Title Improving GAN Training via Binarized Representation Entropy (BRE) Regularization
Authors Yanshuai Cao, Gavin Weiguang Ding, Kry Yik-Chau Lui, Ruitong Huang
Abstract We propose a novel regularizer to improve the training of Generative Adversarial Networks (GANs). The motivation is that when the discriminator D spreads out its model capacity in the right way, the learning signals given to the generator G are more informative and diverse. These in turn help G to explore better and discover the real data manifold while avoiding large unstable jumps due to the erroneous extrapolation made by D. Our regularizer guides the rectifier discriminator D to better allocate its model capacity, by encouraging the binary activation patterns on selected internal layers of D to have a high joint entropy. Experimental results on both synthetic data and real datasets demonstrate improvements in stability and convergence speed of the GAN training, as well as higher sample quality. The approach also leads to higher classification accuracies in semi-supervised learning.
Tasks
Published 2018-05-09
URL http://arxiv.org/abs/1805.03644v1
PDF http://arxiv.org/pdf/1805.03644v1.pdf
PWC https://paperswithcode.com/paper/improving-gan-training-via-binarized
Repo https://github.com/BorealisAI/bre-gan
Framework tf

Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders

Title Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders
Authors Wen-Chin Huang, Hsin-Te Hwang, Yu-Huai Peng, Yu Tsao, Hsin-Min Wang
Abstract An effective approach to non-parallel voice conversion (VC) is to utilize deep neural networks (DNNs), specifically variational auto encoders (VAEs), to model the latent structure of speech in an unsupervised manner. A previous study has confirmed the ef- fectiveness of VAE using the STRAIGHT spectra for VC. How- ever, VAE using other types of spectral features such as mel- cepstral coefficients (MCCs), which are related to human per- ception and have been widely used in VC, have not been prop- erly investigated. Instead of using one specific type of spectral feature, it is expected that VAE may benefit from using multi- ple types of spectral features simultaneously, thereby improving the capability of VAE for VC. To this end, we propose a novel VAE framework (called cross-domain VAE, CDVAE) for VC. Specifically, the proposed framework utilizes both STRAIGHT spectra and MCCs by explicitly regularizing multiple objectives in order to constrain the behavior of the learned encoder and de- coder. Experimental results demonstrate that the proposed CD- VAE framework outperforms the conventional VAE framework in terms of subjective tests.
Tasks Voice Conversion
Published 2018-08-29
URL http://arxiv.org/abs/1808.09634v1
PDF http://arxiv.org/pdf/1808.09634v1.pdf
PWC https://paperswithcode.com/paper/voice-conversion-based-on-cross-domain
Repo https://github.com/unilight/cdvae-vc
Framework tf

Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging

Title Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging
Authors Kemal Kurniawan, Alham Fikri Aji
Abstract Previous work in Indonesian part-of-speech (POS) tagging are hard to compare as they are not evaluated on a common dataset. Furthermore, in spite of the success of neural network models for English POS tagging, they are rarely explored for Indonesian. In this paper, we explored various techniques for Indonesian POS tagging, including rule-based, CRF, and neural network-based models. We evaluated our models on the IDN Tagged Corpus. A new state-of-the-art of 97.47 F1 score is achieved with a recurrent neural network. To provide a standard for future work, we release the dataset split that we used publicly.
Tasks Part-Of-Speech Tagging
Published 2018-09-10
URL http://arxiv.org/abs/1809.03391v3
PDF http://arxiv.org/pdf/1809.03391v3.pdf
PWC https://paperswithcode.com/paper/toward-a-standardized-and-more-accurate
Repo https://github.com/kmkurn/id-pos-tagging
Framework none

General solutions for nonlinear differential equations: a rule-based self-learning approach using deep reinforcement learning

Title General solutions for nonlinear differential equations: a rule-based self-learning approach using deep reinforcement learning
Authors Shiyin Wei, Xiaowei Jin, Hui Li
Abstract A universal rule-based self-learning approach using deep reinforcement learning (DRL) is proposed for the first time to solve nonlinear ordinary differential equations and partial differential equations. The solver consists of a deep neural network-structured actor that outputs candidate solutions, and a critic derived only from physical rules (governing equations and boundary and initial conditions). Solutions in discretized time are treated as multiple tasks sharing the same governing equation, and the current step parameters provide an ideal initialization for the next owing to the temporal continuity of the solutions, which shows a transfer learning characteristic and indicates that the DRL solver has captured the intrinsic nature of the equation. The approach is verified through solving the Schr"odinger, Navier-Stokes, Burgers’, Van der Pol, and Lorenz equations and an equation of motion. The results indicate that the approach gives solutions with high accuracy, and the solution process promises to get faster.
Tasks Transfer Learning
Published 2018-05-13
URL https://arxiv.org/abs/1805.07297v2
PDF https://arxiv.org/pdf/1805.07297v2.pdf
PWC https://paperswithcode.com/paper/general-solutions-for-nonlinear-differential
Repo https://github.com/HIT-SMC/DRL_solver
Framework none

PnP-AdaNet: Plug-and-Play Adversarial Domain Adaptation Network with a Benchmark at Cross-modality Cardiac Segmentation

Title PnP-AdaNet: Plug-and-Play Adversarial Domain Adaptation Network with a Benchmark at Cross-modality Cardiac Segmentation
Authors Qi Dou, Cheng Ouyang, Cheng Chen, Hao Chen, Ben Glocker, Xiahai Zhuang, Pheng-Ann Heng
Abstract Deep convolutional networks have demonstrated the state-of-the-art performance on various medical image computing tasks. Leveraging images from different modalities for the same analysis task holds clinical benefits. However, the generalization capability of deep models on test data with different distributions remain as a major challenge. In this paper, we propose the PnPAdaNet (plug-and-play adversarial domain adaptation network) for adapting segmentation networks between different modalities of medical images, e.g., MRI and CT. We propose to tackle the significant domain shift by aligning the feature spaces of source and target domains in an unsupervised manner. Specifically, a domain adaptation module flexibly replaces the early encoder layers of the source network, and the higher layers are shared between domains. With adversarial learning, we build two discriminators whose inputs are respectively multi-level features and predicted segmentation masks. We have validated our domain adaptation method on cardiac structure segmentation in unpaired MRI and CT. The experimental results with comprehensive ablation studies demonstrate the excellent efficacy of our proposed PnP-AdaNet. Moreover, we introduce a novel benchmark on the cardiac dataset for the task of unsupervised cross-modality domain adaptation. We will make our code and database publicly available, aiming to promote future studies on this challenging yet important research topic in medical imaging.
Tasks Cardiac Segmentation, Domain Adaptation, Medical Image Generation, Medical Image Segmentation
Published 2018-12-19
URL http://arxiv.org/abs/1812.07907v1
PDF http://arxiv.org/pdf/1812.07907v1.pdf
PWC https://paperswithcode.com/paper/pnp-adanet-plug-and-play-adversarial-domain
Repo https://github.com/carrenD/Med-CMDA
Framework tf

Deep Cosine Metric Learning for Person Re-Identification

Title Deep Cosine Metric Learning for Person Re-Identification
Authors Nicolai Wojke, Alex Bewley
Abstract Metric learning aims to construct an embedding where two extracted features corresponding to the same identity are likely to be closer than features from different identities. This paper presents a method for learning such a feature space where the cosine similarity is effectively optimized through a simple re-parametrization of the conventional softmax classification regime. At test time, the final classification layer can be stripped from the network to facilitate nearest neighbor queries on unseen individuals using the cosine similarity metric. This approach presents a simple alternative to direct metric learning objectives such as siamese networks that have required sophisticated pair or triplet sampling strategies in the past. The method is evaluated on two large-scale pedestrian re-identification datasets where competitive results are achieved overall. In particular, we achieve better generalization on the test set compared to a network trained with triplet loss.
Tasks Metric Learning, Person Re-Identification
Published 2018-12-02
URL http://arxiv.org/abs/1812.00442v1
PDF http://arxiv.org/pdf/1812.00442v1.pdf
PWC https://paperswithcode.com/paper/deep-cosine-metric-learning-for-person-re
Repo https://github.com/seovchinnikov/cosine_softmax_keras
Framework tf

Batch DropBlock Network for Person Re-identification and Beyond

Title Batch DropBlock Network for Person Re-identification and Beyond
Authors Zuozhuo Dai, Mingqiang Chen, Xiaodong Gu, Siyu Zhu, Ping Tan
Abstract Since the person re-identification task often suffers from the problem of pose changes and occlusions, some attentive local features are often suppressed when training CNNs. In this paper, we propose the Batch DropBlock (BDB) Network which is a two branch network composed of a conventional ResNet-50 as the global branch and a feature dropping branch. The global branch encodes the global salient representations. Meanwhile, the feature dropping branch consists of an attentive feature learning module called Batch DropBlock, which randomly drops the same region of all input feature maps in a batch to reinforce the attentive feature learning of local regions. The network then concatenates features from both branches and provides a more comprehensive and spatially distributed feature representation. Albeit simple, our method achieves state-of-the-art on person re-identification and it is also applicable to general metric learning tasks. For instance, we achieve 76.4% Rank-1 accuracy on the CUHK03-Detect dataset and 83.0% Recall-1 score on the Stanford Online Products dataset, outperforming the existing works by a large margin (more than 6%).
Tasks Image Retrieval, Metric Learning, Person Re-Identification
Published 2018-11-17
URL https://arxiv.org/abs/1811.07130v2
PDF https://arxiv.org/pdf/1811.07130v2.pdf
PWC https://paperswithcode.com/paper/batch-feature-erasing-for-person-re
Repo https://github.com/zjjszj/batch-feature-erasing-network
Framework pytorch

Open Vocabulary Learning on Source Code with a Graph-Structured Cache

Title Open Vocabulary Learning on Source Code with a Graph-Structured Cache
Authors Milan Cvitkovic, Badal Singh, Anima Anandkumar
Abstract Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques. However, a major challenge is that code is written using an open, rapidly changing vocabulary due to, e.g., the coinage of new variable and method names. Reasoning over such a vocabulary is not something for which most NLP methods are designed. We introduce a Graph-Structured Cache to address this problem; this cache contains a node for each new word the model encounters with edges connecting each word to its occurrences in the code. We find that combining this graph-structured cache strategy with recent Graph-Neural-Network-based models for supervised learning on code improves the models’ performance on a code completion task and a variable naming task — with over $100%$ relative improvement on the latter — at the cost of a moderate increase in computation time.
Tasks
Published 2018-10-18
URL https://arxiv.org/abs/1810.08305v2
PDF https://arxiv.org/pdf/1810.08305v2.pdf
PWC https://paperswithcode.com/paper/open-vocabulary-learning-on-source-code-with
Repo https://github.com/Microsoft/graph-based-code-modelling
Framework tf

Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks

Title Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks
Authors Amanpreet Singh, Tushar Jain, Sainbayar Sukhbaatar
Abstract Learning when to communicate and doing that effectively is essential in multi-agent tasks. Recent works show that continuous communication allows efficient training with back-propagation in multi-agent scenarios, but have been restricted to fully-cooperative tasks. In this paper, we present Individualized Controlled Continuous Communication Model (IC3Net) which has better training efficiency than simple continuous communication model, and can be applied to semi-cooperative and competitive settings along with the cooperative settings. IC3Net controls continuous communication with a gating mechanism and uses individualized rewards foreach agent to gain better performance and scalability while fixing credit assignment issues. Using variety of tasks including StarCraft BroodWars explore and combat scenarios, we show that our network yields improved performance and convergence rates than the baselines as the scale increases. Our results convey that IC3Net agents learn when to communicate based on the scenario and profitability.
Tasks Starcraft
Published 2018-12-23
URL http://arxiv.org/abs/1812.09755v1
PDF http://arxiv.org/pdf/1812.09755v1.pdf
PWC https://paperswithcode.com/paper/learning-when-to-communicate-at-scale-in
Repo https://github.com/apsdehal/gym-starcraft
Framework none

Learning a Discriminative Feature Network for Semantic Segmentation

Title Learning a Discriminative Feature Network for Semantic Segmentation
Authors Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang
Abstract Most existing methods of semantic segmentation still suffer from two aspects of challenges: intra-class inconsistency and inter-class indistinction. To tackle these two problems, we propose a Discriminative Feature Network (DFN), which contains two sub-networks: Smooth Network and Border Network. Specifically, to handle the intra-class inconsistency problem, we specially design a Smooth Network with Channel Attention Block and global average pooling to select the more discriminative features. Furthermore, we propose a Border Network to make the bilateral features of boundary distinguishable with deep semantic boundary supervision. Based on our proposed DFN, we achieve state-of-the-art performance 86.2% mean IOU on PASCAL VOC 2012 and 80.3% mean IOU on Cityscapes dataset.
Tasks Semantic Segmentation
Published 2018-04-25
URL http://arxiv.org/abs/1804.09337v1
PDF http://arxiv.org/pdf/1804.09337v1.pdf
PWC https://paperswithcode.com/paper/learning-a-discriminative-feature-network-for
Repo https://github.com/ycszen/TorchSeg
Framework pytorch

Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization

Title Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization
Authors Sijia Liu, Bhavya Kailkhura, Pin-Yu Chen, Paishun Ting, Shiyu Chang, Lisa Amini
Abstract As application demands for zeroth-order (gradient-free) optimization accelerate, the need for variance reduced and faster converging approaches is also intensifying. This paper addresses these challenges by presenting: a) a comprehensive theoretical analysis of variance reduced zeroth-order (ZO) optimization, b) a novel variance reduced ZO algorithm, called ZO-SVRG, and c) an experimental evaluation of our approach in the context of two compelling applications, black-box chemical material classification and generation of adversarial examples from black-box deep neural network models. Our theoretical analysis uncovers an essential difficulty in the analysis of ZO-SVRG: the unbiased assumption on gradient estimates no longer holds. We prove that compared to its first-order counterpart, ZO-SVRG with a two-point random gradient estimator could suffer an additional error of order $O(1/b)$, where $b$ is the mini-batch size. To mitigate this error, we propose two accelerated versions of ZO-SVRG utilizing variance reduced gradient estimators, which achieve the best rate known for ZO stochastic optimization (in terms of iterations). Our extensive experimental results show that our approaches outperform other state-of-the-art ZO algorithms, and strike a balance between the convergence rate and the function query complexity.
Tasks Material Classification, Stochastic Optimization
Published 2018-05-25
URL http://arxiv.org/abs/1805.10367v2
PDF http://arxiv.org/pdf/1805.10367v2.pdf
PWC https://paperswithcode.com/paper/zeroth-order-stochastic-variance-reduction
Repo https://github.com/IBM/ZOSVRG-BlackBox-Adv
Framework none

Selfless Sequential Learning

Title Selfless Sequential Learning
Authors Rahaf Aljundi, Marcus Rohrbach, Tinne Tuytelaars
Abstract Sequential learning, also called lifelong learning, studies the problem of learning tasks in a sequence with access restricted to only the data of the current task. In this paper we look at a scenario with fixed model capacity, and postulate that the learning process should not be selfish, i.e. it should account for future tasks to be added and thus leave enough capacity for them. To achieve Selfless Sequential Learning we study different regularization strategies and activation functions. We find that imposing sparsity at the level of the representation (i.e.~neuron activations) is more beneficial for sequential learning than encouraging parameter sparsity. In particular, we propose a novel regularizer, that encourages representation sparsity by means of neural inhibition. It results in few active neurons which in turn leaves more free neurons to be utilized by upcoming tasks. As neural inhibition over an entire layer can be too drastic, especially for complex tasks requiring strong representations, our regularizer only inhibits other neurons in a local neighbourhood, inspired by lateral inhibition processes in the brain. We combine our novel regularizer, with state-of-the-art lifelong learning methods that penalize changes to important previously learned parts of the network. We show that our new regularizer leads to increased sparsity which translates in consistent performance improvement %over alternative regularizers we studied on diverse datasets.
Tasks
Published 2018-06-14
URL http://arxiv.org/abs/1806.05421v5
PDF http://arxiv.org/pdf/1806.05421v5.pdf
PWC https://paperswithcode.com/paper/selfless-sequential-learning
Repo https://github.com/rahafaljundi/Selfless-Sequential-Learning
Framework pytorch

An Ontology-Based Dialogue Management System for Banking and Finance Dialogue Systems

Title An Ontology-Based Dialogue Management System for Banking and Finance Dialogue Systems
Authors Duygu Altinok
Abstract Keeping the dialogue state in dialogue systems is a notoriously difficult task. We introduce an ontology-based dialogue manage(OntoDM), a dialogue manager that keeps the state of the conversation, provides a basis for anaphora resolution and drives the conversation via domain ontologies. The banking and finance area promises great potential for disambiguating the context via a rich set of products and specificity of proper nouns, named entities and verbs. We used ontologies both as a knowledge base and a basis for the dialogue manager; the knowledge base component and dialogue manager components coalesce in a sense. Domain knowledge is used to track Entities of Interest, i.e. nodes (classes) of the ontology which happen to be products and services. In this way we also introduced conversation memory and attention in a sense. We finely blended linguistic methods, domain-driven keyword ranking and domain ontologies to create ways of domain-driven conversation. Proposed framework is used in our in-house German language banking and finance chatbots. General challenges of German language processing and finance-banking domain chatbot language models and lexicons are also introduced. This work is still in progress, hence no success metrics have been introduced yet.
Tasks Chatbot, Dialogue Management
Published 2018-04-13
URL http://arxiv.org/abs/1804.04838v1
PDF http://arxiv.org/pdf/1804.04838v1.pdf
PWC https://paperswithcode.com/paper/an-ontology-based-dialogue-management-system
Repo https://github.com/TimKettenacker/puffin
Framework none
comments powered by Disqus