Paper Group ANR 1106
Computationally Efficient Approaches for Image Style Transfer. Revisiting Adversarial Risk. IMS at the PolEval 2018: A Bulky Ensemble Depedency Parser meets 12 Simple Rules for Predicting Enhanced Dependencies in Polish. Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition. Semantic Frame Parsing for Information Extraction : the …
Computationally Efficient Approaches for Image Style Transfer
Title | Computationally Efficient Approaches for Image Style Transfer |
Authors | Ram Krishna Pandey, Samarjit Karmakar, A G Ramakrishnan |
Abstract | In this work, we have investigated various style transfer approaches and (i) examined how the stylized reconstruction changes with the change of loss function and (ii) provided a computationally efficient solution for the same. We have used elegant techniques like depth-wise separable convolution in place of convolution and nearest neighbor interpolation in place of transposed convolution. Further, we have also added multiple interpolations in place of transposed convolution. The results obtained are perceptually similar in quality, while being computationally very efficient. The decrease in the computational complexity of our architecture is validated by the decrease in the testing time by 26.1%, 39.1%, and 57.1%, respectively. |
Tasks | Style Transfer |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05927v1 |
http://arxiv.org/pdf/1807.05927v1.pdf | |
PWC | https://paperswithcode.com/paper/computationally-efficient-approaches-for |
Repo | |
Framework | |
Revisiting Adversarial Risk
Title | Revisiting Adversarial Risk |
Authors | Arun Sai Suggala, Adarsh Prasad, Vaishnavh Nagarajan, Pradeep Ravikumar |
Abstract | Recent works on adversarial perturbations show that there is an inherent trade-off between standard test accuracy and adversarial accuracy. Specifically, they show that no classifier can simultaneously be robust to adversarial perturbations and achieve high standard test accuracy. However, this is contrary to the standard notion that on tasks such as image classification, humans are robust classifiers with low error rate. In this work, we show that the main reason behind this confusion is the inexact definition of adversarial perturbation that is used in the literature. To fix this issue, we propose a slight, yet important modification to the existing definition of adversarial perturbation. Based on the modified definition, we show that there is no trade-off between adversarial and standard accuracies; there exist classifiers that are robust and achieve high standard accuracy. We further study several properties of this new definition of adversarial risk and its relation to the existing definition. |
Tasks | Image Classification |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02924v5 |
http://arxiv.org/pdf/1806.02924v5.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-adversarial-risk |
Repo | |
Framework | |
IMS at the PolEval 2018: A Bulky Ensemble Depedency Parser meets 12 Simple Rules for Predicting Enhanced Dependencies in Polish
Title | IMS at the PolEval 2018: A Bulky Ensemble Depedency Parser meets 12 Simple Rules for Predicting Enhanced Dependencies in Polish |
Authors | Agnieszka Falenska, Anders Björkelund, Xiang Yu, Jonas Kuhn |
Abstract | This paper presents the IMS contribution to the PolEval 2018 Shared Task. We submitted systems for both of the Subtasks of Task 1. In Subtask (A), which was about dependency parsing, we used our ensemble system from the CoNLL 2017 UD Shared Task. The system first preprocesses the sentences with a CRF POS/morphological tagger and predicts supertags with a neural tagger. Then, it employs multiple instances of three different parsers and merges their outputs by applying blending. The system achieved the second place out of four participating teams. In this paper we show which components of the system were the most responsible for its final performance. The goal of Subtask (B) was to predict enhanced graphs. Our approach consisted of two steps: parsing the sentences with our ensemble system from Subtask (A), and applying 12 simple rules to obtain the final dependency graphs. The rules introduce additional enhanced arcs only for tokens with “conj” heads (conjuncts). They do not predict semantic relations at all. The system ranked first out of three participating teams. In this paper we show examples of rules we designed and analyze the relation between the quality of automatically parsed trees and the accuracy of the enhanced graphs. |
Tasks | Dependency Parsing |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.03036v1 |
http://arxiv.org/pdf/1811.03036v1.pdf | |
PWC | https://paperswithcode.com/paper/ims-at-the-poleval-2018-a-bulky-ensemble |
Repo | |
Framework | |
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition
Title | Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition |
Authors | Jianwei Yang, Jiasen Lu, Stefan Lee, Dhruv Batra, Devi Parikh |
Abstract | In an open-world setting, it is inevitable that an intelligent agent (e.g., a robot) will encounter visual objects, attributes or relationships it does not recognize. In this work, we develop an agent empowered with visual curiosity, i.e. the ability to ask questions to an Oracle (e.g., human) about the contents in images (e.g., What is the object on the left side of the red cube?) and build visual recognition model based on the answers received (e.g., Cylinder). In order to do this, the agent must (1) understand what it recognizes and what it does not, (2) formulate a valid, unambiguous and informative language query (a question) to ask the Oracle, (3) derive the parameters of visual classifiers from the Oracle response and (4) leverage the updated visual classifiers to ask more clarified questions. Specifically, we propose a novel framework and formulate the learning of visual curiosity as a reinforcement learning problem. In this framework, all components of our agent, visual recognition module (to see), question generation policy (to ask), answer digestion module (to understand) and graph memory module (to memorize), are learned entirely end-to-end to maximize the reward derived from the scene graph obtained by the agent as a consequence of the dialog with the Oracle. Importantly, the question generation policy is disentangled from the visual recognition system and specifics of the environment. Consequently, we demonstrate a sort of double generalization. Our question generation policy generalizes to new environments and a new pair of eyes, i.e., new visual system. Trained on a synthetic dataset, our results show that our agent learns new visual concepts significantly faster than several heuristic baselines, even when tested on synthetic environments with novel objects, as well as in a realistic environment. |
Tasks | Question Generation |
Published | 2018-10-01 |
URL | http://arxiv.org/abs/1810.00912v1 |
http://arxiv.org/pdf/1810.00912v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-curiosity-learning-to-ask-questions-to |
Repo | |
Framework | |
Semantic Frame Parsing for Information Extraction : the CALOR corpus
Title | Semantic Frame Parsing for Information Extraction : the CALOR corpus |
Authors | Gabriel Marzinotto, Jeremy Auguste, Frederic Bechet, Géraldine Damnati, Alexis Nasr |
Abstract | This paper presents a publicly available corpus of French encyclopedic history texts annotated according to the Berkeley FrameNet formalism. The main difference in our approach compared to previous works on semantic parsing with FrameNet is that we are not interested here in full text parsing but rather on partial parsing. The goal is to select from the FrameNet resources the minimal set of frames that are going to be useful for the applicative framework targeted, in our case Information Extraction from encyclopedic documents. Such an approach leverages the manual annotation of larger corpora than those obtained through full text parsing and therefore opens the door to alternative methods for Frame parsing than those used so far on the FrameNet 1.5 benchmark corpus. The approaches compared in this study rely on an integrated sequence labeling model which jointly optimizes frame identification and semantic role segmentation and identification. The models compared are CRFs and multitasks bi-LSTMs. |
Tasks | Semantic Parsing |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.08039v1 |
http://arxiv.org/pdf/1812.08039v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-frame-parsing-for-information |
Repo | |
Framework | |
Reinforced Co-Training
Title | Reinforced Co-Training |
Authors | Jiawei Wu, Lei Li, William Yang Wang |
Abstract | Co-training is a popular semi-supervised learning framework to utilize a large amount of unlabeled data in addition to a small labeled set. Co-training methods exploit predicted labels on the unlabeled data and select samples based on prediction confidence to augment the training. However, the selection of samples in existing co-training methods is based on a predetermined policy, which ignores the sampling bias between the unlabeled and the labeled subsets, and fails to explore the data space. In this paper, we propose a novel method, Reinforced Co-Training, to select high-quality unlabeled samples to better co-train on. More specifically, our approach uses Q-learning to learn a data selection policy with a small labeled dataset, and then exploits this policy to train the co-training classifiers automatically. Experimental results on clickbait detection and generic text classification tasks demonstrate that our proposed method can obtain more accurate text classification results. |
Tasks | Clickbait Detection, Q-Learning, Text Classification |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06035v1 |
http://arxiv.org/pdf/1804.06035v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforced-co-training |
Repo | |
Framework | |
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations
Title | Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations |
Authors | Xiaoqin Zhang, Huimin Ma |
Abstract | Pretraining with expert demonstrations have been found useful in speeding up the training process of deep reinforcement learning algorithms since less online simulation data is required. Some people use supervised learning to speed up the process of feature learning, others pretrain the policies by imitating expert demonstrations. However, these methods are unstable and not suitable for actor-critic reinforcement learning algorithms. Also, some existing methods rely on the global optimum assumption, which is not true in most scenarios. In this paper, we employ expert demonstrations in a actor-critic reinforcement learning framework, and meanwhile ensure that the performance is not affected by the fact that expert demonstrations are not global optimal. We theoretically derive a method for computing policy gradients and value estimators with only expert demonstrations. Our method is theoretically plausible for actor-critic reinforcement learning algorithms that pretrains both policy and value functions. We apply our method to two of the typical actor-critic reinforcement learning algorithms, DDPG and ACER, and demonstrate with experiments that our method not only outperforms the RL algorithms without pretraining process, but also is more simulation efficient. |
Tasks | |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10459v2 |
http://arxiv.org/pdf/1801.10459v2.pdf | |
PWC | https://paperswithcode.com/paper/pretraining-deep-actor-critic-reinforcement |
Repo | |
Framework | |
Computer Assisted Localization of a Heart Arrhythmia
Title | Computer Assisted Localization of a Heart Arrhythmia |
Authors | Chris Vogl, Peng Zheng, Stephen P. Seslar, Aleksandr Y. Aravkin |
Abstract | We consider the problem of locating a point-source heart arrhythmia using data from a standard diagnostic procedure, where a reference catheter is placed in the heart, and arrival times from a second diagnostic catheter are recorded as the diagnostic catheter moves around within the heart. We model this situation as a nonconvex feasibility problem, where given a set of arrival times, we look for a source location that is consistent with the available data. We develop a new optimization approach and fast algorithm to obtain online proposals for the next location to suggest to the operator as she collects data. We validate the procedure using a Monte Carlo simulation based on patients’ electrophysiological data. The proposed procedure robustly and quickly locates the source of arrhythmias without any prior knowledge of heart anatomy. |
Tasks | |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03091v1 |
http://arxiv.org/pdf/1807.03091v1.pdf | |
PWC | https://paperswithcode.com/paper/computer-assisted-localization-of-a-heart |
Repo | |
Framework | |
Counterfactual Learning from Human Proofreading Feedback for Semantic Parsing
Title | Counterfactual Learning from Human Proofreading Feedback for Semantic Parsing |
Authors | Carolin Lawrence, Stefan Riezler |
Abstract | In semantic parsing for question-answering, it is often too expensive to collect gold parses or even gold answers as supervision signals. We propose to convert model outputs into a set of human-understandable statements which allow non-expert users to act as proofreaders, providing error markings as learning signals to the parser. Because model outputs were suggested by a historic system, we operate in a counterfactual, or off-policy, learning setup. We introduce new estimators which can effectively leverage the given feedback and which avoid known degeneracies in counterfactual learning, while still being applicable to stochastic gradient optimization for neural semantic parsing. Furthermore, we discuss how our feedback collection method can be seamlessly integrated into deployed virtual personal assistants that embed a semantic parser. Our work is the first to show that semantic parsers can be improved significantly by counterfactual learning from logged human feedback data. |
Tasks | Question Answering, Semantic Parsing |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12239v1 |
http://arxiv.org/pdf/1811.12239v1.pdf | |
PWC | https://paperswithcode.com/paper/counterfactual-learning-from-human |
Repo | |
Framework | |
Unifying Isolated and Overlapping Audio Event Detection with Multi-Label Multi-Task Convolutional Recurrent Neural Networks
Title | Unifying Isolated and Overlapping Audio Event Detection with Multi-Label Multi-Task Convolutional Recurrent Neural Networks |
Authors | Huy Phan, Oliver Y. Chén, Philipp Koch, Lam Pham, Ian McLoughlin, Alfred Mertins, Maarten De Vos |
Abstract | We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlapping audio events. The framework leverages the power of convolutional recurrent neural network architectures; convolutional layers learn effective features over which higher recurrent layers perform sequential modelling. Furthermore, the output layer is designed to handle arbitrary degrees of event overlap. At each time step in the recurrent output sequence, an output triple is dedicated to each event category of interest to jointly model event occurrence and temporal boundaries. That is, the network jointly determines whether an event of this category occurs, and when it occurs, by estimating onset and offset positions at each recurrent time step. We then introduce three sequential losses for network training: multi-label classification loss, distance estimation loss, and confidence loss. We demonstrate good generalization on two datasets: ITC-Irst for isolated audio event detection, and TUT-SED-Synthetic-2016 for overlapping audio event detection. |
Tasks | Multi-Label Classification |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.01092v2 |
http://arxiv.org/pdf/1811.01092v2.pdf | |
PWC | https://paperswithcode.com/paper/unifying-isolated-and-overlapping-audio-event |
Repo | |
Framework | |
Mixed batches and symmetric discriminators for GAN training
Title | Mixed batches and symmetric discriminators for GAN training |
Authors | Thomas Lucas, Corentin Tallec, Jakob Verbeek, Yann Ollivier |
Abstract | Generative adversarial networks (GANs) are pow- erful generative models based on providing feed- back to a generative network via a discriminator network. However, the discriminator usually as- sesses individual samples. This prevents the dis- criminator from accessing global distributional statistics of generated samples, and often leads to mode dropping: the generator models only part of the target distribution. We propose to feed the discriminator with mixed batches of true and fake samples, and train it to predict the ratio of true samples in the batch. The latter score does not depend on the order of samples in a batch. Rather than learning this invariance, we introduce a generic permutation-invariant discriminator ar- chitecture. This architecture is provably a uni- versal approximator of all symmetric functions. Experimentally, our approach reduces mode col- lapse in GANs on two synthetic datasets, and obtains good results on the CIFAR10 and CelebA datasets, both qualitatively and quantitatively. |
Tasks | |
Published | 2018-06-19 |
URL | http://arxiv.org/abs/1806.07185v1 |
http://arxiv.org/pdf/1806.07185v1.pdf | |
PWC | https://paperswithcode.com/paper/mixed-batches-and-symmetric-discriminators |
Repo | |
Framework | |
A Multi-layer LSTM-based Approach for Robot Command Interaction Modeling
Title | A Multi-layer LSTM-based Approach for Robot Command Interaction Modeling |
Authors | Martino Mensio, Emanuele Bastianelli, Ilaria Tiddi, Giuseppe Rizzo |
Abstract | As the first robotic platforms slowly approach our everyday life, we can imagine a near future where service robots will be easily accessible by non-expert users through vocal interfaces. The capability of managing natural language would indeed speed up the process of integrating such platform in the ordinary life. Semantic parsing is a fundamental task of the Natural Language Understanding process, as it allows extracting the meaning of a user utterance to be used by a machine. In this paper, we present a preliminary study to semantically parse user vocal commands for a House Service robot, using a multi-layer Long-Short Term Memory neural network with attention mechanism. The system is trained on the Human Robot Interaction Corpus, and it is preliminarily compared with previous approaches. |
Tasks | Semantic Parsing |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05242v1 |
http://arxiv.org/pdf/1811.05242v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-layer-lstm-based-approach-for-robot |
Repo | |
Framework | |
An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method
Title | An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method |
Authors | Li Shen, Peng Sun, Yitong Wang, Wei Liu, Tong Zhang |
Abstract | We propose a novel algorithmic framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-gradient (VMOR-HPE) method with a global convergence guarantee for the maximal monotone operator inclusion problem. Its iteration complexities and local linear convergence rate are provided, which theoretically demonstrate that a large over-relaxed step-size contributes to accelerating the proposed VMOR-HPE as a byproduct. Specifically, we find that a large class of primal and primal-dual operator splitting algorithms are all special cases of VMOR-HPE. Hence, the proposed framework offers a new insight into these operator splitting algorithms. In addition, we apply VMOR-HPE to the Karush-Kuhn-Tucker (KKT) generalized equation of linear equality constrained multi-block composite convex optimization, yielding a new algorithm, namely nonsymmetric Proximal Alternating Direction Method of Multipliers with a preconditioned Extra-gradient step in which the preconditioned metric is generated by a blockwise Barzilai-Borwein line search technique (PADMM-EBB). We also establish iteration complexities of PADMM-EBB in terms of the KKT residual. Finally, we apply PADMM-EBB to handle the nonnegative dual graph regularized low-rank representation problem. Promising results on synthetic and real datasets corroborate the efficacy of PADMM-EBB. |
Tasks | |
Published | 2018-05-16 |
URL | http://arxiv.org/abs/1805.06137v2 |
http://arxiv.org/pdf/1805.06137v2.pdf | |
PWC | https://paperswithcode.com/paper/180506137 |
Repo | |
Framework | |
Quantile Regression Under Memory Constraint
Title | Quantile Regression Under Memory Constraint |
Authors | Xi Chen, Weidong Liu, Yichen Zhang |
Abstract | This paper studies the inference problem in quantile regression (QR) for a large sample size $n$ but under a limited memory constraint, where the memory can only store a small batch of data of size $m$. A natural method is the na"ive divide-and-conquer approach, which splits data into batches of size $m$, computes the local QR estimator for each batch, and then aggregates the estimators via averaging. However, this method only works when $n=o(m^2)$ and is computationally expensive. This paper proposes a computationally efficient method, which only requires an initial QR estimator on a small batch of data and then successively refines the estimator via multiple rounds of aggregations. Theoretically, as long as $n$ grows polynomially in $m$, we establish the asymptotic normality for the obtained estimator and show that our estimator with only a few rounds of aggregations achieves the same efficiency as the QR estimator computed on all the data. Moreover, our result allows the case that the dimensionality $p$ goes to infinity. The proposed method can also be applied to address the QR problem under distributed computing environment (e.g., in a large-scale sensor network) or for real-time streaming data. |
Tasks | |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.08264v1 |
http://arxiv.org/pdf/1810.08264v1.pdf | |
PWC | https://paperswithcode.com/paper/quantile-regression-under-memory-constraint |
Repo | |
Framework | |
TET-GAN: Text Effects Transfer via Stylization and Destylization
Title | TET-GAN: Text Effects Transfer via Stylization and Destylization |
Authors | Shuai Yang, Jiaying Liu, Wenjing Wang, Zongming Guo |
Abstract | Text effects transfer technology automatically makes the text dramatically more impressive. However, previous style transfer methods either study the model for general style, which cannot handle the highly-structured text effects along the glyph, or require manual design of subtle matching criteria for text effects. In this paper, we focus on the use of the powerful representation abilities of deep neural features for text effects transfer. For this purpose, we propose a novel Texture Effects Transfer GAN (TET-GAN), which consists of a stylization subnetwork and a destylization subnetwork. The key idea is to train our network to accomplish both the objective of style transfer and style removal, so that it can learn to disentangle and recombine the content and style features of text effects images. To support the training of our network, we propose a new text effects dataset with as much as 64 professionally designed styles on 837 characters. We show that the disentangled feature representations enable us to transfer or remove all these styles on arbitrary glyphs using one network. Furthermore, the flexible network design empowers TET-GAN to efficiently extend to a new text style via one-shot learning where only one example is required. We demonstrate the superiority of the proposed method in generating high-quality stylized text over the state-of-the-art methods. |
Tasks | One-Shot Learning, Style Transfer, Text Effects Transfer |
Published | 2018-12-16 |
URL | http://arxiv.org/abs/1812.06384v2 |
http://arxiv.org/pdf/1812.06384v2.pdf | |
PWC | https://paperswithcode.com/paper/tet-gan-text-effects-transfer-via-stylization |
Repo | |
Framework | |