January 25, 2020

2980 words 14 mins read

Paper Group ANR 1734

Paper Group ANR 1734

Learning to Generalize One Sample at a Time with Self-Supervision. Explanations can be manipulated and geometry is to blame. Using Structured Representation and Data: A Hybrid Model for Negation and Sentiment in Customer Service Conversations. MSDF: A Deep Reinforcement Learning Framework for Service Function Chain Migration. Robust Semantic Parsin …

Learning to Generalize One Sample at a Time with Self-Supervision

Title Learning to Generalize One Sample at a Time with Self-Supervision
Authors Antonio D’Innocente, Silvia Bucci, Barbara Caputo, Tatiana Tommasi
Abstract Although deep networks have significantly increased the performance of visual recognition methods, it is still challenging to achieve the robustness across visual domains that is necessary for real-world applications. To tackle this issue, research on domain adaptation and generalization has flourished over the last decade. An important aspect to consider when assessing the work done in the literature so far is the amount of data annotation necessary for training each approach, both at the source and target level. In this paper we argue that the data annotation overload should be minimal, as it is costly. Hence, we propose to use self-supervised learning to achieve domain generalization and adaptation. We consider learning regularities from non annotated data as an auxiliary task, and cast the problem within an Auxiliary Learning principled framework. Moreover, we suggest to further exploit the ability to learn about visual domains from non annotated images by learning from target data while testing, as data are presented to the algorithm one sample at a time. Results on three different scenarios confirm the value of our approach.
Tasks Auxiliary Learning, Domain Adaptation, Domain Generalization
Published 2019-10-09
URL https://arxiv.org/abs/1910.03915v3
PDF https://arxiv.org/pdf/1910.03915v3.pdf
PWC https://paperswithcode.com/paper/learning-to-generalize-one-sample-at-a-time
Repo
Framework

Explanations can be manipulated and geometry is to blame

Title Explanations can be manipulated and geometry is to blame
Authors Ann-Kathrin Dombrowski, Maximilian Alber, Christopher J. Anders, Marcel Ackermann, Klaus-Robert Müller, Pan Kessel
Abstract Explanation methods aim to make neural networks more trustworthy and interpretable. In this paper, we demonstrate a property of explanation methods which is disconcerting for both of these purposes. Namely, we show that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network’s output approximately constant. We establish theoretically that this phenomenon can be related to certain geometrical properties of neural networks. This allows us to derive an upper bound on the susceptibility of explanations to manipulations. Based on this result, we propose effective mechanisms to enhance the robustness of explanations.
Tasks
Published 2019-06-19
URL https://arxiv.org/abs/1906.07983v2
PDF https://arxiv.org/pdf/1906.07983v2.pdf
PWC https://paperswithcode.com/paper/explanations-can-be-manipulated-and-geometry
Repo
Framework

Using Structured Representation and Data: A Hybrid Model for Negation and Sentiment in Customer Service Conversations

Title Using Structured Representation and Data: A Hybrid Model for Negation and Sentiment in Customer Service Conversations
Authors Amita Misra, Mansurul Bhuiyan, Jalal Mahmud, Saurabh Tripathy
Abstract Twitter customer service interactions have recently emerged as an effective platform to respond and engage with customers. In this work, we explore the role of negation in customer service interactions, particularly applied to sentiment analysis. We define rules to identify true negation cues and scope more suited to conversational data than existing general review data. Using semantic knowledge and syntactic structure from constituency parse trees, we propose an algorithm for scope detection that performs comparable to state of the art BiLSTM. We further investigate the results of negation scope detection for the sentiment prediction task on customer service conversation data using both a traditional SVM and a Neural Network. We propose an antonym dictionary based method for negation applied to a CNN-LSTM combination model for sentiment analysis. Experimental results show that the antonym-based method outperforms the previous lexicon-based and neural network methods.
Tasks Sentiment Analysis
Published 2019-06-11
URL https://arxiv.org/abs/1906.04706v1
PDF https://arxiv.org/pdf/1906.04706v1.pdf
PWC https://paperswithcode.com/paper/using-structured-representation-and-data-a-1
Repo
Framework

MSDF: A Deep Reinforcement Learning Framework for Service Function Chain Migration

Title MSDF: A Deep Reinforcement Learning Framework for Service Function Chain Migration
Authors Ruoyun Chen, Hancheng Lu, Yujiao Lu, Jinxue Liu
Abstract Under dynamic traffic, service function chain (SFC) migration is considered as an effective way to improve resource utilization. However, the lack of future network information leads to non-optimal solutions, which motivates us to study reinforcement learning based SFC migration from a long-term perspective. In this paper, we formulate the SFC migration problem as a minimization problem with the objective of total network operation cost under constraints of users’ quality of service. We firstly design a deep Q-network based algorithm to solve single SFC migration problem, which can adjust migration strategy online without knowing future information. Further, a novel multi-agent cooperative framework, called MSDF, is proposed to address the challenge of considering multiple SFC migration on the basis of single SFC migration. MSDF reduces the complexity thus accelerates the convergence speed, especially in large scale networks. Experimental results demonstrate that MSDF outperforms typical heuristic algorithms under various scenarios.
Tasks
Published 2019-11-12
URL https://arxiv.org/abs/1911.04801v2
PDF https://arxiv.org/pdf/1911.04801v2.pdf
PWC https://paperswithcode.com/paper/msdf-a-deep-reinforcement-learning-framework
Repo
Framework

Robust Semantic Parsing with Adversarial Learning for Domain Generalization

Title Robust Semantic Parsing with Adversarial Learning for Domain Generalization
Authors Gabriel Marzinotto, Geraldine Damnati, Frédéric Béchet, Benoit Favre
Abstract This paper addresses the issue of generalization for Semantic Parsing in an adversarial framework. Building models that are more robust to inter-document variability is crucial for the integration of Semantic Parsing technologies in real applications. The underlying question throughout this study is whether adversarial learning can be used to train models on a higher level of abstraction in order to increase their robustness to lexical and stylistic variations.We propose to perform Semantic Parsing with a domain classification adversarial task without explicit knowledge of the domain. The strategy is first evaluated on a French corpus of encyclopedic documents, annotated with FrameNet, in an information retrieval perspective, then on PropBank Semantic Role Labeling task on the CoNLL-2005 benchmark. We show that adversarial learning increases all models generalization capabilities both on in and out-of-domain data.
Tasks Domain Generalization, Information Retrieval, Semantic Parsing, Semantic Role Labeling
Published 2019-10-01
URL https://arxiv.org/abs/1910.06700v1
PDF https://arxiv.org/pdf/1910.06700v1.pdf
PWC https://paperswithcode.com/paper/robust-semantic-parsing-with-adversarial-1
Repo
Framework

Ultra-Low Energy and High Speed LIF Neuron using Silicon Bipolar Impact Ionization MOSFET for Spiking Neural Networks

Title Ultra-Low Energy and High Speed LIF Neuron using Silicon Bipolar Impact Ionization MOSFET for Spiking Neural Networks
Authors Alok Kumar Kamal, Jawar Singh
Abstract Silicon bipolar impact ionization MOSFET offers the potential for realization of leaky integrated fire (LIF) neuron due to the presence of parasitic BJT in the floating body. In this work, we have proposed an L shaped gate bipolar impact ionization MOS (L-BIMOS), with reduced breakdown voltage ($V_{B}$ = 1.68 V) and demonstrated the functioning of LIF neuron based on positive feedback mechanism of parasitic BJT. Using 2-D TCAD simulations, we manifest that the proposed L-BIMOS exhibits a low threshold voltage (0.2 V) for firing a spike, and the minimum energy required to fire a single spike for L-BIMOS is calculated to be 0.18 pJ, which makes proposed device $194\times$ more energy efficient than PD-SOI MOSFET silicon neuron (MOSFET silicon neuron) and $5\times10^{3}$ times more energy efficient than analog/digital circuit based conventional neuron. Furthermore, the proposed L-BIMOS silicon neuron exhibits spiking frequency in the GHz range, when the drain is biased at $V_{DG}$ = 2.0 V.
Tasks
Published 2019-09-02
URL https://arxiv.org/abs/1909.00669v1
PDF https://arxiv.org/pdf/1909.00669v1.pdf
PWC https://paperswithcode.com/paper/ultra-low-energy-and-high-speed-lif-neuron
Repo
Framework

Modeling e-Learners’ Cognitive and Metacognitive Strategy in Comparative Question Solving

Title Modeling e-Learners’ Cognitive and Metacognitive Strategy in Comparative Question Solving
Authors Feng Tian, Jia Yue, Kuo-ming Chao, Buyue Qian, Nazaraf Shah, Longzhuang Li, Haiping Zhu, Yan Chen, Bin Zeng, Qinghua Zheng
Abstract Cognitive and metacognitive strategy had demonstrated a significant role in self-regulated learning (SRL), and an appropriate use of strategies is beneficial to effective learning or question-solving tasks during a human-computer interaction process. This paper proposes a novel method combining Knowledge Map (KM) based data mining technique with Thinking Map (TM) to detect learner’s cognitive and metacognitive strategy in the question-solving scenario. In particular, a graph-based mining algorithm is designed to facilitate our proposed method, which can automatically map cognitive strategy to metacognitive strategy with raising abstraction level, and make the cognitive and metacognitive process viewable, which acts like a reverse engineering engine to explain how a learner thinks when solving a question. Additionally, we develop an online learning environment system for participants to learn and record their behaviors. To corroborate the effectiveness of our approach and algorithm, we conduct experiments recruiting 173 postgraduate and undergraduate students, and they were asked to complete a question-solving task, such as “What are similarities and differences between array and pointer?” from “The C Programming Language” course and “What are similarities and differences between packet switching and circuit switching?” from “Computer Network Principle” course. The mined strategies patterns results are encouraging and supported well our proposed method.
Tasks
Published 2019-06-04
URL https://arxiv.org/abs/1906.03074v1
PDF https://arxiv.org/pdf/1906.03074v1.pdf
PWC https://paperswithcode.com/paper/modeling-e-learners-cognitive-and
Repo
Framework

Simplifying Neural Networks with the Marabou Verification Engine

Title Simplifying Neural Networks with the Marabou Verification Engine
Authors Sumathi Gokulanathan, Alexander Feldsher, Adi Malca, Clark Barrett, Guy Katz
Abstract Deep neural network (DNN) verification is an emerging field, with diverse verification engines quickly becoming available. Demonstrating the effectiveness of these tools on real-world DNNs is an important step towards their wider adoption. We focus here on the recently proposed Marabou verification tool, and demonstrate its usage for a novel application: simplifying neural networks, by reducing the size of a DNN without harming its accuracy. We report on the work-flow of the simplification process, and on its potential significance and applicability to domains of interest.
Tasks
Published 2019-10-25
URL https://arxiv.org/abs/1910.12396v1
PDF https://arxiv.org/pdf/1910.12396v1.pdf
PWC https://paperswithcode.com/paper/simplifying-neural-networks-with-the-marabou
Repo
Framework

WaveFlow: A Compact Flow-based Model for Raw Audio

Title WaveFlow: A Compact Flow-based Model for Raw Audio
Authors Wei Ping, Kainan Peng, Kexin Zhao, Zhao Song
Abstract In this work, we propose WaveFlow, a small-footprint generative flow for raw audio, which is directly trained with maximum likelihood. It handles the long-range structure of waveform with a dilated 2-D convolutional architecture, while modeling the local variations using expressive autoregressive functions. WaveFlow provides a unified view of likelihood-based models for raw audio, including WaveNet and WaveGlow as special cases. It generates high-fidelity speech as WaveNet, while synthesizing several orders of magnitude faster as it only requires a few sequential steps to generate very long waveforms. Furthermore, it can significantly reduce the likelihood gap that has existed between autoregressive models and flow-based models for efficient synthesis. Finally, our small-footprint WaveFlow has only 5.91M parameters, which is 15$\times$ smaller than WaveGlow. It can generate 22.05 kHz high-fidelity audio 42.6$\times$ faster than real-time on a V100 GPU without engineered inference kernels.
Tasks
Published 2019-12-03
URL https://arxiv.org/abs/1912.01219v3
PDF https://arxiv.org/pdf/1912.01219v3.pdf
PWC https://paperswithcode.com/paper/waveflow-a-compact-flow-based-model-for-raw-1
Repo
Framework

AVD: Adversarial Video Distillation

Title AVD: Adversarial Video Distillation
Authors Mohammad Tavakolian, Mohammad Sabokrou, Abdenour Hadid
Abstract In this paper, we present a simple yet efficient approach for video representation, called Adversarial Video Distillation (AVD). The key idea is to represent videos by compressing them in the form of realistic images, which can be used in a variety of video-based scene analysis applications. Representing a video as a single image enables us to address the problem of video analysis by image analysis techniques. To this end, we exploit a 3D convolutional encoder-decoder network to encode the input video as an image by minimizing the reconstruction error. Furthermore, weak supervision by an adversarial training procedure is imposed on the output of the encoder to generate semantically realistic images. The encoder learns to extract semantically meaningful representations from a given input video by mapping the 3D input into a 2D latent representation. The obtained representation can be simply used as the input of deep models pre-trained on images for video classification. We evaluated the effectiveness of our proposed method for video-based activity recognition on three standard and challenging benchmark datasets, i.e. UCF101, HMDB51, and Kinetics. The experimental results demonstrate that AVD achieves interesting performance, outperforming the state-of-the-art methods for video classification.
Tasks Activity Recognition, Video Classification
Published 2019-07-12
URL https://arxiv.org/abs/1907.05640v1
PDF https://arxiv.org/pdf/1907.05640v1.pdf
PWC https://paperswithcode.com/paper/avd-adversarial-video-distillation
Repo
Framework

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Title SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
Authors Zhize Li
Abstract We analyze stochastic gradient algorithms for optimizing nonconvex problems. In particular, our goal is to find local minima (second-order stationary points) instead of just finding first-order stationary points which may be some bad unstable saddle points. We show that a simple perturbed version of stochastic recursive gradient descent algorithm (called SSRGD) can find an $(\epsilon,\delta)$-second-order stationary point with $\widetilde{O}(\sqrt{n}/\epsilon^2 + \sqrt{n}/\delta^4 + n/\delta^3)$ stochastic gradient complexity for nonconvex finite-sum problems. As a by-product, SSRGD finds an $\epsilon$-first-order stationary point with $O(n+\sqrt{n}/\epsilon^2)$ stochastic gradients. These results are almost optimal since Fang et al. [2018] provided a lower bound $\Omega(\sqrt{n}/\epsilon^2)$ for finding even just an $\epsilon$-first-order stationary point. We emphasize that SSRGD algorithm for finding second-order stationary points is as simple as for finding first-order stationary points just by adding a uniform perturbation sometimes, while all other algorithms for finding second-order stationary points with similar gradient complexity need to combine with a negative-curvature search subroutine (e.g., Neon2 [Allen-Zhu and Li, 2018]). Moreover, the simple SSRGD algorithm gets a simpler analysis. Besides, we also extend our results from nonconvex finite-sum problems to nonconvex online (expectation) problems, and prove the corresponding convergence results.
Tasks
Published 2019-04-19
URL https://arxiv.org/abs/1904.09265v2
PDF https://arxiv.org/pdf/1904.09265v2.pdf
PWC https://paperswithcode.com/paper/ssrgd-simple-stochastic-recursive-gradient
Repo
Framework
Title Efficiently Exploring Ordering Problems through Conflict-directed Search
Authors Jingkai Chen, Cheng Fang, David Wang, Andrew Wang, Brian Williams
Abstract In planning and scheduling, solving problems with both state and temporal constraints is hard since these constraints may be highly coupled. Judicious orderings of events enable solvers to efficiently make decisions over sequences of actions to satisfy complex hybrid specifications. The ordering problem is thus fundamental to planning. Promising recent works have explored the ordering problem as search, incorporating a special tree structure for efficiency. However, such approaches only reason over partial order specifications. Having observed that an ordering is inconsistent with respect to underlying constraints, prior works do not exploit the tree structure to efficiently generate orderings that resolve the inconsistency. In this paper, we present Conflict-directed Incremental Total Ordering (CDITO), a conflict-directed search method to incrementally and systematically generate event total orders given ordering relations and conflicts returned by sub-solvers. Due to its ability to reason over conflicts, CDITO is much more efficient than Incremental Total Ordering. We demonstrate this by benchmarking on temporal network configuration problems that involve routing network flows and allocating bandwidth resources over time.
Tasks
Published 2019-04-15
URL http://arxiv.org/abs/1904.07366v1
PDF http://arxiv.org/pdf/1904.07366v1.pdf
PWC https://paperswithcode.com/paper/efficiently-exploring-ordering-problems
Repo
Framework

Deep Neural Rejection against Adversarial Examples

Title Deep Neural Rejection against Adversarial Examples
Authors Angelo Sotgiu, Ambra Demontis, Marco Melis, Battista Biggio, Giorgio Fumera, Xiaoyi Feng, Fabio Roli
Abstract Despite the impressive performances reported by deep neural networks in different application domains, they remain largely vulnerable to adversarial examples, i.e., input samples that are carefully perturbed to cause misclassification at test time. In this work, we propose a deep neural rejection mechanism to detect adversarial examples, based on the idea of rejecting samples that exhibit anomalous feature representations at different network layers. With respect to competing approaches, our method does not require generating adversarial examples at training time, and it is less computationally demanding. To properly evaluate our method, we define an adaptive white-box attack that is aware of the defense mechanism and aims to bypass it. Under this worst-case setting, we empirically show that our approach outperforms previously-proposed methods that detect adversarial examples by only analyzing the feature representation provided by the output network layer.
Tasks
Published 2019-10-01
URL https://arxiv.org/abs/1910.00470v2
PDF https://arxiv.org/pdf/1910.00470v2.pdf
PWC https://paperswithcode.com/paper/deep-neural-rejection-against-adversarial
Repo
Framework

Importance Resampling for Off-policy Prediction

Title Importance Resampling for Off-policy Prediction
Authors Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White
Abstract Importance sampling (IS) is a common reweighting strategy for off-policy prediction in reinforcement learning. While it is consistent and unbiased, it can result in high variance updates to the weights for the value function. In this work, we explore a resampling strategy as an alternative to reweighting. We propose Importance Resampling (IR) for off-policy prediction, which resamples experience from a replay buffer and applies standard on-policy updates. The approach avoids using importance sampling ratios in the update, instead correcting the distribution before the update. We characterize the bias and consistency of IR, particularly compared to Weighted IS (WIS). We demonstrate in several microworlds that IR has improved sample efficiency and lower variance updates, as compared to IS and several variance-reduced IS strategies, including variants of WIS and V-trace which clips IS ratios. We also provide a demonstration showing IR improves over IS for learning a value function from images in a racing car simulator.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.04328v2
PDF https://arxiv.org/pdf/1906.04328v2.pdf
PWC https://paperswithcode.com/paper/importance-resampling-for-off-policy
Repo
Framework

Pretrained AI Models: Performativity, Mobility, and Change

Title Pretrained AI Models: Performativity, Mobility, and Change
Authors Lav R. Varshney, Nitish Shirish Keskar, Richard Socher
Abstract The paradigm of pretrained deep learning models has recently emerged in artificial intelligence practice, allowing deployment in numerous societal settings with limited computational resources, but also embedding biases and enabling unintended negative uses. In this paper, we treat pretrained models as objects of study and discuss the ethical impacts of their sociological position. We discuss how pretrained models are developed and compared under the common task framework, but that this may make self-regulation inadequate. Further how pretrained models may have a performative effect on society that exacerbates biases. We then discuss how pretrained models move through actor networks as a kind of computationally immutable mobile, but that users also act as agents of technological change by reinterpreting them via fine-tuning and transfer. We further discuss how users may use pretrained models in malicious ways, drawing a novel connection between the responsible innovation and user-centered innovation literatures. We close by discussing how this sociological understanding of pretrained models can inform AI governance frameworks for fairness, accountability, and transparency.
Tasks
Published 2019-09-07
URL https://arxiv.org/abs/1909.03290v1
PDF https://arxiv.org/pdf/1909.03290v1.pdf
PWC https://paperswithcode.com/paper/pretrained-ai-models-performativity-mobility
Repo
Framework
comments powered by Disqus