February 1, 2020

3326 words 16 mins read

Paper Group AWR 368

Paper Group AWR 368

TrueLearn: A Family of Bayesian Algorithms to Match Lifelong Learners to Open Educational Resources. Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection. ExCL: Extractive Clip Localization Using Natural Language Descriptions. ExtremeC3Net: Extreme Lightweight Portrait Segmentation Networks using Adva …

TrueLearn: A Family of Bayesian Algorithms to Match Lifelong Learners to Open Educational Resources

Title TrueLearn: A Family of Bayesian Algorithms to Match Lifelong Learners to Open Educational Resources
Authors Sahan Bulathwela, Maria Perez-Ortiz, Emine Yilmaz, John Shawe-Taylor
Abstract The recent advances in computer-assisted learning systems and the availability of open educational resources today promise a pathway to providing cost-efficient, high-quality education to large masses of learners. One of the most ambitious use cases of computer-assisted learning is to build a lifelong learning recommendation system. Unlike short-term courses, lifelong learning presents unique challenges, requiring sophisticated recommendation models that account for a wide range of factors such as background knowledge of learners or novelty of the material while effectively maintaining knowledge states of masses of learners for significantly longer periods of time (ideally, a lifetime). This work presents the foundations towards building a dynamic, scalable and transparent recommendation system for education, modelling learner’s knowledge from implicit data in the form of engagement with open educational resources. We i) use a text ontology based on Wikipedia to automatically extract knowledge components of educational resources and, ii) propose a set of online Bayesian strategies inspired by the well-known areas of item response theory and knowledge tracing. Our proposal, TrueLearn, focuses on recommendations for which the learner has enough background knowledge (so they are able to understand and learn from the material), and the material has enough novelty that would help the learner improve their knowledge about the subject and keep them engaged. We further construct a large open educational video lectures dataset and test the performance of the proposed algorithms, which show clear promise towards building an effective educational recommendation system.
Tasks Knowledge Tracing
Published 2019-11-21
URL https://arxiv.org/abs/1911.09471v1
PDF https://arxiv.org/pdf/1911.09471v1.pdf
PWC https://paperswithcode.com/paper/truelearn-a-family-of-bayesian-algorithms-to
Repo https://github.com/sahanbull/TrueLearn
Framework none

Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection

Title Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection
Authors Xuanyi Dong, Yi Yang
Abstract Facial landmark detection aims to localize the anatomically defined points of human faces. In this paper, we study facial landmark detection from partially labeled facial images. A typical approach is to (1) train a detector on the labeled images; (2) generate new training samples using this detector’s prediction as pseudo labels of unlabeled images; (3) retrain the detector on the labeled samples and partial pseudo labeled samples. In this way, the detector can learn from both labeled and unlabeled data to become robust. In this paper, we propose an interaction mechanism between a teacher and two students to generate more reliable pseudo labels for unlabeled data, which are beneficial to semi-supervised facial landmark detection. Specifically, the two students are instantiated as dual detectors. The teacher learns to judge the quality of the pseudo labels generated by the students and filter out unqualified samples before the retraining stage. In this way, the student detectors get feedback from their teacher and are retrained by premium data generated by itself. Since the two students are trained by different samples, a combination of their predictions will be more robust as the final prediction compared to either prediction. Extensive experiments on 300-W and AFLW benchmarks show that the interactions between teacher and students contribute to better utilization of the unlabeled data and achieves state-of-the-art performance.
Tasks Facial Landmark Detection
Published 2019-08-06
URL https://arxiv.org/abs/1908.02116v3
PDF https://arxiv.org/pdf/1908.02116v3.pdf
PWC https://paperswithcode.com/paper/teacher-supervises-students-how-to-learn-from
Repo https://github.com/D-X-Y/landmark-detection
Framework pytorch

ExCL: Extractive Clip Localization Using Natural Language Descriptions

Title ExCL: Extractive Clip Localization Using Natural Language Descriptions
Authors Soham Ghosh, Anuva Agarwal, Zarana Parekh, Alexander Hauptmann
Abstract The task of retrieving clips within videos based on a given natural language query requires cross-modal reasoning over multiple frames. Prior approaches such as sliding window classifiers are inefficient, while text-clip similarity driven ranking-based approaches such as segment proposal networks are far more complicated. In order to select the most relevant video clip corresponding to the given text description, we propose a novel extractive approach that predicts the start and end frames by leveraging cross-modal interactions between the text and video - this removes the need to retrieve and re-rank multiple proposal segments. Using recurrent networks we encode the two modalities into a joint representation which is then used in different variants of start-end frame predictor networks. Through extensive experimentation and ablative analysis, we demonstrate that our simple and elegant approach significantly outperforms state of the art on two datasets and has comparable performance on a third.
Tasks
Published 2019-04-04
URL http://arxiv.org/abs/1904.02755v1
PDF http://arxiv.org/pdf/1904.02755v1.pdf
PWC https://paperswithcode.com/paper/excl-extractive-clip-localization-using
Repo https://github.com/jayleicn/TVRetrieval
Framework pytorch

ExtremeC3Net: Extreme Lightweight Portrait Segmentation Networks using Advanced C3-modules

Title ExtremeC3Net: Extreme Lightweight Portrait Segmentation Networks using Advanced C3-modules
Authors Hyojin Park, Lars Lowe Sjösund, YoungJoon Yoo, Jihwan Bang, Nojun Kwak
Abstract Designing a lightweight and robust portrait segmentation algorithm is an important task for a wide range of face applications. However, the problem has been considered as a subset of the object segmentation problem. bviously, portrait segmentation has its unique requirements. First, because the portrait segmentation is performed in the middle of a whole process of many realworld applications, it requires extremely lightweight models. Second, there has not been any public datasets in this domain that contain a sufficient number of images with unbiased statistics. To solve the problems, we introduce a new extremely lightweight portrait segmentation model consisting of a two-branched architecture based on the concentrated-comprehensive convolutions block. Our method reduces the number of parameters from 2.1M to 37.7K (around 98.2% reduction), while maintaining the accuracy within a 1% margin from the state-of-the-art portrait segmentation method. In our qualitative and quantitative analysis on the EG1800 dataset, we show that our method outperforms various existing lightweight segmentation models. Second, we propose a simple method to create additional portrait segmentation data which can improve accuracy on the EG1800 dataset. Also, we analyze the bias in public datasets by additionally annotating race, gender, and age on our own. The augmented dataset, the additional annotations and code are available in https://github.com/HYOJINPARK/ExtPortraitSeg .
Tasks Semantic Segmentation
Published 2019-08-08
URL https://arxiv.org/abs/1908.03093v3
PDF https://arxiv.org/pdf/1908.03093v3.pdf
PWC https://paperswithcode.com/paper/extremec3net-extreme-lightweight-portrait
Repo https://github.com/HYOJINPARK/ExtPortraitSeg
Framework pytorch

Universal Adversarial Perturbations for CNN Classifiers in EEG-Based BCIs

Title Universal Adversarial Perturbations for CNN Classifiers in EEG-Based BCIs
Authors Zihan Liu, Xiao Zhang, Dongrui Wu
Abstract Multiple convolutional neural network (CNN) classifiers have been proposed for electroencephalogram (EEG) based brain-computer interfaces (BCIs). However, CNN models have been found vulnerable to universal adversarial perturbations (UAPs), which are small and example-independent, yet powerful enough to degrade the performance of a CNN model, when added to a benign example. This paper proposes a novel total loss minimization (TLM) approach to generate UAPs for EEG-based BCIs. Experimental results demonstrate the effectiveness of TLM on three popular CNN classifiers for both target and non-target attacks. We also verify the transferability of UAPs in EEG-based BCI systems. To our knowledge, this is the first study on UAPs of CNN classifiers in EEG-based BCIs, and also the first study on UAPs for target attacks. UAPs are easy to construct, and can attack BCIs in real-time, exposing a critical security concern of BCIs.
Tasks EEG
Published 2019-12-03
URL https://arxiv.org/abs/1912.01171v1
PDF https://arxiv.org/pdf/1912.01171v1.pdf
PWC https://paperswithcode.com/paper/universal-adversarial-perturbations-for-cnn
Repo https://github.com/ZihanLiu95/UAP_EEG
Framework none

Variational Federated Multi-Task Learning

Title Variational Federated Multi-Task Learning
Authors Luca Corinzia, Joachim M. Buhmann
Abstract In classical federated learning a central server coordinates the training of a single model on a massively distributed network of devices. This setting can be naturally extended to a multi-task learning framework, to handle real-world federated datasets that typically show strong non-IID data distributions among devices. Even though federated multi-task learning has been shown to be an effective paradigm for real world datasets, it has been applied only to convex models. In this work we introduce VIRTUAL, an algorithm for federated multi-task learning with non-convex models. In VIRTUAL the federated network of the server and the clients is treated as a star-shaped Bayesian network, and learning is performed on the network using approximated variational inference. We show that this method is effective on real-world federated datasets, outperforming the current state-of-the-art for federated learning.
Tasks Multi-Task Learning
Published 2019-06-14
URL https://arxiv.org/abs/1906.06268v1
PDF https://arxiv.org/pdf/1906.06268v1.pdf
PWC https://paperswithcode.com/paper/variational-federated-multi-task-learning
Repo https://github.com/lucori/virtual
Framework tf

Visualization of Convolutional Neural Networks for Monocular Depth Estimation

Title Visualization of Convolutional Neural Networks for Monocular Depth Estimation
Authors Junjie Hu, Yan Zhang, Takayuki Okatani
Abstract Recently, convolutional neural networks (CNNs) have shown great success on the task of monocular depth estimation. A fundamental yet unanswered question is: how CNNs can infer depth from a single image. Toward answering this question, we consider visualization of inference of a CNN by identifying relevant pixels of an input image to depth estimation. We formulate it as an optimization problem of identifying the smallest number of image pixels from which the CNN can estimate a depth map with the minimum difference from the estimate from the entire image. To cope with a difficulty with optimization through a deep CNN, we propose to use another network to predict those relevant image pixels in a forward computation. In our experiments, we first show the effectiveness of this approach, and then apply it to different depth estimation networks on indoor and outdoor scene datasets. The results provide several findings that help exploration of the above question.
Tasks Depth Estimation, Interpretable Machine Learning
Published 2019-04-06
URL http://arxiv.org/abs/1904.03380v1
PDF http://arxiv.org/pdf/1904.03380v1.pdf
PWC https://paperswithcode.com/paper/visualization-of-convolutional-neural
Repo https://github.com/JunjH/Visualizing-CNNs-for-monocular-depth-estimation
Framework pytorch

Universal Self-Attention Network for Graph Classification

Title Universal Self-Attention Network for Graph Classification
Authors Dai Quoc Nguyen, Tu Dinh Nguyen, Dinh Phung
Abstract Existing graph neural network-based models have mainly been biased towards a supervised training setting; and they often share the common limitations in exploiting potential dependencies among nodes. To this end, we present U2GNN, a novel embedding model leveraging on the strength of the recently introduced universal self-attention network (Dehghaniet al., 2019), to learn low-dimensional embeddings of graphs which can be used for graph classification. In particular, given an input graph, U2GNN first applies a self-attention computation, which is then followed by a recurrent transition to iteratively memorize its attention on vector representations of each node and its neighbors across each iteration. Thus, U2GNN can address the limitations in the existing models to produce plausible node embeddings whose sum is the final embedding of the whole graph. Experimental results in both supervised and unsupervised training settings show that our U2GNN produces new state-of-the-art performances on a range of well-known benchmark datasets for the graph classification task. To the best of our knowledge, this is the first work showing that a unsupervised model can significantly work better than supervised models by a large margin.
Tasks Graph Classification, Graph Embedding
Published 2019-09-26
URL https://arxiv.org/abs/1909.11855v4
PDF https://arxiv.org/pdf/1909.11855v4.pdf
PWC https://paperswithcode.com/paper/unsupervised-universal-self-attention-network
Repo https://github.com/daiquocnguyen/U2GNN
Framework tf

Neural Approximate Dynamic Programming for On-Demand Ride-Pooling

Title Neural Approximate Dynamic Programming for On-Demand Ride-Pooling
Authors Sanket Shah, Meghna Lowalekar, Pradeep Varakantham
Abstract On-demand ride-pooling (e.g., UberPool) has recently become popular because of its ability to lower costs for passengers while simultaneously increasing revenue for drivers and aggregation companies. Unlike in Taxi on Demand (ToD) services – where a vehicle is only assigned one passenger at a time – in on-demand ride-pooling, each (possibly partially filled) vehicle can be assigned a group of passenger requests with multiple different origin and destination pairs. To ensure near real-time response, existing solutions to the real-time ride-pooling problem are myopic in that they optimise the objective (e.g., maximise the number of passengers served) for the current time step without considering its effect on future assignments. This is because even a myopic assignment in ride-pooling involves considering what combinations of passenger requests that can be assigned to vehicles, which adds a layer of combinatorial complexity to the ToD problem. A popular approach that addresses the limitations of myopic assignments in ToD problems is Approximate Dynamic Programming (ADP). Existing ADP methods for ToD can only handle Linear Program (LP) based assignments, however, while the assignment problem in ride-pooling requires an Integer Linear Program (ILP) with bad LP relaxations. To this end, our key technical contribution is in providing a general ADP method that can learn from ILP-based assignments. Additionally, we handle the extra combinatorial complexity from combinations of passenger requests by using a Neural Network based approximate value function and show a connection to Deep Reinforcement Learning that allows us to learn this value-function with increased stability and sample-efficiency. We show that our approach outperforms past approaches on a real-world dataset by up to 16%, a significant improvement in city-scale transportation problems.
Tasks
Published 2019-11-20
URL https://arxiv.org/abs/1911.08842v1
PDF https://arxiv.org/pdf/1911.08842v1.pdf
PWC https://paperswithcode.com/paper/neural-approximate-dynamic-programming-for-on
Repo https://github.com/sanketkshah/NeurADP-for-Ride-Pooling
Framework none

Full-Gradient Representation for Neural Network Visualization

Title Full-Gradient Representation for Neural Network Visualization
Authors Suraj Srinivas, Francois Fleuret
Abstract We introduce a new tool for interpreting neural net responses, namely full-gradients, which decomposes the neural net response into input sensitivity and per-neuron sensitivity components. This is the first proposed representation which satisfies two key properties: completeness and weak dependence, which provably cannot be satisfied by any saliency map-based interpretability method. For convolutional nets, we also propose an approximate saliency map representation, called FullGrad, obtained by aggregating the full-gradient components. We experimentally evaluate the usefulness of FullGrad in explaining model behaviour with two quantitative tests: pixel perturbation and remove-and-retrain. Our experiments reveal that our method explains model behaviour correctly, and more comprehensively than other methods in the literature. Visual inspection also reveals that our saliency maps are sharper and more tightly confined to object regions than other methods.
Tasks Interpretable Machine Learning
Published 2019-05-02
URL https://arxiv.org/abs/1905.00780v4
PDF https://arxiv.org/pdf/1905.00780v4.pdf
PWC https://paperswithcode.com/paper/full-jacobian-representation-of-neural
Repo https://github.com/idiap/fullgrad-saliency
Framework pytorch

CAMEL: A Weakly Supervised Learning Framework for Histopathology Image Segmentation

Title CAMEL: A Weakly Supervised Learning Framework for Histopathology Image Segmentation
Authors Gang Xu, Zhigang Song, Zhuo Sun, Calvin Ku, Zhe Yang, Cancheng Liu, Shuhao Wang, Jianpeng Ma, Wei Xu
Abstract Histopathology image analysis plays a critical role in cancer diagnosis and treatment. To automatically segment the cancerous regions, fully supervised segmentation algorithms require labor-intensive and time-consuming labeling at the pixel level. In this research, we propose CAMEL, a weakly supervised learning framework for histopathology image segmentation using only image-level labels. Using multiple instance learning (MIL)-based label enrichment, CAMEL splits the image into latticed instances and automatically generates instance-level labels. After label enrichment, the instance-level labels are further assigned to the corresponding pixels, producing the approximate pixel-level labels and making fully supervised training of segmentation models possible. CAMEL achieves comparable performance with the fully supervised approaches in both instance-level classification and pixel-level segmentation on CAMELYON16 and a colorectal adenoma dataset. Moreover, the generality of the automatic labeling methodology may benefit future weakly supervised learning studies for histopathology image analysis.
Tasks Multiple Instance Learning, Semantic Segmentation
Published 2019-08-28
URL https://arxiv.org/abs/1908.10555v1
PDF https://arxiv.org/pdf/1908.10555v1.pdf
PWC https://paperswithcode.com/paper/camel-a-weakly-supervised-learning-framework
Repo https://github.com/ThoroughImages/CAMEL
Framework none

Action Robust Reinforcement Learning and Applications in Continuous Control

Title Action Robust Reinforcement Learning and Applications in Continuous Control
Authors Chen Tessler, Yonathan Efroni, Shie Mannor
Abstract A policy is said to be robust if it maximizes the reward while considering a bad, or even adversarial, model. In this work we formalize two new criteria of robustness to action uncertainty. Specifically, we consider two scenarios in which the agent attempts to perform an action $a$, and (i) with probability $\alpha$, an alternative adversarial action $\bar a$ is taken, or (ii) an adversary adds a perturbation to the selected action in the case of continuous action space. We show that our criteria are related to common forms of uncertainty in robotics domains, such as the occurrence of abrupt forces, and suggest algorithms in the tabular case. Building on the suggested algorithms, we generalize our approach to deep reinforcement learning (DRL) and provide extensive experiments in the various MuJoCo domains. Our experiments show that not only does our approach produce robust policies, but it also improves the performance in the absence of perturbations. This generalization indicates that action-robustness can be thought of as implicit regularization in RL problems.
Tasks Continuous Control
Published 2019-01-26
URL https://arxiv.org/abs/1901.09184v2
PDF https://arxiv.org/pdf/1901.09184v2.pdf
PWC https://paperswithcode.com/paper/action-robust-reinforcement-learning-and
Repo https://github.com/icml2019-anonymous-author/Action-Robust-Reinforcement-Learning
Framework pytorch

Graph WaveNet for Deep Spatial-Temporal Graph Modeling

Title Graph WaveNet for Deep Spatial-Temporal Graph Modeling
Authors Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Chengqi Zhang
Abstract Spatial-temporal graph modeling is an important task to analyze the spatial relations and temporal trends of components in a system. Existing approaches mostly capture the spatial dependency on a fixed graph structure, assuming that the underlying relation between entities is pre-determined. However, the explicit graph structure (relation) does not necessarily reflect the true dependency and genuine relation may be missing due to the incomplete connections in the data. Furthermore, existing methods are ineffective to capture the temporal trends as the RNNs or CNNs employed in these methods cannot capture long-range temporal sequences. To overcome these limitations, we propose in this paper a novel graph neural network architecture, Graph WaveNet, for spatial-temporal graph modeling. By developing a novel adaptive dependency matrix and learn it through node embedding, our model can precisely capture the hidden spatial dependency in the data. With a stacked dilated 1D convolution component whose receptive field grows exponentially as the number of layers increases, Graph WaveNet is able to handle very long sequences. These two components are integrated seamlessly in a unified framework and the whole framework is learned in an end-to-end manner. Experimental results on two public traffic network datasets, METR-LA and PEMS-BAY, demonstrate the superior performance of our algorithm.
Tasks Traffic Prediction
Published 2019-05-31
URL https://arxiv.org/abs/1906.00121v1
PDF https://arxiv.org/pdf/1906.00121v1.pdf
PWC https://paperswithcode.com/paper/190600121
Repo https://github.com/nnzhan/Graph-WaveNet
Framework pytorch

Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction

Title Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction
Authors Qinyuan Ye, Liyuan Liu, Maosen Zhang, Xiang Ren
Abstract In recent years there is a surge of interest in applying distant supervision (DS) to automatically generate training data for relation extraction (RE). In this paper, we study the problem what limits the performance of DS-trained neural models, conduct thorough analyses, and identify a factor that can influence the performance greatly, shifted label distribution. Specifically, we found this problem commonly exists in real-world DS datasets, and without special handing, typical DS-RE models cannot automatically adapt to this shift, thus achieving deteriorated performance. To further validate our intuition, we develop a simple yet effective adaptation method for DS-trained models, bias adjustment, which updates models learned over the source domain (i.e., DS training set) with a label distribution estimated on the target domain (i.e., test set). Experiments demonstrate that bias adjustment achieves consistent performance gains on DS-trained models, especially on neural models, with an up to 23% relative F1 improvement, which verifies our assumptions. Our code and data can be found at \url{https://github.com/INK-USC/shifted-label-distribution}.
Tasks Relation Extraction
Published 2019-04-19
URL https://arxiv.org/abs/1904.09331v2
PDF https://arxiv.org/pdf/1904.09331v2.pdf
PWC https://paperswithcode.com/paper/190409331
Repo https://github.com/INK-USC/shifted-label-distribution
Framework pytorch

Hierarchical Multi-Task Natural Language Understanding for Cross-domain Conversational AI: HERMIT NLU

Title Hierarchical Multi-Task Natural Language Understanding for Cross-domain Conversational AI: HERMIT NLU
Authors Andrea Vanzo, Emanuele Bastianelli, Oliver Lemon
Abstract We present a new neural architecture for wide-coverage Natural Language Understanding in Spoken Dialogue Systems. We develop a hierarchical multi-task architecture, which delivers a multi-layer representation of sentence meaning (i.e., Dialogue Acts and Frame-like structures). The architecture is a hierarchy of self-attention mechanisms and BiLSTM encoders followed by CRF tagging layers. We describe a variety of experiments, showing that our approach obtains promising results on a dataset annotated with Dialogue Acts and Frame Semantics. Moreover, we demonstrate its applicability to a different, publicly available NLU dataset annotated with domain-specific intents and corresponding semantic roles, providing overall performance higher than state-of-the-art tools such as RASA, Dialogflow, LUIS, and Watson. For example, we show an average 4.45% improvement in entity tagging F-score over Rasa, Dialogflow and LUIS.
Tasks Spoken Dialogue Systems
Published 2019-10-02
URL https://arxiv.org/abs/1910.00912v1
PDF https://arxiv.org/pdf/1910.00912v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-multi-task-natural-language
Repo https://github.com/RasaHQ/rasa
Framework none
comments powered by Disqus