February 1, 2020

3326 words 16 mins read

Paper Group AWR 368

TrueLearn: A Family of Bayesian Algorithms to Match Lifelong Learners to Open Educational Resources. Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection. ExCL: Extractive Clip Localization Using Natural Language Descriptions. ExtremeC3Net: Extreme Lightweight Portrait Segmentation Networks using Adva …

TrueLearn: A Family of Bayesian Algorithms to Match Lifelong Learners to Open Educational Resources


Title	TrueLearn: A Family of Bayesian Algorithms to Match Lifelong Learners to Open Educational Resources
Authors	Sahan Bulathwela, Maria Perez-Ortiz, Emine Yilmaz, John Shawe-Taylor
Abstract	The recent advances in computer-assisted learning systems and the availability of open educational resources today promise a pathway to providing cost-efficient, high-quality education to large masses of learners. One of the most ambitious use cases of computer-assisted learning is to build a lifelong learning recommendation system. Unlike short-term courses, lifelong learning presents unique challenges, requiring sophisticated recommendation models that account for a wide range of factors such as background knowledge of learners or novelty of the material while effectively maintaining knowledge states of masses of learners for significantly longer periods of time (ideally, a lifetime). This work presents the foundations towards building a dynamic, scalable and transparent recommendation system for education, modelling learner’s knowledge from implicit data in the form of engagement with open educational resources. We i) use a text ontology based on Wikipedia to automatically extract knowledge components of educational resources and, ii) propose a set of online Bayesian strategies inspired by the well-known areas of item response theory and knowledge tracing. Our proposal, TrueLearn, focuses on recommendations for which the learner has enough background knowledge (so they are able to understand and learn from the material), and the material has enough novelty that would help the learner improve their knowledge about the subject and keep them engaged. We further construct a large open educational video lectures dataset and test the performance of the proposed algorithms, which show clear promise towards building an effective educational recommendation system.
Tasks	Knowledge Tracing
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09471v1
PDF	https://arxiv.org/pdf/1911.09471v1.pdf
PWC	https://paperswithcode.com/paper/truelearn-a-family-of-bayesian-algorithms-to
Repo	https://github.com/sahanbull/TrueLearn
Framework	none

Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection


Title	Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection
Authors	Xuanyi Dong, Yi Yang
Abstract	Facial landmark detection aims to localize the anatomically defined points of human faces. In this paper, we study facial landmark detection from partially labeled facial images. A typical approach is to (1) train a detector on the labeled images; (2) generate new training samples using this detector’s prediction as pseudo labels of unlabeled images; (3) retrain the detector on the labeled samples and partial pseudo labeled samples. In this way, the detector can learn from both labeled and unlabeled data to become robust. In this paper, we propose an interaction mechanism between a teacher and two students to generate more reliable pseudo labels for unlabeled data, which are beneficial to semi-supervised facial landmark detection. Specifically, the two students are instantiated as dual detectors. The teacher learns to judge the quality of the pseudo labels generated by the students and filter out unqualified samples before the retraining stage. In this way, the student detectors get feedback from their teacher and are retrained by premium data generated by itself. Since the two students are trained by different samples, a combination of their predictions will be more robust as the final prediction compared to either prediction. Extensive experiments on 300-W and AFLW benchmarks show that the interactions between teacher and students contribute to better utilization of the unlabeled data and achieves state-of-the-art performance.
Tasks	Facial Landmark Detection
Published	2019-08-06
URL	https://arxiv.org/abs/1908.02116v3
PDF	https://arxiv.org/pdf/1908.02116v3.pdf
PWC	https://paperswithcode.com/paper/teacher-supervises-students-how-to-learn-from
Repo	https://github.com/D-X-Y/landmark-detection
Framework	pytorch

ExCL: Extractive Clip Localization Using Natural Language Descriptions


Title	ExCL: Extractive Clip Localization Using Natural Language Descriptions
Authors	Soham Ghosh, Anuva Agarwal, Zarana Parekh, Alexander Hauptmann
Abstract	The task of retrieving clips within videos based on a given natural language query requires cross-modal reasoning over multiple frames. Prior approaches such as sliding window classifiers are inefficient, while text-clip similarity driven ranking-based approaches such as segment proposal networks are far more complicated. In order to select the most relevant video clip corresponding to the given text description, we propose a novel extractive approach that predicts the start and end frames by leveraging cross-modal interactions between the text and video - this removes the need to retrieve and re-rank multiple proposal segments. Using recurrent networks we encode the two modalities into a joint representation which is then used in different variants of start-end frame predictor networks. Through extensive experimentation and ablative analysis, we demonstrate that our simple and elegant approach significantly outperforms state of the art on two datasets and has comparable performance on a third.
Tasks
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02755v1
PDF	http://arxiv.org/pdf/1904.02755v1.pdf
PWC	https://paperswithcode.com/paper/excl-extractive-clip-localization-using
Repo	https://github.com/jayleicn/TVRetrieval
Framework	pytorch

ExtremeC3Net: Extreme Lightweight Portrait Segmentation Networks using Advanced C3-modules


Title	ExtremeC3Net: Extreme Lightweight Portrait Segmentation Networks using Advanced C3-modules
Authors	Hyojin Park, Lars Lowe Sjösund, YoungJoon Yoo, Jihwan Bang, Nojun Kwak
Abstract	Designing a lightweight and robust portrait segmentation algorithm is an important task for a wide range of face applications. However, the problem has been considered as a subset of the object segmentation problem. bviously, portrait segmentation has its unique requirements. First, because the portrait segmentation is performed in the middle of a whole process of many realworld applications, it requires extremely lightweight models. Second, there has not been any public datasets in this domain that contain a sufficient number of images with unbiased statistics. To solve the problems, we introduce a new extremely lightweight portrait segmentation model consisting of a two-branched architecture based on the concentrated-comprehensive convolutions block. Our method reduces the number of parameters from 2.1M to 37.7K (around 98.2% reduction), while maintaining the accuracy within a 1% margin from the state-of-the-art portrait segmentation method. In our qualitative and quantitative analysis on the EG1800 dataset, we show that our method outperforms various existing lightweight segmentation models. Second, we propose a simple method to create additional portrait segmentation data which can improve accuracy on the EG1800 dataset. Also, we analyze the bias in public datasets by additionally annotating race, gender, and age on our own. The augmented dataset, the additional annotations and code are available in https://github.com/HYOJINPARK/ExtPortraitSeg .
Tasks	Semantic Segmentation
Published	2019-08-08
URL	https://arxiv.org/abs/1908.03093v3
PDF	https://arxiv.org/pdf/1908.03093v3.pdf
PWC	https://paperswithcode.com/paper/extremec3net-extreme-lightweight-portrait
Repo	https://github.com/HYOJINPARK/ExtPortraitSeg
Framework	pytorch

Universal Adversarial Perturbations for CNN Classifiers in EEG-Based BCIs


Title	Universal Adversarial Perturbations for CNN Classifiers in EEG-Based BCIs
Authors	Zihan Liu, Xiao Zhang, Dongrui Wu
Abstract	Multiple convolutional neural network (CNN) classifiers have been proposed for electroencephalogram (EEG) based brain-computer interfaces (BCIs). However, CNN models have been found vulnerable to universal adversarial perturbations (UAPs), which are small and example-independent, yet powerful enough to degrade the performance of a CNN model, when added to a benign example. This paper proposes a novel total loss minimization (TLM) approach to generate UAPs for EEG-based BCIs. Experimental results demonstrate the effectiveness of TLM on three popular CNN classifiers for both target and non-target attacks. We also verify the transferability of UAPs in EEG-based BCI systems. To our knowledge, this is the first study on UAPs of CNN classifiers in EEG-based BCIs, and also the first study on UAPs for target attacks. UAPs are easy to construct, and can attack BCIs in real-time, exposing a critical security concern of BCIs.
Tasks	EEG
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01171v1
PDF	https://arxiv.org/pdf/1912.01171v1.pdf
PWC	https://paperswithcode.com/paper/universal-adversarial-perturbations-for-cnn
Repo	https://github.com/ZihanLiu95/UAP_EEG
Framework	none

Variational Federated Multi-Task Learning


Title	Variational Federated Multi-Task Learning
Authors	Luca Corinzia, Joachim M. Buhmann
Abstract	In classical federated learning a central server coordinates the training of a single model on a massively distributed network of devices. This setting can be naturally extended to a multi-task learning framework, to handle real-world federated datasets that typically show strong non-IID data distributions among devices. Even though federated multi-task learning has been shown to be an effective paradigm for real world datasets, it has been applied only to convex models. In this work we introduce VIRTUAL, an algorithm for federated multi-task learning with non-convex models. In VIRTUAL the federated network of the server and the clients is treated as a star-shaped Bayesian network, and learning is performed on the network using approximated variational inference. We show that this method is effective on real-world federated datasets, outperforming the current state-of-the-art for federated learning.
Tasks	Multi-Task Learning
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06268v1
PDF	https://arxiv.org/pdf/1906.06268v1.pdf
PWC	https://paperswithcode.com/paper/variational-federated-multi-task-learning
Repo	https://github.com/lucori/virtual
Framework	tf

Visualization of Convolutional Neural Networks for Monocular Depth Estimation


Title	Visualization of Convolutional Neural Networks for Monocular Depth Estimation
Authors	Junjie Hu, Yan Zhang, Takayuki Okatani
Abstract	Recently, convolutional neural networks (CNNs) have shown great success on the task of monocular depth estimation. A fundamental yet unanswered question is: how CNNs can infer depth from a single image. Toward answering this question, we consider visualization of inference of a CNN by identifying relevant pixels of an input image to depth estimation. We formulate it as an optimization problem of identifying the smallest number of image pixels from which the CNN can estimate a depth map with the minimum difference from the estimate from the entire image. To cope with a difficulty with optimization through a deep CNN, we propose to use another network to predict those relevant image pixels in a forward computation. In our experiments, we first show the effectiveness of this approach, and then apply it to different depth estimation networks on indoor and outdoor scene datasets. The results provide several findings that help exploration of the above question.
Tasks	Depth Estimation, Interpretable Machine Learning
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03380v1
PDF	http://arxiv.org/pdf/1904.03380v1.pdf
PWC	https://paperswithcode.com/paper/visualization-of-convolutional-neural
Repo	https://github.com/JunjH/Visualizing-CNNs-for-monocular-depth-estimation
Framework	pytorch

Universal Self-Attention Network for Graph Classification


Title	Universal Self-Attention Network for Graph Classification
Authors	Dai Quoc Nguyen, Tu Dinh Nguyen, Dinh Phung
Abstract	Existing graph neural network-based models have mainly been biased towards a supervised training setting; and they often share the common limitations in exploiting potential dependencies among nodes. To this end, we present U2GNN, a novel embedding model leveraging on the strength of the recently introduced universal self-attention network (Dehghaniet al., 2019), to learn low-dimensional embeddings of graphs which can be used for graph classification. In particular, given an input graph, U2GNN first applies a self-attention computation, which is then followed by a recurrent transition to iteratively memorize its attention on vector representations of each node and its neighbors across each iteration. Thus, U2GNN can address the limitations in the existing models to produce plausible node embeddings whose sum is the final embedding of the whole graph. Experimental results in both supervised and unsupervised training settings show that our U2GNN produces new state-of-the-art performances on a range of well-known benchmark datasets for the graph classification task. To the best of our knowledge, this is the first work showing that a unsupervised model can significantly work better than supervised models by a large margin.
Tasks	Graph Classification, Graph Embedding
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11855v4
PDF	https://arxiv.org/pdf/1909.11855v4.pdf
PWC	https://paperswithcode.com/paper/unsupervised-universal-self-attention-network
Repo	https://github.com/daiquocnguyen/U2GNN
Framework	tf

Neural Approximate Dynamic Programming for On-Demand Ride-Pooling


Title	Neural Approximate Dynamic Programming for On-Demand Ride-Pooling
Authors	Sanket Shah, Meghna Lowalekar, Pradeep Varakantham
Abstract	On-demand ride-pooling (e.g., UberPool) has recently become popular because of its ability to lower costs for passengers while simultaneously increasing revenue for drivers and aggregation companies. Unlike in Taxi on Demand (ToD) services – where a vehicle is only assigned one passenger at a time – in on-demand ride-pooling, each (possibly partially filled) vehicle can be assigned a group of passenger requests with multiple different origin and destination pairs. To ensure near real-time response, existing solutions to the real-time ride-pooling problem are myopic in that they optimise the objective (e.g., maximise the number of passengers served) for the current time step without considering its effect on future assignments. This is because even a myopic assignment in ride-pooling involves considering what combinations of passenger requests that can be assigned to vehicles, which adds a layer of combinatorial complexity to the ToD problem. A popular approach that addresses the limitations of myopic assignments in ToD problems is Approximate Dynamic Programming (ADP). Existing ADP methods for ToD can only handle Linear Program (LP) based assignments, however, while the assignment problem in ride-pooling requires an Integer Linear Program (ILP) with bad LP relaxations. To this end, our key technical contribution is in providing a general ADP method that can learn from ILP-based assignments. Additionally, we handle the extra combinatorial complexity from combinations of passenger requests by using a Neural Network based approximate value function and show a connection to Deep Reinforcement Learning that allows us to learn this value-function with increased stability and sample-efficiency. We show that our approach outperforms past approaches on a real-world dataset by up to 16%, a significant improvement in city-scale transportation problems.
Tasks
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08842v1
PDF	https://arxiv.org/pdf/1911.08842v1.pdf
PWC	https://paperswithcode.com/paper/neural-approximate-dynamic-programming-for-on
Repo	https://github.com/sanketkshah/NeurADP-for-Ride-Pooling
Framework	none

Full-Gradient Representation for Neural Network Visualization


Title	Full-Gradient Representation for Neural Network Visualization
Authors	Suraj Srinivas, Francois Fleuret
Abstract	We introduce a new tool for interpreting neural net responses, namely full-gradients, which decomposes the neural net response into input sensitivity and per-neuron sensitivity components. This is the first proposed representation which satisfies two key properties: completeness and weak dependence, which provably cannot be satisfied by any saliency map-based interpretability method. For convolutional nets, we also propose an approximate saliency map representation, called FullGrad, obtained by aggregating the full-gradient components. We experimentally evaluate the usefulness of FullGrad in explaining model behaviour with two quantitative tests: pixel perturbation and remove-and-retrain. Our experiments reveal that our method explains model behaviour correctly, and more comprehensively than other methods in the literature. Visual inspection also reveals that our saliency maps are sharper and more tightly confined to object regions than other methods.
Tasks	Interpretable Machine Learning
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00780v4
PDF	https://arxiv.org/pdf/1905.00780v4.pdf
PWC	https://paperswithcode.com/paper/full-jacobian-representation-of-neural
Repo	https://github.com/idiap/fullgrad-saliency
Framework	pytorch

CAMEL: A Weakly Supervised Learning Framework for Histopathology Image Segmentation


Title	CAMEL: A Weakly Supervised Learning Framework for Histopathology Image Segmentation
Authors	Gang Xu, Zhigang Song, Zhuo Sun, Calvin Ku, Zhe Yang, Cancheng Liu, Shuhao Wang, Jianpeng Ma, Wei Xu
Abstract	Histopathology image analysis plays a critical role in cancer diagnosis and treatment. To automatically segment the cancerous regions, fully supervised segmentation algorithms require labor-intensive and time-consuming labeling at the pixel level. In this research, we propose CAMEL, a weakly supervised learning framework for histopathology image segmentation using only image-level labels. Using multiple instance learning (MIL)-based label enrichment, CAMEL splits the image into latticed instances and automatically generates instance-level labels. After label enrichment, the instance-level labels are further assigned to the corresponding pixels, producing the approximate pixel-level labels and making fully supervised training of segmentation models possible. CAMEL achieves comparable performance with the fully supervised approaches in both instance-level classification and pixel-level segmentation on CAMELYON16 and a colorectal adenoma dataset. Moreover, the generality of the automatic labeling methodology may benefit future weakly supervised learning studies for histopathology image analysis.
Tasks	Multiple Instance Learning, Semantic Segmentation
Published	2019-08-28
URL	https://arxiv.org/abs/1908.10555v1
PDF	https://arxiv.org/pdf/1908.10555v1.pdf
PWC	https://paperswithcode.com/paper/camel-a-weakly-supervised-learning-framework
Repo	https://github.com/ThoroughImages/CAMEL
Framework	none

Action Robust Reinforcement Learning and Applications in Continuous Control


Title	Action Robust Reinforcement Learning and Applications in Continuous Control
Authors	Chen Tessler, Yonathan Efroni, Shie Mannor
Abstract	A policy is said to be robust if it maximizes the reward while considering a bad, or even adversarial, model. In this work we formalize two new criteria of robustness to action uncertainty. Specifically, we consider two scenarios in which the agent attempts to perform an action $a$, and (i) with probability $\alpha$, an alternative adversarial action $\bar a$ is taken, or (ii) an adversary adds a perturbation to the selected action in the case of continuous action space. We show that our criteria are related to common forms of uncertainty in robotics domains, such as the occurrence of abrupt forces, and suggest algorithms in the tabular case. Building on the suggested algorithms, we generalize our approach to deep reinforcement learning (DRL) and provide extensive experiments in the various MuJoCo domains. Our experiments show that not only does our approach produce robust policies, but it also improves the performance in the absence of perturbations. This generalization indicates that action-robustness can be thought of as implicit regularization in RL problems.
Tasks	Continuous Control
Published	2019-01-26
URL	https://arxiv.org/abs/1901.09184v2
PDF	https://arxiv.org/pdf/1901.09184v2.pdf
PWC	https://paperswithcode.com/paper/action-robust-reinforcement-learning-and
Repo	https://github.com/icml2019-anonymous-author/Action-Robust-Reinforcement-Learning
Framework	pytorch

Graph WaveNet for Deep Spatial-Temporal Graph Modeling


Title	Graph WaveNet for Deep Spatial-Temporal Graph Modeling
Authors	Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Chengqi Zhang
Abstract	Spatial-temporal graph modeling is an important task to analyze the spatial relations and temporal trends of components in a system. Existing approaches mostly capture the spatial dependency on a fixed graph structure, assuming that the underlying relation between entities is pre-determined. However, the explicit graph structure (relation) does not necessarily reflect the true dependency and genuine relation may be missing due to the incomplete connections in the data. Furthermore, existing methods are ineffective to capture the temporal trends as the RNNs or CNNs employed in these methods cannot capture long-range temporal sequences. To overcome these limitations, we propose in this paper a novel graph neural network architecture, Graph WaveNet, for spatial-temporal graph modeling. By developing a novel adaptive dependency matrix and learn it through node embedding, our model can precisely capture the hidden spatial dependency in the data. With a stacked dilated 1D convolution component whose receptive field grows exponentially as the number of layers increases, Graph WaveNet is able to handle very long sequences. These two components are integrated seamlessly in a unified framework and the whole framework is learned in an end-to-end manner. Experimental results on two public traffic network datasets, METR-LA and PEMS-BAY, demonstrate the superior performance of our algorithm.
Tasks	Traffic Prediction
Published	2019-05-31
URL	https://arxiv.org/abs/1906.00121v1
PDF	https://arxiv.org/pdf/1906.00121v1.pdf
PWC	https://paperswithcode.com/paper/190600121
Repo	https://github.com/nnzhan/Graph-WaveNet
Framework	pytorch

Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction


Title	Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction
Authors	Qinyuan Ye, Liyuan Liu, Maosen Zhang, Xiang Ren
Abstract	In recent years there is a surge of interest in applying distant supervision (DS) to automatically generate training data for relation extraction (RE). In this paper, we study the problem what limits the performance of DS-trained neural models, conduct thorough analyses, and identify a factor that can influence the performance greatly, shifted label distribution. Specifically, we found this problem commonly exists in real-world DS datasets, and without special handing, typical DS-RE models cannot automatically adapt to this shift, thus achieving deteriorated performance. To further validate our intuition, we develop a simple yet effective adaptation method for DS-trained models, bias adjustment, which updates models learned over the source domain (i.e., DS training set) with a label distribution estimated on the target domain (i.e., test set). Experiments demonstrate that bias adjustment achieves consistent performance gains on DS-trained models, especially on neural models, with an up to 23% relative F1 improvement, which verifies our assumptions. Our code and data can be found at \url{https://github.com/INK-USC/shifted-label-distribution}.
Tasks	Relation Extraction
Published	2019-04-19
URL	https://arxiv.org/abs/1904.09331v2
PDF	https://arxiv.org/pdf/1904.09331v2.pdf
PWC	https://paperswithcode.com/paper/190409331
Repo	https://github.com/INK-USC/shifted-label-distribution
Framework	pytorch

Hierarchical Multi-Task Natural Language Understanding for Cross-domain Conversational AI: HERMIT NLU


Title	Hierarchical Multi-Task Natural Language Understanding for Cross-domain Conversational AI: HERMIT NLU
Authors	Andrea Vanzo, Emanuele Bastianelli, Oliver Lemon
Abstract	We present a new neural architecture for wide-coverage Natural Language Understanding in Spoken Dialogue Systems. We develop a hierarchical multi-task architecture, which delivers a multi-layer representation of sentence meaning (i.e., Dialogue Acts and Frame-like structures). The architecture is a hierarchy of self-attention mechanisms and BiLSTM encoders followed by CRF tagging layers. We describe a variety of experiments, showing that our approach obtains promising results on a dataset annotated with Dialogue Acts and Frame Semantics. Moreover, we demonstrate its applicability to a different, publicly available NLU dataset annotated with domain-specific intents and corresponding semantic roles, providing overall performance higher than state-of-the-art tools such as RASA, Dialogflow, LUIS, and Watson. For example, we show an average 4.45% improvement in entity tagging F-score over Rasa, Dialogflow and LUIS.
Tasks	Spoken Dialogue Systems
Published	2019-10-02
URL	https://arxiv.org/abs/1910.00912v1
PDF	https://arxiv.org/pdf/1910.00912v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-multi-task-natural-language
Repo	https://github.com/RasaHQ/rasa
Framework	none