February 2, 2020

2932 words 14 mins read

Paper Group AWR 42

Paper Group AWR 42

Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement. 3D Dilated Multi-Fiber Network for Real-time Brain Tumor Segmentation in MRI. A topological data analysis based classification method for multiple measurements. HopSkipJumpAttack: A Query-Efficient Decision-Based Attack. Is the Red Square Big? MALeViC: Modelin …

Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement

Title Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement
Authors Ting-En Lin, Hua Xu, Hanlei Zhang
Abstract Identifying new user intents is an essential task in the dialogue system. However, it is hard to get satisfying clustering results since the definition of intents is strongly guided by prior knowledge. Existing methods incorporate prior knowledge by intensive feature engineering, which not only leads to overfitting but also makes it sensitive to the number of clusters. In this paper, we propose constrained deep adaptive clustering with cluster refinement (CDAC+), an end-to-end clustering method that can naturally incorporate pairwise constraints as prior knowledge to guide the clustering process. Moreover, we refine the clusters by forcing the model to learn from the high confidence assignments. After eliminating low confidence assignments, our approach is surprisingly insensitive to the number of clusters. Experimental results on the three benchmark datasets show that our method can yield significant improvements over strong baselines.
Tasks Feature Engineering
Published 2019-11-20
URL https://arxiv.org/abs/1911.08891v1
PDF https://arxiv.org/pdf/1911.08891v1.pdf
PWC https://paperswithcode.com/paper/discovering-new-intents-via-constrained-deep
Repo https://github.com/thuiar/CDAC-plus
Framework pytorch

3D Dilated Multi-Fiber Network for Real-time Brain Tumor Segmentation in MRI

Title 3D Dilated Multi-Fiber Network for Real-time Brain Tumor Segmentation in MRI
Authors Chen Chen, Xiaopeng Liu, Meng Ding, Junfeng Zheng, Jiangyun Li
Abstract Brain tumor segmentation plays a pivotal role in medical image processing. In this work, we aim to segment brain MRI volumes. 3D convolution neural networks (CNN) such as 3D U-Net and V-Net employing 3D convolutions to capture the correlation between adjacent slices have achieved impressive segmentation results. However, these 3D CNN architectures come with high computational overheads due to multiple layers of 3D convolutions, which may make these models prohibitive for practical large-scale applications. To this end, we propose a highly efficient 3D CNN to achieve real-time dense volumetric segmentation. The network leverages the 3D multi-fiber unit which consists of an ensemble of lightweight 3D convolutional networks to significantly reduce the computational cost. Moreover, 3D dilated convolutions are used to build multi-scale feature representations. Extensive experimental results on the BraTS-2018 challenge dataset show that the proposed architecture greatly reduces computation cost while maintaining high accuracy for brain tumor segmentation. The source code can be found at https://github.com/China-LiuXiaopeng/BraTS-DMFNet
Tasks Brain Tumor Segmentation
Published 2019-04-06
URL https://arxiv.org/abs/1904.03355v5
PDF https://arxiv.org/pdf/1904.03355v5.pdf
PWC https://paperswithcode.com/paper/3d-dilated-multi-fiber-network-for-real-time
Repo https://github.com/China-LiuXiaopeng/BraTS-DMFNet
Framework pytorch

A topological data analysis based classification method for multiple measurements

Title A topological data analysis based classification method for multiple measurements
Authors Henri Riihimäki, Wojciech Chachólski, Jakob Theorell, Jan Hillert, Ryan Ramanujam
Abstract Machine learning models for repeated measurements are limited. Using topological data analysis (TDA), we present a classifier for repeated measurements which samples from the data space and builds a network graph based on the data topology. When applying this to two case studies, accuracy exceeds alternative models with additional benefits such as reporting data subsets with high purity along with feature values. For 300 examples of 3 tree species, the accuracy reached 80% after 30 datapoints, which was improved to 90% after increased sampling to 400 datapoints. Using data from 100 examples of each of 6 point processes, the classifier achieved 96.8% accuracy. In both datasets, the TDA classifier outperformed an alternative model. This algorithm and software can be beneficial for repeated measurement data common in biological sciences, as both an accurate classifier and a feature selection tool.
Tasks Feature Selection, Point Processes, Topological Data Analysis
Published 2019-04-05
URL http://arxiv.org/abs/1904.02971v1
PDF http://arxiv.org/pdf/1904.02971v1.pdf
PWC https://paperswithcode.com/paper/a-topological-data-analysis-based
Repo https://github.com/ryaram1/mmTDA
Framework none

HopSkipJumpAttack: A Query-Efficient Decision-Based Attack

Title HopSkipJumpAttack: A Query-Efficient Decision-Based Attack
Authors Jianbo Chen, Michael I. Jordan, Martin J. Wainwright
Abstract The goal of a decision-based adversarial attack on a trained model is to generate adversarial examples based solely on observing output labels returned by the targeted model. We develop HopSkipJumpAttack, a family of algorithms based on a novel estimate of the gradient direction using binary information at the decision boundary. The proposed family includes both untargeted and targeted attacks optimized for $\ell_2$ and $\ell_\infty$ similarity metrics respectively. Theoretical analysis is provided for the proposed algorithms and the gradient direction estimate. Experiments show HopSkipJumpAttack requires significantly fewer model queries than Boundary Attack. It also achieves competitive performance in attacking several widely-used defense mechanisms. (HopSkipJumpAttack was named Boundary Attack++ in a previous version of the preprint.)
Tasks Adversarial Attack
Published 2019-04-03
URL https://arxiv.org/abs/1904.02144v4
PDF https://arxiv.org/pdf/1904.02144v4.pdf
PWC https://paperswithcode.com/paper/boundary-attack-query-efficient-decision
Repo https://github.com/Jianbo-Lab/HSJA
Framework tf

Is the Red Square Big? MALeViC: Modeling Adjectives Leveraging Visual Contexts

Title Is the Red Square Big? MALeViC: Modeling Adjectives Leveraging Visual Contexts
Authors Sandro Pezzelle, Raquel Fernández
Abstract This work aims at modeling how the meaning of gradable adjectives of size (big', small’) can be learned from visually-grounded contexts. Inspired by cognitive and linguistic evidence showing that the use of these expressions relies on setting a threshold that is dependent on a specific context, we investigate the ability of multi-modal models in assessing whether an object is big' or small’ in a given visual scene. In contrast with the standard computational approach that simplistically treats gradable adjectives as `fixed’ attributes, we pose the problem as relational: to be successful, a model has to consider the full visual context. By means of four main tasks, we show that state-of-the-art models (but not a relatively strong baseline) can learn the function subtending the meaning of size adjectives, though their performance is found to decrease while moving from simple to more complex tasks. Crucially, models fail in developing abstract representations of gradable adjectives that can be used compositionally. |
Tasks
Published 2019-08-27
URL https://arxiv.org/abs/1908.10285v1
PDF https://arxiv.org/pdf/1908.10285v1.pdf
PWC https://paperswithcode.com/paper/is-the-red-square-big-malevic-modeling
Repo https://github.com/sandropezzelle/malevic
Framework none

Collaborative Similarity Embedding for Recommender Systems

Title Collaborative Similarity Embedding for Recommender Systems
Authors Chih-Ming Chen, Chuan-Ju Wang, Ming-Feng Tsai, Yi-Hsuan Yang
Abstract We present collaborative similarity embedding (CSE), a unified framework that exploits comprehensive collaborative relations available in a user-item bipartite graph for representation learning and recommendation. In the proposed framework, we differentiate two types of proximity relations: direct proximity and k-th order neighborhood proximity. While learning from the former exploits direct user-item associations observable from the graph, learning from the latter makes use of implicit associations such as user-user similarities and item-item similarities, which can provide valuable information especially when the graph is sparse. Moreover, for improving scalability and flexibility, we propose a sampling technique that is specifically designed to capture the two types of proximity relations. Extensive experiments on eight benchmark datasets show that CSE yields significantly better performance than state-of-the-art recommendation methods.
Tasks Recommendation Systems, Representation Learning
Published 2019-02-17
URL http://arxiv.org/abs/1902.06188v2
PDF http://arxiv.org/pdf/1902.06188v2.pdf
PWC https://paperswithcode.com/paper/collaborative-similarity-embedding-for
Repo https://github.com/cnclabs/smore
Framework none

The Shape of Data: Intrinsic Distance for Data Distributions

Title The Shape of Data: Intrinsic Distance for Data Distributions
Authors Anton Tsitsulin, Marina Munkhoeva, Davide Mottin, Panagiotis Karras, Alex Bronstein, Ivan Oseledets, Emmanuel Müller
Abstract The ability to represent and compare machine learning models is crucial in order to quantify subtle model changes, evaluate generative models, and gather insights on neural network architectures. Existing techniques for comparing data distributions focus on global data properties such as mean and covariance; in that sense, they are extrinsic and uni-scale. We develop a first-of-its-kind intrinsic and multi-scale method for characterizing and comparing data manifolds, using a lower-bound of the spectral variant of the Gromov-Wasserstein inter-manifold distance, which compares all data moments. In a thorough experimental study, we demonstrate that our method effectively discerns the structure of data manifolds even on unaligned data of different dimensionalities; moreover, we showcase its efficacy in evaluating the quality of generative models.
Tasks
Published 2019-05-27
URL https://arxiv.org/abs/1905.11141v2
PDF https://arxiv.org/pdf/1905.11141v2.pdf
PWC https://paperswithcode.com/paper/intrinsic-multi-scale-evaluation-of
Repo https://github.com/xgfs/msid
Framework none

FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network

Title FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network
Authors Jonah Philion
Abstract The search for predictive models that generalize to the long tail of sensor inputs is the central difficulty when developing data-driven models for autonomous vehicles. In this paper, we use lane detection to study modeling and training techniques that yield better performance on real world test drives. On the modeling side, we introduce a novel fully convolutional model of lane detection that learns to decode lane structures instead of delegating structure inference to post-processing. In contrast to previous works, our convolutional decoder is able to represent an arbitrary number of lanes per image, preserves the polyline representation of lanes without reducing lanes to polynomials, and draws lanes iteratively without requiring the computational and temporal complexity of recurrent neural networks. Because our model includes an estimate of the joint distribution of neighboring pixels belonging to the same lane, our formulation includes a natural and computationally cheap definition of uncertainty. On the training side, we demonstrate a simple yet effective approach to adapt the model to new environments using unsupervised style transfer. By training FastDraw to make predictions of lane structure that are invariant to low-level stylistic differences between images, we achieve strong performance at test time in weather and lighting conditions that deviate substantially from those of the annotated datasets that are publicly available. We quantitatively evaluate our approach on the CVPR 2017 Tusimple lane marking challenge, difficult CULane datasets, and a small labeled dataset of our own and achieve competitive accuracy while running at 90 FPS.
Tasks Autonomous Vehicles, Lane Detection, Style Transfer
Published 2019-05-10
URL https://arxiv.org/abs/1905.04354v2
PDF https://arxiv.org/pdf/1905.04354v2.pdf
PWC https://paperswithcode.com/paper/fastdraw-addressing-the-long-tail-of-lane
Repo https://github.com/jonahthelion/FastDraw
Framework none

Short-distance commuters in the smart city

Title Short-distance commuters in the smart city
Authors Francisco Benita, Garvit Bansal, Georgios Piliouras, Bige Tunçer
Abstract This study models and examines commuter’s preferences for short-distance transportation modes, namely: walking, taking a bus or riding a metro. It is used a unique dataset from a large-scale field experiment in Singapore that provides rich information about tens of thousands of commuters’ behavior. In contrast to the standard approach, this work does not relay on survey data. Conversely, the chosen transportation modes are identified by processing raw data (latitude, longitude, timestamp). The approach of this work exploits the information generated by the smart transportation system in the city that make suitable the task of obtaining granular and nearly real-time data. Novel algorithms are proposed with the intention to generate proxies for walkability and public transport attributes. The empirical results of the case study suggest that commuters do no differentiate between public transport choices (bus and metro), therefore possible nested structures for the public transport modes are rejected.
Tasks
Published 2019-02-16
URL http://arxiv.org/abs/1902.08028v1
PDF http://arxiv.org/pdf/1902.08028v1.pdf
PWC https://paperswithcode.com/paper/short-distance-commuters-in-the-smart-city
Repo https://github.com/Garvit244/Shapefile_to_Network
Framework none

Machine Learning vs Statistical Methods for Time Series Forecasting: Size Matters

Title Machine Learning vs Statistical Methods for Time Series Forecasting: Size Matters
Authors Vitor Cerqueira, Luis Torgo, Carlos Soares
Abstract Time series forecasting is one of the most active research topics. Machine learning methods have been increasingly adopted to solve these predictive tasks. However, in a recent work, these were shown to systematically present a lower predictive performance relative to simple statistical methods. In this work, we counter these results. We show that these are only valid under an extremely low sample size. Using a learning curve method, our results suggest that machine learning methods improve their relative predictive performance as the sample size grows. The code to reproduce the experiments is available at https://github.com/vcerqueira/MLforForecasting.
Tasks Time Series, Time Series Forecasting
Published 2019-09-29
URL https://arxiv.org/abs/1909.13316v1
PDF https://arxiv.org/pdf/1909.13316v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-vs-statistical-methods-for
Repo https://github.com/vcerqueira/MLforForecasting
Framework none

LATTE: Accelerating LiDAR Point Cloud Annotation via Sensor Fusion, One-Click Annotation, and Tracking

Title LATTE: Accelerating LiDAR Point Cloud Annotation via Sensor Fusion, One-Click Annotation, and Tracking
Authors Bernie Wang, Virginia Wu, Bichen Wu, Kurt Keutzer
Abstract LiDAR (Light Detection And Ranging) is an essential and widely adopted sensor for autonomous vehicles, particularly for those vehicles operating at higher levels (L4-L5) of autonomy. Recent work has demonstrated the promise of deep-learning approaches for LiDAR-based detection. However, deep-learning algorithms are extremely data hungry, requiring large amounts of labeled point-cloud data for training and evaluation. Annotating LiDAR point cloud data is challenging due to the following issues: 1) A LiDAR point cloud is usually sparse and has low resolution, making it difficult for human annotators to recognize objects. 2) Compared to annotation on 2D images, the operation of drawing 3D bounding boxes or even point-wise labels on LiDAR point clouds is more complex and time-consuming. 3) LiDAR data are usually collected in sequences, so consecutive frames are highly correlated, leading to repeated annotations. To tackle these challenges, we propose LATTE, an open-sourced annotation tool for LiDAR point clouds. LATTE features the following innovations: 1) Sensor fusion: We utilize image-based detection algorithms to automatically pre-label a calibrated image, and transfer the labels to the point cloud. 2) One-click annotation: Instead of drawing 3D bounding boxes or point-wise labels, we simplify the annotation to just one click on the target object, and automatically generate the bounding box for the target. 3) Tracking: we integrate tracking into sequence annotation such that we can transfer labels from one frame to subsequent ones and therefore significantly reduce repeated labeling. Experiments show the proposed features accelerate the annotation speed by 6.2x and significantly improve label quality with 23.6% and 2.2% higher instance-level precision and recall, and 2.0% higher bounding box IoU. LATTE is open-sourced at https://github.com/bernwang/latte.
Tasks Autonomous Vehicles, Sensor Fusion
Published 2019-04-19
URL http://arxiv.org/abs/1904.09085v1
PDF http://arxiv.org/pdf/1904.09085v1.pdf
PWC https://paperswithcode.com/paper/latte-accelerating-lidar-point-cloud
Repo https://github.com/bernwang/latte
Framework none

Adaptively Preconditioned Stochastic Gradient Langevin Dynamics

Title Adaptively Preconditioned Stochastic Gradient Langevin Dynamics
Authors Chandrasekaran Anirudh Bhardwaj
Abstract Stochastic Gradient Langevin Dynamics infuses isotropic gradient noise to SGD to help navigate pathological curvature in the loss landscape for deep networks. Isotropic nature of the noise leads to poor scaling, and adaptive methods based on higher order curvature information such as Fisher Scoring have been proposed to precondition the noise in order to achieve better convergence. In this paper, we describe an adaptive method to estimate the parameters of the noise and conduct experiments on well-known model architectures to show that the adaptively preconditioned SGLD method achieves convergence with the speed of adaptive first order methods such as Adam, AdaGrad etc. and achieves generalization equivalent of SGD in the test set.
Tasks
Published 2019-06-10
URL https://arxiv.org/abs/1906.04324v2
PDF https://arxiv.org/pdf/1906.04324v2.pdf
PWC https://paperswithcode.com/paper/adaptively-preconditioned-stochastic-gradient
Repo https://github.com/Anirudhsekar96/Noisy_SGD
Framework pytorch

DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

Title DropEdge: Towards Deep Graph Convolutional Networks on Node Classification
Authors Yu Rong, Wenbing Huang, Tingyang Xu, Junzhou Huang
Abstract \emph{Over-fitting} and \emph{over-smoothing} are two main obstacles of developing deep Graph Convolutional Networks (GCNs) for node classification. In particular, over-fitting weakens the generalization ability on small dataset, while over-smoothing impedes model training by isolating output representations from the input features with the increase in network depth. This paper proposes DropEdge, a novel and flexible technique to alleviate both issues. At its core, DropEdge randomly removes a certain number of edges from the input graph at each training epoch, acting like a data augmenter and also a message passing reducer. Furthermore, we theoretically demonstrate that DropEdge either reduces the convergence speed of over-smoothing or relieves the information loss caused by it. More importantly, our DropEdge is a general skill that can be equipped with many other backbone models (e.g. GCN, ResGCN, GraphSAGE, and JKNet) for enhanced performance. Extensive experiments on several benchmarks verify that DropEdge consistently improves the performance on a variety of both shallow and deep GCNs. The effect of DropEdge on preventing over-smoothing is empirically visualized and validated as well. Codes are released on~\url{https://github.com/DropEdge/DropEdge}.
Tasks Node Classification
Published 2019-07-25
URL https://arxiv.org/abs/1907.10903v4
PDF https://arxiv.org/pdf/1907.10903v4.pdf
PWC https://paperswithcode.com/paper/the-truly-deep-graph-convolutional-networks
Repo https://github.com/DropEdge/DropEdge
Framework pytorch

Learning to Optimize in Swarms

Title Learning to Optimize in Swarms
Authors Yue Cao, Tianlong Chen, Zhangyang Wang, Yang Shen
Abstract Learning to optimize has emerged as a powerful framework for various optimization and machine learning tasks. Current such “meta-optimizers” often learn in the space of continuous optimization algorithms that are point-based and uncertainty-unaware. To overcome the limitations, we propose a meta-optimizer that learns in the algorithmic space of both point-based and population-based optimization algorithms. The meta-optimizer targets at a meta-loss function consisting of both cumulative regret and entropy. Specifically, we learn and interpret the update formula through a population of LSTMs embedded with sample- and feature-level attentions. Meanwhile, we estimate the posterior directly over the global optimum and use an uncertainty measure to help guide the learning process. Empirical results over non-convex test functions and the protein-docking application demonstrate that this new meta-optimizer outperforms existing competitors.
Tasks
Published 2019-11-09
URL https://arxiv.org/abs/1911.03787v2
PDF https://arxiv.org/pdf/1911.03787v2.pdf
PWC https://paperswithcode.com/paper/learning-to-optimize-in-swarms
Repo https://github.com/Shen-Lab/LOIS
Framework tf

Discourse-Aware Semantic Self-Attention for Narrative Reading Comprehension

Title Discourse-Aware Semantic Self-Attention for Narrative Reading Comprehension
Authors Todor Mihaylov, Anette Frank
Abstract In this work, we propose to use linguistic annotations as a basis for a \textit{Discourse-Aware Semantic Self-Attention} encoder that we employ for reading comprehension on long narrative texts. We extract relations between discourse units, events and their arguments as well as coreferring mentions, using available annotation tools. Our empirical evaluation shows that the investigated structures improve the overall performance, especially intra-sentential and cross-sentential discourse relations, sentence-internal semantic role relations, and long-distance coreference relations. We show that dedicating self-attention heads to intra-sentential relations and relations connecting neighboring sentences is beneficial for finding answers to questions in longer contexts. Our findings encourage the use of discourse-semantic annotations to enhance the generalization capacity of self-attention models for reading comprehension.
Tasks Reading Comprehension
Published 2019-08-28
URL https://arxiv.org/abs/1908.10721v1
PDF https://arxiv.org/pdf/1908.10721v1.pdf
PWC https://paperswithcode.com/paper/discourse-aware-semantic-self-attention-for
Repo https://github.com/Heidelberg-NLP/discourse-aware-semantic-self-attention
Framework none
comments powered by Disqus