Paper Group AWR 42
Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement. 3D Dilated Multi-Fiber Network for Real-time Brain Tumor Segmentation in MRI. A topological data analysis based classification method for multiple measurements. HopSkipJumpAttack: A Query-Efficient Decision-Based Attack. Is the Red Square Big? MALeViC: Modelin …
Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement
Title | Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement |
Authors | Ting-En Lin, Hua Xu, Hanlei Zhang |
Abstract | Identifying new user intents is an essential task in the dialogue system. However, it is hard to get satisfying clustering results since the definition of intents is strongly guided by prior knowledge. Existing methods incorporate prior knowledge by intensive feature engineering, which not only leads to overfitting but also makes it sensitive to the number of clusters. In this paper, we propose constrained deep adaptive clustering with cluster refinement (CDAC+), an end-to-end clustering method that can naturally incorporate pairwise constraints as prior knowledge to guide the clustering process. Moreover, we refine the clusters by forcing the model to learn from the high confidence assignments. After eliminating low confidence assignments, our approach is surprisingly insensitive to the number of clusters. Experimental results on the three benchmark datasets show that our method can yield significant improvements over strong baselines. |
Tasks | Feature Engineering |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08891v1 |
https://arxiv.org/pdf/1911.08891v1.pdf | |
PWC | https://paperswithcode.com/paper/discovering-new-intents-via-constrained-deep |
Repo | https://github.com/thuiar/CDAC-plus |
Framework | pytorch |
3D Dilated Multi-Fiber Network for Real-time Brain Tumor Segmentation in MRI
Title | 3D Dilated Multi-Fiber Network for Real-time Brain Tumor Segmentation in MRI |
Authors | Chen Chen, Xiaopeng Liu, Meng Ding, Junfeng Zheng, Jiangyun Li |
Abstract | Brain tumor segmentation plays a pivotal role in medical image processing. In this work, we aim to segment brain MRI volumes. 3D convolution neural networks (CNN) such as 3D U-Net and V-Net employing 3D convolutions to capture the correlation between adjacent slices have achieved impressive segmentation results. However, these 3D CNN architectures come with high computational overheads due to multiple layers of 3D convolutions, which may make these models prohibitive for practical large-scale applications. To this end, we propose a highly efficient 3D CNN to achieve real-time dense volumetric segmentation. The network leverages the 3D multi-fiber unit which consists of an ensemble of lightweight 3D convolutional networks to significantly reduce the computational cost. Moreover, 3D dilated convolutions are used to build multi-scale feature representations. Extensive experimental results on the BraTS-2018 challenge dataset show that the proposed architecture greatly reduces computation cost while maintaining high accuracy for brain tumor segmentation. The source code can be found at https://github.com/China-LiuXiaopeng/BraTS-DMFNet |
Tasks | Brain Tumor Segmentation |
Published | 2019-04-06 |
URL | https://arxiv.org/abs/1904.03355v5 |
https://arxiv.org/pdf/1904.03355v5.pdf | |
PWC | https://paperswithcode.com/paper/3d-dilated-multi-fiber-network-for-real-time |
Repo | https://github.com/China-LiuXiaopeng/BraTS-DMFNet |
Framework | pytorch |
A topological data analysis based classification method for multiple measurements
Title | A topological data analysis based classification method for multiple measurements |
Authors | Henri Riihimäki, Wojciech Chachólski, Jakob Theorell, Jan Hillert, Ryan Ramanujam |
Abstract | Machine learning models for repeated measurements are limited. Using topological data analysis (TDA), we present a classifier for repeated measurements which samples from the data space and builds a network graph based on the data topology. When applying this to two case studies, accuracy exceeds alternative models with additional benefits such as reporting data subsets with high purity along with feature values. For 300 examples of 3 tree species, the accuracy reached 80% after 30 datapoints, which was improved to 90% after increased sampling to 400 datapoints. Using data from 100 examples of each of 6 point processes, the classifier achieved 96.8% accuracy. In both datasets, the TDA classifier outperformed an alternative model. This algorithm and software can be beneficial for repeated measurement data common in biological sciences, as both an accurate classifier and a feature selection tool. |
Tasks | Feature Selection, Point Processes, Topological Data Analysis |
Published | 2019-04-05 |
URL | http://arxiv.org/abs/1904.02971v1 |
http://arxiv.org/pdf/1904.02971v1.pdf | |
PWC | https://paperswithcode.com/paper/a-topological-data-analysis-based |
Repo | https://github.com/ryaram1/mmTDA |
Framework | none |
HopSkipJumpAttack: A Query-Efficient Decision-Based Attack
Title | HopSkipJumpAttack: A Query-Efficient Decision-Based Attack |
Authors | Jianbo Chen, Michael I. Jordan, Martin J. Wainwright |
Abstract | The goal of a decision-based adversarial attack on a trained model is to generate adversarial examples based solely on observing output labels returned by the targeted model. We develop HopSkipJumpAttack, a family of algorithms based on a novel estimate of the gradient direction using binary information at the decision boundary. The proposed family includes both untargeted and targeted attacks optimized for $\ell_2$ and $\ell_\infty$ similarity metrics respectively. Theoretical analysis is provided for the proposed algorithms and the gradient direction estimate. Experiments show HopSkipJumpAttack requires significantly fewer model queries than Boundary Attack. It also achieves competitive performance in attacking several widely-used defense mechanisms. (HopSkipJumpAttack was named Boundary Attack++ in a previous version of the preprint.) |
Tasks | Adversarial Attack |
Published | 2019-04-03 |
URL | https://arxiv.org/abs/1904.02144v4 |
https://arxiv.org/pdf/1904.02144v4.pdf | |
PWC | https://paperswithcode.com/paper/boundary-attack-query-efficient-decision |
Repo | https://github.com/Jianbo-Lab/HSJA |
Framework | tf |
Is the Red Square Big? MALeViC: Modeling Adjectives Leveraging Visual Contexts
Title | Is the Red Square Big? MALeViC: Modeling Adjectives Leveraging Visual Contexts |
Authors | Sandro Pezzelle, Raquel Fernández |
Abstract | This work aims at modeling how the meaning of gradable adjectives of size (big', small’) can be learned from visually-grounded contexts. Inspired by cognitive and linguistic evidence showing that the use of these expressions relies on setting a threshold that is dependent on a specific context, we investigate the ability of multi-modal models in assessing whether an object is big' or small’ in a given visual scene. In contrast with the standard computational approach that simplistically treats gradable adjectives as `fixed’ attributes, we pose the problem as relational: to be successful, a model has to consider the full visual context. By means of four main tasks, we show that state-of-the-art models (but not a relatively strong baseline) can learn the function subtending the meaning of size adjectives, though their performance is found to decrease while moving from simple to more complex tasks. Crucially, models fail in developing abstract representations of gradable adjectives that can be used compositionally. | |
Tasks | |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10285v1 |
https://arxiv.org/pdf/1908.10285v1.pdf | |
PWC | https://paperswithcode.com/paper/is-the-red-square-big-malevic-modeling |
Repo | https://github.com/sandropezzelle/malevic |
Framework | none |
Collaborative Similarity Embedding for Recommender Systems
Title | Collaborative Similarity Embedding for Recommender Systems |
Authors | Chih-Ming Chen, Chuan-Ju Wang, Ming-Feng Tsai, Yi-Hsuan Yang |
Abstract | We present collaborative similarity embedding (CSE), a unified framework that exploits comprehensive collaborative relations available in a user-item bipartite graph for representation learning and recommendation. In the proposed framework, we differentiate two types of proximity relations: direct proximity and k-th order neighborhood proximity. While learning from the former exploits direct user-item associations observable from the graph, learning from the latter makes use of implicit associations such as user-user similarities and item-item similarities, which can provide valuable information especially when the graph is sparse. Moreover, for improving scalability and flexibility, we propose a sampling technique that is specifically designed to capture the two types of proximity relations. Extensive experiments on eight benchmark datasets show that CSE yields significantly better performance than state-of-the-art recommendation methods. |
Tasks | Recommendation Systems, Representation Learning |
Published | 2019-02-17 |
URL | http://arxiv.org/abs/1902.06188v2 |
http://arxiv.org/pdf/1902.06188v2.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-similarity-embedding-for |
Repo | https://github.com/cnclabs/smore |
Framework | none |
The Shape of Data: Intrinsic Distance for Data Distributions
Title | The Shape of Data: Intrinsic Distance for Data Distributions |
Authors | Anton Tsitsulin, Marina Munkhoeva, Davide Mottin, Panagiotis Karras, Alex Bronstein, Ivan Oseledets, Emmanuel Müller |
Abstract | The ability to represent and compare machine learning models is crucial in order to quantify subtle model changes, evaluate generative models, and gather insights on neural network architectures. Existing techniques for comparing data distributions focus on global data properties such as mean and covariance; in that sense, they are extrinsic and uni-scale. We develop a first-of-its-kind intrinsic and multi-scale method for characterizing and comparing data manifolds, using a lower-bound of the spectral variant of the Gromov-Wasserstein inter-manifold distance, which compares all data moments. In a thorough experimental study, we demonstrate that our method effectively discerns the structure of data manifolds even on unaligned data of different dimensionalities; moreover, we showcase its efficacy in evaluating the quality of generative models. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11141v2 |
https://arxiv.org/pdf/1905.11141v2.pdf | |
PWC | https://paperswithcode.com/paper/intrinsic-multi-scale-evaluation-of |
Repo | https://github.com/xgfs/msid |
Framework | none |
FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network
Title | FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network |
Authors | Jonah Philion |
Abstract | The search for predictive models that generalize to the long tail of sensor inputs is the central difficulty when developing data-driven models for autonomous vehicles. In this paper, we use lane detection to study modeling and training techniques that yield better performance on real world test drives. On the modeling side, we introduce a novel fully convolutional model of lane detection that learns to decode lane structures instead of delegating structure inference to post-processing. In contrast to previous works, our convolutional decoder is able to represent an arbitrary number of lanes per image, preserves the polyline representation of lanes without reducing lanes to polynomials, and draws lanes iteratively without requiring the computational and temporal complexity of recurrent neural networks. Because our model includes an estimate of the joint distribution of neighboring pixels belonging to the same lane, our formulation includes a natural and computationally cheap definition of uncertainty. On the training side, we demonstrate a simple yet effective approach to adapt the model to new environments using unsupervised style transfer. By training FastDraw to make predictions of lane structure that are invariant to low-level stylistic differences between images, we achieve strong performance at test time in weather and lighting conditions that deviate substantially from those of the annotated datasets that are publicly available. We quantitatively evaluate our approach on the CVPR 2017 Tusimple lane marking challenge, difficult CULane datasets, and a small labeled dataset of our own and achieve competitive accuracy while running at 90 FPS. |
Tasks | Autonomous Vehicles, Lane Detection, Style Transfer |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04354v2 |
https://arxiv.org/pdf/1905.04354v2.pdf | |
PWC | https://paperswithcode.com/paper/fastdraw-addressing-the-long-tail-of-lane |
Repo | https://github.com/jonahthelion/FastDraw |
Framework | none |
Short-distance commuters in the smart city
Title | Short-distance commuters in the smart city |
Authors | Francisco Benita, Garvit Bansal, Georgios Piliouras, Bige Tunçer |
Abstract | This study models and examines commuter’s preferences for short-distance transportation modes, namely: walking, taking a bus or riding a metro. It is used a unique dataset from a large-scale field experiment in Singapore that provides rich information about tens of thousands of commuters’ behavior. In contrast to the standard approach, this work does not relay on survey data. Conversely, the chosen transportation modes are identified by processing raw data (latitude, longitude, timestamp). The approach of this work exploits the information generated by the smart transportation system in the city that make suitable the task of obtaining granular and nearly real-time data. Novel algorithms are proposed with the intention to generate proxies for walkability and public transport attributes. The empirical results of the case study suggest that commuters do no differentiate between public transport choices (bus and metro), therefore possible nested structures for the public transport modes are rejected. |
Tasks | |
Published | 2019-02-16 |
URL | http://arxiv.org/abs/1902.08028v1 |
http://arxiv.org/pdf/1902.08028v1.pdf | |
PWC | https://paperswithcode.com/paper/short-distance-commuters-in-the-smart-city |
Repo | https://github.com/Garvit244/Shapefile_to_Network |
Framework | none |
Machine Learning vs Statistical Methods for Time Series Forecasting: Size Matters
Title | Machine Learning vs Statistical Methods for Time Series Forecasting: Size Matters |
Authors | Vitor Cerqueira, Luis Torgo, Carlos Soares |
Abstract | Time series forecasting is one of the most active research topics. Machine learning methods have been increasingly adopted to solve these predictive tasks. However, in a recent work, these were shown to systematically present a lower predictive performance relative to simple statistical methods. In this work, we counter these results. We show that these are only valid under an extremely low sample size. Using a learning curve method, our results suggest that machine learning methods improve their relative predictive performance as the sample size grows. The code to reproduce the experiments is available at https://github.com/vcerqueira/MLforForecasting. |
Tasks | Time Series, Time Series Forecasting |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13316v1 |
https://arxiv.org/pdf/1909.13316v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-vs-statistical-methods-for |
Repo | https://github.com/vcerqueira/MLforForecasting |
Framework | none |
LATTE: Accelerating LiDAR Point Cloud Annotation via Sensor Fusion, One-Click Annotation, and Tracking
Title | LATTE: Accelerating LiDAR Point Cloud Annotation via Sensor Fusion, One-Click Annotation, and Tracking |
Authors | Bernie Wang, Virginia Wu, Bichen Wu, Kurt Keutzer |
Abstract | LiDAR (Light Detection And Ranging) is an essential and widely adopted sensor for autonomous vehicles, particularly for those vehicles operating at higher levels (L4-L5) of autonomy. Recent work has demonstrated the promise of deep-learning approaches for LiDAR-based detection. However, deep-learning algorithms are extremely data hungry, requiring large amounts of labeled point-cloud data for training and evaluation. Annotating LiDAR point cloud data is challenging due to the following issues: 1) A LiDAR point cloud is usually sparse and has low resolution, making it difficult for human annotators to recognize objects. 2) Compared to annotation on 2D images, the operation of drawing 3D bounding boxes or even point-wise labels on LiDAR point clouds is more complex and time-consuming. 3) LiDAR data are usually collected in sequences, so consecutive frames are highly correlated, leading to repeated annotations. To tackle these challenges, we propose LATTE, an open-sourced annotation tool for LiDAR point clouds. LATTE features the following innovations: 1) Sensor fusion: We utilize image-based detection algorithms to automatically pre-label a calibrated image, and transfer the labels to the point cloud. 2) One-click annotation: Instead of drawing 3D bounding boxes or point-wise labels, we simplify the annotation to just one click on the target object, and automatically generate the bounding box for the target. 3) Tracking: we integrate tracking into sequence annotation such that we can transfer labels from one frame to subsequent ones and therefore significantly reduce repeated labeling. Experiments show the proposed features accelerate the annotation speed by 6.2x and significantly improve label quality with 23.6% and 2.2% higher instance-level precision and recall, and 2.0% higher bounding box IoU. LATTE is open-sourced at https://github.com/bernwang/latte. |
Tasks | Autonomous Vehicles, Sensor Fusion |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09085v1 |
http://arxiv.org/pdf/1904.09085v1.pdf | |
PWC | https://paperswithcode.com/paper/latte-accelerating-lidar-point-cloud |
Repo | https://github.com/bernwang/latte |
Framework | none |
Adaptively Preconditioned Stochastic Gradient Langevin Dynamics
Title | Adaptively Preconditioned Stochastic Gradient Langevin Dynamics |
Authors | Chandrasekaran Anirudh Bhardwaj |
Abstract | Stochastic Gradient Langevin Dynamics infuses isotropic gradient noise to SGD to help navigate pathological curvature in the loss landscape for deep networks. Isotropic nature of the noise leads to poor scaling, and adaptive methods based on higher order curvature information such as Fisher Scoring have been proposed to precondition the noise in order to achieve better convergence. In this paper, we describe an adaptive method to estimate the parameters of the noise and conduct experiments on well-known model architectures to show that the adaptively preconditioned SGLD method achieves convergence with the speed of adaptive first order methods such as Adam, AdaGrad etc. and achieves generalization equivalent of SGD in the test set. |
Tasks | |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.04324v2 |
https://arxiv.org/pdf/1906.04324v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptively-preconditioned-stochastic-gradient |
Repo | https://github.com/Anirudhsekar96/Noisy_SGD |
Framework | pytorch |
DropEdge: Towards Deep Graph Convolutional Networks on Node Classification
Title | DropEdge: Towards Deep Graph Convolutional Networks on Node Classification |
Authors | Yu Rong, Wenbing Huang, Tingyang Xu, Junzhou Huang |
Abstract | \emph{Over-fitting} and \emph{over-smoothing} are two main obstacles of developing deep Graph Convolutional Networks (GCNs) for node classification. In particular, over-fitting weakens the generalization ability on small dataset, while over-smoothing impedes model training by isolating output representations from the input features with the increase in network depth. This paper proposes DropEdge, a novel and flexible technique to alleviate both issues. At its core, DropEdge randomly removes a certain number of edges from the input graph at each training epoch, acting like a data augmenter and also a message passing reducer. Furthermore, we theoretically demonstrate that DropEdge either reduces the convergence speed of over-smoothing or relieves the information loss caused by it. More importantly, our DropEdge is a general skill that can be equipped with many other backbone models (e.g. GCN, ResGCN, GraphSAGE, and JKNet) for enhanced performance. Extensive experiments on several benchmarks verify that DropEdge consistently improves the performance on a variety of both shallow and deep GCNs. The effect of DropEdge on preventing over-smoothing is empirically visualized and validated as well. Codes are released on~\url{https://github.com/DropEdge/DropEdge}. |
Tasks | Node Classification |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.10903v4 |
https://arxiv.org/pdf/1907.10903v4.pdf | |
PWC | https://paperswithcode.com/paper/the-truly-deep-graph-convolutional-networks |
Repo | https://github.com/DropEdge/DropEdge |
Framework | pytorch |
Learning to Optimize in Swarms
Title | Learning to Optimize in Swarms |
Authors | Yue Cao, Tianlong Chen, Zhangyang Wang, Yang Shen |
Abstract | Learning to optimize has emerged as a powerful framework for various optimization and machine learning tasks. Current such “meta-optimizers” often learn in the space of continuous optimization algorithms that are point-based and uncertainty-unaware. To overcome the limitations, we propose a meta-optimizer that learns in the algorithmic space of both point-based and population-based optimization algorithms. The meta-optimizer targets at a meta-loss function consisting of both cumulative regret and entropy. Specifically, we learn and interpret the update formula through a population of LSTMs embedded with sample- and feature-level attentions. Meanwhile, we estimate the posterior directly over the global optimum and use an uncertainty measure to help guide the learning process. Empirical results over non-convex test functions and the protein-docking application demonstrate that this new meta-optimizer outperforms existing competitors. |
Tasks | |
Published | 2019-11-09 |
URL | https://arxiv.org/abs/1911.03787v2 |
https://arxiv.org/pdf/1911.03787v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-optimize-in-swarms |
Repo | https://github.com/Shen-Lab/LOIS |
Framework | tf |
Discourse-Aware Semantic Self-Attention for Narrative Reading Comprehension
Title | Discourse-Aware Semantic Self-Attention for Narrative Reading Comprehension |
Authors | Todor Mihaylov, Anette Frank |
Abstract | In this work, we propose to use linguistic annotations as a basis for a \textit{Discourse-Aware Semantic Self-Attention} encoder that we employ for reading comprehension on long narrative texts. We extract relations between discourse units, events and their arguments as well as coreferring mentions, using available annotation tools. Our empirical evaluation shows that the investigated structures improve the overall performance, especially intra-sentential and cross-sentential discourse relations, sentence-internal semantic role relations, and long-distance coreference relations. We show that dedicating self-attention heads to intra-sentential relations and relations connecting neighboring sentences is beneficial for finding answers to questions in longer contexts. Our findings encourage the use of discourse-semantic annotations to enhance the generalization capacity of self-attention models for reading comprehension. |
Tasks | Reading Comprehension |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10721v1 |
https://arxiv.org/pdf/1908.10721v1.pdf | |
PWC | https://paperswithcode.com/paper/discourse-aware-semantic-self-attention-for |
Repo | https://github.com/Heidelberg-NLP/discourse-aware-semantic-self-attention |
Framework | none |