Paper Group AWR 254
Sorting out Lipschitz function approximation. Kernel Treelets. Towards Robust and Privacy-preserving Text Representations. SCC: Automatic Classification of Code Snippets. Recurrent Entity Networks with Delayed Memory Update for Targeted Aspect-based Sentiment Analysis. Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and …
Sorting out Lipschitz function approximation
Title | Sorting out Lipschitz function approximation |
Authors | Cem Anil, James Lucas, Roger Grosse |
Abstract | Training neural networks under a strict Lipschitz constraint is useful for provable adversarial robustness, generalization bounds, interpretable gradients, and Wasserstein distance estimation. By the composition property of Lipschitz functions, it suffices to ensure that each individual affine transformation or nonlinear activation is 1-Lipschitz. The challenge is to do this while maintaining the expressive power. We identify a necessary property for such an architecture: each of the layers must preserve the gradient norm during backpropagation. Based on this, we propose to combine a gradient norm preserving activation function, GroupSort, with norm-constrained weight matrices. We show that norm-constrained GroupSort architectures are universal Lipschitz function approximators. Empirically, we show that norm-constrained GroupSort networks achieve tighter estimates of Wasserstein distance than their ReLU counterparts and can achieve provable adversarial robustness guarantees with little cost to accuracy. |
Tasks | |
Published | 2018-11-13 |
URL | https://arxiv.org/abs/1811.05381v2 |
https://arxiv.org/pdf/1811.05381v2.pdf | |
PWC | https://paperswithcode.com/paper/sorting-out-lipschitz-function-approximation |
Repo | https://github.com/cemanil/LNets |
Framework | pytorch |
Kernel Treelets
Title | Kernel Treelets |
Authors | Hedi Xia, Hector D. Ceniceros |
Abstract | A new method for hierarchical clustering is presented. It combines treelets, a particular multiscale decomposition of data, with a projection on a reproducing kernel Hilbert space. The proposed approach, called kernel treelets (KT), effectively substitutes the correlation coefficient matrix used in treelets with a symmetric, positive semi-definite matrix efficiently constructed from a kernel function. Unlike most clustering methods, which require data sets to be numeric, KT can be applied to more general data and yield a multi-resolution sequence of basis on the data directly in feature space. The effectiveness and potential of KT in clustering analysis is illustrated with some examples. |
Tasks | |
Published | 2018-12-12 |
URL | http://arxiv.org/abs/1812.04808v1 |
http://arxiv.org/pdf/1812.04808v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-treelets |
Repo | https://github.com/hedixia/kernel_treelet |
Framework | none |
Towards Robust and Privacy-preserving Text Representations
Title | Towards Robust and Privacy-preserving Text Representations |
Authors | Yitong Li, Timothy Baldwin, Trevor Cohn |
Abstract | Written text often provides sufficient clues to identify the author, their gender, age, and other important attributes. Consequently, the authorship of training and evaluation corpora can have unforeseen impacts, including differing model performance for different user groups, as well as privacy implications. In this paper, we propose an approach to explicitly obscure important author characteristics at training time, such that representations learned are invariant to these attributes. Evaluating on two tasks, we show that this leads to increased privacy in the learned representations, as well as more robust models to varying evaluation conditions, including out-of-domain corpora. |
Tasks | |
Published | 2018-05-16 |
URL | http://arxiv.org/abs/1805.06093v1 |
http://arxiv.org/pdf/1805.06093v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-and-privacy-preserving-text |
Repo | https://github.com/lrank/Robust_and_Privacy_preserving_Text_Representations |
Framework | tf |
SCC: Automatic Classification of Code Snippets
Title | SCC: Automatic Classification of Code Snippets |
Authors | Kamel Alreshedy, Dhanush Dharmaretnam, Daniel M. German, Venkatesh Srinivasan, T. Aaron Gulliver |
Abstract | Determining the programming language of a source code file has been considered in the research community; it has been shown that Machine Learning (ML) and Natural Language Processing (NLP) algorithms can be effective in identifying the programming language of source code files. However, determining the programming language of a code snippet or a few lines of source code is still a challenging task. Online forums such as Stack Overflow and code repositories such as GitHub contain a large number of code snippets. In this paper, we describe Source Code Classification (SCC), a classifier that can identify the programming language of code snippets written in 21 different programming languages. A Multinomial Naive Bayes (MNB) classifier is employed which is trained using Stack Overflow posts. It is shown to achieve an accuracy of 75% which is higher than that with Programming Languages Identification (PLI a proprietary online classifier of snippets) whose accuracy is only 55.5%. The average score for precision, recall and the F1 score with the proposed tool are 0.76, 0.75 and 0.75, respectively. In addition, it can distinguish between code snippets from a family of programming languages such as C, C++ and C#, and can also identify the programming language version such as C# 3.0, C# 4.0 and C# 5.0. |
Tasks | |
Published | 2018-09-21 |
URL | http://arxiv.org/abs/1809.07945v1 |
http://arxiv.org/pdf/1809.07945v1.pdf | |
PWC | https://paperswithcode.com/paper/scc-automatic-classification-of-code-snippets |
Repo | https://github.com/mindscan-de/FluentGenesis-Classifier |
Framework | tf |
Recurrent Entity Networks with Delayed Memory Update for Targeted Aspect-based Sentiment Analysis
Title | Recurrent Entity Networks with Delayed Memory Update for Targeted Aspect-based Sentiment Analysis |
Authors | Fei Liu, Trevor Cohn, Timothy Baldwin |
Abstract | While neural networks have been shown to achieve impressive results for sentence-level sentiment analysis, targeted aspect-based sentiment analysis (TABSA) — extraction of fine-grained opinion polarity w.r.t. a pre-defined set of aspects — remains a difficult task. Motivated by recent advances in memory-augmented models for machine reading, we propose a novel architecture, utilising external “memory chains” with a delayed memory update mechanism to track entities. On a TABSA task, the proposed model demonstrates substantial improvements over state-of-the-art approaches, including those using external knowledge bases. |
Tasks | Aspect-Based Sentiment Analysis, Reading Comprehension, Sentiment Analysis |
Published | 2018-04-30 |
URL | http://arxiv.org/abs/1804.11019v1 |
http://arxiv.org/pdf/1804.11019v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-entity-networks-with-delayed-memory |
Repo | https://github.com/liufly/delayed-memory-update-entnet |
Framework | tf |
Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera
Title | Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera |
Authors | Fangchang Ma, Guilherme Venturelli Cavalheiro, Sertac Karaman |
Abstract | Depth completion, the technique of estimating a dense depth image from sparse depth measurements, has a variety of applications in robotics and autonomous driving. However, depth completion faces 3 main challenges: the irregularly spaced pattern in the sparse depth input, the difficulty in handling multiple sensor modalities (when color images are available), as well as the lack of dense, pixel-level ground truth depth labels. In this work, we address all these challenges. Specifically, we develop a deep regression model to learn a direct mapping from sparse depth (and color images) to dense depth. We also propose a self-supervised training framework that requires only sequences of color and sparse depth images, without the need for dense depth labels. Our experiments demonstrate that our network, when trained with semi-dense annotations, attains state-of-the- art accuracy and is the winning approach on the KITTI depth completion benchmark at the time of submission. Furthermore, the self-supervised framework outperforms a number of existing solutions trained with semi- dense annotations. |
Tasks | Autonomous Driving, Depth Completion |
Published | 2018-07-01 |
URL | http://arxiv.org/abs/1807.00275v2 |
http://arxiv.org/pdf/1807.00275v2.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-sparse-to-dense-self |
Repo | https://github.com/fangchangma/self-supervised-depth-completion |
Framework | pytorch |
Revisiting Temporal Modeling for Video-based Person ReID
Title | Revisiting Temporal Modeling for Video-based Person ReID |
Authors | Jiyang Gao, Ram Nevatia |
Abstract | Video-based person reID is an important task, which has received much attention in recent years due to the increasing demand in surveillance and camera networks. A typical video-based person reID system consists of three parts: an image-level feature extractor (e.g. CNN), a temporal modeling method to aggregate temporal features and a loss function. Although many methods on temporal modeling have been proposed, it is hard to directly compare these methods, because the choice of feature extractor and loss function also have a large impact on the final performance. We comprehensively study and compare four different temporal modeling methods (temporal pooling, temporal attention, RNN and 3D convnets) for video-based person reID. We also propose a new attention generation network which adopts temporal convolution to extract temporal information among frames. The evaluation is done on the MARS dataset, and our methods outperform state-of-the-art methods by a large margin. Our source codes are released at https://github.com/jiyanggao/Video-Person-ReID. |
Tasks | |
Published | 2018-05-05 |
URL | http://arxiv.org/abs/1805.02104v2 |
http://arxiv.org/pdf/1805.02104v2.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-temporal-modeling-for-video-based |
Repo | https://github.com/mattcoldwater/Video-ReID |
Framework | pytorch |
Talk the Walk: Navigating New York City through Grounded Dialogue
Title | Talk the Walk: Navigating New York City through Grounded Dialogue |
Authors | Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston, Douwe Kiela |
Abstract | We introduce “Talk The Walk”, the first large-scale dialogue dataset grounded in action and perception. The task involves two agents (a “guide” and a “tourist”) that communicate via natural language in order to achieve a common goal: having the tourist navigate to a given target location. The task and dataset, which are described in detail, are challenging and their full solution is an open problem that we pose to the community. We (i) focus on the task of tourist localization and develop the novel Masked Attention for Spatial Convolutions (MASC) mechanism that allows for grounding tourist utterances into the guide’s map, (ii) show it yields significant improvements for both emergent and natural language communication, and (iii) using this method, we establish non-trivial baselines on the full task. |
Tasks | |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03367v3 |
http://arxiv.org/pdf/1807.03367v3.pdf | |
PWC | https://paperswithcode.com/paper/talk-the-walk-navigating-new-york-city |
Repo | https://github.com/facebookresearch/talkthewalk |
Framework | pytorch |
Real-time self-adaptive deep stereo
Title | Real-time self-adaptive deep stereo |
Authors | Alessio Tonioni, Fabio Tosi, Matteo Poggi, Stefano Mattoccia, Luigi Di Stefano |
Abstract | Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs. These models, however, suffer from a notable decrease in accuracy when exposed to scenarios significantly different from the training set, e.g., real vs synthetic images, etc.). We argue that it is extremely unlikely to gather enough samples to achieve effective training/tuning in any target domain, thus making this setup impractical for many applications. Instead, we propose to perform unsupervised and continuous online adaptation of a deep stereo network, which allows for preserving its accuracy in any environment. However, this strategy is extremely computationally demanding and thus prevents real-time inference. We address this issue introducing a new lightweight, yet effective, deep stereo architecture, Modularly ADaptive Network (MADNet) and developing a Modular ADaptation (MAD) algorithm, which independently trains sub-portions of the network. By deploying MADNet together with MAD we introduce the first real-time self-adaptive deep stereo system enabling competitive performance on heterogeneous datasets. |
Tasks | Stereo Depth Estimation |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.05424v2 |
http://arxiv.org/pdf/1810.05424v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-self-adaptive-deep-stereo |
Repo | https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stereo |
Framework | tf |
Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy
Title | Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy |
Authors | Stanislav Nikolov, Sam Blackwell, Ruheena Mendes, Jeffrey De Fauw, Clemens Meyer, Cían Hughes, Harry Askham, Bernardino Romera-Paredes, Alan Karthikesalingam, Carlton Chu, Dawn Carnell, Cheng Boon, Derek D’Souza, Syed Ali Moinuddin, Kevin Sullivan, DeepMind Radiographer Consortium, Hugh Montgomery, Geraint Rees, Ricky Sharma, Mustafa Suleyman, Trevor Back, Joseph R. Ledsam, Olaf Ronneberger |
Abstract | Over half a million individuals are diagnosed with head and neck cancer each year worldwide. Radiotherapy is an important curative treatment for this disease, but it requires manually intensive delineation of radiosensitive organs at risk (OARs). This planning process can delay treatment commencement. While auto-segmentation algorithms offer a potentially time-saving solution, the challenges in defining, quantifying and achieving expert performance remain. Adopting a deep learning approach, we demonstrate a 3D U-Net architecture that achieves performance similar to experts in delineating a wide range of head and neck OARs. The model was trained on a dataset of 663 deidentified computed tomography (CT) scans acquired in routine clinical practice and segmented according to consensus OAR definitions. We demonstrate its generalisability through application to an independent test set of 24 CT scans available from The Cancer Imaging Archive collected at multiple international sites previously unseen to the model, each segmented by two independent experts and consisting of 21 OARs commonly segmented in clinical practice. With appropriate validation studies and regulatory approvals, this system could improve the effectiveness of radiotherapy pathways. |
Tasks | Computed Tomography (CT) |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04430v1 |
http://arxiv.org/pdf/1809.04430v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-to-achieve-clinically |
Repo | https://github.com/deepmind/tcia-ct-scan-dataset |
Framework | none |
Adaptive O-CNN: A Patch-based Deep Representation of 3D Shapes
Title | Adaptive O-CNN: A Patch-based Deep Representation of 3D Shapes |
Authors | Peng-Shuai Wang, Chun-Yu Sun, Yang Liu, Xin Tong |
Abstract | We present an Adaptive Octree-based Convolutional Neural Network (Adaptive O-CNN) for efficient 3D shape encoding and decoding. Different from volumetric-based or octree-based CNN methods that represent a 3D shape with voxels in the same resolution, our method represents a 3D shape adaptively with octants at different levels and models the 3D shape within each octant with a planar patch. Based on this adaptive patch-based representation, we propose an Adaptive O-CNN encoder and decoder for encoding and decoding 3D shapes. The Adaptive O-CNN encoder takes the planar patch normal and displacement as input and performs 3D convolutions only at the octants at each level, while the Adaptive O-CNN decoder infers the shape occupancy and subdivision status of octants at each level and estimates the best plane normal and displacement for each leaf octant. As a general framework for 3D shape analysis and generation, the Adaptive O-CNN not only reduces the memory and computational cost, but also offers better shape generation capability than the existing 3D-CNN approaches. We validate Adaptive O-CNN in terms of efficiency and effectiveness on different shape analysis and generation tasks, including shape classification, 3D autoencoding, shape prediction from a single image, and shape completion for noisy and incomplete point clouds. |
Tasks | 3D Shape Analysis |
Published | 2018-09-21 |
URL | http://arxiv.org/abs/1809.07917v1 |
http://arxiv.org/pdf/1809.07917v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-o-cnn-a-patch-based-deep |
Repo | https://github.com/Microsoft/O-CNN |
Framework | tf |
Rule induction for global explanation of trained models
Title | Rule induction for global explanation of trained models |
Authors | Madhumita Sushil, Simon Šuster, Walter Daelemans |
Abstract | Understanding the behavior of a trained network and finding explanations for its outputs is important for improving the network’s performance and generalization ability, and for ensuring trust in automated systems. Several approaches have previously been proposed to identify and visualize the most important features by analyzing a trained network. However, the relations between different features and classes are lost in most cases. We propose a technique to induce sets of if-then-else rules that capture these relations to globally explain the predictions of a network. We first calculate the importance of the features in the trained network. We then weigh the original inputs with these feature importance scores, simplify the transformed input space, and finally fit a rule induction model to explain the model predictions. We find that the output rule-sets can explain the predictions of a neural network trained for 4-class text classification from the 20 newsgroups dataset to a macro-averaged F-score of 0.80. We make the code available at https://github.com/clips/interpret_with_rules. |
Tasks | Feature Importance, Text Classification |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09744v1 |
http://arxiv.org/pdf/1808.09744v1.pdf | |
PWC | https://paperswithcode.com/paper/rule-induction-for-global-explanation-of |
Repo | https://github.com/clips/interpret_with_rules |
Framework | none |
The Impact of Humanoid Affect Expression on Human Behavior in a Game-Theoretic Setting
Title | The Impact of Humanoid Affect Expression on Human Behavior in a Game-Theoretic Setting |
Authors | Aaron M. Roth, Umang Bhatt, Tamara Amin, Afsaneh Doryab, Fei Fang, Manuela Veloso |
Abstract | With the rapid development of robot and other intelligent and autonomous agents, how a human could be influenced by a robot’s expressed mood when making decisions becomes a crucial question in human-robot interaction. In this pilot study, we investigate (1) in what way a robot can express a certain mood to influence a human’s decision making behavioral model; (2) how and to what extent the human will be influenced in a game theoretic setting. More specifically, we create an NLP model to generate sentences that adhere to a specific affective expression profile. We use these sentences for a humanoid robot as it plays a Stackelberg security game against a human. We investigate the behavioral model of the human player. |
Tasks | Decision Making |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03671v1 |
http://arxiv.org/pdf/1806.03671v1.pdf | |
PWC | https://paperswithcode.com/paper/the-impact-of-humanoid-affect-expression-on |
Repo | https://github.com/tamamin/Affect-Adjusted-NLP |
Framework | none |
Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms
Title | Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms |
Authors | Mathieu Blondel, André F. T. Martins, Vlad Niculae |
Abstract | This paper studies Fenchel-Young losses, a generic way to construct convex loss functions from a regularization function. We analyze their properties in depth, showing that they unify many well-known loss functions and allow to create useful new ones easily. Fenchel-Young losses constructed from a generalized entropy, including the Shannon and Tsallis entropies, induce predictive probability distributions. We formulate conditions for a generalized entropy to yield losses with a separation margin, and probability distributions with sparse support. Finally, we derive efficient algorithms, making Fenchel-Young losses appealing both in theory and practice. |
Tasks | |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09717v4 |
http://arxiv.org/pdf/1805.09717v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-classifiers-with-fenchel-young |
Repo | https://github.com/mblondel/projection-losses |
Framework | none |
Learning Representations for Soft Skill Matching
Title | Learning Representations for Soft Skill Matching |
Authors | Luiza Sayfullina, Eric Malmi, Juho Kannala |
Abstract | Employers actively look for talents having not only specific hard skills but also various soft skills. To analyze the soft skill demands on the job market, it is important to be able to detect soft skill phrases from job advertisements automatically. However, a naive matching of soft skill phrases can lead to false positive matches when a soft skill phrase, such as friendly, is used to describe a company, a team, or another entity, rather than a desired candidate. In this paper, we propose a phrase-matching-based approach which differentiates between soft skill phrases referring to a candidate vs. something else. The disambiguation is formulated as a binary text classification problem where the prediction is made for the potential soft skill based on the context where it occurs. To inform the model about the soft skill for which the prediction is made, we develop several approaches, including soft skill masking and soft skill tagging. We compare several neural network based approaches, including CNN, LSTM and Hierarchical Attention Model. The proposed tagging-based input representation using LSTM achieved the highest recall of 83.92% on the job dataset when fixing a precision to 95%. |
Tasks | Text Classification |
Published | 2018-07-20 |
URL | http://arxiv.org/abs/1807.07741v1 |
http://arxiv.org/pdf/1807.07741v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-representations-for-soft-skill |
Repo | https://github.com/muzaluisa/soft-skill-matching |
Framework | none |