October 20, 2019

2803 words 14 mins read

Paper Group ANR 89

Gear Training: A new way to implement high-performance model-parallel training. Affine Differential Invariants for Invariant Feature Point Detection. Dual Memory Network Model for Biased Product Review Classification. Learning from Exemplars and Prototypes in Machine Learning and Psychology. Denoising Adversarial Autoencoders: Classifying Skin Lesi …

Gear Training: A new way to implement high-performance model-parallel training


Title	Gear Training: A new way to implement high-performance model-parallel training
Authors	Hao Dong, Shuai Li, Dongchang Xu, Yi Ren, Di Zhang
Abstract	The training of Deep Neural Networks usually needs tremendous computing resources. Therefore many deep models are trained in large cluster instead of single machine or GPU. Though major researchs at present try to run whole model on all machines by using asynchronous asynchronous stochastic gradient descent (ASGD), we present a new approach to train deep model parallely – split the model and then seperately train different parts of it in different speed.
Tasks
Published	2018-06-11
URL	http://arxiv.org/abs/1806.03925v1
PDF	http://arxiv.org/pdf/1806.03925v1.pdf
PWC	https://paperswithcode.com/paper/gear-training-a-new-way-to-implement-high
Repo
Framework

Affine Differential Invariants for Invariant Feature Point Detection


Title	Affine Differential Invariants for Invariant Feature Point Detection
Authors	Stanley L. Tuznik, Peter J. Olver, Allen Tannenbaum
Abstract	Image feature points are detected as pixels which locally maximize a detector function, two commonly used examples of which are the (Euclidean) image gradient and the Harris-Stephens corner detector. A major limitation of these feature detectors are that they are only Euclidean-invariant. In this work we demonstrate the application of a 2D affine-invariant image feature point detector based on differential invariants as derived through the equivariant method of moving frames. The fundamental equi-affine differential invariants for 3D image volumes are also computed.
Tasks
Published	2018-03-05
URL	http://arxiv.org/abs/1803.01669v2
PDF	http://arxiv.org/pdf/1803.01669v2.pdf
PWC	https://paperswithcode.com/paper/affine-differential-invariants-for-invariant
Repo
Framework

Dual Memory Network Model for Biased Product Review Classification


Title	Dual Memory Network Model for Biased Product Review Classification
Authors	Yunfei Long, Mingyu Ma, Qin Lu, Rong Xiang, Chu-Ren Huang
Abstract	In sentiment analysis (SA) of product reviews, both user and product information are proven to be useful. Current tasks handle user profile and product information in a unified model which may not be able to learn salient features of users and products effectively. In this work, we propose a dual user and product memory network (DUPMN) model to learn user profiles and product reviews using separate memory networks. Then, the two representations are used jointly for sentiment prediction. The use of separate models aims to capture user profiles and product information more effectively. Compared to state-of-the-art unified prediction models, the evaluations on three benchmark datasets, IMDB, Yelp13, and Yelp14, show that our dual learning model gives performance gain of 0.6%, 1.2%, and 0.9%, respectively. The improvements are also deemed very significant measured by p-values.
Tasks	Sentiment Analysis
Published	2018-09-16
URL	http://arxiv.org/abs/1809.05807v1
PDF	http://arxiv.org/pdf/1809.05807v1.pdf
PWC	https://paperswithcode.com/paper/dual-memory-network-model-for-biased-product
Repo
Framework

Learning from Exemplars and Prototypes in Machine Learning and Psychology


Title	Learning from Exemplars and Prototypes in Machine Learning and Psychology
Authors	Julian Zubek, Ludmila Kuncheva
Abstract	This paper draws a parallel between similarity-based categorisation models developed in cognitive psychology and the nearest neighbour classifier (1-NN) in machine learning. Conceived as a result of the historical rivalry between prototype theories (abstraction) and exemplar theories (memorisation), recent models of human categorisation seek a compromise in-between. Regarding the stimuli (entities to be categorised) as points in a metric space, machine learning offers a large collection of methods to select a small, representative and discriminative point set. These methods are known under various names: instance selection, data editing, prototype selection, prototype generation or prototype replacement. The nearest neighbour classifier is used with the selected reference set. Such a set can be interpreted as a data-driven categorisation model. We juxtapose the models from the two fields to enable cross-referencing. We believe that both machine learning and cognitive psychology can draw inspiration from the comparison and enrich their repertoire of similarity-based models.
Tasks
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01130v1
PDF	http://arxiv.org/pdf/1806.01130v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-exemplars-and-prototypes-in
Repo
Framework

Denoising Adversarial Autoencoders: Classifying Skin Lesions Using Limited Labelled Training Data


Title	Denoising Adversarial Autoencoders: Classifying Skin Lesions Using Limited Labelled Training Data
Authors	Antonia Creswell, Alison Pouplin, Anil A Bharath
Abstract	We propose a novel deep learning model for classifying medical images in the setting where there is a large amount of unlabelled medical data available, but labelled data is in limited supply. We consider the specific case of classifying skin lesions as either malignant or benign. In this setting, the proposed approach – the semi-supervised, denoising adversarial autoencoder – is able to utilise vast amounts of unlabelled data to learn a representation for skin lesions, and small amounts of labelled data to assign class labels based on the learned representation. We analyse the contributions of both the adversarial and denoising components of the model and find that the combination yields superior classification performance in the setting of limited labelled training data.
Tasks	Denoising
Published	2018-01-02
URL	http://arxiv.org/abs/1801.00693v1
PDF	http://arxiv.org/pdf/1801.00693v1.pdf
PWC	https://paperswithcode.com/paper/denoising-adversarial-autoencoders-1
Repo
Framework

Transformative Machine Learning


Title	Transformative Machine Learning
Authors	Ivan Olier, Oghenejokpeme I. Orhobor, Joaquin Vanschoren, Ross D. King
Abstract	The key to success in machine learning (ML) is the use of effective data representations. Traditionally, data representations were hand-crafted. Recently it has been demonstrated that, given sufficient data, deep neural networks can learn effective implicit representations from simple input representations. However, for most scientific problems, the use of deep learning is not appropriate as the amount of available data is limited, and/or the output models must be explainable. Nevertheless, many scientific problems do have significant amounts of data available on related tasks, which makes them amenable to multi-task learning, i.e. learning many related problems simultaneously. Here we propose a novel and general representation learning approach for multi-task learning that works successfully with small amounts of data. The fundamental new idea is to transform an input intrinsic data representation (i.e., handcrafted features), to an extrinsic representation based on what a pre-trained set of models predict about the examples. This transformation has the dual advantages of producing significantly more accurate predictions, and providing explainable models. To demonstrate the utility of this transformative learning approach, we have applied it to three real-world scientific problems: drug-design (quantitative structure activity relationship learning), predicting human gene expression (across different tissue types and drug treatments), and meta-learning for machine learning (predicting which machine learning methods work best for a given problem). In all three problems, transformative machine learning significantly outperforms the best intrinsic representation.
Tasks	Meta-Learning, Multi-Task Learning, Representation Learning
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03392v1
PDF	http://arxiv.org/pdf/1811.03392v1.pdf
PWC	https://paperswithcode.com/paper/transformative-machine-learning
Repo
Framework

Duelling Bandits with Weak Regret in Adversarial Environments


Title	Duelling Bandits with Weak Regret in Adversarial Environments
Authors	Lennard Hilgendorf
Abstract	Research on the multi-armed bandit problem has studied the trade-off of exploration and exploitation in depth. However, there are numerous applications where the cardinal absolute-valued feedback model (e.g. ratings from one to five) is not suitable. This has motivated the formulation of the duelling bandits problem, where the learner picks a pair of actions and observes a noisy binary feedback, indicating a relative preference between the two. There exist a multitude of different settings and interpretations of the problem for two reasons. First, due to the absence of a total order of actions, there is no natural definition of the best action. Existing work either explicitly assumes the existence of a linear order, or uses a custom definition for the winner. Second, there are multiple reasonable notions of regret to measure the learner’s performance. Most prior work has been focussing on the $\textit{strong regret}$, which averages the quality of the two actions picked. This work focusses on the $\textit{weak regret}$, which is based on the quality of the better of the two actions selected. Weak regret is the more appropriate performance measure when the pair’s inferior action has no significant detrimental effect on the pair’s quality. We study the duelling bandits problem in the adversarial setting. We provide an algorithm which has theoretical guarantees in both the utility-based setting, which implies a total order, and the unrestricted setting. For the latter, we work with the $\textit{Borda winner}$, finding the action maximising the probability of winning against an action sampled uniformly at random. The thesis concludes with experimental results based on both real-world data and synthetic data, showing the algorithm’s performance and limitations.
Tasks
Published	2018-12-10
URL	http://arxiv.org/abs/1812.04152v1
PDF	http://arxiv.org/pdf/1812.04152v1.pdf
PWC	https://paperswithcode.com/paper/duelling-bandits-with-weak-regret-in
Repo
Framework

DALEX: explainers for complex predictive models


Title	DALEX: explainers for complex predictive models
Authors	Przemyslaw Biecek
Abstract	Predictive modeling is invaded by elastic, yet complex methods such as neural networks or ensembles (model stacking, boosting or bagging). Such methods are usually described by a large number of parameters or hyper parameters - a price that one needs to pay for elasticity. The very number of parameters makes models hard to understand. This paper describes a consistent collection of explainers for predictive models, a.k.a. black boxes. Each explainer is a technique for exploration of a black box model. Presented approaches are model-agnostic, what means that they extract useful information from any predictive method despite its internal structure. Each explainer is linked with a specific aspect of a model. Some are useful in decomposing predictions, some serve better in understanding performance, while others are useful in understanding importance and conditional responses of a particular variable. Every explainer presented in this paper works for a single model or for a collection of models. In the latter case, models can be compared against each other. Such comparison helps to find strengths and weaknesses of different approaches and gives additional possibilities for model validation. Presented explainers are implemented in the DALEX package for R. They are based on a uniform standardized grammar of model exploration which may be easily extended. The current implementation supports the most popular frameworks for classification and regression.
Tasks
Published	2018-06-23
URL	http://arxiv.org/abs/1806.08915v2
PDF	http://arxiv.org/pdf/1806.08915v2.pdf
PWC	https://paperswithcode.com/paper/dalex-explainers-for-complex-predictive
Repo
Framework

A Survey of Safety and Trustworthiness of Deep Neural Networks


Title	A Survey of Safety and Trustworthiness of Deep Neural Networks
Authors	Xiaowei Huang, Daniel Kroening, Wenjie Ruan, James Sharp, Youcheng Sun, Emese Thamo, Min Wu, Xinping Yi
Abstract	In the past few years, significant progress has been made on deep neural networks (DNNs) in achieving human-level performance on several long-standing tasks. With the broader deployment of DNNs on various applications, the concerns on its safety and trustworthiness have been raised in public, especially after the widely reported fatal incidents of self-driving cars. Research to address these concerns is very active, with many papers released in the past few years. This survey paper conducts a review of the current research effort on making DNNs safe and trustworthy, by focusing on four aspects: verification, testing, adversarial attack and defence, and interpretability. In total, we surveyed 178 papers, most of which published after 2017.
Tasks	Adversarial Attack, Self-Driving Cars
Published	2018-12-18
URL	https://arxiv.org/abs/1812.08342v4
PDF	https://arxiv.org/pdf/1812.08342v4.pdf
PWC	https://paperswithcode.com/paper/safety-and-trustworthiness-of-deep-neural
Repo
Framework

Diffeomorphic Learning


Title	Diffeomorphic Learning
Authors	Laurent Younes
Abstract	We introduce in this paper a learning paradigm in which the training data is transformed by a diffeomorphic transformation before prediction. The learning algorithm minimizes a cost function evaluating the prediction error on the training set penalized by the distance between the diffeomorphism and the identity. The approach borrows ideas from shape analysis where diffeomorphisms are estimated for shape and image alignment, and brings them in a previously unexplored setting, estimating, in particular diffeomorphisms in much larger dimensions. After introducing the concept and describing a learning algorithm, we present diverse applications, mostly with synthetic examples, demonstrating the potential of the approach, as well as some insight on how it can be improved.
Tasks
Published	2018-06-04
URL	https://arxiv.org/abs/1806.01240v3
PDF	https://arxiv.org/pdf/1806.01240v3.pdf
PWC	https://paperswithcode.com/paper/diffeomorphic-learning
Repo
Framework

Improving CNN classifiers by estimating test-time priors


Title	Improving CNN classifiers by estimating test-time priors
Authors	Milan Sulc, Jiri Matas
Abstract	The problem of different training and test set class priors is addressed in the context of CNN classifiers. We compare two different approaches to estimating the new priors: an existing Maximum Likelihood Estimation approach (optimized by an EM algorithm or by projected gradient descend) and a proposed Maximum a Posteriori approach, which increases the stability of the estimate by introducing a Dirichlet hyper-prior on the class prior probabilities. Experimental results show a significant improvement on the fine-grained classification tasks using known evaluation-time priors, increasing the top-1 accuracy by 4.0% on the FGVC iNaturalist 2018 validation set and by 3.9% on the FGVCx Fungi 2018 validation set. Estimation of the unknown test set priors noticeably increases the accuracy on the PlantCLEF dataset, allowing a single CNN model to achieve state-of-the-art results and outperform the competition-winning ensemble of 12 CNNs. The proposed Maximum a Posteriori estimation increases the prediction accuracy by 2.8% on PlantCLEF 2017 and by 1.8% on FGVCx Fungi, where the existing MLE method would lead to a decrease accuracy.
Tasks	Image Classification
Published	2018-05-21
URL	http://arxiv.org/abs/1805.08235v2
PDF	http://arxiv.org/pdf/1805.08235v2.pdf
PWC	https://paperswithcode.com/paper/improving-cnn-classifiers-by-estimating-test
Repo
Framework

Geometry and clustering with metrics derived from separable Bregman divergences


Title	Geometry and clustering with metrics derived from separable Bregman divergences
Authors	Erika Gomes-Gonçalves, Henryk Gzyl, Frank Nielsen
Abstract	Separable Bregman divergences induce Riemannian metric spaces that are isometric to the Euclidean space after monotone embeddings. We investigate fixed rate quantization and its codebook Voronoi diagrams, and report on experimental performances of partition-based, hierarchical, and soft clustering algorithms with respect to these Riemann-Bregman distances.
Tasks	Quantization
Published	2018-10-25
URL	http://arxiv.org/abs/1810.10770v1
PDF	http://arxiv.org/pdf/1810.10770v1.pdf
PWC	https://paperswithcode.com/paper/geometry-and-clustering-with-metrics-derived
Repo
Framework

User Modeling for Task Oriented Dialogues


Title	User Modeling for Task Oriented Dialogues
Authors	Izzeddin Gur, Dilek Hakkani-Tur, Gokhan Tur, Pararth Shah
Abstract	We introduce end-to-end neural network based models for simulating users of task-oriented dialogue systems. User simulation in dialogue systems is crucial from two different perspectives: (i) automatic evaluation of different dialogue models, and (ii) training task-oriented dialogue systems. We design a hierarchical sequence-to-sequence model that first encodes the initial user goal and system turns into fixed length representations using Recurrent Neural Networks (RNN). It then encodes the dialogue history using another RNN layer. At each turn, user responses are decoded from the hidden representations of the dialogue level RNN. This hierarchical user simulator (HUS) approach allows the model to capture undiscovered parts of the user goal without the need of an explicit dialogue state tracking. We further develop several variants by utilizing a latent variable model to inject random variations into user responses to promote diversity in simulated user responses and a novel goal regularization mechanism to penalize divergence of user responses from the initial user goal. We evaluate the proposed models on movie ticket booking domain by systematically interacting each user simulator with various dialogue system policies trained with different objectives and users.
Tasks	Dialogue State Tracking, Task-Oriented Dialogue Systems
Published	2018-11-11
URL	http://arxiv.org/abs/1811.04369v1
PDF	http://arxiv.org/pdf/1811.04369v1.pdf
PWC	https://paperswithcode.com/paper/user-modeling-for-task-oriented-dialogues
Repo
Framework

Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure


Title	Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure
Authors	Besmira Nushi, Ece Kamar, Eric Horvitz
Abstract	As machine learning systems move from computer-science laboratories into the open world, their accountability becomes a high priority problem. Accountability requires deep understanding of system behavior and its failures. Current evaluation methods such as single-score error metrics and confusion matrices provide aggregate views of system performance that hide important shortcomings. Understanding details about failures is important for identifying pathways for refinement, communicating the reliability of systems in different settings, and for specifying appropriate human oversight and engagement. Characterization of failures and shortcomings is particularly complex for systems composed of multiple machine learned components. For such systems, existing evaluation methods have limited expressiveness in describing and explaining the relationship among input content, the internal states of system components, and final output quality. We present Pandora, a set of hybrid human-machine methods and tools for describing and explaining system failures. Pandora leverages both human and system-generated observations to summarize conditions of system malfunction with respect to the input content and system architecture. We share results of a case study with a machine learning pipeline for image captioning that show how detailed performance views can be beneficial for analysis and debugging.
Tasks	Image Captioning
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07424v1
PDF	http://arxiv.org/pdf/1809.07424v1.pdf
PWC	https://paperswithcode.com/paper/towards-accountable-ai-hybrid-human-machine
Repo
Framework

Automatic Generation of Chinese Short Product Titles for Mobile Display


Title	Automatic Generation of Chinese Short Product Titles for Mobile Display
Authors	Yu Gong, Xusheng Luo, Kenny Q. Zhu, Wenwu Ou, Zhao Li, Lu Duan
Abstract	This paper studies the problem of automatically extracting a short title from a manually written longer description of E-commerce products for display on mobile devices. It is a new extractive summarization problem on short text inputs, for which we propose a feature-enriched network model, combining three different categories of features in parallel. Experimental results show that our framework significantly outperforms several baselines by a substantial gain of 4.5%. Moreover, we produce an extractive summarization dataset for E-commerce short texts and will release it to the research community.
Tasks
Published	2018-03-30
URL	https://arxiv.org/abs/1803.11359v4
PDF	https://arxiv.org/pdf/1803.11359v4.pdf
PWC	https://paperswithcode.com/paper/automatic-generation-of-chinese-short-product
Repo
Framework