Paper Group ANR 89
Gear Training: A new way to implement high-performance model-parallel training. Affine Differential Invariants for Invariant Feature Point Detection. Dual Memory Network Model for Biased Product Review Classification. Learning from Exemplars and Prototypes in Machine Learning and Psychology. Denoising Adversarial Autoencoders: Classifying Skin Lesi …
Gear Training: A new way to implement high-performance model-parallel training
Title | Gear Training: A new way to implement high-performance model-parallel training |
Authors | Hao Dong, Shuai Li, Dongchang Xu, Yi Ren, Di Zhang |
Abstract | The training of Deep Neural Networks usually needs tremendous computing resources. Therefore many deep models are trained in large cluster instead of single machine or GPU. Though major researchs at present try to run whole model on all machines by using asynchronous asynchronous stochastic gradient descent (ASGD), we present a new approach to train deep model parallely – split the model and then seperately train different parts of it in different speed. |
Tasks | |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.03925v1 |
http://arxiv.org/pdf/1806.03925v1.pdf | |
PWC | https://paperswithcode.com/paper/gear-training-a-new-way-to-implement-high |
Repo | |
Framework | |
Affine Differential Invariants for Invariant Feature Point Detection
Title | Affine Differential Invariants for Invariant Feature Point Detection |
Authors | Stanley L. Tuznik, Peter J. Olver, Allen Tannenbaum |
Abstract | Image feature points are detected as pixels which locally maximize a detector function, two commonly used examples of which are the (Euclidean) image gradient and the Harris-Stephens corner detector. A major limitation of these feature detectors are that they are only Euclidean-invariant. In this work we demonstrate the application of a 2D affine-invariant image feature point detector based on differential invariants as derived through the equivariant method of moving frames. The fundamental equi-affine differential invariants for 3D image volumes are also computed. |
Tasks | |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01669v2 |
http://arxiv.org/pdf/1803.01669v2.pdf | |
PWC | https://paperswithcode.com/paper/affine-differential-invariants-for-invariant |
Repo | |
Framework | |
Dual Memory Network Model for Biased Product Review Classification
Title | Dual Memory Network Model for Biased Product Review Classification |
Authors | Yunfei Long, Mingyu Ma, Qin Lu, Rong Xiang, Chu-Ren Huang |
Abstract | In sentiment analysis (SA) of product reviews, both user and product information are proven to be useful. Current tasks handle user profile and product information in a unified model which may not be able to learn salient features of users and products effectively. In this work, we propose a dual user and product memory network (DUPMN) model to learn user profiles and product reviews using separate memory networks. Then, the two representations are used jointly for sentiment prediction. The use of separate models aims to capture user profiles and product information more effectively. Compared to state-of-the-art unified prediction models, the evaluations on three benchmark datasets, IMDB, Yelp13, and Yelp14, show that our dual learning model gives performance gain of 0.6%, 1.2%, and 0.9%, respectively. The improvements are also deemed very significant measured by p-values. |
Tasks | Sentiment Analysis |
Published | 2018-09-16 |
URL | http://arxiv.org/abs/1809.05807v1 |
http://arxiv.org/pdf/1809.05807v1.pdf | |
PWC | https://paperswithcode.com/paper/dual-memory-network-model-for-biased-product |
Repo | |
Framework | |
Learning from Exemplars and Prototypes in Machine Learning and Psychology
Title | Learning from Exemplars and Prototypes in Machine Learning and Psychology |
Authors | Julian Zubek, Ludmila Kuncheva |
Abstract | This paper draws a parallel between similarity-based categorisation models developed in cognitive psychology and the nearest neighbour classifier (1-NN) in machine learning. Conceived as a result of the historical rivalry between prototype theories (abstraction) and exemplar theories (memorisation), recent models of human categorisation seek a compromise in-between. Regarding the stimuli (entities to be categorised) as points in a metric space, machine learning offers a large collection of methods to select a small, representative and discriminative point set. These methods are known under various names: instance selection, data editing, prototype selection, prototype generation or prototype replacement. The nearest neighbour classifier is used with the selected reference set. Such a set can be interpreted as a data-driven categorisation model. We juxtapose the models from the two fields to enable cross-referencing. We believe that both machine learning and cognitive psychology can draw inspiration from the comparison and enrich their repertoire of similarity-based models. |
Tasks | |
Published | 2018-06-04 |
URL | http://arxiv.org/abs/1806.01130v1 |
http://arxiv.org/pdf/1806.01130v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-exemplars-and-prototypes-in |
Repo | |
Framework | |
Denoising Adversarial Autoencoders: Classifying Skin Lesions Using Limited Labelled Training Data
Title | Denoising Adversarial Autoencoders: Classifying Skin Lesions Using Limited Labelled Training Data |
Authors | Antonia Creswell, Alison Pouplin, Anil A Bharath |
Abstract | We propose a novel deep learning model for classifying medical images in the setting where there is a large amount of unlabelled medical data available, but labelled data is in limited supply. We consider the specific case of classifying skin lesions as either malignant or benign. In this setting, the proposed approach – the semi-supervised, denoising adversarial autoencoder – is able to utilise vast amounts of unlabelled data to learn a representation for skin lesions, and small amounts of labelled data to assign class labels based on the learned representation. We analyse the contributions of both the adversarial and denoising components of the model and find that the combination yields superior classification performance in the setting of limited labelled training data. |
Tasks | Denoising |
Published | 2018-01-02 |
URL | http://arxiv.org/abs/1801.00693v1 |
http://arxiv.org/pdf/1801.00693v1.pdf | |
PWC | https://paperswithcode.com/paper/denoising-adversarial-autoencoders-1 |
Repo | |
Framework | |
Transformative Machine Learning
Title | Transformative Machine Learning |
Authors | Ivan Olier, Oghenejokpeme I. Orhobor, Joaquin Vanschoren, Ross D. King |
Abstract | The key to success in machine learning (ML) is the use of effective data representations. Traditionally, data representations were hand-crafted. Recently it has been demonstrated that, given sufficient data, deep neural networks can learn effective implicit representations from simple input representations. However, for most scientific problems, the use of deep learning is not appropriate as the amount of available data is limited, and/or the output models must be explainable. Nevertheless, many scientific problems do have significant amounts of data available on related tasks, which makes them amenable to multi-task learning, i.e. learning many related problems simultaneously. Here we propose a novel and general representation learning approach for multi-task learning that works successfully with small amounts of data. The fundamental new idea is to transform an input intrinsic data representation (i.e., handcrafted features), to an extrinsic representation based on what a pre-trained set of models predict about the examples. This transformation has the dual advantages of producing significantly more accurate predictions, and providing explainable models. To demonstrate the utility of this transformative learning approach, we have applied it to three real-world scientific problems: drug-design (quantitative structure activity relationship learning), predicting human gene expression (across different tissue types and drug treatments), and meta-learning for machine learning (predicting which machine learning methods work best for a given problem). In all three problems, transformative machine learning significantly outperforms the best intrinsic representation. |
Tasks | Meta-Learning, Multi-Task Learning, Representation Learning |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03392v1 |
http://arxiv.org/pdf/1811.03392v1.pdf | |
PWC | https://paperswithcode.com/paper/transformative-machine-learning |
Repo | |
Framework | |
Duelling Bandits with Weak Regret in Adversarial Environments
Title | Duelling Bandits with Weak Regret in Adversarial Environments |
Authors | Lennard Hilgendorf |
Abstract | Research on the multi-armed bandit problem has studied the trade-off of exploration and exploitation in depth. However, there are numerous applications where the cardinal absolute-valued feedback model (e.g. ratings from one to five) is not suitable. This has motivated the formulation of the duelling bandits problem, where the learner picks a pair of actions and observes a noisy binary feedback, indicating a relative preference between the two. There exist a multitude of different settings and interpretations of the problem for two reasons. First, due to the absence of a total order of actions, there is no natural definition of the best action. Existing work either explicitly assumes the existence of a linear order, or uses a custom definition for the winner. Second, there are multiple reasonable notions of regret to measure the learner’s performance. Most prior work has been focussing on the $\textit{strong regret}$, which averages the quality of the two actions picked. This work focusses on the $\textit{weak regret}$, which is based on the quality of the better of the two actions selected. Weak regret is the more appropriate performance measure when the pair’s inferior action has no significant detrimental effect on the pair’s quality. We study the duelling bandits problem in the adversarial setting. We provide an algorithm which has theoretical guarantees in both the utility-based setting, which implies a total order, and the unrestricted setting. For the latter, we work with the $\textit{Borda winner}$, finding the action maximising the probability of winning against an action sampled uniformly at random. The thesis concludes with experimental results based on both real-world data and synthetic data, showing the algorithm’s performance and limitations. |
Tasks | |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.04152v1 |
http://arxiv.org/pdf/1812.04152v1.pdf | |
PWC | https://paperswithcode.com/paper/duelling-bandits-with-weak-regret-in |
Repo | |
Framework | |
DALEX: explainers for complex predictive models
Title | DALEX: explainers for complex predictive models |
Authors | Przemyslaw Biecek |
Abstract | Predictive modeling is invaded by elastic, yet complex methods such as neural networks or ensembles (model stacking, boosting or bagging). Such methods are usually described by a large number of parameters or hyper parameters - a price that one needs to pay for elasticity. The very number of parameters makes models hard to understand. This paper describes a consistent collection of explainers for predictive models, a.k.a. black boxes. Each explainer is a technique for exploration of a black box model. Presented approaches are model-agnostic, what means that they extract useful information from any predictive method despite its internal structure. Each explainer is linked with a specific aspect of a model. Some are useful in decomposing predictions, some serve better in understanding performance, while others are useful in understanding importance and conditional responses of a particular variable. Every explainer presented in this paper works for a single model or for a collection of models. In the latter case, models can be compared against each other. Such comparison helps to find strengths and weaknesses of different approaches and gives additional possibilities for model validation. Presented explainers are implemented in the DALEX package for R. They are based on a uniform standardized grammar of model exploration which may be easily extended. The current implementation supports the most popular frameworks for classification and regression. |
Tasks | |
Published | 2018-06-23 |
URL | http://arxiv.org/abs/1806.08915v2 |
http://arxiv.org/pdf/1806.08915v2.pdf | |
PWC | https://paperswithcode.com/paper/dalex-explainers-for-complex-predictive |
Repo | |
Framework | |
A Survey of Safety and Trustworthiness of Deep Neural Networks
Title | A Survey of Safety and Trustworthiness of Deep Neural Networks |
Authors | Xiaowei Huang, Daniel Kroening, Wenjie Ruan, James Sharp, Youcheng Sun, Emese Thamo, Min Wu, Xinping Yi |
Abstract | In the past few years, significant progress has been made on deep neural networks (DNNs) in achieving human-level performance on several long-standing tasks. With the broader deployment of DNNs on various applications, the concerns on its safety and trustworthiness have been raised in public, especially after the widely reported fatal incidents of self-driving cars. Research to address these concerns is very active, with many papers released in the past few years. This survey paper conducts a review of the current research effort on making DNNs safe and trustworthy, by focusing on four aspects: verification, testing, adversarial attack and defence, and interpretability. In total, we surveyed 178 papers, most of which published after 2017. |
Tasks | Adversarial Attack, Self-Driving Cars |
Published | 2018-12-18 |
URL | https://arxiv.org/abs/1812.08342v4 |
https://arxiv.org/pdf/1812.08342v4.pdf | |
PWC | https://paperswithcode.com/paper/safety-and-trustworthiness-of-deep-neural |
Repo | |
Framework | |
Diffeomorphic Learning
Title | Diffeomorphic Learning |
Authors | Laurent Younes |
Abstract | We introduce in this paper a learning paradigm in which the training data is transformed by a diffeomorphic transformation before prediction. The learning algorithm minimizes a cost function evaluating the prediction error on the training set penalized by the distance between the diffeomorphism and the identity. The approach borrows ideas from shape analysis where diffeomorphisms are estimated for shape and image alignment, and brings them in a previously unexplored setting, estimating, in particular diffeomorphisms in much larger dimensions. After introducing the concept and describing a learning algorithm, we present diverse applications, mostly with synthetic examples, demonstrating the potential of the approach, as well as some insight on how it can be improved. |
Tasks | |
Published | 2018-06-04 |
URL | https://arxiv.org/abs/1806.01240v3 |
https://arxiv.org/pdf/1806.01240v3.pdf | |
PWC | https://paperswithcode.com/paper/diffeomorphic-learning |
Repo | |
Framework | |
Improving CNN classifiers by estimating test-time priors
Title | Improving CNN classifiers by estimating test-time priors |
Authors | Milan Sulc, Jiri Matas |
Abstract | The problem of different training and test set class priors is addressed in the context of CNN classifiers. We compare two different approaches to estimating the new priors: an existing Maximum Likelihood Estimation approach (optimized by an EM algorithm or by projected gradient descend) and a proposed Maximum a Posteriori approach, which increases the stability of the estimate by introducing a Dirichlet hyper-prior on the class prior probabilities. Experimental results show a significant improvement on the fine-grained classification tasks using known evaluation-time priors, increasing the top-1 accuracy by 4.0% on the FGVC iNaturalist 2018 validation set and by 3.9% on the FGVCx Fungi 2018 validation set. Estimation of the unknown test set priors noticeably increases the accuracy on the PlantCLEF dataset, allowing a single CNN model to achieve state-of-the-art results and outperform the competition-winning ensemble of 12 CNNs. The proposed Maximum a Posteriori estimation increases the prediction accuracy by 2.8% on PlantCLEF 2017 and by 1.8% on FGVCx Fungi, where the existing MLE method would lead to a decrease accuracy. |
Tasks | Image Classification |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.08235v2 |
http://arxiv.org/pdf/1805.08235v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-cnn-classifiers-by-estimating-test |
Repo | |
Framework | |
Geometry and clustering with metrics derived from separable Bregman divergences
Title | Geometry and clustering with metrics derived from separable Bregman divergences |
Authors | Erika Gomes-Gonçalves, Henryk Gzyl, Frank Nielsen |
Abstract | Separable Bregman divergences induce Riemannian metric spaces that are isometric to the Euclidean space after monotone embeddings. We investigate fixed rate quantization and its codebook Voronoi diagrams, and report on experimental performances of partition-based, hierarchical, and soft clustering algorithms with respect to these Riemann-Bregman distances. |
Tasks | Quantization |
Published | 2018-10-25 |
URL | http://arxiv.org/abs/1810.10770v1 |
http://arxiv.org/pdf/1810.10770v1.pdf | |
PWC | https://paperswithcode.com/paper/geometry-and-clustering-with-metrics-derived |
Repo | |
Framework | |
User Modeling for Task Oriented Dialogues
Title | User Modeling for Task Oriented Dialogues |
Authors | Izzeddin Gur, Dilek Hakkani-Tur, Gokhan Tur, Pararth Shah |
Abstract | We introduce end-to-end neural network based models for simulating users of task-oriented dialogue systems. User simulation in dialogue systems is crucial from two different perspectives: (i) automatic evaluation of different dialogue models, and (ii) training task-oriented dialogue systems. We design a hierarchical sequence-to-sequence model that first encodes the initial user goal and system turns into fixed length representations using Recurrent Neural Networks (RNN). It then encodes the dialogue history using another RNN layer. At each turn, user responses are decoded from the hidden representations of the dialogue level RNN. This hierarchical user simulator (HUS) approach allows the model to capture undiscovered parts of the user goal without the need of an explicit dialogue state tracking. We further develop several variants by utilizing a latent variable model to inject random variations into user responses to promote diversity in simulated user responses and a novel goal regularization mechanism to penalize divergence of user responses from the initial user goal. We evaluate the proposed models on movie ticket booking domain by systematically interacting each user simulator with various dialogue system policies trained with different objectives and users. |
Tasks | Dialogue State Tracking, Task-Oriented Dialogue Systems |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04369v1 |
http://arxiv.org/pdf/1811.04369v1.pdf | |
PWC | https://paperswithcode.com/paper/user-modeling-for-task-oriented-dialogues |
Repo | |
Framework | |
Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure
Title | Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure |
Authors | Besmira Nushi, Ece Kamar, Eric Horvitz |
Abstract | As machine learning systems move from computer-science laboratories into the open world, their accountability becomes a high priority problem. Accountability requires deep understanding of system behavior and its failures. Current evaluation methods such as single-score error metrics and confusion matrices provide aggregate views of system performance that hide important shortcomings. Understanding details about failures is important for identifying pathways for refinement, communicating the reliability of systems in different settings, and for specifying appropriate human oversight and engagement. Characterization of failures and shortcomings is particularly complex for systems composed of multiple machine learned components. For such systems, existing evaluation methods have limited expressiveness in describing and explaining the relationship among input content, the internal states of system components, and final output quality. We present Pandora, a set of hybrid human-machine methods and tools for describing and explaining system failures. Pandora leverages both human and system-generated observations to summarize conditions of system malfunction with respect to the input content and system architecture. We share results of a case study with a machine learning pipeline for image captioning that show how detailed performance views can be beneficial for analysis and debugging. |
Tasks | Image Captioning |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07424v1 |
http://arxiv.org/pdf/1809.07424v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-accountable-ai-hybrid-human-machine |
Repo | |
Framework | |
Automatic Generation of Chinese Short Product Titles for Mobile Display
Title | Automatic Generation of Chinese Short Product Titles for Mobile Display |
Authors | Yu Gong, Xusheng Luo, Kenny Q. Zhu, Wenwu Ou, Zhao Li, Lu Duan |
Abstract | This paper studies the problem of automatically extracting a short title from a manually written longer description of E-commerce products for display on mobile devices. It is a new extractive summarization problem on short text inputs, for which we propose a feature-enriched network model, combining three different categories of features in parallel. Experimental results show that our framework significantly outperforms several baselines by a substantial gain of 4.5%. Moreover, we produce an extractive summarization dataset for E-commerce short texts and will release it to the research community. |
Tasks | |
Published | 2018-03-30 |
URL | https://arxiv.org/abs/1803.11359v4 |
https://arxiv.org/pdf/1803.11359v4.pdf | |
PWC | https://paperswithcode.com/paper/automatic-generation-of-chinese-short-product |
Repo | |
Framework | |