October 19, 2019

3134 words 15 mins read

Paper Group ANR 260

Contrastive Explanation: A Structural-Model Approach. DeepDrum: An Adaptive Conditional Neural Network. ICADx: Interpretable computer aided diagnosis of breast masses. Learning Tensor Latent Features. Sparse Stochastic Zeroth-Order Optimization with an Application to Bandit Structured Prediction. Deep Learning for Digital Text Analytics: Sentiment …

Contrastive Explanation: A Structural-Model Approach


Title	Contrastive Explanation: A Structural-Model Approach
Authors	Tim Miller
Abstract	The topic of causal explanation in artificial intelligence has gathered interest in recent years as researchers and practitioners aim to increase trust and understanding of intelligent decision-making and action. While different sub-fields have looked into this problem with a sub-field-specific view, there are few models that aim to capture explanation in AI more generally. One general model is based on structural causal models. It defines an explanation as a fact that, if found to be true, would constitute an actual cause of a specific event. However, research in philosophy and social sciences shows that explanations are contrastive: that is, when people ask for an explanation of an event – the fact — they (sometimes implicitly) are asking for an explanation relative to some contrast case; that is, “Why P rather than Q?". In this paper, we extend the structural causal model approach to define two complementary notions of contrastive explanation, and demonstrate them on two classical AI problems: classification and planning. We believe that this model can be used to define contrastive explanation of other subfield-specific AI models.
Tasks	Decision Making
Published	2018-11-07
URL	http://arxiv.org/abs/1811.03163v1
PDF	http://arxiv.org/pdf/1811.03163v1.pdf
PWC	https://paperswithcode.com/paper/contrastive-explanation-a-structural-model
Repo
Framework

DeepDrum: An Adaptive Conditional Neural Network


Title	DeepDrum: An Adaptive Conditional Neural Network
Authors	Dimos Makris, Maximos Kaliakatsos-Papakostas, Katia Lida Kermanidis
Abstract	Considering music as a sequence of events with multiple complex dependencies, the Long Short-Term Memory (LSTM) architecture has proven very efficient in learning and reproducing musical styles. However, the generation of rhythms requires additional information regarding musical structure and accompanying instruments. In this paper we present DeepDrum, an adaptive Neural Network capable of generating drum rhythms under constraints imposed by Feed-Forward (Conditional) Layers which contain musical parameters along with given instrumentation information (e.g. bass and guitar notes). Results on generated drum sequences are presented indicating that DeepDrum is effective in producing rhythms that resemble the learned style, while at the same time conforming to given constraints that were unknown during the training process.
Tasks
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06127v2
PDF	http://arxiv.org/pdf/1809.06127v2.pdf
PWC	https://paperswithcode.com/paper/deepdrum-an-adaptive-conditional-neural
Repo
Framework

ICADx: Interpretable computer aided diagnosis of breast masses


Title	ICADx: Interpretable computer aided diagnosis of breast masses
Authors	Seong Tae Kim, Hakmin Lee, Hak Gu Kim, Yong Man Ro
Abstract	In this study, a novel computer aided diagnosis (CADx) framework is devised to investigate interpretability for classifying breast masses. Recently, a deep learning technology has been successfully applied to medical image analysis including CADx. Existing deep learning based CADx approaches, however, have a limitation in explaining the diagnostic decision. In real clinical practice, clinical decisions could be made with reasonable explanation. So current deep learning approaches in CADx are limited in real world deployment. In this paper, we investigate interpretability in CADx with the proposed interpretable CADx (ICADx) framework. The proposed framework is devised with a generative adversarial network, which consists of interpretable diagnosis network and synthetic lesion generative network to learn the relationship between malignancy and a standardized description (BI-RADS). The lesion generative network and the interpretable diagnosis network compete in an adversarial learning so that the two networks are improved. The effectiveness of the proposed method was validated on public mammogram database. Experimental results showed that the proposed ICADx framework could provide the interpretability of mass as well as mass classification. It was mainly attributed to the fact that the proposed method was effectively trained to find the relationship between malignancy and interpretations via the adversarial learning. These results imply that the proposed ICADx framework could be a promising approach to develop the CADx system.
Tasks
Published	2018-05-23
URL	http://arxiv.org/abs/1805.08960v1
PDF	http://arxiv.org/pdf/1805.08960v1.pdf
PWC	https://paperswithcode.com/paper/icadx-interpretable-computer-aided-diagnosis
Repo
Framework

Learning Tensor Latent Features


Title	Learning Tensor Latent Features
Authors	Sung-En Chang, Xun Zheng, Ian E. H. Yen, Pradeep Ravikumar, Rose Yu
Abstract	We study the problem of learning latent feature models (LFMs) for tensor data commonly observed in science and engineering such as hyperspectral imagery. However, the problem is challenging not only due to the non-convex formulation, the combinatorial nature of the constraints in LFMs, but also the high-order correlations in the data. In this work, we formulate a tensor latent feature learning problem by representing the data as a mixture of high-order latent features and binary codes, which are memory efficient and easy to interpret. To make the learning tractable, we propose a novel optimization procedure, Binary matching pursuit (BMP), that iteratively searches for binary bases via a MAXCUT-like boolean quadratic solver. Such a procedure is guaranteed to achieve an? suboptimal solution in O($1/\epsilon$) greedy steps, resulting in a trade-off between accuracy and sparsity. When evaluated on both synthetic and real datasets, our experiments show superior performance over baseline methods.
Tasks
Published	2018-10-10
URL	http://arxiv.org/abs/1810.04754v1
PDF	http://arxiv.org/pdf/1810.04754v1.pdf
PWC	https://paperswithcode.com/paper/learning-tensor-latent-features
Repo
Framework

Sparse Stochastic Zeroth-Order Optimization with an Application to Bandit Structured Prediction


Title	Sparse Stochastic Zeroth-Order Optimization with an Application to Bandit Structured Prediction
Authors	Artem Sokolov, Julian Hitschler, Stefan Riezler
Abstract	Stochastic zeroth-order (SZO), or gradient-free, optimization allows to optimize arbitrary functions by relying only on function evaluations under parameter perturbations, however, the iteration complexity of SZO methods suffers a factor proportional to the dimensionality of the perturbed function. We show that in scenarios with natural sparsity patterns as in structured prediction applications, this factor can be reduced to the expected number of active features over input-output pairs. We give a general proof that applies sparse SZO optimization to Lipschitz-continuous, nonconvex, stochastic objectives, and present an experimental evaluation on linear bandit structured prediction tasks with sparse word-based feature representations that confirm our theoretical results.
Tasks	Structured Prediction
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04458v2
PDF	http://arxiv.org/pdf/1806.04458v2.pdf
PWC	https://paperswithcode.com/paper/sparse-stochastic-zeroth-order-optimization
Repo
Framework

Deep Learning for Digital Text Analytics: Sentiment Analysis


Title	Deep Learning for Digital Text Analytics: Sentiment Analysis
Authors	Reshma U, Barathi Ganesh H B, Mandar Kale, Prachi Mankame, Gouri Kulkarni
Abstract	In today’s scenario, imagining a world without negativity is something very unrealistic, as bad NEWS spreads more virally than good ones. Though it seems impractical in real life, this could be implemented by building a system using Machine Learning and Natural Language Processing techniques in identifying the news datum with negative shade and filter them by taking only the news with positive shade (good news) to the end user. In this work, around two lakhs datum have been trained and tested using a combination of rule-based and data driven approaches. VADER along with a filtration method has been used as an annotating tool followed by statistical Machine Learning approach that have used Document Term Matrix (representation) and Support Vector Machine (classification). Deep Learning algorithms then came into picture to make this system reliable (Doc2Vec) which finally ended up with Convolutional Neural Network(CNN) that yielded better results than the other experimented modules. It showed up a training accuracy of 96%, while a test accuracy of (internal and external news datum) above 85% was obtained.
Tasks	Sentiment Analysis
Published	2018-04-10
URL	http://arxiv.org/abs/1804.03673v1
PDF	http://arxiv.org/pdf/1804.03673v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-digital-text-analytics
Repo
Framework

Self-Attention-Based Message-Relevant Response Generation for Neural Conversation Model


Title	Self-Attention-Based Message-Relevant Response Generation for Neural Conversation Model
Authors	Jonggu Kim, Doyeon Kong, Jong-Hyeok Lee
Abstract	Using a sequence-to-sequence framework, many neural conversation models for chit-chat succeed in naturalness of the response. Nevertheless, the neural conversation models tend to give generic responses which are not specific to given messages, and it still remains as a challenge. To alleviate the tendency, we propose a method to promote message-relevant and diverse responses for neural conversation model by using self-attention, which is time-efficient as well as effective. Furthermore, we present an investigation of why and how effective self-attention is in deep comparison with the standard dialogue generation. The experiment results show that the proposed method improves the standard dialogue generation in various evaluation metrics.
Tasks	Dialogue Generation
Published	2018-05-23
URL	http://arxiv.org/abs/1805.08983v1
PDF	http://arxiv.org/pdf/1805.08983v1.pdf
PWC	https://paperswithcode.com/paper/self-attention-based-message-relevant
Repo
Framework

Nonparametric Stochastic Contextual Bandits


Title	Nonparametric Stochastic Contextual Bandits
Authors	Melody Y. Guan, Heinrich Jiang
Abstract	We analyze the $K$-armed bandit problem where the reward for each arm is a noisy realization based on an observed context under mild nonparametric assumptions. We attain tight results for top-arm identification and a sublinear regret of $\widetilde{O}\Big(T^{\frac{1+D}{2+D}}\Big)$, where $D$ is the context dimension, for a modified UCB algorithm that is simple to implement ($k$NN-UCB). We then give global intrinsic dimension dependent and ambient dimension independent regret bounds. We also discuss recovering topological structures within the context space based on expected bandit performance and provide an extension to infinite-armed contextual bandits. Finally, we experimentally show the improvement of our algorithm over existing multi-armed bandit approaches for both simulated tasks and MNIST image classification.
Tasks	Image Classification, Multi-Armed Bandits
Published	2018-01-05
URL	http://arxiv.org/abs/1801.01750v1
PDF	http://arxiv.org/pdf/1801.01750v1.pdf
PWC	https://paperswithcode.com/paper/nonparametric-stochastic-contextual-bandits
Repo
Framework

Estimating the Number of Connected Components in a Graph via Subgraph Sampling


Title	Estimating the Number of Connected Components in a Graph via Subgraph Sampling
Authors	Jason M. Klusowski, Yihong Wu
Abstract	Learning properties of large graphs from samples has been an important problem in statistical network analysis since the early work of Goodman \cite{Goodman1949} and Frank \cite{Frank1978}. We revisit a problem formulated by Frank \cite{Frank1978} of estimating the number of connected components in a large graph based on the subgraph sampling model, in which we randomly sample a subset of the vertices and observe the induced subgraph. The key question is whether accurate estimation is achievable in the \emph{sublinear} regime where only a vanishing fraction of the vertices are sampled. We show that it is impossible if the parent graph is allowed to contain high-degree vertices or long induced cycles. For the class of chordal graphs, where induced cycles of length four or above are forbidden, we characterize the optimal sample complexity within constant factors and construct linear-time estimators that provably achieve these bounds. This significantly expands the scope of previous results which have focused on unbiased estimators and special classes of graphs such as forests or cliques. Both the construction and the analysis of the proposed methodology rely on combinatorial properties of chordal graphs and identities of induced subgraph counts. They, in turn, also play a key role in proving minimax lower bounds based on construction of random instances of graphs with matching structures of small subgraphs.
Tasks
Published	2018-01-12
URL	https://arxiv.org/abs/1801.04339v3
PDF	https://arxiv.org/pdf/1801.04339v3.pdf
PWC	https://paperswithcode.com/paper/estimating-the-number-of-connected-components
Repo
Framework

A Review on Facial Micro-Expressions Analysis: Datasets, Features and Metrics


Title	A Review on Facial Micro-Expressions Analysis: Datasets, Features and Metrics
Authors	Walied Merghani, Adrian K. Davison, Moi Hoon Yap
Abstract	Facial micro-expressions are very brief, spontaneous facial expressions that appear on the face of humans when they either deliberately or unconsciously conceal an emotion. Micro-expression has shorter duration than macro-expression, which makes it more challenging for human and machine. Over the past ten years, automatic micro-expressions recognition has attracted increasing attention from researchers in psychology, computer science, security, neuroscience and other related disciplines. The aim of this paper is to provide the insights of automatic micro-expressions and recommendations for future research. There has been a lot of datasets released over the last decade that facilitated the rapid growth in this field. However, comparison across different datasets is difficult due to the inconsistency in experiment protocol, features used and evaluation methods. To address these issues, we review the datasets, features and the performance metrics deployed in the literature. Relevant challenges such as the spatial temporal settings during data collection, emotional classes versus objective classes in data labelling, face regions in data analysis, standardisation of metrics and the requirements for real-world implementation are discussed. We conclude by proposing some promising future directions to advancing micro-expressions research.
Tasks
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02397v1
PDF	http://arxiv.org/pdf/1805.02397v1.pdf
PWC	https://paperswithcode.com/paper/a-review-on-facial-micro-expressions-analysis
Repo
Framework

Collaborative Metric Learning Recommendation System: Application to Theatrical Movie Releases


Title	Collaborative Metric Learning Recommendation System: Application to Theatrical Movie Releases
Authors	Miguel Campo, JJ Espinoza, Julie Rieger, Abhinav Taliyan
Abstract	Product recommendation systems are important for major movie studios during the movie greenlight process and as part of machine learning personalization pipelines. Collaborative Filtering (CF) models have proved to be effective at powering recommender systems for online streaming services with explicit customer feedback data. CF models do not perform well in scenarios in which feedback data is not available, in cold start situations like new product launches, and situations with markedly different customer tiers (e.g., high frequency customers vs. casual customers). Generative natural language models that create useful theme-based representations of an underlying corpus of documents can be used to represent new product descriptions, like new movie plots. When combined with CF, they have shown to increase the performance in cold start situations. Outside of those cases though in which explicit customer feedback is available, recommender engines must rely on binary purchase data, which materially degrades performance. Fortunately, purchase data can be combined with product descriptions to generate meaningful representations of products and customer trajectories in a convenient product space in which proximity represents similarity. Learning to measure the distance between points in this space can be accomplished with a deep neural network that trains on customer histories and on dense vectorizations of product descriptions. We developed a system based on Collaborative (Deep) Metric Learning (CML) to predict the purchase probabilities of new theatrical releases. We trained and evaluated the model using a large dataset of customer histories, and tested the model for a set of movies that were released outside of the training window. Initial experiments show gains relative to models that do not train on collaborative preferences.
Tasks	Metric Learning, Product Recommendation, Recommendation Systems
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00202v1
PDF	http://arxiv.org/pdf/1803.00202v1.pdf
PWC	https://paperswithcode.com/paper/collaborative-metric-learning-recommendation
Repo
Framework

On-chip learning for domain wall synapse based Fully Connected Neural Network


Title	On-chip learning for domain wall synapse based Fully Connected Neural Network
Authors	Apoorv Dankar, Anand Verma, Utkarsh Saxena, Divya Kaushik, Shouri Chatterjee, Debanjan Bhowmik
Abstract	Spintronic devices are considered as promising candidates in implementing neuromorphic systems or hardware neural networks, which are expected to perform better than other existing computing systems for certain data classification and regression tasks. In this paper, we have designed a feedforward Fully Connected Neural Network (FCNN) with no hidden layer using spin orbit torque driven domain wall devices as synapses and transistor based analog circuits as neurons. A feedback circuit is also designed using transistors, which at every iteration computes the change in weights of the synapses needed to train the network using Stochastic Gradient Descent (SGD) method. Subsequently it sends write current pulses to the domain wall based synaptic devices which move the domain walls and updates the weights of the synapses. Through a combination of micromagnetic simulations, analog circuit simulations and numerically solving FCNN training equations, we demonstrate “on-chip” training of the designed FCNN on the MNIST database of handwritten digits in this paper. We report the training and test accuracies, energy consumed in the synaptic devices for the training and possible issues with hardware implementation of FCNN that can limit its test accuracy.
Tasks
Published	2018-11-25
URL	http://arxiv.org/abs/1811.09966v1
PDF	http://arxiv.org/pdf/1811.09966v1.pdf
PWC	https://paperswithcode.com/paper/on-chip-learning-for-domain-wall-synapse
Repo
Framework

A Comparative Study on using Principle Component Analysis with Different Text Classifiers


Title	A Comparative Study on using Principle Component Analysis with Different Text Classifiers
Authors	Ahmed I. Taloba, D. A. Eisa, Safaa S. I. Ismail
Abstract	Text categorization (TC) is the task of automatically organizing a set of documents into a set of pre-defined categories. Over the last few years, increased attention has been paid to the use of documents in digital form and this makes text categorization becomes a challenging issue. The most significant problem of text categorization is its huge number of features. Most of these features are redundant, noisy and irrelevant that cause over fitting with most of the classifiers. Hence, feature extraction is an important step to improve the overall accuracy and the performance of the text classifiers. In this paper, we will provide an overview of using principle component analysis (PCA) as a feature extraction with various classifiers. It was observed that the performance rate of the classifiers after using PCA to reduce the dimension of data improved. Experiments are conducted on three UCI data sets, Classic03, CNAE-9 and DBWorld e-mails. We compare the classification performance results of using PCA with popular and well-known text classifiers. Results show that using PCA encouragingly enhances classification performance on most of the classifiers.
Tasks	Text Categorization
Published	2018-07-04
URL	http://arxiv.org/abs/1807.03283v1
PDF	http://arxiv.org/pdf/1807.03283v1.pdf
PWC	https://paperswithcode.com/paper/a-comparative-study-on-using-principle
Repo
Framework

A Visual Quality Index for Fuzzy C-Means


Title	A Visual Quality Index for Fuzzy C-Means
Authors	Aybükë Oztürk, Stéphane Lallich, Jérôme Darmont
Abstract	Cluster analysis is widely used in the areas of machine learning and data mining. Fuzzy clustering is a particular method that considers that a data point can belong to more than one cluster. Fuzzy clustering helps obtain flexible clusters, as needed in such applications as text categorization. The performance of a clustering algorithm critically depends on the number of clusters, and estimating the optimal number of clusters is a challenging task. Quality indices help estimate the optimal number of clusters. However, there is no quality index that can obtain an accurate number of clusters for different datasets. Thence, in this paper, we propose a new cluster quality index associated with a visual, graph-based solution that helps choose the optimal number of clusters in fuzzy partitions. Moreover, we validate our theoretical results through extensive comparison experiments against state-of-the-art quality indices on a variety of numerical real-world and artificial datasets.
Tasks	Text Categorization
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01552v1
PDF	http://arxiv.org/pdf/1806.01552v1.pdf
PWC	https://paperswithcode.com/paper/a-visual-quality-index-for-fuzzy-c-means
Repo
Framework

A Context-aware Capsule Network for Multi-label Classification


Title	A Context-aware Capsule Network for Multi-label Classification
Authors	Sameera Ramasinghe, C. D. Athuralya, Salman Khan
Abstract	Recently proposed Capsule Network is a brain inspired architecture that brings a new paradigm to deep learning by modelling input domain variations through vector based representations. Despite being a seminal contribution, CapsNet does not explicitly model structured relationships between the detected entities and among the capsule features for related inputs. Motivated by the working of cortical network in human visual system, we seek to resolve CapsNet limitations by proposing several intuitive modifications to the CapsNet architecture. We introduce, (1) a novel routing weight initialization technique, (2) an improved CapsNet design that exploits semantic relationships between the primary capsule activations using a densely connected Conditional Random Field and (3) a Cholesky transformation based correlation module to learn a general priority scheme. Our proposed design allows CapsNet to scale better to more complex problems, such as the multi-label classification task, where semantically related categories co-exist with various interdependencies. We present theoretical bases for our extensions and demonstrate significant improvements on ADE20K scene dataset.
Tasks	Multi-Label Classification
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06231v2
PDF	http://arxiv.org/pdf/1810.06231v2.pdf
PWC	https://paperswithcode.com/paper/a-context-aware-capsule-network-for-multi
Repo
Framework