Paper Group ANR 260
Contrastive Explanation: A Structural-Model Approach. DeepDrum: An Adaptive Conditional Neural Network. ICADx: Interpretable computer aided diagnosis of breast masses. Learning Tensor Latent Features. Sparse Stochastic Zeroth-Order Optimization with an Application to Bandit Structured Prediction. Deep Learning for Digital Text Analytics: Sentiment …
Contrastive Explanation: A Structural-Model Approach
Title | Contrastive Explanation: A Structural-Model Approach |
Authors | Tim Miller |
Abstract | The topic of causal explanation in artificial intelligence has gathered interest in recent years as researchers and practitioners aim to increase trust and understanding of intelligent decision-making and action. While different sub-fields have looked into this problem with a sub-field-specific view, there are few models that aim to capture explanation in AI more generally. One general model is based on structural causal models. It defines an explanation as a fact that, if found to be true, would constitute an actual cause of a specific event. However, research in philosophy and social sciences shows that explanations are contrastive: that is, when people ask for an explanation of an event – the fact — they (sometimes implicitly) are asking for an explanation relative to some contrast case; that is, “Why P rather than Q?". In this paper, we extend the structural causal model approach to define two complementary notions of contrastive explanation, and demonstrate them on two classical AI problems: classification and planning. We believe that this model can be used to define contrastive explanation of other subfield-specific AI models. |
Tasks | Decision Making |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.03163v1 |
http://arxiv.org/pdf/1811.03163v1.pdf | |
PWC | https://paperswithcode.com/paper/contrastive-explanation-a-structural-model |
Repo | |
Framework | |
DeepDrum: An Adaptive Conditional Neural Network
Title | DeepDrum: An Adaptive Conditional Neural Network |
Authors | Dimos Makris, Maximos Kaliakatsos-Papakostas, Katia Lida Kermanidis |
Abstract | Considering music as a sequence of events with multiple complex dependencies, the Long Short-Term Memory (LSTM) architecture has proven very efficient in learning and reproducing musical styles. However, the generation of rhythms requires additional information regarding musical structure and accompanying instruments. In this paper we present DeepDrum, an adaptive Neural Network capable of generating drum rhythms under constraints imposed by Feed-Forward (Conditional) Layers which contain musical parameters along with given instrumentation information (e.g. bass and guitar notes). Results on generated drum sequences are presented indicating that DeepDrum is effective in producing rhythms that resemble the learned style, while at the same time conforming to given constraints that were unknown during the training process. |
Tasks | |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06127v2 |
http://arxiv.org/pdf/1809.06127v2.pdf | |
PWC | https://paperswithcode.com/paper/deepdrum-an-adaptive-conditional-neural |
Repo | |
Framework | |
ICADx: Interpretable computer aided diagnosis of breast masses
Title | ICADx: Interpretable computer aided diagnosis of breast masses |
Authors | Seong Tae Kim, Hakmin Lee, Hak Gu Kim, Yong Man Ro |
Abstract | In this study, a novel computer aided diagnosis (CADx) framework is devised to investigate interpretability for classifying breast masses. Recently, a deep learning technology has been successfully applied to medical image analysis including CADx. Existing deep learning based CADx approaches, however, have a limitation in explaining the diagnostic decision. In real clinical practice, clinical decisions could be made with reasonable explanation. So current deep learning approaches in CADx are limited in real world deployment. In this paper, we investigate interpretability in CADx with the proposed interpretable CADx (ICADx) framework. The proposed framework is devised with a generative adversarial network, which consists of interpretable diagnosis network and synthetic lesion generative network to learn the relationship between malignancy and a standardized description (BI-RADS). The lesion generative network and the interpretable diagnosis network compete in an adversarial learning so that the two networks are improved. The effectiveness of the proposed method was validated on public mammogram database. Experimental results showed that the proposed ICADx framework could provide the interpretability of mass as well as mass classification. It was mainly attributed to the fact that the proposed method was effectively trained to find the relationship between malignancy and interpretations via the adversarial learning. These results imply that the proposed ICADx framework could be a promising approach to develop the CADx system. |
Tasks | |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.08960v1 |
http://arxiv.org/pdf/1805.08960v1.pdf | |
PWC | https://paperswithcode.com/paper/icadx-interpretable-computer-aided-diagnosis |
Repo | |
Framework | |
Learning Tensor Latent Features
Title | Learning Tensor Latent Features |
Authors | Sung-En Chang, Xun Zheng, Ian E. H. Yen, Pradeep Ravikumar, Rose Yu |
Abstract | We study the problem of learning latent feature models (LFMs) for tensor data commonly observed in science and engineering such as hyperspectral imagery. However, the problem is challenging not only due to the non-convex formulation, the combinatorial nature of the constraints in LFMs, but also the high-order correlations in the data. In this work, we formulate a tensor latent feature learning problem by representing the data as a mixture of high-order latent features and binary codes, which are memory efficient and easy to interpret. To make the learning tractable, we propose a novel optimization procedure, Binary matching pursuit (BMP), that iteratively searches for binary bases via a MAXCUT-like boolean quadratic solver. Such a procedure is guaranteed to achieve an? suboptimal solution in O($1/\epsilon$) greedy steps, resulting in a trade-off between accuracy and sparsity. When evaluated on both synthetic and real datasets, our experiments show superior performance over baseline methods. |
Tasks | |
Published | 2018-10-10 |
URL | http://arxiv.org/abs/1810.04754v1 |
http://arxiv.org/pdf/1810.04754v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-tensor-latent-features |
Repo | |
Framework | |
Sparse Stochastic Zeroth-Order Optimization with an Application to Bandit Structured Prediction
Title | Sparse Stochastic Zeroth-Order Optimization with an Application to Bandit Structured Prediction |
Authors | Artem Sokolov, Julian Hitschler, Stefan Riezler |
Abstract | Stochastic zeroth-order (SZO), or gradient-free, optimization allows to optimize arbitrary functions by relying only on function evaluations under parameter perturbations, however, the iteration complexity of SZO methods suffers a factor proportional to the dimensionality of the perturbed function. We show that in scenarios with natural sparsity patterns as in structured prediction applications, this factor can be reduced to the expected number of active features over input-output pairs. We give a general proof that applies sparse SZO optimization to Lipschitz-continuous, nonconvex, stochastic objectives, and present an experimental evaluation on linear bandit structured prediction tasks with sparse word-based feature representations that confirm our theoretical results. |
Tasks | Structured Prediction |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04458v2 |
http://arxiv.org/pdf/1806.04458v2.pdf | |
PWC | https://paperswithcode.com/paper/sparse-stochastic-zeroth-order-optimization |
Repo | |
Framework | |
Deep Learning for Digital Text Analytics: Sentiment Analysis
Title | Deep Learning for Digital Text Analytics: Sentiment Analysis |
Authors | Reshma U, Barathi Ganesh H B, Mandar Kale, Prachi Mankame, Gouri Kulkarni |
Abstract | In today’s scenario, imagining a world without negativity is something very unrealistic, as bad NEWS spreads more virally than good ones. Though it seems impractical in real life, this could be implemented by building a system using Machine Learning and Natural Language Processing techniques in identifying the news datum with negative shade and filter them by taking only the news with positive shade (good news) to the end user. In this work, around two lakhs datum have been trained and tested using a combination of rule-based and data driven approaches. VADER along with a filtration method has been used as an annotating tool followed by statistical Machine Learning approach that have used Document Term Matrix (representation) and Support Vector Machine (classification). Deep Learning algorithms then came into picture to make this system reliable (Doc2Vec) which finally ended up with Convolutional Neural Network(CNN) that yielded better results than the other experimented modules. It showed up a training accuracy of 96%, while a test accuracy of (internal and external news datum) above 85% was obtained. |
Tasks | Sentiment Analysis |
Published | 2018-04-10 |
URL | http://arxiv.org/abs/1804.03673v1 |
http://arxiv.org/pdf/1804.03673v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-digital-text-analytics |
Repo | |
Framework | |
Self-Attention-Based Message-Relevant Response Generation for Neural Conversation Model
Title | Self-Attention-Based Message-Relevant Response Generation for Neural Conversation Model |
Authors | Jonggu Kim, Doyeon Kong, Jong-Hyeok Lee |
Abstract | Using a sequence-to-sequence framework, many neural conversation models for chit-chat succeed in naturalness of the response. Nevertheless, the neural conversation models tend to give generic responses which are not specific to given messages, and it still remains as a challenge. To alleviate the tendency, we propose a method to promote message-relevant and diverse responses for neural conversation model by using self-attention, which is time-efficient as well as effective. Furthermore, we present an investigation of why and how effective self-attention is in deep comparison with the standard dialogue generation. The experiment results show that the proposed method improves the standard dialogue generation in various evaluation metrics. |
Tasks | Dialogue Generation |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.08983v1 |
http://arxiv.org/pdf/1805.08983v1.pdf | |
PWC | https://paperswithcode.com/paper/self-attention-based-message-relevant |
Repo | |
Framework | |
Nonparametric Stochastic Contextual Bandits
Title | Nonparametric Stochastic Contextual Bandits |
Authors | Melody Y. Guan, Heinrich Jiang |
Abstract | We analyze the $K$-armed bandit problem where the reward for each arm is a noisy realization based on an observed context under mild nonparametric assumptions. We attain tight results for top-arm identification and a sublinear regret of $\widetilde{O}\Big(T^{\frac{1+D}{2+D}}\Big)$, where $D$ is the context dimension, for a modified UCB algorithm that is simple to implement ($k$NN-UCB). We then give global intrinsic dimension dependent and ambient dimension independent regret bounds. We also discuss recovering topological structures within the context space based on expected bandit performance and provide an extension to infinite-armed contextual bandits. Finally, we experimentally show the improvement of our algorithm over existing multi-armed bandit approaches for both simulated tasks and MNIST image classification. |
Tasks | Image Classification, Multi-Armed Bandits |
Published | 2018-01-05 |
URL | http://arxiv.org/abs/1801.01750v1 |
http://arxiv.org/pdf/1801.01750v1.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-stochastic-contextual-bandits |
Repo | |
Framework | |
Estimating the Number of Connected Components in a Graph via Subgraph Sampling
Title | Estimating the Number of Connected Components in a Graph via Subgraph Sampling |
Authors | Jason M. Klusowski, Yihong Wu |
Abstract | Learning properties of large graphs from samples has been an important problem in statistical network analysis since the early work of Goodman \cite{Goodman1949} and Frank \cite{Frank1978}. We revisit a problem formulated by Frank \cite{Frank1978} of estimating the number of connected components in a large graph based on the subgraph sampling model, in which we randomly sample a subset of the vertices and observe the induced subgraph. The key question is whether accurate estimation is achievable in the \emph{sublinear} regime where only a vanishing fraction of the vertices are sampled. We show that it is impossible if the parent graph is allowed to contain high-degree vertices or long induced cycles. For the class of chordal graphs, where induced cycles of length four or above are forbidden, we characterize the optimal sample complexity within constant factors and construct linear-time estimators that provably achieve these bounds. This significantly expands the scope of previous results which have focused on unbiased estimators and special classes of graphs such as forests or cliques. Both the construction and the analysis of the proposed methodology rely on combinatorial properties of chordal graphs and identities of induced subgraph counts. They, in turn, also play a key role in proving minimax lower bounds based on construction of random instances of graphs with matching structures of small subgraphs. |
Tasks | |
Published | 2018-01-12 |
URL | https://arxiv.org/abs/1801.04339v3 |
https://arxiv.org/pdf/1801.04339v3.pdf | |
PWC | https://paperswithcode.com/paper/estimating-the-number-of-connected-components |
Repo | |
Framework | |
A Review on Facial Micro-Expressions Analysis: Datasets, Features and Metrics
Title | A Review on Facial Micro-Expressions Analysis: Datasets, Features and Metrics |
Authors | Walied Merghani, Adrian K. Davison, Moi Hoon Yap |
Abstract | Facial micro-expressions are very brief, spontaneous facial expressions that appear on the face of humans when they either deliberately or unconsciously conceal an emotion. Micro-expression has shorter duration than macro-expression, which makes it more challenging for human and machine. Over the past ten years, automatic micro-expressions recognition has attracted increasing attention from researchers in psychology, computer science, security, neuroscience and other related disciplines. The aim of this paper is to provide the insights of automatic micro-expressions and recommendations for future research. There has been a lot of datasets released over the last decade that facilitated the rapid growth in this field. However, comparison across different datasets is difficult due to the inconsistency in experiment protocol, features used and evaluation methods. To address these issues, we review the datasets, features and the performance metrics deployed in the literature. Relevant challenges such as the spatial temporal settings during data collection, emotional classes versus objective classes in data labelling, face regions in data analysis, standardisation of metrics and the requirements for real-world implementation are discussed. We conclude by proposing some promising future directions to advancing micro-expressions research. |
Tasks | |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02397v1 |
http://arxiv.org/pdf/1805.02397v1.pdf | |
PWC | https://paperswithcode.com/paper/a-review-on-facial-micro-expressions-analysis |
Repo | |
Framework | |
Collaborative Metric Learning Recommendation System: Application to Theatrical Movie Releases
Title | Collaborative Metric Learning Recommendation System: Application to Theatrical Movie Releases |
Authors | Miguel Campo, JJ Espinoza, Julie Rieger, Abhinav Taliyan |
Abstract | Product recommendation systems are important for major movie studios during the movie greenlight process and as part of machine learning personalization pipelines. Collaborative Filtering (CF) models have proved to be effective at powering recommender systems for online streaming services with explicit customer feedback data. CF models do not perform well in scenarios in which feedback data is not available, in cold start situations like new product launches, and situations with markedly different customer tiers (e.g., high frequency customers vs. casual customers). Generative natural language models that create useful theme-based representations of an underlying corpus of documents can be used to represent new product descriptions, like new movie plots. When combined with CF, they have shown to increase the performance in cold start situations. Outside of those cases though in which explicit customer feedback is available, recommender engines must rely on binary purchase data, which materially degrades performance. Fortunately, purchase data can be combined with product descriptions to generate meaningful representations of products and customer trajectories in a convenient product space in which proximity represents similarity. Learning to measure the distance between points in this space can be accomplished with a deep neural network that trains on customer histories and on dense vectorizations of product descriptions. We developed a system based on Collaborative (Deep) Metric Learning (CML) to predict the purchase probabilities of new theatrical releases. We trained and evaluated the model using a large dataset of customer histories, and tested the model for a set of movies that were released outside of the training window. Initial experiments show gains relative to models that do not train on collaborative preferences. |
Tasks | Metric Learning, Product Recommendation, Recommendation Systems |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00202v1 |
http://arxiv.org/pdf/1803.00202v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-metric-learning-recommendation |
Repo | |
Framework | |
On-chip learning for domain wall synapse based Fully Connected Neural Network
Title | On-chip learning for domain wall synapse based Fully Connected Neural Network |
Authors | Apoorv Dankar, Anand Verma, Utkarsh Saxena, Divya Kaushik, Shouri Chatterjee, Debanjan Bhowmik |
Abstract | Spintronic devices are considered as promising candidates in implementing neuromorphic systems or hardware neural networks, which are expected to perform better than other existing computing systems for certain data classification and regression tasks. In this paper, we have designed a feedforward Fully Connected Neural Network (FCNN) with no hidden layer using spin orbit torque driven domain wall devices as synapses and transistor based analog circuits as neurons. A feedback circuit is also designed using transistors, which at every iteration computes the change in weights of the synapses needed to train the network using Stochastic Gradient Descent (SGD) method. Subsequently it sends write current pulses to the domain wall based synaptic devices which move the domain walls and updates the weights of the synapses. Through a combination of micromagnetic simulations, analog circuit simulations and numerically solving FCNN training equations, we demonstrate “on-chip” training of the designed FCNN on the MNIST database of handwritten digits in this paper. We report the training and test accuracies, energy consumed in the synaptic devices for the training and possible issues with hardware implementation of FCNN that can limit its test accuracy. |
Tasks | |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.09966v1 |
http://arxiv.org/pdf/1811.09966v1.pdf | |
PWC | https://paperswithcode.com/paper/on-chip-learning-for-domain-wall-synapse |
Repo | |
Framework | |
A Comparative Study on using Principle Component Analysis with Different Text Classifiers
Title | A Comparative Study on using Principle Component Analysis with Different Text Classifiers |
Authors | Ahmed I. Taloba, D. A. Eisa, Safaa S. I. Ismail |
Abstract | Text categorization (TC) is the task of automatically organizing a set of documents into a set of pre-defined categories. Over the last few years, increased attention has been paid to the use of documents in digital form and this makes text categorization becomes a challenging issue. The most significant problem of text categorization is its huge number of features. Most of these features are redundant, noisy and irrelevant that cause over fitting with most of the classifiers. Hence, feature extraction is an important step to improve the overall accuracy and the performance of the text classifiers. In this paper, we will provide an overview of using principle component analysis (PCA) as a feature extraction with various classifiers. It was observed that the performance rate of the classifiers after using PCA to reduce the dimension of data improved. Experiments are conducted on three UCI data sets, Classic03, CNAE-9 and DBWorld e-mails. We compare the classification performance results of using PCA with popular and well-known text classifiers. Results show that using PCA encouragingly enhances classification performance on most of the classifiers. |
Tasks | Text Categorization |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.03283v1 |
http://arxiv.org/pdf/1807.03283v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-study-on-using-principle |
Repo | |
Framework | |
A Visual Quality Index for Fuzzy C-Means
Title | A Visual Quality Index for Fuzzy C-Means |
Authors | Aybükë Oztürk, Stéphane Lallich, Jérôme Darmont |
Abstract | Cluster analysis is widely used in the areas of machine learning and data mining. Fuzzy clustering is a particular method that considers that a data point can belong to more than one cluster. Fuzzy clustering helps obtain flexible clusters, as needed in such applications as text categorization. The performance of a clustering algorithm critically depends on the number of clusters, and estimating the optimal number of clusters is a challenging task. Quality indices help estimate the optimal number of clusters. However, there is no quality index that can obtain an accurate number of clusters for different datasets. Thence, in this paper, we propose a new cluster quality index associated with a visual, graph-based solution that helps choose the optimal number of clusters in fuzzy partitions. Moreover, we validate our theoretical results through extensive comparison experiments against state-of-the-art quality indices on a variety of numerical real-world and artificial datasets. |
Tasks | Text Categorization |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01552v1 |
http://arxiv.org/pdf/1806.01552v1.pdf | |
PWC | https://paperswithcode.com/paper/a-visual-quality-index-for-fuzzy-c-means |
Repo | |
Framework | |
A Context-aware Capsule Network for Multi-label Classification
Title | A Context-aware Capsule Network for Multi-label Classification |
Authors | Sameera Ramasinghe, C. D. Athuralya, Salman Khan |
Abstract | Recently proposed Capsule Network is a brain inspired architecture that brings a new paradigm to deep learning by modelling input domain variations through vector based representations. Despite being a seminal contribution, CapsNet does not explicitly model structured relationships between the detected entities and among the capsule features for related inputs. Motivated by the working of cortical network in human visual system, we seek to resolve CapsNet limitations by proposing several intuitive modifications to the CapsNet architecture. We introduce, (1) a novel routing weight initialization technique, (2) an improved CapsNet design that exploits semantic relationships between the primary capsule activations using a densely connected Conditional Random Field and (3) a Cholesky transformation based correlation module to learn a general priority scheme. Our proposed design allows CapsNet to scale better to more complex problems, such as the multi-label classification task, where semantically related categories co-exist with various interdependencies. We present theoretical bases for our extensions and demonstrate significant improvements on ADE20K scene dataset. |
Tasks | Multi-Label Classification |
Published | 2018-10-15 |
URL | http://arxiv.org/abs/1810.06231v2 |
http://arxiv.org/pdf/1810.06231v2.pdf | |
PWC | https://paperswithcode.com/paper/a-context-aware-capsule-network-for-multi |
Repo | |
Framework | |