July 29, 2019

3233 words 16 mins read

Paper Group ANR 88

Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences. Image Captioning with Object Detection and Localization. Efficient Reinforcement Learning via Initial Pure Exploration. Discourse Structure in Machine Translation Evaluation. Learning Robust Visual-Semantic Embeddings …


Title	Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences
Authors	Tim Miller, Piers Howe, Liz Sonenberg
Abstract	In his seminal book `The Inmates are Running the Asylum: Why High-Tech Products Drive Us Crazy And How To Restore The Sanity' [2004, Sams Indianapolis, IN, USA], Alan Cooper argues that a major reason why software is often poorly designed (from a user perspective) is that programmers are in charge of design decisions, rather than interaction designers. As a result, programmers design software for themselves, rather than for their target audience, a phenomenon he refers to as the` inmates running the asylum’. This paper argues that explainable AI risks a similar fate. While the re-emergence of explainable AI is positive, this paper argues most of us as AI researchers are building explanatory agents for ourselves, rather than for the intended users. But explainable AI is more likely to succeed if researchers and practitioners understand, adopt, implement, and improve models from the vast and valuable bodies of research in philosophy, psychology, and cognitive science, and if evaluation of these models is focused more on people than on technology. From a light scan of literature, we demonstrate that there is considerable scope to infuse more results from the social and behavioural sciences into explainable AI, and present some key results from these fields that are relevant to explainable AI.
Tasks
Published	2017-12-02
URL	http://arxiv.org/abs/1712.00547v2
PDF	http://arxiv.org/pdf/1712.00547v2.pdf
PWC	https://paperswithcode.com/paper/explainable-ai-beware-of-inmates-running-the
Repo
Framework

Image Captioning with Object Detection and Localization


Title	Image Captioning with Object Detection and Localization
Authors	Zhongliang Yang, Yu-Jin Zhang, Sadaqat ur Rehman, Yongfeng Huang
Abstract	Automatically generating a natural language description of an image is a task close to the heart of image understanding. In this paper, we present a multi-model neural network method closely related to the human visual system that automatically learns to describe the content of images. Our model consists of two sub-models: an object detection and localization model, which extract the information of objects and their spatial relationship in images respectively; Besides, a deep recurrent neural network (RNN) based on long short-term memory (LSTM) units with attention mechanism for sentences generation. Each word of the description will be automatically aligned to different objects of the input image when it is generated. This is similar to the attention mechanism of the human visual system. Experimental results on the COCO dataset showcase the merit of the proposed method, which outperforms previous benchmark models.
Tasks	Image Captioning, Object Detection
Published	2017-06-08
URL	http://arxiv.org/abs/1706.02430v1
PDF	http://arxiv.org/pdf/1706.02430v1.pdf
PWC	https://paperswithcode.com/paper/image-captioning-with-object-detection-and
Repo
Framework

Efficient Reinforcement Learning via Initial Pure Exploration


Title	Efficient Reinforcement Learning via Initial Pure Exploration
Authors	Sudeep Raja Putta, Theja Tulabandhula
Abstract	In several realistic situations, an interactive learning agent can practice and refine its strategy before going on to be evaluated. For instance, consider a student preparing for a series of tests. She would typically take a few practice tests to know which areas she needs to improve upon. Based of the scores she obtains in these practice tests, she would formulate a strategy for maximizing her scores in the actual tests. We treat this scenario in the context of an agent exploring a fixed-horizon episodic Markov Decision Process (MDP), where the agent can practice on the MDP for some number of episodes (not necessarily known in advance) before starting to incur regret for its actions. During practice, the agent’s goal must be to maximize the probability of following an optimal policy. This is akin to the problem of Pure Exploration (PE). We extend the PE problem of Multi Armed Bandits (MAB) to MDPs and propose a Bayesian algorithm called Posterior Sampling for Pure Exploration (PSPE), which is similar to its bandit counterpart. We show that the Bayesian simple regret converges at an optimal exponential rate when using PSPE. When the agent starts being evaluated, its goal would be to minimize the cumulative regret incurred. This is akin to the problem of Reinforcement Learning (RL). The agent uses the Posterior Sampling for Reinforcement Learning algorithm (PSRL) initialized with the posteriors of the practice phase. We hypothesize that this PSPE + PSRL combination is an optimal strategy for minimizing regret in RL problems with an initial practice phase. We show empirical results which prove that having a lower simple regret at the end of the practice phase results in having lower cumulative regret during evaluation.
Tasks	Multi-Armed Bandits
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02237v1
PDF	http://arxiv.org/pdf/1706.02237v1.pdf
PWC	https://paperswithcode.com/paper/efficient-reinforcement-learning-via-initial
Repo
Framework

Discourse Structure in Machine Translation Evaluation


Title	Discourse Structure in Machine Translation Evaluation
Authors	Shafiq Joty, Francisco Guzmán, Lluís Màrquez, Preslav Nakov
Abstract	In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment- and at the system-level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DiscoTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular we show that: (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference tree is positively correlated with translation quality.
Tasks	Machine Translation
Published	2017-10-04
URL	http://arxiv.org/abs/1710.01504v1
PDF	http://arxiv.org/pdf/1710.01504v1.pdf
PWC	https://paperswithcode.com/paper/discourse-structure-in-machine-translation
Repo
Framework

Learning Robust Visual-Semantic Embeddings


Title	Learning Robust Visual-Semantic Embeddings
Authors	Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov
Abstract	Many of the existing methods for learning joint embedding of images and text use only supervised information from paired images and its textual attributes. Taking advantage of the recent success of unsupervised learning in deep neural networks, we propose an end-to-end learning framework that is able to extract more robust multi-modal representations across domains. The proposed method combines representation learning models (i.e., auto-encoders) together with cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn joint embeddings for semantic and visual features. A novel technique of unsupervised-data adaptation inference is introduced to construct more comprehensive embeddings for both labeled and unlabeled data. We evaluate our method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with a wide range of applications, including zero and few-shot image recognition and retrieval, from inductive to transductive settings. Empirically, we show that our framework improves over the current state of the art on many of the considered tasks.
Tasks	Representation Learning
Published	2017-03-17
URL	http://arxiv.org/abs/1703.05908v2
PDF	http://arxiv.org/pdf/1703.05908v2.pdf
PWC	https://paperswithcode.com/paper/learning-robust-visual-semantic-embeddings
Repo
Framework

Optimization of distributions differences for classification


Title	Optimization of distributions differences for classification
Authors	Mohammad Reza Bonyadi, Quang M. Tieng, David C. Reutens
Abstract	In this paper we introduce a new classification algorithm called Optimization of Distributions Differences (ODD). The algorithm aims to find a transformation from the feature space to a new space where the instances in the same class are as close as possible to one another while the gravity centers of these classes are as far as possible from one another. This aim is formulated as a multiobjective optimization problem that is solved by a hybrid of an evolutionary strategy and the Quasi-Newton method. The choice of the transformation function is flexible and could be any continuous space function. We experiment with a linear and a non-linear transformation in this paper. We show that the algorithm can outperform 6 other state-of-the-art classification methods, namely naive Bayes, support vector machines, linear discriminant analysis, multi-layer perceptrons, decision trees, and k-nearest neighbors, in 12 standard classification datasets. Our results show that the method is less sensitive to the imbalanced number of instances comparing to these methods. We also show that ODD maintains its performance better than other classification methods in these datasets, hence, offers a better generalization ability.
Tasks	Multiobjective Optimization
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00989v1
PDF	http://arxiv.org/pdf/1703.00989v1.pdf
PWC	https://paperswithcode.com/paper/optimization-of-distributions-differences-for
Repo
Framework

Benchmarking Multimodal Sentiment Analysis


Title	Benchmarking Multimodal Sentiment Analysis
Authors	Erik Cambria, Devamanyu Hazarika, Soujanya Poria, Amir Hussain, R. B. V. Subramaanyam
Abstract	We propose a framework for multimodal sentiment analysis and emotion recognition using convolutional neural network-based feature extraction from text and visual modalities. We obtain a performance improvement of 10% over the state of the art by combining visual, text and audio features. We also discuss some major issues frequently ignored in multimodal sentiment analysis research: the role of speaker-independent models, importance of the modalities and generalizability. The paper thus serve as a new benchmark for further research in multimodal sentiment analysis and also demonstrates the different facets of analysis to be considered while performing such tasks.
Tasks	Emotion Recognition, Multimodal Sentiment Analysis, Sentiment Analysis
Published	2017-07-29
URL	http://arxiv.org/abs/1707.09538v1
PDF	http://arxiv.org/pdf/1707.09538v1.pdf
PWC	https://paperswithcode.com/paper/benchmarking-multimodal-sentiment-analysis
Repo
Framework

An effective algorithm for hyperparameter optimization of neural networks


Title	An effective algorithm for hyperparameter optimization of neural networks
Authors	Gonzalo Diaz, Achille Fokoue, Giacomo Nannicini, Horst Samulowitz
Abstract	A major challenge in designing neural network (NN) systems is to determine the best structure and parameters for the network given the data for the machine learning problem at hand. Examples of parameters are the number of layers and nodes, the learning rates, and the dropout rates. Typically, these parameters are chosen based on heuristic rules and manually fine-tuned, which may be very time-consuming, because evaluating the performance of a single parametrization of the NN may require several hours. This paper addresses the problem of choosing appropriate parameters for the NN by formulating it as a box-constrained mathematical optimization problem, and applying a derivative-free optimization tool that automatically and effectively searches the parameter space. The optimization tool employs a radial basis function model of the objective function (the prediction accuracy of the NN) to accelerate the discovery of configurations yielding high accuracy. Candidate configurations explored by the algorithm are trained to a small number of epochs, and only the most promising candidates receive full training. The performance of the proposed methodology is assessed on benchmark sets and in the context of predicting drug-drug interactions, showing promising results. The optimization tool used in this paper is open-source.
Tasks	Hyperparameter Optimization
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08520v1
PDF	http://arxiv.org/pdf/1705.08520v1.pdf
PWC	https://paperswithcode.com/paper/an-effective-algorithm-for-hyperparameter
Repo
Framework

Data-driven Random Fourier Features using Stein Effect


Title	Data-driven Random Fourier Features using Stein Effect
Authors	Wei-Cheng Chang, Chun-Liang Li, Yiming Yang, Barnabas Poczos
Abstract	Large-scale kernel approximation is an important problem in machine learning research. Approaches using random Fourier features have become increasingly popular [Rahimi and Recht, 2007], where kernel approximation is treated as empirical mean estimation via Monte Carlo (MC) or Quasi-Monte Carlo (QMC) integration [Yang et al., 2014]. A limitation of the current approaches is that all the features receive an equal weight summing to 1. In this paper, we propose a novel shrinkage estimator from “Stein effect”, which provides a data-driven weighting strategy for random features and enjoys theoretical justifications in terms of lowering the empirical risk. We further present an efficient randomized algorithm for large-scale applications of the proposed method. Our empirical results on six benchmark data sets demonstrate the advantageous performance of this approach over representative baselines in both kernel approximation and supervised learning tasks.
Tasks
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08525v1
PDF	http://arxiv.org/pdf/1705.08525v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-random-fourier-features-using
Repo
Framework

Cross-Age LFW: A Database for Studying Cross-Age Face Recognition in Unconstrained Environments


Title	Cross-Age LFW: A Database for Studying Cross-Age Face Recognition in Unconstrained Environments
Authors	Tianyue Zheng, Weihong Deng, Jiani Hu
Abstract	Labeled Faces in the Wild (LFW) database has been widely utilized as the benchmark of unconstrained face verification and due to big data driven machine learning methods, the performance on the database approaches nearly 100%. However, we argue that this accuracy may be too optimistic because of some limiting factors. Besides different poses, illuminations, occlusions and expressions, cross-age face is another challenge in face recognition. Different ages of the same person result in large intra-class variations and aging process is unavoidable in real world face verification. However, LFW does not pay much attention on it. Thereby we construct a Cross-Age LFW (CALFW) which deliberately searches and selects 3,000 positive face pairs with age gaps to add aging process intra-class variance. Negative pairs with same gender and race are also selected to reduce the influence of attribute difference between positive/negative pairs and achieve face verification instead of attributes classification. We evaluate several metric learning and deep learning methods on the new database. Compared to the accuracy on LFW, the accuracy drops about 10%-17% on CALFW.
Tasks	Face Recognition, Face Verification, Metric Learning
Published	2017-08-28
URL	http://arxiv.org/abs/1708.08197v1
PDF	http://arxiv.org/pdf/1708.08197v1.pdf
PWC	https://paperswithcode.com/paper/cross-age-lfw-a-database-for-studying-cross
Repo
Framework

Residual Value Forecasting Using Asymmetric Cost Functions


Title	Residual Value Forecasting Using Asymmetric Cost Functions
Authors	Korbinian Dress, Stefan Lessmann, Hans-Jörg von Mettenheim
Abstract	Leasing is a popular channel to market new cars. Pricing a leasing contract is complicated because the leasing rate embodies an expectation of the residual value of the car after contract expiration. To aid lessors in their pricing decisions, the paper develops resale price forecasting models. A peculiarity of the leasing business is that forecast errors entail different costs. Identifying effective ways to address this characteristic is the main objective of the paper. More specifically, the paper contributes to the literature through i) consolidating and integrating previous work in forecasting with asymmetric cost of error functions, ii) systematically evaluating previous approaches and comparing them to a new approach, and iii) demonstrating that forecasting with asymmetric cost of error functions enhances the quality of decision support in car leasing. For example, under the assumption that the costs of overestimating resale prices is twice that of the opposite error, incorporating corresponding cost asymmetry into forecast model development reduces decision costs by about eight percent, compared to a standard forecasting model. Higher asymmetry produces even larger improvements.
Tasks
Published	2017-07-10
URL	http://arxiv.org/abs/1707.02736v1
PDF	http://arxiv.org/pdf/1707.02736v1.pdf
PWC	https://paperswithcode.com/paper/residual-value-forecasting-using-asymmetric
Repo
Framework

On the Design of LQR Kernels for Efficient Controller Learning


Title	On the Design of LQR Kernels for Efficient Controller Learning
Authors	Alonso Marco, Philipp Hennig, Stefan Schaal, Sebastian Trimpe
Abstract	Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.
Tasks
Published	2017-09-20
URL	http://arxiv.org/abs/1709.07089v1
PDF	http://arxiv.org/pdf/1709.07089v1.pdf
PWC	https://paperswithcode.com/paper/on-the-design-of-lqr-kernels-for-efficient
Repo
Framework

Zero-Shot Learning via Category-Specific Visual-Semantic Mapping


Title	Zero-Shot Learning via Category-Specific Visual-Semantic Mapping
Authors	Li Niu, Jianfei Cai, Ashok Veeraraghavan
Abstract	Zero-Shot Learning (ZSL) aims to classify a test instance from an unseen category based on the training instances from seen categories, in which the gap between seen categories and unseen categories is generally bridged via visual-semantic mapping between the low-level visual feature space and the intermediate semantic space. However, the visual-semantic mapping learnt based on seen categories may not generalize well to unseen categories because the data distributions between seen categories and unseen categories are considerably different, which is known as the projection domain shift problem in ZSL. To address this domain shift issue, we propose a method named Adaptive Embedding ZSL (AEZSL) to learn an adaptive visual-semantic mapping for each unseen category based on the similarities between each unseen category and all the seen categories. Then, we further make two extensions based on our AEZSL method. Firstly, in order to utilize the unlabeled test instances from unseen categories, we extend our AEZSL to a semi-supervised approach named AEZSL with Label Refinement (AEZSL_LR), in which a progressive approach is developed to update the visual classifiers and refine the predicted test labels alternatively based on the similarities among test instances and among unseen categories. Secondly, to avoid learning visual-semantic mapping for each unseen category in the large-scale classification task, we extend our AEZSL to a deep adaptive embedding model named Deep AEZSL (DAEZSL) sharing the similar idea (i.e., visual-semantic mapping should be category-specific and related to the semantic space) with AEZSL, which only needs to be trained once, but can be applied to arbitrary number of unseen categories. Extensive experiments demonstrate that our proposed methods achieve the state-of-the-art results for image classification on four benchmark datasets.
Tasks	Image Classification, Zero-Shot Learning
Published	2017-11-16
URL	http://arxiv.org/abs/1711.06167v2
PDF	http://arxiv.org/pdf/1711.06167v2.pdf
PWC	https://paperswithcode.com/paper/zero-shot-learning-via-category-specific
Repo
Framework

Characterisation of speech diversity using self-organising maps


Title	Characterisation of speech diversity using self-organising maps
Authors	Tom A. F. Anderson, David M. W. Powers
Abstract	We report investigations into speaker classification of larger quantities of unlabelled speech data using small sets of manually phonemically annotated speech. The Kohonen speech typewriter is a semi-supervised method comprised of self-organising maps (SOMs) that achieves low phoneme error rates. A SOM is a 2D array of cells that learn vector representations of the data based on neighbourhoods. In this paper, we report a method to evaluate pronunciation using multilevel SOMs with /hVd/ single syllable utterances for the study of vowels, for Australian pronunciation.
Tasks
Published	2017-01-23
URL	http://arxiv.org/abs/1702.02092v1
PDF	http://arxiv.org/pdf/1702.02092v1.pdf
PWC	https://paperswithcode.com/paper/characterisation-of-speech-diversity-using
Repo
Framework

Hashtag Healthcare: From Tweets to Mental Health Journals Using Deep Transfer Learning


Title	Hashtag Healthcare: From Tweets to Mental Health Journals Using Deep Transfer Learning
Authors	Benjamin Shickel, Martin Heesacker, Sherry Benton, Parisa Rashidi
Abstract	As the popularity of social media platforms continues to rise, an ever-increasing amount of human communication and self- expression takes place online. Most recent research has focused on mining social media for public user opinion about external entities such as product reviews or sentiment towards political news. However, less attention has been paid to analyzing users’ internalized thoughts and emotions from a mental health perspective. In this paper, we quantify the semantic difference between public Tweets and private mental health journals used in online cognitive behavioral therapy. We will use deep transfer learning techniques for analyzing the semantic gap between the two domains. We show that for the task of emotional valence prediction, social media can be successfully harnessed to create more accurate, robust, and personalized mental health models. Our results suggest that the semantic gap between public and private self-expression is small, and that utilizing the abundance of available social media is one way to overcome the small sample sizes of mental health data, which are commonly limited by availability and privacy concerns.
Tasks	Transfer Learning
Published	2017-08-04
URL	http://arxiv.org/abs/1708.01372v1
PDF	http://arxiv.org/pdf/1708.01372v1.pdf
PWC	https://paperswithcode.com/paper/hashtag-healthcare-from-tweets-to-mental
Repo
Framework