Paper Group ANR 352
An Unbiased Approach to Quantification of Gender Inclination using Interpretable Word Representations. Generating Texts with Integer Linear Programming. Global SNR Estimation of Speech Signals using Entropy and Uncertainty Estimates from Dropout Networks. A regression approach for explaining manifold embedding coordinates. Logical Rule Induction an …
An Unbiased Approach to Quantification of Gender Inclination using Interpretable Word Representations
Title | An Unbiased Approach to Quantification of Gender Inclination using Interpretable Word Representations |
Authors | Navid Rekabsaz, Allan Hanbury |
Abstract | Recent advances in word embedding provide significant benefit to various information processing tasks. Yet these dense representations and their estimation of word-to-word relatedness remain difficult to interpret and hard to analyze. As an alternative, explicit word representations i.e. vectors with clearly-defined dimensions, which can be words, windows of words, or documents are easily interpretable, and recent methods show competitive performance to the dense vectors. In this work, we propose a method to transfer word2vec SkipGram embedding model to its explicit representation model. The method provides interpretable explicit vectors while keeping the effectiveness of the original model, tested by evaluating the model on several word association collections. Based on the proposed explicit representation, we propose a novel method to quantify the degree of the existence of gender bias in the English language (used in Wikipedia) with regard to a set of occupations. By measuring the bias towards explicit Female and Male factors, the work demonstrates a general tendency of the majority of the occupations to male and a strong bias in a few specific occupations (e.g. nurse) to female. |
Tasks | |
Published | 2018-12-13 |
URL | http://arxiv.org/abs/1812.10424v1 |
http://arxiv.org/pdf/1812.10424v1.pdf | |
PWC | https://paperswithcode.com/paper/an-unbiased-approach-to-quantification-of |
Repo | |
Framework | |
Generating Texts with Integer Linear Programming
Title | Generating Texts with Integer Linear Programming |
Authors | Gerasimos Lampouras, Ion Androutsopoulos |
Abstract | Concept-to-text generation typically employs a pipeline architecture, which often leads to suboptimal texts. Content selection, for example, may greedily select the most important facts, which may require, however, too many words to express, and this may be undesirable when space is limited or expensive. Selecting other facts, possibly only slightly less important, may allow the lexicalization stage to use much fewer words, or to report more facts in the same space. Decisions made during content selection and lexicalization may also lead to more or fewer sentence aggregation opportunities, affecting the length and readability of the resulting texts. Building upon on a publicly available state of the art natural language generator for Semantic Web ontologies, this article presents an Integer Linear Programming model that, unlike pipeline architectures, jointly considers choices available in content selection, lexicalization, and sentence aggregation to avoid greedy local decisions and produce more compact texts, i.e., texts that report more facts per word. Compact texts are desirable, for example, when generating advertisements to be included in Web search results, or when summarizing structured information in limited space. An extended version of the proposed model also considers a limited form of referring expression generation and avoids redundant sentences. An approximation of the two models can be used when longer texts need to be generated. Experiments with three ontologies confirm that the proposed models lead to more compact texts, compared to pipeline systems, with no deterioration or with improvements in the perceived quality of the generated texts. |
Tasks | Concept-To-Text Generation, Text Generation |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1811.00051v1 |
http://arxiv.org/pdf/1811.00051v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-texts-with-integer-linear |
Repo | |
Framework | |
Global SNR Estimation of Speech Signals using Entropy and Uncertainty Estimates from Dropout Networks
Title | Global SNR Estimation of Speech Signals using Entropy and Uncertainty Estimates from Dropout Networks |
Authors | Rohith Aralikatti, Dilip Margam, Tanay Sharma, Thanda Abhinav, Shankar M Venkatesan |
Abstract | This paper demonstrates two novel methods to estimate the global SNR of speech signals. In both methods, Deep Neural Network-Hidden Markov Model (DNN-HMM) acoustic model used in speech recognition systems is leveraged for the additional task of SNR estimation. In the first method, the entropy of the DNN-HMM output is computed. Recent work on bayesian deep learning has shown that a DNN-HMM trained with dropout can be used to estimate model uncertainty by approximating it as a deep Gaussian process. In the second method, this approximation is used to obtain model uncertainty estimates. Noise specific regressors are used to predict the SNR from the entropy and model uncertainty. The DNN-HMM is trained on GRID corpus and tested on different noise profiles from the DEMAND noise database at SNR levels ranging from -10 dB to 30 dB. |
Tasks | Speech Recognition |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04353v1 |
http://arxiv.org/pdf/1804.04353v1.pdf | |
PWC | https://paperswithcode.com/paper/global-snr-estimation-of-speech-signals-using |
Repo | |
Framework | |
A regression approach for explaining manifold embedding coordinates
Title | A regression approach for explaining manifold embedding coordinates |
Authors | Marina Meila, Samson Koelle, Hanyu Zhang |
Abstract | Manifold embedding algorithms map high dimensional data, down to coordinates in a much lower dimensional space. One of the aims of the dimension reduction is to find the {\em intrinsic coordinates} that describe the data manifold. However, the coordinates returned by the embedding algorithm are abstract coordinates. Finding their physical, domain related meaning is not formalized and left to the domain experts. This paper studies the problem of recovering the domain-specific meaning of the new low dimensional representation in a semi-automatic, principled fashion. We propose a method to explain embedding coordinates on a manifold as {\em non-linear} compositions of functions from a user-defined dictionary. We show that this problem can be set up as a sparse {\em linear Group Lasso} recovery problem, find sufficient recovery conditions, and demonstrate its effectiveness on data. |
Tasks | Dimensionality Reduction |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.11891v1 |
http://arxiv.org/pdf/1811.11891v1.pdf | |
PWC | https://paperswithcode.com/paper/a-regression-approach-for-explaining-manifold |
Repo | |
Framework | |
Logical Rule Induction and Theory Learning Using Neural Theorem Proving
Title | Logical Rule Induction and Theory Learning Using Neural Theorem Proving |
Authors | Andres Campero, Aldo Pareja, Tim Klinger, Josh Tenenbaum, Sebastian Riedel |
Abstract | A hallmark of human cognition is the ability to continually acquire and distill observations of the world into meaningful, predictive theories. In this paper we present a new mechanism for logical theory acquisition which takes a set of observed facts and learns to extract from them a set of logical rules and a small set of core facts which together entail the observations. Our approach is neuro-symbolic in the sense that the rule pred- icates and core facts are given dense vector representations. The rules are applied to the core facts using a soft unification procedure to infer additional facts. After k steps of forward inference, the consequences are compared to the initial observations and the rules and core facts are then encouraged towards representations that more faithfully generate the observations through inference. Our approach is based on a novel neural forward-chaining differentiable rule induction network. The rules are interpretable and learned compositionally from their predicates, which may be invented. We demonstrate the efficacy of our approach on a variety of ILP rule induction and domain theory learning datasets. |
Tasks | Automated Theorem Proving |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.02193v3 |
http://arxiv.org/pdf/1809.02193v3.pdf | |
PWC | https://paperswithcode.com/paper/logical-rule-induction-and-theory-learning |
Repo | |
Framework | |
AI Benchmark: Running Deep Neural Networks on Android Smartphones
Title | AI Benchmark: Running Deep Neural Networks on Android Smartphones |
Authors | Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, Luc Van Gool |
Abstract | Over the last years, the computational power of mobile devices such as smartphones and tablets has grown dramatically, reaching the level of desktop computers available not long ago. While standard smartphone apps are no longer a problem for them, there is still a group of tasks that can easily challenge even high-end devices, namely running artificial intelligence algorithms. In this paper, we present a study of the current state of deep learning in the Android ecosystem and describe available frameworks, programming models and the limitations of running AI on smartphones. We give an overview of the hardware acceleration resources available on four main mobile chipset platforms: Qualcomm, HiSilicon, MediaTek and Samsung. Additionally, we present the real-world performance results of different mobile SoCs collected with AI Benchmark that are covering all main existing hardware configurations. |
Tasks | |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01109v2 |
http://arxiv.org/pdf/1810.01109v2.pdf | |
PWC | https://paperswithcode.com/paper/ai-benchmark-running-deep-neural-networks-on |
Repo | |
Framework | |
Towards Neural Theorem Proving at Scale
Title | Towards Neural Theorem Proving at Scale |
Authors | Pasquale Minervini, Matko Bosnjak, Tim Rocktäschel, Sebastian Riedel |
Abstract | Neural models combining representation learning and reasoning in an end-to-end trainable manner are receiving increasing interest. However, their use is severely limited by their computational complexity, which renders them unusable on real world datasets. We focus on the Neural Theorem Prover (NTP) model proposed by Rockt{"{a}}schel and Riedel (2017), a continuous relaxation of the Prolog backward chaining algorithm where unification between terms is replaced by the similarity between their embedding representations. For answering a given query, this model needs to consider all possible proof paths, and then aggregate results - this quickly becomes infeasible even for small Knowledge Bases (KBs). We observe that we can accurately approximate the inference process in this model by considering only proof paths associated with the highest proof scores. This enables inference and learning on previously impracticable KBs. |
Tasks | Automated Theorem Proving, Representation Learning |
Published | 2018-07-21 |
URL | http://arxiv.org/abs/1807.08204v1 |
http://arxiv.org/pdf/1807.08204v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-neural-theorem-proving-at-scale |
Repo | |
Framework | |
Developing a machine learning framework for estimating soil moisture with VNIR hyperspectral data
Title | Developing a machine learning framework for estimating soil moisture with VNIR hyperspectral data |
Authors | Sina Keller, Felix M. Riese, Johanna Stötzer, Philipp M. Maier, Stefan Hinz |
Abstract | In this paper, we investigate the potential of estimating the soil-moisture content based on VNIR hyperspectral data combined with LWIR data. Measurements from a multi-sensor field campaign represent the benchmark dataset which contains measured hyperspectral, LWIR, and soil-moisture data conducted on grassland site. We introduce a regression framework with three steps consisting of feature selection, preprocessing, and well-chosen regression models. The latter are mainly supervised machine learning models. An exception are the self-organizing maps which combine unsupervised and supervised learning. We analyze the impact of the distinct preprocessing methods on the regression results. Of all regression models, the extremely randomized trees model without preprocessing provides the best estimation performance. Our results reveal the potential of the respective regression framework combined with the VNIR hyperspectral data to estimate soil moisture measured under real-world conditions. In conclusion, the results of this paper provide a basis for further improvements in different research directions. |
Tasks | Feature Selection |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.09046v4 |
http://arxiv.org/pdf/1804.09046v4.pdf | |
PWC | https://paperswithcode.com/paper/developing-a-machine-learning-framework-for |
Repo | |
Framework | |
Reinforcement Learning of Theorem Proving
Title | Reinforcement Learning of Theorem Proving |
Authors | Cezary Kaliszyk, Josef Urban, Henryk Michalewski, Mirek Olšák |
Abstract | We introduce a theorem proving algorithm that uses practically no domain heuristics for guiding its connection-style proof search. Instead, it runs many Monte-Carlo simulations guided by reinforcement learning from previous proof attempts. We produce several versions of the prover, parameterized by different learning and guiding algorithms. The strongest version of the system is trained on a large corpus of mathematical problems and evaluated on previously unseen problems. The trained system solves within the same number of inferences over 40% more problems than a baseline prover, which is an unusually high improvement in this hard AI domain. To our knowledge this is the first time reinforcement learning has been convincingly applied to solving general mathematical problems on a large scale. |
Tasks | Automated Theorem Proving |
Published | 2018-05-19 |
URL | http://arxiv.org/abs/1805.07563v1 |
http://arxiv.org/pdf/1805.07563v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-of-theorem-proving |
Repo | |
Framework | |
Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint
Title | Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint |
Authors | Prashanth L. A., Michael Fu |
Abstract | The classic objective in a reinforcement learning (RL) problem is to find a policy that minimizes, in expectation, a long-run objective such as the infinite-horizon discounted or long-run average cost. In many practical applications, optimizing the expected value alone is not sufficient, and it may be necessary to include a risk measure in the optimization process, either as the objective or as a constraint. Various risk measures have been proposed in the literature, e.g., mean-variance tradeoff, exponential utility, the percentile performance, value at risk, conditional value at risk, prospect theory and its later enhancement, cumulative prospect theory. In this article, we focus on the combination of risk criteria and reinforcement learning in a constrained optimization framework, i.e., a setting where the goal to find a policy that optimizes the usual objective of infinite-horizon discounted/average cost, while ensuring that an explicit risk constraint is satisfied. We introduce the risk-constrained RL framework, cover popular risk measures based on variance, conditional value-at-risk and cumulative prospect theory, and present a template for a risk-sensitive RL algorithm. We survey some of our recent work on this topic, covering problems encompassing discounted cost, average cost, and stochastic shortest path settings, together with the aforementioned risk measures in a constrained framework. This non-exhaustive survey is aimed at giving a flavor of the challenges involved in solving a risk-sensitive RL problem, and outlining some potential future research directions. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09126v1 |
http://arxiv.org/pdf/1810.09126v1.pdf | |
PWC | https://paperswithcode.com/paper/risk-sensitive-reinforcement-learning-a |
Repo | |
Framework | |
Cross-domain Deep Feature Combination for Bird Species Classification with Audio-visual Data
Title | Cross-domain Deep Feature Combination for Bird Species Classification with Audio-visual Data |
Authors | Bold Naranchimeg, Chao Zhang, Takuya Akashi |
Abstract | In recent decade, many state-of-the-art algorithms on image classification as well as audio classification have achieved noticeable successes with the development of deep convolutional neural network (CNN). However, most of the works only exploit single type of training data. In this paper, we present a study on classifying bird species by exploiting the combination of both visual (images) and audio (sounds) data using CNN, which has been sparsely treated so far. Specifically, we propose CNN-based multimodal learning models in three types of fusion strategies (early, middle, late) to settle the issues of combining training data cross domains. The advantage of our proposed method lies on the fact that We can utilize CNN not only to extract features from image and audio data (spectrogram) but also to combine the features across modalities. In the experiment, we train and evaluate the network structure on a comprehensive CUB-200-2011 standard data set combing our originally collected audio data set with respect to the data species. We observe that a model which utilizes the combination of both data outperforms models trained with only an either type of data. We also show that transfer learning can significantly increase the classification performance. |
Tasks | Audio Classification, Bird Species Classification With Audio-Visual Data, Image Classification, Transfer Learning |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10199v1 |
http://arxiv.org/pdf/1811.10199v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-deep-feature-combination-for |
Repo | |
Framework | |
Like a Baby: Visually Situated Neural Language Acquisition
Title | Like a Baby: Visually Situated Neural Language Acquisition |
Authors | Alexander G. Ororbia, Ankur Mali, Matthew A. Kelly, David Reitter |
Abstract | We examine the benefits of visual context in training neural language models to perform next-word prediction. A multi-modal neural architecture is introduced that outperform its equivalent trained on language alone with a 2% decrease in perplexity, even when no visual context is available at test. Fine-tuning the embeddings of a pre-trained state-of-the-art bidirectional language model (BERT) in the language modeling framework yields a 3.5% improvement. The advantage for training with visual context when testing without is robust across different languages (English, German and Spanish) and different models (GRU, LSTM, $\Delta$-RNN, as well as those that use BERT embeddings). Thus, language models perform better when they learn like a baby, i.e, in a multi-modal environment. This finding is compatible with the theory of situated cognition: language is inseparable from its physical context. |
Tasks | Language Acquisition, Language Modelling |
Published | 2018-05-29 |
URL | https://arxiv.org/abs/1805.11546v2 |
https://arxiv.org/pdf/1805.11546v2.pdf | |
PWC | https://paperswithcode.com/paper/visually-grounded-situated-learning-in-neural |
Repo | |
Framework | |
Proceedings 6th International Workshop on Theorem proving components for Educational software
Title | Proceedings 6th International Workshop on Theorem proving components for Educational software |
Authors | Pedro Quaresma, Walther Neuper |
Abstract | The 6th International Workshop on Theorem proving components for Educational software (ThEdu’17) was held in Gothenburg, Sweden, on 6 Aug 2017. It was associated to the conference CADE26. Topics of interest include: methods of automated deduction applied to checking students’ input; methods of automated deduction applied to prove post-conditions for particular problem solutions; combinations of deduction and computation enabling systems to propose next steps; automated provers specific for dynamic geometry systems; proof and proving in mathematics education. ThEdu’17 was a vibrant workshop, with one invited talk and eight contributions. It triggered the post-proceedings at hand. |
Tasks | Automated Theorem Proving |
Published | 2018-03-02 |
URL | http://arxiv.org/abs/1803.00722v1 |
http://arxiv.org/pdf/1803.00722v1.pdf | |
PWC | https://paperswithcode.com/paper/proceedings-6th-international-workshop-on |
Repo | |
Framework | |
Importance Weighting and Variational Inference
Title | Importance Weighting and Variational Inference |
Authors | Justin Domke, Daniel Sheldon |
Abstract | Recent work used importance sampling ideas for better variational bounds on likelihoods. We clarify the applicability of these ideas to pure probabilistic inference, by showing the resulting Importance Weighted Variational Inference (IWVI) technique is an instance of augmented variational inference, thus identifying the looseness in previous work. Experiments confirm IWVI’s practicality for probabilistic inference. As a second contribution, we investigate inference with elliptical distributions, which improves accuracy in low dimensions, and convergence in high dimensions. |
Tasks | |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.09034v3 |
http://arxiv.org/pdf/1808.09034v3.pdf | |
PWC | https://paperswithcode.com/paper/importance-weighting-and-variational |
Repo | |
Framework | |
Training Faster by Separating Modes of Variation in Batch-normalized Models
Title | Training Faster by Separating Modes of Variation in Batch-normalized Models |
Authors | Mahdi M. Kalayeh, Mubarak Shah |
Abstract | Batch Normalization (BN) is essential to effectively train state-of-the-art deep Convolutional Neural Networks (CNN). It normalizes inputs to the layers during training using the statistics of each mini-batch. In this work, we study BN from the viewpoint of Fisher kernels. We show that assuming samples within a mini-batch are from the same probability density function, then BN is identical to the Fisher vector of a Gaussian distribution. That means BN can be explained in terms of kernels that naturally emerge from the probability density function of the underlying data distribution. However, given the rectifying non-linearities employed in CNN architectures, distribution of inputs to the layers show heavy tail and asymmetric characteristics. Therefore, we propose approximating underlying data distribution not with one, but a mixture of Gaussian densities. Deriving Fisher vector for a Gaussian Mixture Model (GMM), reveals that BN can be improved by independently normalizing with respect to the statistics of disentangled sub-populations. We refer to our proposed soft piecewise version of BN as Mixture Normalization (MN). Through extensive set of experiments on CIFAR-10 and CIFAR-100, we show that MN not only effectively accelerates training image classification and Generative Adversarial networks, but also reaches higher quality models. |
Tasks | Image Classification |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02892v2 |
http://arxiv.org/pdf/1806.02892v2.pdf | |
PWC | https://paperswithcode.com/paper/training-faster-by-separating-modes-of |
Repo | |
Framework | |