Paper Group ANR 23
Sentiment Analysis in Scholarly Book Reviews. Analysis of k-Nearest Neighbor Distances with Application to Entropy Estimation. Algorithmic Songwriting with ALYSIA. On Horizontal and Vertical Separation in Hierarchical Text Classification. A universal tradeoff between power, precision and speed in physical communication. Infant directed speech is co …
Sentiment Analysis in Scholarly Book Reviews
Title | Sentiment Analysis in Scholarly Book Reviews |
Authors | Hussam Hamdan, Patrice Bellot, Frederic Bechet |
Abstract | So far different studies have tackled the sentiment analysis in several domains such as restaurant and movie reviews. But, this problem has not been studied in scholarly book reviews which is different in terms of review style and size. In this paper, we propose to combine different features in order to be presented to a supervised classifiers which extract the opinion target expressions and detect their polarities in scholarly book reviews. We construct a labeled corpus for training and evaluating our methods in French book reviews. We also evaluate them on English restaurant reviews in order to measure their robustness across the domains and languages. The evaluation shows that our methods are enough robust for English restaurant reviews and French book reviews. |
Tasks | Sentiment Analysis |
Published | 2016-03-04 |
URL | http://arxiv.org/abs/1603.01595v1 |
http://arxiv.org/pdf/1603.01595v1.pdf | |
PWC | https://paperswithcode.com/paper/sentiment-analysis-in-scholarly-book-reviews |
Repo | |
Framework | |
Analysis of k-Nearest Neighbor Distances with Application to Entropy Estimation
Title | Analysis of k-Nearest Neighbor Distances with Application to Entropy Estimation |
Authors | Shashank Singh, Barnabás Póczos |
Abstract | Estimating entropy and mutual information consistently is important for many machine learning applications. The Kozachenko-Leonenko (KL) estimator (Kozachenko & Leonenko, 1987) is a widely used nonparametric estimator for the entropy of multivariate continuous random variables, as well as the basis of the mutual information estimator of Kraskov et al. (2004), perhaps the most widely used estimator of mutual information in this setting. Despite the practical importance of these estimators, major theoretical questions regarding their finite-sample behavior remain open. This paper proves finite-sample bounds on the bias and variance of the KL estimator, showing that it achieves the minimax convergence rate for certain classes of smooth functions. In proving these bounds, we analyze finite-sample behavior of k-nearest neighbors (k-NN) distance statistics (on which the KL estimator is based). We derive concentration inequalities for k-NN distances and a general expectation bound for statistics of k-NN distances, which may be useful for other analyses of k-NN methods. |
Tasks | |
Published | 2016-03-28 |
URL | http://arxiv.org/abs/1603.08578v2 |
http://arxiv.org/pdf/1603.08578v2.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-k-nearest-neighbor-distances-with |
Repo | |
Framework | |
Algorithmic Songwriting with ALYSIA
Title | Algorithmic Songwriting with ALYSIA |
Authors | Margareta Ackerman, David Loker |
Abstract | This paper introduces ALYSIA: Automated LYrical SongwrIting Application. ALYSIA is based on a machine learning model using Random Forests, and we discuss its success at pitch and rhythm prediction. Next, we show how ALYSIA was used to create original pop songs that were subsequently recorded and produced. Finally, we discuss our vision for the future of Automated Songwriting for both co-creative and autonomous systems. |
Tasks | |
Published | 2016-12-04 |
URL | http://arxiv.org/abs/1612.01058v1 |
http://arxiv.org/pdf/1612.01058v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithmic-songwriting-with-alysia |
Repo | |
Framework | |
On Horizontal and Vertical Separation in Hierarchical Text Classification
Title | On Horizontal and Vertical Separation in Hierarchical Text Classification |
Authors | Mostafa Dehghani, Hosein Azarbonyad, Jaap Kamps, Maarten Marx |
Abstract | Hierarchy is a common and effective way of organizing data and representing their relationships at different levels of abstraction. However, hierarchical data dependencies cause difficulties in the estimation of “separable” models that can distinguish between the entities in the hierarchy. Extracting separable models of hierarchical entities requires us to take their relative position into account and to consider the different types of dependencies in the hierarchy. In this paper, we present an investigation of the effect of separability in text-based entity classification and argue that in hierarchical classification, a separation property should be established between entities not only in the same layer, but also in different layers. Our main findings are the followings. First, we analyse the importance of separability on the data representation in the task of classification and based on that, we introduce a “Strong Separation Principle” for optimizing expected effectiveness of classifiers decision based on separation property. Second, we present Hierarchical Significant Words Language Models (HSWLM) which capture all, and only, the essential features of hierarchical entities according to their relative position in the hierarchy resulting in horizontally and vertically separable models. Third, we validate our claims on real-world data and demonstrate that how HSWLM improves the accuracy of classification and how it provides transferable models over time. Although discussions in this paper focus on the classification problem, the models are applicable to any information access tasks on data that has, or can be mapped to, a hierarchical structure. |
Tasks | Text Classification |
Published | 2016-09-02 |
URL | http://arxiv.org/abs/1609.00514v1 |
http://arxiv.org/pdf/1609.00514v1.pdf | |
PWC | https://paperswithcode.com/paper/on-horizontal-and-vertical-separation-in |
Repo | |
Framework | |
A universal tradeoff between power, precision and speed in physical communication
Title | A universal tradeoff between power, precision and speed in physical communication |
Authors | Subhaneil Lahiri, Jascha Sohl-Dickstein, Surya Ganguli |
Abstract | Maximizing the speed and precision of communication while minimizing power dissipation is a fundamental engineering design goal. Also, biological systems achieve remarkable speed, precision and power efficiency using poorly understood physical design principles. Powerful theories like information theory and thermodynamics do not provide general limits on power, precision and speed. Here we go beyond these classical theories to prove that the product of precision and speed is universally bounded by power dissipation in any physical communication channel whose dynamics is faster than that of the signal. Moreover, our derivation involves a novel connection between friction and information geometry. These results may yield insight into both the engineering design of communication devices and the structure and function of biological signaling systems. |
Tasks | |
Published | 2016-03-24 |
URL | http://arxiv.org/abs/1603.07758v1 |
http://arxiv.org/pdf/1603.07758v1.pdf | |
PWC | https://paperswithcode.com/paper/a-universal-tradeoff-between-power-precision |
Repo | |
Framework | |
Infant directed speech is consistent with teaching
Title | Infant directed speech is consistent with teaching |
Authors | Baxter S. Eaves Jr., Naomi H. Feldman, Thomas L. Griffiths, Patrick Shafto |
Abstract | Infant-directed speech (IDS) has distinctive properties that differ from adult-directed speech (ADS). Why it has these properties – and whether they are intended to facilitate language learning – is matter of contention. We argue that much of this disagreement stems from lack of a formal, guiding theory of how phonetic categories should best be taught to infant-like learners. In the absence of such a theory, researchers have relied on intuitions about learning to guide the argument. We use a formal theory of teaching, validated through experiments in other domains, as the basis for a detailed analysis of whether IDS is well-designed for teaching phonetic categories. Using the theory, we generate ideal data for teaching phonetic categories in English. We qualitatively compare the simulated teaching data with human IDS, finding that the teaching data exhibit many features of IDS, including some that have been taken as evidence IDS is not for teaching. The simulated data reveal potential pitfalls for experimentalists exploring the role of IDS in language learning. Focusing on different formants and phoneme sets leads to different conclusions, and the benefit of the teaching data to learners is not apparent until a sufficient number of examples have been provided. Finally, we investigate transfer of IDS to learning ADS. The teaching data improves classification of ADS data, but only for the learner they were generated to teach; not universally across all classes of learner. This research offers a theoretically-grounded framework that empowers experimentalists to systematically evaluate whether IDS is for teaching. |
Tasks | |
Published | 2016-06-01 |
URL | http://arxiv.org/abs/1606.01175v1 |
http://arxiv.org/pdf/1606.01175v1.pdf | |
PWC | https://paperswithcode.com/paper/infant-directed-speech-is-consistent-with |
Repo | |
Framework | |
An Information-theoretic Approach to Machine-oriented Music Summarization
Title | An Information-theoretic Approach to Machine-oriented Music Summarization |
Authors | Francisco Raposo, David Martins de Matos, Ricardo Ribeiro |
Abstract | Music summarization allows for higher efficiency in processing, storage, and sharing of datasets. Machine-oriented approaches, being agnostic to human consumption, optimize these aspects even further. Such summaries have already been successfully validated in some MIR tasks. We now generalize previous conclusions by evaluating the impact of generic summarization of music from a probabilistic perspective. We estimate Gaussian distributions for original and summarized songs and compute their relative entropy, in order to measure information loss incurred by summarization. Our results suggest that relative entropy is a good predictor of summarization performance in the context of tasks relying on a bag-of-features model. Based on this observation, we further propose a straightforward yet expressive summarizer, which minimizes relative entropy with respect to the original song, that objectively outperforms previous methods and is better suited to avoid potential copyright issues. |
Tasks | |
Published | 2016-12-07 |
URL | http://arxiv.org/abs/1612.02350v6 |
http://arxiv.org/pdf/1612.02350v6.pdf | |
PWC | https://paperswithcode.com/paper/an-information-theoretic-approach-to-machine |
Repo | |
Framework | |
Explainable Restricted Boltzmann Machines for Collaborative Filtering
Title | Explainable Restricted Boltzmann Machines for Collaborative Filtering |
Authors | Behnoush Abdollahi, Olfa Nasraoui |
Abstract | Most accurate recommender systems are black-box models, hiding the reasoning behind their recommendations. Yet explanations have been shown to increase the user’s trust in the system in addition to providing other benefits such as scrutability, meaning the ability to verify the validity of recommendations. This gap between accuracy and transparency or explainability has generated an interest in automated explanation generation methods. Restricted Boltzmann Machines (RBM) are accurate models for CF that also lack interpretability. In this paper, we focus on RBM based collaborative filtering recommendations, and further assume the absence of any additional data source, such as item content or user attributes. We thus propose a new Explainable RBM technique that computes the top-n recommendation list from items that are explainable. Experimental results show that our method is effective in generating accurate and explainable recommendations. |
Tasks | Recommendation Systems |
Published | 2016-06-22 |
URL | http://arxiv.org/abs/1606.07129v1 |
http://arxiv.org/pdf/1606.07129v1.pdf | |
PWC | https://paperswithcode.com/paper/explainable-restricted-boltzmann-machines-for |
Repo | |
Framework | |
Fast Zero-Shot Image Tagging
Title | Fast Zero-Shot Image Tagging |
Authors | Yang Zhang, Boqing Gong, Mubarak Shah |
Abstract | The well-known word analogy experiments show that the recent word vectors capture fine-grained linguistic regularities in words by linear vector offsets, but it is unclear how well the simple vector offsets can encode visual regularities over words. We study a particular image-word relevance relation in this paper. Our results show that the word vectors of relevant tags for a given image rank ahead of the irrelevant tags, along a principal direction in the word vector space. Inspired by this observation, we propose to solve image tagging by estimating the principal direction for an image. Particularly, we exploit linear mappings and nonlinear deep neural networks to approximate the principal direction from an input image. We arrive at a quite versatile tagging model. It runs fast given a test image, in constant time w.r.t.\ the training set size. It not only gives superior performance for the conventional tagging task on the NUS-WIDE dataset, but also outperforms competitive baselines on annotating images with previously unseen tags |
Tasks | |
Published | 2016-05-31 |
URL | http://arxiv.org/abs/1605.09759v1 |
http://arxiv.org/pdf/1605.09759v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-zero-shot-image-tagging |
Repo | |
Framework | |
A Theory of Interactive Debugging of Knowledge Bases in Monotonic Logics
Title | A Theory of Interactive Debugging of Knowledge Bases in Monotonic Logics |
Authors | Patrick Rodler |
Abstract | A broad variety of knowledge-based applications such as recommender, expert, planning or configuration systems usually operate on the basis of knowledge represented by means of some logical language. Such a logical knowledge base (KB) enables intelligent behavior of such systems by allowing them to automatically reason, answer queries of interest or solve complex real-world problems. Nowadays, where information acquisition comes at low costs and often happens automatically, the applied KBs are continuously growing in terms of size, information content and complexity. These developments foster the emergence of errors in these KBs and thus pose a significant challenge on all people and tools involved in KB evolution, maintenance and application. If some minimal quality criteria such as logical consistency are not met by some KB, it becomes useless for knowledge-based applications. To guarantee the compliance of KBs with given requirements, (non-interactive) KB debuggers have been proposed. These however often cannot localize all potential faults, suggest too large or incorrect modifications of the faulty KB or suffer from poor scalability due to the inherent complexity of the KB debugging problem. As a remedy to these issues, based on a well-founded theoretical basis this work proposes complete, sound and optimal methods for the interactive debugging of KBs that suggest the one (minimally invasive) error correction of the faulty KB that yields a repaired KB with exactly the intended semantics. Users, e.g. domain experts, are involved in the debugging process by answering automatically generated queries whether some given statements must or must not hold in the domain that should be modeled by the problematic KB at hand. |
Tasks | |
Published | 2016-09-20 |
URL | http://arxiv.org/abs/1609.06375v1 |
http://arxiv.org/pdf/1609.06375v1.pdf | |
PWC | https://paperswithcode.com/paper/a-theory-of-interactive-debugging-of |
Repo | |
Framework | |
Localization by Fusing a Group of Fingerprints via Multiple Antennas in Indoor Environment
Title | Localization by Fusing a Group of Fingerprints via Multiple Antennas in Indoor Environment |
Authors | Xiansheng Guo, Nirwan Ansari |
Abstract | Most existing fingerprints-based indoor localization approaches are based on some single fingerprints, such as received signal strength (RSS), channel impulse response (CIR), and signal subspace. However, the localization accuracy obtained by the single fingerprint approach is rather susceptible to the changing environment, multi-path, and non-line-of-sight (NLOS) propagation. Furthermore, building the fingerprints is a very time consuming process. In this paper, we propose a novel localization framework by Fusing A Group Of fingerprinTs (FAGOT) via multiple antennas for the indoor environment. We first build a GrOup Of Fingerprints (GOOF), which includes five different fingerprints, namely, RSS, covariance matrix, signal subspace, fractional low order moment, and fourth-order cumulant, which are obtained by different transformations of the received signals from multiple antennas in the offline stage. Then, we design a parallel GOOF multiple classifiers based on AdaBoost (GOOF-AdaBoost) to train each of these fingerprints in parallel as five strong multiple classifiers. In the online stage, we input the corresponding transformations of the real measurements into these strong classifiers to obtain independent decisions. Finally, we propose an efficient combination fusion algorithm, namely, MUltiple Classifiers mUltiple Samples (MUCUS) fusion algorithm to improve the accuracy of localization by combining the predictions of multiple classifiers with different samples. As compared with the single fingerprint approaches, the prediction probability of our proposed approach is improved significantly. The process for building fingerprints can also be reduced drastically. We demonstrate the feasibility and performance of the proposed algorithm through extensive simulations as well as via real experimental data using a Universal Software Radio Peripheral (USRP) platform with four antennas. |
Tasks | |
Published | 2016-09-01 |
URL | http://arxiv.org/abs/1609.00661v2 |
http://arxiv.org/pdf/1609.00661v2.pdf | |
PWC | https://paperswithcode.com/paper/localization-by-fusing-a-group-of |
Repo | |
Framework | |
Modeling the Evolution of Gene-Culture Divergence
Title | Modeling the Evolution of Gene-Culture Divergence |
Authors | Chris Marriott, Jobran Chebib |
Abstract | We present a model for evolving agents using both genetic and cultural inheritance mechanisms. Within each agent our model maintains two distinct information stores we call the genome and the memome. Processes of adaptation are modeled as evolutionary processes at each level of adaptation (phylogenetic, ontogenetic, sociogenetic). We review relevant competing models and we show how our model improves on previous attempts to model genetic and cultural evolutionary processes. In particular we argue our model can achieve divergent gene-culture co-evolution. |
Tasks | |
Published | 2016-04-25 |
URL | http://arxiv.org/abs/1604.07108v1 |
http://arxiv.org/pdf/1604.07108v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-the-evolution-of-gene-culture |
Repo | |
Framework | |
Phone-based Metric as a Predictor for Basic Personality Traits
Title | Phone-based Metric as a Predictor for Basic Personality Traits |
Authors | Bjarke Mønsted, Anders Mollgaard, Joachim Mathiesen |
Abstract | Basic personality traits are typically assessed through questionnaires. Here we consider phone-based metrics as a way to asses personality traits. We use data from smartphones with custom data-collection software distributed to 730 individuals. The data includes information about location, physical motion, face-to-face contacts, online social network friends, text messages and calls. The data is further complemented by questionnaire-based data on basic personality traits. From the phone-based metrics, we define a set of behavioral variables, which we use in a prediction of basic personality traits. We find that predominantly, the Big Five personality traits extraversion and, to some degree, neuroticism are strongly expressed in our data. As an alternative to the Big Five, we investigate whether other linear combinations of the 44 questions underlying the Big Five Inventory are more predictable. In a tertile classification problem, basic dimensionality reduction techniques, such as independent component analysis, increase the predictability relative to the baseline from $11%$ to $23%$. Finally, from a supervised linear classifier, we were able to further improve this predictability to $33%$. In all cases, the most predictable projections had an overweight of the questions related to extraversion and neuroticism. In addition, our findings indicate that the score system underlying the Big Five Inventory disregards a part of the information available in the 44 questions. |
Tasks | Dimensionality Reduction |
Published | 2016-04-16 |
URL | http://arxiv.org/abs/1604.04696v1 |
http://arxiv.org/pdf/1604.04696v1.pdf | |
PWC | https://paperswithcode.com/paper/phone-based-metric-as-a-predictor-for-basic |
Repo | |
Framework | |
A Comparative Study of Ranking-based Semantics for Abstract Argumentation
Title | A Comparative Study of Ranking-based Semantics for Abstract Argumentation |
Authors | Elise Bonzon, Jérôme Delobelle, Sébastien Konieczny, Nicolas Maudet |
Abstract | Argumentation is a process of evaluating and comparing a set of arguments. A way to compare them consists in using a ranking-based semantics which rank-order arguments from the most to the least acceptable ones. Recently, a number of such semantics have been proposed independently, often associated with some desirable properties. However, there is no comparative study which takes a broader perspective. This is what we propose in this work. We provide a general comparison of all these semantics with respect to the proposed properties. That allows to underline the differences of behavior between the existing semantics. |
Tasks | Abstract Argumentation |
Published | 2016-02-02 |
URL | http://arxiv.org/abs/1602.01059v1 |
http://arxiv.org/pdf/1602.01059v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-study-of-ranking-based |
Repo | |
Framework | |
Distributed stochastic optimization for deep learning (thesis)
Title | Distributed stochastic optimization for deep learning (thesis) |
Authors | Sixin Zhang |
Abstract | We study the problem of how to distribute the training of large-scale deep learning models in the parallel computing environment. We propose a new distributed stochastic optimization method called Elastic Averaging SGD (EASGD). We analyze the convergence rate of the EASGD method in the synchronous scenario and compare its stability condition with the existing ADMM method in the round-robin scheme. An asynchronous and momentum variant of the EASGD method is applied to train deep convolutional neural networks for image classification on the CIFAR and ImageNet datasets. Our approach accelerates the training and furthermore achieves better test accuracy. It also requires a much smaller amount of communication than other common baseline approaches such as the DOWNPOUR method. We then investigate the limit in speedup of the initial and the asymptotic phase of the mini-batch SGD, the momentum SGD, and the EASGD methods. We find that the spread of the input data distribution has a big impact on their initial convergence rate and stability region. We also find a surprising connection between the momentum SGD and the EASGD method with a negative moving average rate. A non-convex case is also studied to understand when EASGD can get trapped by a saddle point. Finally, we scale up the EASGD method by using a tree structured network topology. We show empirically its advantage and challenge. We also establish a connection between the EASGD and the DOWNPOUR method with the classical Jacobi and the Gauss-Seidel method, thus unifying a class of distributed stochastic optimization methods. |
Tasks | Image Classification, Stochastic Optimization |
Published | 2016-05-07 |
URL | http://arxiv.org/abs/1605.02216v1 |
http://arxiv.org/pdf/1605.02216v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-stochastic-optimization-for-deep |
Repo | |
Framework | |