Paper Group ANR 285
A Gated Self-attention Memory Network for Answer Selection. Convergence Rates for Gaussian Mixtures of Experts. Semantic categories of artifacts and animals reflect efficient coding. AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration. Boosted Attention: Leveraging Human Attent …
A Gated Self-attention Memory Network for Answer Selection
Title | A Gated Self-attention Memory Network for Answer Selection |
Authors | Tuan Lai, Quan Hung Tran, Trung Bui, Daisuke Kihara |
Abstract | Answer selection is an important research problem, with applications in many areas. Previous deep learning based approaches for the task mainly adopt the Compare-Aggregate architecture that performs word-level comparison followed by aggregation. In this work, we take a departure from the popular Compare-Aggregate architecture, and instead, propose a new gated self-attention memory network for the task. Combined with a simple transfer learning technique from a large-scale online corpus, our model outperforms previous methods by a large margin, achieving new state-of-the-art results on two standard answer selection datasets: TrecQA and WikiQA. |
Tasks | Answer Selection, Transfer Learning |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.09696v1 |
https://arxiv.org/pdf/1909.09696v1.pdf | |
PWC | https://paperswithcode.com/paper/a-gated-self-attention-memory-network-for |
Repo | |
Framework | |
Convergence Rates for Gaussian Mixtures of Experts
Title | Convergence Rates for Gaussian Mixtures of Experts |
Authors | Nhat Ho, Chiao-Yu Yang, Michael I. Jordan |
Abstract | We provide a theoretical treatment of over-specified Gaussian mixtures of experts with covariate-free gating networks. We establish the convergence rates of the maximum likelihood estimation (MLE) for these models. Our proof technique is based on a novel notion of \emph{algebraic independence} of the expert functions. Drawing on optimal transport theory, we establish a connection between the algebraic independence and a certain class of partial differential equations (PDEs). Exploiting this connection allows us to derive convergence rates and minimax lower bounds for parameter estimation. |
Tasks | |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04377v1 |
https://arxiv.org/pdf/1907.04377v1.pdf | |
PWC | https://paperswithcode.com/paper/convergence-rates-for-gaussian-mixtures-of |
Repo | |
Framework | |
Semantic categories of artifacts and animals reflect efficient coding
Title | Semantic categories of artifacts and animals reflect efficient coding |
Authors | Noga Zaslavsky, Terry Regier, Naftali Tishby, Charles Kemp |
Abstract | It has been argued that semantic categories across languages reflect pressure for efficient communication. Recently, this idea has been cast in terms of a general information-theoretic principle of efficiency, the Information Bottleneck (IB) principle, and it has been shown that this principle accounts for the emergence and evolution of named color categories across languages, including soft structure and patterns of inconsistent naming. However, it is not yet clear to what extent this account generalizes to semantic domains other than color. Here we show that it generalizes to two qualitatively different semantic domains: names for containers, and for animals. First, we show that container naming in Dutch and French is near-optimal in the IB sense, and that IB broadly accounts for soft categories and inconsistent naming patterns in both languages. Second, we show that a hierarchy of animal categories derived from IB captures cross-linguistic tendencies in the growth of animal taxonomies. Taken together, these findings suggest that fundamental information-theoretic principles of efficient coding may shape semantic categories across languages and across domains. |
Tasks | |
Published | 2019-05-11 |
URL | https://arxiv.org/abs/1905.04562v1 |
https://arxiv.org/pdf/1905.04562v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-categories-of-artifacts-and-animals |
Repo | |
Framework | |
AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration
Title | AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration |
Authors | Liantao Ma, Junyi Gao, Yasha Wang, Chaohe Zhang, Jiangtao Wang, Wenjie Ruan, Wen Tang, Xin Gao, Xinyu Ma |
Abstract | Deep learning-based health status representation learning and clinical prediction have raised much research interest in recent years. Existing models have shown superior performance, but there are still several major issues that have not been fully taken into consideration. First, the historical variation pattern of the biomarker in diverse time scales plays a vital role in indicating the health status, but it has not been explicitly extracted by existing works. Second, key factors that strongly indicate the health risk are different among patients. It is still challenging to adaptively make use of the features for patients in diverse conditions. Third, using prediction models as the black box will limit the reliability in clinical practice. However, none of the existing works can provide satisfying interpretability and meanwhile achieve high prediction performance. In this work, we develop a general health status representation learning model, named AdaCare. It can capture the long and short-term variations of biomarkers as clinical features to depict the health status in multiple time scales. It also models the correlation between clinical features to enhance the ones which strongly indicate the health status and thus can maintain a state-of-the-art performance in terms of prediction accuracy while providing qualitative interpretability. We conduct a health risk prediction experiment on two real-world datasets. Experiment results indicate that AdaCare outperforms state-of-the-art approaches and provides effective interpretability, which is verifiable by clinical experts. |
Tasks | Representation Learning |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.12205v1 |
https://arxiv.org/pdf/1911.12205v1.pdf | |
PWC | https://paperswithcode.com/paper/adacare-explainable-clinical-health-status |
Repo | |
Framework | |
Boosted Attention: Leveraging Human Attention for Image Captioning
Title | Boosted Attention: Leveraging Human Attention for Image Captioning |
Authors | Shi Chen, Qi Zhao |
Abstract | Visual attention has shown usefulness in image captioning, with the goal of enabling a caption model to selectively focus on regions of interest. Existing models typically rely on top-down language information and learn attention implicitly by optimizing the captioning objectives. While somewhat effective, the learned top-down attention can fail to focus on correct regions of interest without direct supervision of attention. Inspired by the human visual system which is driven by not only the task-specific top-down signals but also the visual stimuli, we in this work propose to use both types of attention for image captioning. In particular, we highlight the complementary nature of the two types of attention and develop a model (Boosted Attention) to integrate them for image captioning. We validate the proposed approach with state-of-the-art performance across various evaluation metrics. |
Tasks | Image Captioning |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1904.00767v1 |
http://arxiv.org/pdf/1904.00767v1.pdf | |
PWC | https://paperswithcode.com/paper/boosted-attention-leveraging-human-attention-1 |
Repo | |
Framework | |
Personalizing Search Results Using Hierarchical RNN with Query-aware Attention
Title | Personalizing Search Results Using Hierarchical RNN with Query-aware Attention |
Authors | Songwei Ge, Zhicheng Dou, Zhengbao Jiang, Jian-Yun Nie, Ji-Rong Wen |
Abstract | Search results personalization has become an effective way to improve the quality of search engines. Previous studies extracted information such as past clicks, user topical interests, query click entropy and so on to tailor the original ranking. However, few studies have taken into account the sequential information underlying previous queries and sessions. Intuitively, the order of issued queries is important in inferring the real user interests. And more recent sessions should provide more reliable personal signals than older sessions. In addition, the previous search history and user behaviors should influence the personalization of the current query depending on their relatedness. To implement these intuitions, in this paper we employ a hierarchical recurrent neural network to exploit such sequential information and automatically generate user profile from historical data. We propose a query-aware attention model to generate a dynamic user profile based on the input query. Significant improvement is observed in the experiment with data from a commercial search engine when compared with several traditional personalization models. Our analysis reveals that the attention model is able to attribute higher weights to more related past sessions after fine training. |
Tasks | |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07600v1 |
https://arxiv.org/pdf/1908.07600v1.pdf | |
PWC | https://paperswithcode.com/paper/190807600 |
Repo | |
Framework | |
SemEval-2015 Task 10: Sentiment Analysis in Twitter
Title | SemEval-2015 Task 10: Sentiment Analysis in Twitter |
Authors | Sara Rosenthal, Saif M Mohammad, Preslav Nakov, Alan Ritter, Svetlana Kiritchenko, Veselin Stoyanov |
Abstract | In this paper, we describe the 2015 iteration of the SemEval shared task on Sentiment Analysis in Twitter. This was the most popular sentiment analysis shared task to date with more than 40 teams participating in each of the last three years. This year’s shared task competition consisted of five sentiment prediction subtasks. Two were reruns from previous years: (A) sentiment expressed by a phrase in the context of a tweet, and (B) overall sentiment of a tweet. We further included three new subtasks asking to predict (C) the sentiment towards a topic in a single tweet, (D) the overall sentiment towards a topic in a set of tweets, and (E) the degree of prior polarity of a phrase. |
Tasks | Sentiment Analysis |
Published | 2019-12-05 |
URL | https://arxiv.org/abs/1912.02387v1 |
https://arxiv.org/pdf/1912.02387v1.pdf | |
PWC | https://paperswithcode.com/paper/semeval-2015-task-10-sentiment-analysis-in-1 |
Repo | |
Framework | |
Local Unsupervised Learning for Image Analysis
Title | Local Unsupervised Learning for Image Analysis |
Authors | Leopold Grinberg, John Hopfield, Dmitry Krotov |
Abstract | Local Hebbian learning is believed to be inferior in performance to end-to-end training using a backpropagation algorithm. We question this popular belief by designing a local algorithm that can learn convolutional filters at scale on large image datasets. These filters combined with patch normalization and very steep non-linearities result in a good classification accuracy for shallow networks trained locally, as opposed to end-to-end. The filters learned by our algorithm contain both orientation selective units and unoriented color units, resembling the responses of pyramidal neurons located in the cytochrome oxidase ‘interblob’ and ‘blob’ regions in the primary visual cortex of primates. It is shown that convolutional networks with patch normalization significantly outperform standard convolutional networks on the task of recovering the original classes when shadows are superimposed on top of standard CIFAR-10 images. Patch normalization approximates the retinal adaptation to the mean light intensity, important for human vision. We also demonstrate a successful transfer of learned representations between CIFAR-10 and ImageNet 32x32 datasets. All these results taken together hint at the possibility that local unsupervised training might be a powerful tool for learning general representations (without specifying the task) directly from unlabeled data. |
Tasks | |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.08993v1 |
https://arxiv.org/pdf/1908.08993v1.pdf | |
PWC | https://paperswithcode.com/paper/local-unsupervised-learning-for-image |
Repo | |
Framework | |
Subset Multivariate Collective And Point Anomaly Detection
Title | Subset Multivariate Collective And Point Anomaly Detection |
Authors | Alexander T M Fisch, Idris A Eckley, Paul Fearnhead |
Abstract | In recent years, there has been a growing interest in identifying anomalous structure within multivariate data streams. We consider the problem of detecting collective anomalies, corresponding to intervals where one or more of the data streams behaves anomalously. We first develop a test for a single collective anomaly that has power to simultaneously detect anomalies that are either rare, that is affecting few data streams, or common. We then show how to detect multiple anomalies in a way that is computationally efficient but avoids the approximations inherent in binary segmentation-like approaches. This approach, which we call MVCAPA, is shown to consistently estimate the number and location of the collective anomalies, a property that has not previously been shown for competing methods. MVCAPA can be made robust to point anomalies and can allow for the anomalies to be imperfectly aligned. We show the practical usefulness of allowing for imperfect alignments through a resulting increase in power to detect regions of copy number variation. |
Tasks | Anomaly Detection |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.01691v1 |
https://arxiv.org/pdf/1909.01691v1.pdf | |
PWC | https://paperswithcode.com/paper/subset-multivariate-collective-and-point |
Repo | |
Framework | |
Fair Data Adaptation with Quantile Preservation
Title | Fair Data Adaptation with Quantile Preservation |
Authors | Drago Plečko, Nicolai Meinshausen |
Abstract | Fairness of classification and regression has received much attention recently and various, partially non-compatible, criteria have been proposed. The fairness criteria can be enforced for a given classifier or, alternatively, the data can be adapated to ensure that every classifier trained on the data will adhere to desired fairness criteria. We present a practical data adaption method based on quantile preservation in causal structural equation models. The data adaptation is based on a presumed counterfactual model for the data. While the counterfactual model itself cannot be verified experimentally, we show that certain population notions of fairness are still guaranteed even if the counterfactual model is misspecified. The precise nature of the fulfilled non-causal fairness notion (such as demographic parity, separation or sufficiency) depends on the structure of the underlying causal model and the choice of resolving variables. We describe an implementation of the proposed data adaptation procedure based on Random Forests and demonstrate its practical use on simulated and real-world data. |
Tasks | |
Published | 2019-11-15 |
URL | https://arxiv.org/abs/1911.06685v1 |
https://arxiv.org/pdf/1911.06685v1.pdf | |
PWC | https://paperswithcode.com/paper/fair-data-adaptation-with-quantile |
Repo | |
Framework | |
Restless Hidden Markov Bandits with Linear Rewards
Title | Restless Hidden Markov Bandits with Linear Rewards |
Authors | Michal Yemini, Amir Leshem, Anelia Somekh-Baruch |
Abstract | This paper presents an algorithm and regret analysis for the restless hidden Markov bandit problem with linear rewards. In this problem the reward received by the decision maker is a random linear function which depends on the arm selected and a hidden state. In contrast to previous works on Markovian bandits, we do not assume that the decision maker receives information regarding the state of the system, but has to infer it based on its actions and the received reward. Surprisingly, we can still maintain logarithmic regret in the case of polyhedral action set. Furthermore, the regret does not depend on the number of extreme points in the action space. |
Tasks | |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.10271v1 |
https://arxiv.org/pdf/1910.10271v1.pdf | |
PWC | https://paperswithcode.com/paper/restless-hidden-markov-bandits-with-linear |
Repo | |
Framework | |
Sentiment Analysis of German Twitter
Title | Sentiment Analysis of German Twitter |
Authors | Wladimir Sidorenko |
Abstract | This thesis explores the ways by how people express their opinions on German Twitter, examines current approaches to automatic mining of these feelings, and proposes novel methods, which outperform state-of-the-art techniques. For this purpose, I introduce a new corpus of German tweets that have been manually annotated with sentiments, their targets and holders, as well as polar terms and their contextual modifiers. Using these data, I explore four major areas of sentiment research: (i) generation of sentiment lexicons, (ii) fine-grained opinion mining, (iii) message-level polarity classification, and (iv) discourse-aware sentiment analysis. In the first task, I compare three popular groups of lexicon generation methods: dictionary-, corpus-, and word-embedding-based ones, finding that dictionary-based systems generally yield better lexicons than the last two groups. Apart from this, I propose a linear projection algorithm, whose results surpass many existing automatic lexicons. Afterwords, in the second task, I examine two common approaches to automatic prediction of sentiments, sources, and targets: conditional random fields and recurrent neural networks, obtaining higher scores with the former model and improving these results even further by redefining the structure of CRF graphs. When dealing with message-level polarity classification, I juxtapose three major sentiment paradigms: lexicon-, machine-learning-, and deep-learning-based systems, and try to unite the first and last of these groups by introducing a bidirectional neural network with lexicon-based attention. Finally, in order to make the new classifier aware of discourse structure, I let it separately analyze the elementary discourse units of each microblog and infer the overall polarity of a message from the scores of its EDUs with the help of two new approaches: latent-marginalized CRFs and Recursive Dirichlet Process. |
Tasks | Opinion Mining, Sentiment Analysis |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.13062v1 |
https://arxiv.org/pdf/1911.13062v1.pdf | |
PWC | https://paperswithcode.com/paper/sentiment-analysis-of-german-twitter |
Repo | |
Framework | |
Spatio-temporal Action Recognition: A Survey
Title | Spatio-temporal Action Recognition: A Survey |
Authors | Amlaan Bhoi |
Abstract | The task of action recognition or action detection involves analyzing videos and determining what action or motion is being performed. The primary subject of these videos are predominantly humans performing some action. However, this requirement can be relaxed to generalize over other subjects such as animals or robots. The applications can range from anywhere between human-computer inter-action to automated video editing proposals. When we consider spatiotemporal action recognition, we deal with action localization. This task not only involves determining what action is being performed but also when and where itis being performed in said video. This paper aims to survey the plethora of approaches and algorithms attempted to solve this task, give a comprehensive comparison between them, explore various datasets available for the problem, and determine the most promising approaches. |
Tasks | Action Detection, Action Localization, Temporal Action Localization |
Published | 2019-01-27 |
URL | http://arxiv.org/abs/1901.09403v1 |
http://arxiv.org/pdf/1901.09403v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-action-recognition-a-survey |
Repo | |
Framework | |
Language-Independent Sentiment Analysis Using Subjectivity and Positional Information
Title | Language-Independent Sentiment Analysis Using Subjectivity and Positional Information |
Authors | Veselin Raychev, Preslav Nakov |
Abstract | We describe a novel language-independent approach to the task of determining the polarity, positive or negative, of the author’s opinion on a specific topic in natural language text. In particular, weights are assigned to attributes, individual words or word bi-grams, based on their position and on their likelihood of being subjective. The subjectivity of each attribute is estimated in a two-step process, where first the probability of being subjective is calculated for each sentence containing the attribute, and then these probabilities are used to alter the attribute’s weights for polarity classification. The evaluation results on a standard dataset of movie reviews shows 89.85% classification accuracy, which rivals the best previously published results for this dataset for systems that use no additional linguistic information nor external resources. |
Tasks | Sentiment Analysis |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12544v1 |
https://arxiv.org/pdf/1911.12544v1.pdf | |
PWC | https://paperswithcode.com/paper/language-independent-sentiment-analysis-using |
Repo | |
Framework | |
A Machine Learning Method for Prediction of Multipath Channels
Title | A Machine Learning Method for Prediction of Multipath Channels |
Authors | Julian Ahrens, Lia Ahrens, Hans D. Schotten |
Abstract | In this paper, a machine learning method for predicting the evolution of a mobile communication channel based on a specific type of convolutional neural network is developed and evaluated in a simulated multipath transmission scenario. The simulation and channel estimation are designed to replicate real-world scenarios and common measurements supported by reference signals in modern cellular networks. The capability of the predictor meets the requirements that a deployment of the developed method in a radio resource scheduler of a base station poses. Possible applications of the method are discussed. |
Tasks | |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04824v2 |
https://arxiv.org/pdf/1909.04824v2.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-method-for-prediction-of |
Repo | |
Framework | |