Paper Group ANR 1097
Disentangling Aspect and Opinion Words in Target-based Sentiment Analysis using Lifelong Learning. Rank and Rate: Multi-task Learning for Recommender Systems. t-Exponential Memory Networks for Question-Answering Machines. Algebraic Machine Learning. From Gene Expression to Drug Response: A Collaborative Filtering Approach. VideoMem: Constructing, A …
Disentangling Aspect and Opinion Words in Target-based Sentiment Analysis using Lifelong Learning
Title | Disentangling Aspect and Opinion Words in Target-based Sentiment Analysis using Lifelong Learning |
Authors | Shuai Wang, Mianwei Zhou, Sahisnu Mazumder, Bing Liu, Yi Chang |
Abstract | Given a target name, which can be a product aspect or entity, identifying its aspect words and opinion words in a given corpus is a fine-grained task in target-based sentiment analysis (TSA). This task is challenging, especially when we have no labeled data and we want to perform it for any given domain. To address it, we propose a general two-stage approach. Stage one extracts/groups the target-related words (call t-words) for a given target. This is relatively easy as we can apply an existing semantics-based learning technique. Stage two separates the aspect and opinion words from the grouped t-words, which is challenging because we often do not have enough word-level aspect and opinion labels. In this work, we formulate this problem in a PU learning setting and incorporate the idea of lifelong learning to solve it. Experimental results show the effectiveness of our approach. |
Tasks | Sentiment Analysis |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.05818v1 |
http://arxiv.org/pdf/1802.05818v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-aspect-and-opinion-words-in |
Repo | |
Framework | |
Rank and Rate: Multi-task Learning for Recommender Systems
Title | Rank and Rate: Multi-task Learning for Recommender Systems |
Authors | Guy Hadash, Oren Sar Shalom, Rita Osadchy |
Abstract | The two main tasks in the Recommender Systems domain are the ranking and rating prediction tasks. The rating prediction task aims at predicting to what extent a user would like any given item, which would enable to recommend the items with the highest predicted scores. The ranking task on the other hand directly aims at recommending the most valuable items for the user. Several previous approaches proposed learning user and item representations to optimize both tasks simultaneously in a multi-task framework. In this work we propose a novel multi-task framework that exploits the fact that a user does a two-phase decision process - first decides to interact with an item (ranking task) and only afterward to rate it (rating prediction task). We evaluated our framework on two benchmark datasets, on two different configurations and showed its superiority over state-of-the-art methods. |
Tasks | Multi-Task Learning, Recommendation Systems |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1807.11698v1 |
http://arxiv.org/pdf/1807.11698v1.pdf | |
PWC | https://paperswithcode.com/paper/rank-and-rate-multi-task-learning-for |
Repo | |
Framework | |
t-Exponential Memory Networks for Question-Answering Machines
Title | t-Exponential Memory Networks for Question-Answering Machines |
Authors | Kyriakos Tolias, Sotirios Chatzis |
Abstract | Recent advances in deep learning have brought to the fore models that can make multiple computational steps in the service of completing a task; these are capable of describ- ing long-term dependencies in sequential data. Novel recurrent attention models over possibly large external memory modules constitute the core mechanisms that enable these capabilities. Our work addresses learning subtler and more complex underlying temporal dynamics in language modeling tasks that deal with sparse sequential data. To this end, we improve upon these recent advances, by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network parameters as latent variables with a prior distribution imposed over them. Our statistical assumptions go beyond the standard practice of postulating Gaussian priors. Indeed, to allow for handling outliers, which are prevalent in long observed sequences of multivariate data, multivariate t-exponential distributions are imposed. On this basis, we proceed to infer corresponding posteriors; these can be used for inference and prediction at test time, in a way that accounts for the uncertainty in the available sparse training data. Specifically, to allow for our approach to best exploit the merits of the t-exponential family, our method considers a new t-divergence measure, which generalizes the concept of the Kullback-Leibler divergence. We perform an extensive experimental evaluation of our approach, using challenging language modeling benchmarks, and illustrate its superiority over existing state-of-the-art techniques. |
Tasks | Language Modelling, Question Answering |
Published | 2018-09-04 |
URL | http://arxiv.org/abs/1809.01229v1 |
http://arxiv.org/pdf/1809.01229v1.pdf | |
PWC | https://paperswithcode.com/paper/t-exponential-memory-networks-for-question |
Repo | |
Framework | |
Algebraic Machine Learning
Title | Algebraic Machine Learning |
Authors | Fernando Martin-Maroto, Gonzalo G. de Polavieja |
Abstract | Machine learning algorithms use error function minimization to fit a large set of parameters in a preexisting model. However, error minimization eventually leads to a memorization of the training dataset, losing the ability to generalize to other datasets. To achieve generalization something else is needed, for example a regularization method or stopping the training when error in a validation dataset is minimal. Here we propose a different approach to learning and generalization that is parameter-free, fully discrete and that does not use function minimization. We use the training data to find an algebraic representation with minimal size and maximal freedom, explicitly expressed as a product of irreducible components. This algebraic representation is shown to directly generalize, giving high accuracy in test data, more so the smaller the representation. We prove that the number of generalizing representations can be very large and the algebra only needs to find one. We also derive and test a relationship between compression and error rate. We give results for a simple problem solved step by step, hand-written character recognition, and the Queens Completion problem as an example of unsupervised learning. As an alternative to statistical learning, algebraic learning may offer advantages in combining bottom-up and top-down information, formal concept derivation from data and large-scale parallelization. |
Tasks | |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05252v2 |
http://arxiv.org/pdf/1803.05252v2.pdf | |
PWC | https://paperswithcode.com/paper/algebraic-machine-learning |
Repo | |
Framework | |
From Gene Expression to Drug Response: A Collaborative Filtering Approach
Title | From Gene Expression to Drug Response: A Collaborative Filtering Approach |
Authors | Cheng Qian, Nicholas D. Sidiropoulos, Magda Amiridi, Amin Emad |
Abstract | Predicting the response of cancer cells to drugs is an important problem in pharmacogenomics. Recent efforts in generation of large scale datasets profiling gene expression and drug sensitivity in cell lines have provided a unique opportunity to study this problem. However, one major challenge is the small number of samples (cell lines) compared to the number of features (genes) even in these large datasets. We propose a collaborative filtering (CF) like algorithm for modeling gene-drug relationship to identify patients most likely to benefit from a treatment. Due to the correlation of gene expressions in different cell lines, the gene expression matrix is approximately low-rank, which suggests that drug responses could be estimated from a reduced dimension latent space of the gene expression. Towards this end, we propose a joint low-rank matrix factorization and latent linear regression approach. Experiments with data from the Genomics of Drug Sensitivity in Cancer database are included to show that the proposed method can predict drug-gene associations better than the state-of-the-art methods. |
Tasks | |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12758v2 |
http://arxiv.org/pdf/1810.12758v2.pdf | |
PWC | https://paperswithcode.com/paper/from-gene-expression-to-drug-response-a |
Repo | |
Framework | |
VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability
Title | VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability |
Authors | Romain Cohendet, Claire-Hélène Demarty, Ngoc Q. K. Duong, Martin Engilberge |
Abstract | Humans share a strong tendency to memorize/forget some of the visual information they encounter. This paper focuses on providing computational models for the prediction of the intrinsic memorability of visual content. To address this new challenge, we introduce a large scale dataset (VideoMem) composed of 10,000 videos annotated with memorability scores. In contrast to previous work on image memorability – where memorability was measured a few minutes after memorization – memory performance is measured twice: a few minutes after memorization and again 24-72 hours later. Hence, the dataset comes with short-term and long-term memorability annotations. After an in-depth analysis of the dataset, we investigate several deep neural network based models for the prediction of video memorability. Our best model using a ranking loss achieves a Spearman’s rank correlation of 0.494 for short-term memorability prediction, while our proposed model with attention mechanism provides insights of what makes a content memorable. The VideoMem dataset with pre-extracted features is publicly available. |
Tasks | |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.01973v1 |
http://arxiv.org/pdf/1812.01973v1.pdf | |
PWC | https://paperswithcode.com/paper/videomem-constructing-analyzing-predicting |
Repo | |
Framework | |
Noisin: Unbiased Regularization for Recurrent Neural Networks
Title | Noisin: Unbiased Regularization for Recurrent Neural Networks |
Authors | Adji B. Dieng, Rajesh Ranganath, Jaan Altosaar, David M. Blei |
Abstract | Recurrent neural networks (RNNs) are powerful models of sequential data. They have been successfully used in domains such as text and speech. However, RNNs are susceptible to overfitting; regularization is important. In this paper we develop Noisin, a new method for regularizing RNNs. Noisin injects random noise into the hidden states of the RNN and then maximizes the corresponding marginal likelihood of the data. We show how Noisin applies to any RNN and we study many different types of noise. Noisin is unbiased–it preserves the underlying RNN on average. We characterize how Noisin regularizes its RNN both theoretically and empirically. On language modeling benchmarks, Noisin improves over dropout by as much as 12.2% on the Penn Treebank and 9.4% on the Wikitext-2 dataset. We also compared the state-of-the-art language model of Yang et al. 2017, both with and without Noisin. On the Penn Treebank, the method with Noisin more quickly reaches state-of-the-art performance. |
Tasks | Language Modelling |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01500v2 |
http://arxiv.org/pdf/1805.01500v2.pdf | |
PWC | https://paperswithcode.com/paper/noisin-unbiased-regularization-for-recurrent |
Repo | |
Framework | |
Towards Autonomous Reinforcement Learning: Automatic Setting of Hyper-parameters using Bayesian Optimization
Title | Towards Autonomous Reinforcement Learning: Automatic Setting of Hyper-parameters using Bayesian Optimization |
Authors | Juan Cruz Barsce, Jorge A. Palombarini, Ernesto C. Martínez |
Abstract | With the increase of machine learning usage by industries and scientific communities in a variety of tasks such as text mining, image recognition and self-driving cars, automatic setting of hyper-parameter in learning algorithms is a key factor for achieving satisfactory performance regardless of user expertise in the inner workings of the techniques and methodologies. In particular, for a reinforcement learning algorithm, the efficiency of an agent learning a control policy in an uncertain environment is heavily dependent on the hyper-parameters used to balance exploration with exploitation. In this work, an autonomous learning framework that integrates Bayesian optimization with Gaussian process regression to optimize the hyper-parameters of a reinforcement learning algorithm, is proposed. Also, a bandits-based approach to achieve a balance between computational costs and decreasing uncertainty about the Q-values, is presented. A gridworld example is used to highlight how hyper-parameter configurations of a learning algorithm (SARSA) are iteratively improved based on two performance functions. |
Tasks | Self-Driving Cars |
Published | 2018-05-12 |
URL | http://arxiv.org/abs/1805.04748v1 |
http://arxiv.org/pdf/1805.04748v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-autonomous-reinforcement-learning |
Repo | |
Framework | |
Bayesian Transfer Reinforcement Learning with Prior Knowledge Rules
Title | Bayesian Transfer Reinforcement Learning with Prior Knowledge Rules |
Authors | Michalis K. Titsias, Sotirios Nikoloutsopoulos |
Abstract | We propose a probabilistic framework to directly insert prior knowledge in reinforcement learning (RL) algorithms by defining the behaviour policy as a Bayesian posterior distribution. Such a posterior combines task specific information with prior knowledge, thus allowing to achieve transfer learning across tasks. The resulting method is flexible and it can be easily incorporated to any standard off-policy and on-policy algorithms, such as those based on temporal differences and policy gradients. We develop a specific instance of this Bayesian transfer RL framework by expressing prior knowledge as general deterministic rules that can be useful in a large variety of tasks, such as navigation tasks. Also, we elaborate more on recent probabilistic and entropy-regularised RL by developing a novel temporal learning algorithm and show how to combine it with Bayesian transfer RL. Finally, we demonstrate our method for solving mazes and show that significant speed ups can be obtained. |
Tasks | Transfer Learning, Transfer Reinforcement Learning |
Published | 2018-09-30 |
URL | http://arxiv.org/abs/1810.00468v1 |
http://arxiv.org/pdf/1810.00468v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-transfer-reinforcement-learning-with |
Repo | |
Framework | |
Generalizability vs. Robustness: Adversarial Examples for Medical Imaging
Title | Generalizability vs. Robustness: Adversarial Examples for Medical Imaging |
Authors | Magdalini Paschali, Sailesh Conjeti, Fernando Navarro, Nassir Navab |
Abstract | In this paper, for the first time, we propose an evaluation method for deep learning models that assesses the performance of a model not only in an unseen test scenario, but also in extreme cases of noise, outliers and ambiguous input data. To this end, we utilize adversarial examples, images that fool machine learning models, while looking imperceptibly different from original data, as a measure to evaluate the robustness of a variety of medical imaging models. Through extensive experiments on skin lesion classification and whole brain segmentation with state-of-the-art networks such as Inception and UNet, we show that models that achieve comparable performance regarding generalizability may have significant variations in their perception of the underlying data manifold, leading to an extensive performance gap in their robustness. |
Tasks | Brain Segmentation, Skin Lesion Classification |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1804.00504v1 |
http://arxiv.org/pdf/1804.00504v1.pdf | |
PWC | https://paperswithcode.com/paper/generalizability-vs-robustness-adversarial |
Repo | |
Framework | |
Dating Ancient Paintings of Mogao Grottoes Using Deeply Learnt Visual Codes
Title | Dating Ancient Paintings of Mogao Grottoes Using Deeply Learnt Visual Codes |
Authors | Qingquan Li, Qin Zou, De Ma, Qian Wang, Song Wang |
Abstract | Cultural heritage is the asset of all the peoples of the world. The preservation and inheritance of cultural heritage is conducive to the progress of human civilization. In northwestern China, there is a world heritage site – Mogao Grottoes – that has a plenty of mural paintings showing the historical cultures of ancient China. To study these historical cultures, one critical procedure is to date the mural paintings, i.e., determining the era when they were created. Until now, most mural paintings at Mogao Grottoes have been dated by directly referring to the mural texts or historical documents. However, some are still left with creation-era undetermined due to the lack of reference materials. Considering that the drawing style of mural paintings was changing along the history and the drawing style can be learned and quantified through painting data, we formulate the problem of mural-painting dating into a problem of drawing-style classification. In fact, drawing styles can be expressed not only in color or curvature, but also in some unknown forms – the forms that have not been observed. To this end, besides sophisticated color and shape descriptors, a deep convolution neural network is designed to encode the implicit drawing styles. 3860 mural paintings collected from 194 different grottoes with determined creation-era labels are used to train the classification model and build the dating method. In experiments, the proposed dating method is applied to seven mural paintings which were previously dated with controversies, and the exciting new dating results are approved by the Dunhuang expert. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09168v1 |
http://arxiv.org/pdf/1810.09168v1.pdf | |
PWC | https://paperswithcode.com/paper/dating-ancient-paintings-of-mogao-grottoes |
Repo | |
Framework | |
Self-Attention Equipped Graph Convolutions for Disease Prediction
Title | Self-Attention Equipped Graph Convolutions for Disease Prediction |
Authors | Anees Kazi, S. Arvind krishna, Shayan Shekarforoush, Karsten Kortuem, Shadi Albarqouni, Nassir Navab |
Abstract | Multi-modal data comprising imaging (MRI, fMRI, PET, etc.) and non-imaging (clinical test, demographics, etc.) data can be collected together and used for disease prediction. Such diverse data gives complementary information about the patient's condition to make an informed diagnosis. A model capable of leveraging the individuality of each multi-modal data is required for better disease prediction. We propose a graph convolution based deep model which takes into account the distinctiveness of each element of the multi-modal data. We incorporate a novel self-attention layer, which weights every element of the demographic data by exploring its relation to the underlying disease. We demonstrate the superiority of our developed technique in terms of computational speed and performance when compared to state-of-the-art methods. Our method outperforms other methods with a significant margin. |
Tasks | Disease Prediction |
Published | 2018-12-24 |
URL | http://arxiv.org/abs/1812.09954v1 |
http://arxiv.org/pdf/1812.09954v1.pdf | |
PWC | https://paperswithcode.com/paper/self-attention-equipped-graph-convolutions |
Repo | |
Framework | |
A CNN-based Spatial Feature Fusion Algorithm for Hyperspectral Imagery Classification
Title | A CNN-based Spatial Feature Fusion Algorithm for Hyperspectral Imagery Classification |
Authors | Alan J. X. Guo, Fei Zhu |
Abstract | The shortage of training samples remains one of the main obstacles in applying the artificial neural networks (ANN) to the hyperspectral images classification. To fuse the spatial and spectral information, pixel patches are often utilized to train a model, which may further aggregate this problem. In the existing works, an ANN model supervised by center-loss (ANNC) was introduced. Training merely with spectral information, the ANNC yields discriminative spectral features suitable for the subsequent classification tasks. In this paper, a CNN-based spatial feature fusion (CSFF) algorithm is proposed, which allows a smart fusion of the spatial information to the spectral features extracted by ANNC. As a critical part of CSFF, a CNN-based discriminant model is introduced to estimate whether two paring pixels belong to the same class. At the testing stage, by applying the discriminant model to the pixel-pairs generated by the test pixel and its neighbors, the local structure is estimated and represented as a customized convolutional kernel. The spectral-spatial feature is obtained by a convolutional operation between the estimated kernel and the corresponding spectral features within a neighborhood. At last, the label of the test pixel is predicted by classifying the resulting spectral-spatial feature. Without increasing the number of training samples or involving pixel patches at the training stage, the CSFF framework achieves the state-of-the-art by declining $20%-50%$ classification failures in experiments on three well-known hyperspectral images. |
Tasks | |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10355v2 |
http://arxiv.org/pdf/1801.10355v2.pdf | |
PWC | https://paperswithcode.com/paper/a-cnn-based-spatial-feature-fusion-algorithm |
Repo | |
Framework | |
Sepsis Prediction and Vital Signs Ranking in Intensive Care Unit Patients
Title | Sepsis Prediction and Vital Signs Ranking in Intensive Care Unit Patients |
Authors | Avijit Mitra, Khalid Ashraf |
Abstract | We study multiple rule-based and machine learning (ML) models for sepsis detection. We report the first neural network detection and prediction results on three categories of sepsis. We have used the retrospective Medical Information Mart for Intensive Care (MIMIC)-III dataset, restricted to intensive care unit (ICU) patients. Features for prediction were created from only common vital sign measurements. We show significant improvement of AUC score using neural network based ensemble model compared to single ML and rule-based models. For the detection of sepsis, severe sepsis, and septic shock, our model achieves an AUC of 0.97, 0.96 and 0.91, respectively. Four hours before the positive hours, it predicts the same three categories with an AUC of 0.90, 0.91 and 0.90 respectively. Further, we ranked the features and found that using six vital signs consistently provides higher detection and prediction AUC for all the models tested. Our novel ensemble model achieves highest AUC in detecting and predicting sepsis, severe sepsis, and septic shock in the MIMIC-III ICU patients, and is amenable to deployment in hospital settings. |
Tasks | |
Published | 2018-12-17 |
URL | http://arxiv.org/abs/1812.06686v3 |
http://arxiv.org/pdf/1812.06686v3.pdf | |
PWC | https://paperswithcode.com/paper/sepsis-prediction-and-vital-signs-ranking-in |
Repo | |
Framework | |
Fake Sentence Detection as a Training Task for Sentence Encoding
Title | Fake Sentence Detection as a Training Task for Sentence Encoding |
Authors | Viresh Ranjan, Heeyoung Kwon, Niranjan Balasubramanian, Minh Hoai |
Abstract | Sentence encoders are typically trained on language modeling tasks with large unlabeled datasets. While these encoders achieve state-of-the-art results on many sentence-level tasks, they are difficult to train with long training cycles. We introduce fake sentence detection as a new training task for learning sentence encoders. We automatically generate fake sentences by corrupting original sentences from a source collection and train the encoders to produce representations that are effective at detecting fake sentences. This binary classification task turns to be quite efficient for training sentence encoders. We compare a basic BiLSTM encoder trained on this task with a strong sentence encoding models (Skipthought and FastSent) trained on a language modeling task. We find that the BiLSTM trains much faster on fake sentence detection (20 hours instead of weeks) using smaller amounts of data (1M instead of 64M sentences). Further analysis shows the learned representations capture many syntactic and semantic properties expected from good sentence representations. |
Tasks | Language Modelling |
Published | 2018-08-11 |
URL | http://arxiv.org/abs/1808.03840v4 |
http://arxiv.org/pdf/1808.03840v4.pdf | |
PWC | https://paperswithcode.com/paper/fake-sentence-detection-as-a-training-task |
Repo | |
Framework | |