Paper Group AWR 80
Predicting user intent from search queries using both CNNs and RNNs. A Unified Framework for Domain Adaptation using Metric Learning on Manifolds. TVAE: Triplet-Based Variational Autoencoder using Metric Learning. Safe Triplet Screening for Distance Metric Learning. Estimating Cellular Goals from High-Dimensional Biological Data. Feature selection …
Predicting user intent from search queries using both CNNs and RNNs
Title | Predicting user intent from search queries using both CNNs and RNNs |
Authors | Mihai Cristian Pîrvu, Alexandra Anghel, Ciprian Borodescu, Alexandru Constantin |
Abstract | Predicting user behaviour on a website is a difficult task, which requires the integration of multiple sources of information, such as geo-location, user profile or web surfing history. In this paper we tackle the problem of predicting the user intent, based on the queries that were used to access a certain webpage. We make no additional assumptions, such as domain detection, device used or location, and only use the word information embedded in the given query. In order to build competitive classifiers, we label a small fraction of the EDI query intent prediction dataset \cite{edi-challenge-dataset}, which is used as ground truth. Then, using various rule-based approaches, we automatically label the rest of the dataset, train the classifiers and evaluate the quality of the automatic labeling on the ground truth dataset. We used both recurrent and convolutional networks as the models, while representing the words in the query with multiple embedding methods. |
Tasks | |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07324v1 |
http://arxiv.org/pdf/1812.07324v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-user-intent-from-search-queries |
Repo | https://github.com/Morphl-AI/MorphL-Model-User-Search-Intent |
Framework | pytorch |
A Unified Framework for Domain Adaptation using Metric Learning on Manifolds
Title | A Unified Framework for Domain Adaptation using Metric Learning on Manifolds |
Authors | Sridhar Mahadevan, Bamdev Mishra, Shalini Ghosh |
Abstract | We present a novel framework for domain adaptation, whereby both geometric and statistical differences between a labeled source domain and unlabeled target domain can be integrated by exploiting the curved Riemannian geometry of statistical manifolds. Our approach is based on formulating transfer from source to target as a problem of geometric mean metric learning on manifolds. Specifically, we exploit the curved Riemannian manifold geometry of symmetric positive definite (SPD) covariance matrices. We exploit a simple but important observation that as the space of covariance matrices is both a Riemannian space as well as a homogeneous space, the shortest path geodesic between two covariances on the manifold can be computed analytically. Statistics on the SPD matrix manifold, such as the geometric mean of two matrices can be reduced to solving the well-known Riccati equation. We show how the Ricatti-based solution can be constrained to not only reduce the statistical differences between the source and target domains, such as aligning second order covariances and minimizing the maximum mean discrepancy, but also the underlying geometry of the source and target domains using diffusions on the underlying source and target manifolds. A key strength of our proposed approach is that it enables integrating multiple sources of variation between source and target in a unified way, by reducing the combined objective function to a nested set of Ricatti equations where the solution can be represented by a cascaded series of geometric mean computations. In addition to showing the theoretical optimality of our solution, we present detailed experiments using standard transfer learning testbeds from computer vision comparing our proposed algorithms to past work in domain adaptation, showing improved results over a large variety of previous methods. |
Tasks | Domain Adaptation, Metric Learning, Transfer Learning |
Published | 2018-04-28 |
URL | http://arxiv.org/abs/1804.10834v1 |
http://arxiv.org/pdf/1804.10834v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-framework-for-domain-adaptation |
Repo | https://github.com/sridharmahadevan/Geodesic-Covariance-Alignment |
Framework | none |
TVAE: Triplet-Based Variational Autoencoder using Metric Learning
Title | TVAE: Triplet-Based Variational Autoencoder using Metric Learning |
Authors | Haque Ishfaq, Assaf Hoogi, Daniel Rubin |
Abstract | Deep metric learning has been demonstrated to be highly effective in learning semantic representation and encoding information that can be used to measure data similarity, by relying on the embedding learned from metric learning. At the same time, variational autoencoder (VAE) has widely been used to approximate inference and proved to have a good performance for directed probabilistic models. However, for traditional VAE, the data label or feature information are intractable. Similarly, traditional representation learning approaches fail to represent many salient aspects of the data. In this project, we propose a novel integrated framework to learn latent embedding in VAE by incorporating deep metric learning. The features are learned by optimizing a triplet loss on the mean vectors of VAE in conjunction with standard evidence lower bound (ELBO) of VAE. This approach, which we call Triplet based Variational Autoencoder (TVAE), allows us to capture more fine-grained information in the latent embedding. Our model is tested on MNIST data set and achieves a high triplet accuracy of 95.60% while the traditional VAE (Kingma & Welling, 2013) achieves triplet accuracy of 75.08%. |
Tasks | Metric Learning, Representation Learning |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04403v2 |
http://arxiv.org/pdf/1802.04403v2.pdf | |
PWC | https://paperswithcode.com/paper/tvae-triplet-based-variational-autoencoder |
Repo | https://github.com/hmishfaq/DDSM-TVAE |
Framework | pytorch |
Safe Triplet Screening for Distance Metric Learning
Title | Safe Triplet Screening for Distance Metric Learning |
Authors | Tomoki Yoshida, Ichiro Takeuchi, Masayuki Karasuyama |
Abstract | We study safe screening for metric learning. Distance metric learning can optimize a metric over a set of triplets, each one of which is defined by a pair of same class instances and an instance in a different class. However, the number of possible triplets is quite huge even for a small dataset. Our safe triplet screening identifies triplets which can be safely removed from the optimization problem without losing the optimality. Compared with existing safe screening studies, triplet screening is particularly significant because of (1) the huge number of possible triplets, and (2) the semi-definite constraint in the optimization. We derive several variants of screening rules, and analyze their relationships. Numerical experiments on benchmark datasets demonstrate the effectiveness of safe triplet screening. |
Tasks | Metric Learning |
Published | 2018-02-12 |
URL | http://arxiv.org/abs/1802.03923v2 |
http://arxiv.org/pdf/1802.03923v2.pdf | |
PWC | https://paperswithcode.com/paper/safe-triplet-screening-for-distance-metric |
Repo | https://github.com/birdwatcherYT/Safe-Triplet-Screening-for-Distance-Metric-Learning |
Framework | none |
Estimating Cellular Goals from High-Dimensional Biological Data
Title | Estimating Cellular Goals from High-Dimensional Biological Data |
Authors | Laurence Yang, Michael A. Saunders, Jean-Christophe Lachance, Bernhard O. Palsson, José Bento |
Abstract | Optimization-based models have been used to predict cellular behavior for over 25 years. The constraints in these models are derived from genome annotations, measured macro-molecular composition of cells, and by measuring the cell’s growth rate and metabolism in different conditions. The cellular goal (the optimization problem that the cell is trying to solve) can be challenging to derive experimentally for many organisms, including human or mammalian cells, which have complex metabolic capabilities and are not well understood. Existing approaches to learning goals from data include (a) estimating a linear objective function, or (b) estimating linear constraints that model complex biochemical reactions and constrain the cell’s operation. The latter approach is important because often the known/observed biochemical reactions are not enough to explain observations, and hence there is a need to extend automatically the model complexity by learning new chemical reactions. However, this leads to nonconvex optimization problems, and existing tools cannot scale to realistically large metabolic models. Hence, constraint estimation is still used sparingly despite its benefits for modeling cell metabolism, which is important for developing novel antimicrobials against pathogens, discovering cancer drug targets, and producing value-added chemicals. Here, we develop the first approach to estimating constraint reactions from data that can scale to realistically large metabolic models. Previous tools have been used on problems having less than 75 biochemical reactions and 60 metabolites, which limits real-life-size applications. We perform extensive experiments using 75 large-scale metabolic network models for different organisms (including bacteria, yeasts, and mammals) and show that our algorithm can recover cellular constraint reactions, even when some measurements are missing. |
Tasks | |
Published | 2018-07-11 |
URL | https://arxiv.org/abs/1807.04245v4 |
https://arxiv.org/pdf/1807.04245v4.pdf | |
PWC | https://paperswithcode.com/paper/estimating-cellular-goals-from-high |
Repo | https://github.com/laurenceyang33/cellgoal |
Framework | none |
Feature selection as Monte-Carlo Search in Growing Single Rooted Directed Acyclic Graph by Best Leaf Identification
Title | Feature selection as Monte-Carlo Search in Growing Single Rooted Directed Acyclic Graph by Best Leaf Identification |
Authors | Aurelien Pelissier, Atsuyoshi Nakamura, Koji Tabata |
Abstract | Monte Carlo tree search (MCTS) has received considerable interest due to its spectacular success in the difficult problem of computer Go and also proved beneficial in a range of other domains. A major issue that has received little attention in the MCTS literature is the fact that, in most games, different actions can lead to the same state, that may lead to a high degree of redundancy in tree representation and unnecessary additional computational cost. We extend MCTS to single rooted directed acyclic graph (SR-DAG), and consider the Best Arm Identification (BAI) and the Best Leaf Identification (BLI) problem of an expanding SR-DAG of arbitrary depth. We propose algorithms that are (epsilon, delta)-correct in the fixed confidence setting, and prove an asymptotic upper bounds of sample complexity for our BAI algorithm. As a major application for our BLI algorithm, a novel approach for Feature Selection is proposed by representing the feature set space as a SR-DAG and repeatedly evaluating feature subsets until a candidate for the best leaf is returned, a proof of concept is shown on benchmark data sets. |
Tasks | Feature Selection |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07531v2 |
http://arxiv.org/pdf/1811.07531v2.pdf | |
PWC | https://paperswithcode.com/paper/feature-selection-as-monte-carlo-search-in |
Repo | https://github.com/Aurelien-Pelissier/Feature-Selection-as-Reinforcement-Learning |
Framework | tf |
Adversarial Deep Reinforcement Learning in Portfolio Management
Title | Adversarial Deep Reinforcement Learning in Portfolio Management |
Authors | Zhipeng Liang, Hao Chen, Junhao Zhu, Kangkang Jiang, Yanran Li |
Abstract | In this paper, we implement three state-of-art continuous reinforcement learning algorithms, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO) and Policy Gradient (PG)in portfolio management. All of them are widely-used in game playing and robot control. What’s more, PPO has appealing theoretical propeties which is hopefully potential in portfolio management. We present the performances of them under different settings, including different learning rates, objective functions, feature combinations, in order to provide insights for parameters tuning, features selection and data preparation. We also conduct intensive experiments in China Stock market and show that PG is more desirable in financial market than DDPG and PPO, although both of them are more advanced. What’s more, we propose a so called Adversarial Training method and show that it can greatly improve the training efficiency and significantly promote average daily return and sharpe ratio in back test. Based on this new modification, our experiments results show that our agent based on Policy Gradient can outperform UCRP. |
Tasks | |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09940v3 |
http://arxiv.org/pdf/1808.09940v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-deep-reinforcement-learning-in |
Repo | https://github.com/liangzp/Reinforcement-learning-in-portfolio-management- |
Framework | tf |
RCCNet: An Efficient Convolutional Neural Network for Histological Routine Colon Cancer Nuclei Classification
Title | RCCNet: An Efficient Convolutional Neural Network for Histological Routine Colon Cancer Nuclei Classification |
Authors | S H Shabbeer Basha, Soumen Ghosh, Kancharagunta Kishan Babu, Shiv Ram Dubey, Viswanath Pulabaigari, Snehasis Mukherjee |
Abstract | Efficient and precise classification of histological cell nuclei is of utmost importance due to its potential applications in the field of medical image analysis. It would facilitate the medical practitioners to better understand and explore various factors for cancer treatment. The classification of histological cell nuclei is a challenging task due to the cellular heterogeneity. This paper proposes an efficient Convolutional Neural Network (CNN) based architecture for classification of histological routine colon cancer nuclei named as RCCNet. The main objective of this network is to keep the CNN model as simple as possible. The proposed RCCNet model consists of only 1,512,868 learnable parameters which are significantly less compared to the popular CNN models such as AlexNet, CIFARVGG, GoogLeNet, and WRN. The experiments are conducted over publicly available routine colon cancer histological dataset “CRCHistoPhenotypes”. The results of the proposed RCCNet model are compared with five state-of-the-art CNN models in terms of the accuracy, weighted average F1 score and training time. The proposed method has achieved a classification accuracy of 80.61% and 0.7887 weighted average F1 score. The proposed RCCNet is more efficient and generalized terms of the training time and data over-fitting, respectively. |
Tasks | Nuclei Classification |
Published | 2018-09-30 |
URL | https://arxiv.org/abs/1810.02797v3 |
https://arxiv.org/pdf/1810.02797v3.pdf | |
PWC | https://paperswithcode.com/paper/rccnet-an-efficient-convolutional-neural |
Repo | https://github.com/shabbeersh/Impact-of-FC-layers |
Framework | tf |
Projective Inference in High-dimensional Problems: Prediction and Feature Selection
Title | Projective Inference in High-dimensional Problems: Prediction and Feature Selection |
Authors | Juho Piironen, Markus Paasiniemi, Aki Vehtari |
Abstract | This paper discusses predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We argue that in many cases one can benefit from a decision theoretically justified two-stage approach: first, construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions. The model built in the first step is referred to as the \emph{reference model} and the operation during the latter step as predictive \emph{projection}. The key characteristic of this approach is that it finds an excellent tradeoff between sparsity and predictive accuracy, and the gain comes from utilizing all available information including prior and that coming from the left out features. We review several methods that follow this principle and provide novel methodological contributions. We present a new projection technique that unifies two existing techniques and is both accurate and fast to compute. We also propose a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection. Furthermore, we prove a theorem that helps to understand the conditions under which the projective approach could be beneficial. The benefits are illustrated via several simulated and real world examples. |
Tasks | Feature Selection |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02406v1 |
http://arxiv.org/pdf/1810.02406v1.pdf | |
PWC | https://paperswithcode.com/paper/projective-inference-in-high-dimensional |
Repo | https://github.com/stan-dev/projpred |
Framework | none |
Deep Relevance Ranking Using Enhanced Document-Query Interactions
Title | Deep Relevance Ranking Using Enhanced Document-Query Interactions |
Authors | Ryan McDonald, Georgios-Ioannis Brokos, Ion Androutsopoulos |
Abstract | We explore several new models for document relevance ranking, building upon the Deep Relevance Matching Model (DRMM) of Guo et al. (2016). Unlike DRMM, which uses context-insensitive encodings of terms and query-document term interactions, we inject rich context-sensitive encodings throughout our models, inspired by PACRR’s (Hui et al., 2017) convolutional n-gram matching features, but extended in several ways including multiple views of query and document inputs. We test our models on datasets from the BIOASQ question answering challenge (Tsatsaronis et al., 2015) and TREC ROBUST 2004 (Voorhees, 2005), showing they outperform BM25-based baselines, DRMM, and PACRR. |
Tasks | Ad-Hoc Information Retrieval, Question Answering |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01682v2 |
http://arxiv.org/pdf/1809.01682v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-relevance-ranking-using-enhanced |
Repo | https://github.com/nlpaueb/deep-relevance-ranking |
Framework | tf |
High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking
Title | High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking |
Authors | Fan Wang, Sach Mukherjee, Sylvia Richardson, Steven M. Hill |
Abstract | Penalized likelihood approaches are widely used for high-dimensional regression. Although many methods have been proposed and the associated theory is now well-developed, the relative efficacy of different approaches in finite-sample settings, as encountered in practice, remains incompletely understood. There is therefore a need for empirical investigations in this area that can offer practical insight and guidance to users. In this paper we present a large-scale comparison of penalized regression methods. We distinguish between three related goals: prediction, variable selection and variable ranking. Our results span more than 2,300 data-generating scenarios, including both synthetic and semi-synthetic data (real covariates and simulated responses), allowing us to systematically consider the influence of various factors (sample size, dimensionality, sparsity, signal strength and multicollinearity). We consider several widely-used approaches (Lasso, Adaptive Lasso, Elastic Net, Ridge Regression, SCAD, the Dantzig Selector and Stability Selection). We find considerable variation in performance between methods. Our results support a `no panacea’ view, with no unambiguous winner across all scenarios or goals, even in this restricted setting where all data align well with the assumptions underlying the methods. The study allows us to make some recommendations as to which approaches may be most (or least) suitable given the goal and some data characteristics. Our empirical results complement existing theory and provide a resource to compare methods across a range of scenarios and metrics. | |
Tasks | |
Published | 2018-08-02 |
URL | https://arxiv.org/abs/1808.00723v2 |
https://arxiv.org/pdf/1808.00723v2.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-regression-in-practice-an |
Repo | https://github.com/fw307/high_dimensional_regression_comparison |
Framework | none |
A Constraint-Based Algorithm For Causal Discovery with Cycles, Latent Variables and Selection Bias
Title | A Constraint-Based Algorithm For Causal Discovery with Cycles, Latent Variables and Selection Bias |
Authors | Eric V. Strobl |
Abstract | Causal processes in nature may contain cycles, and real datasets may violate causal sufficiency as well as contain selection bias. No constraint-based causal discovery algorithm can currently handle cycles, latent variables and selection bias (CLS) simultaneously. I therefore introduce an algorithm called Cyclic Causal Inference (CCI) that makes sound inferences with a conditional independence oracle under CLS, provided that we can represent the cyclic causal process as a non-recursive linear structural equation model with independent errors. Empirical results show that CCI outperforms CCD in the cyclic case as well as rivals FCI and RFCI in the acyclic case. |
Tasks | Causal Discovery, Causal Inference |
Published | 2018-05-05 |
URL | http://arxiv.org/abs/1805.02087v1 |
http://arxiv.org/pdf/1805.02087v1.pdf | |
PWC | https://paperswithcode.com/paper/a-constraint-based-algorithm-for-causal |
Repo | https://github.com/ericstrobl/CCI |
Framework | none |
A Semi-Supervised Data Augmentation Approach using 3D Graphical Engines
Title | A Semi-Supervised Data Augmentation Approach using 3D Graphical Engines |
Authors | Shuangjun Liu, Sarah Ostadabbas |
Abstract | Deep learning approaches have been rapidly adopted across a wide range of fields because of their accuracy and flexibility, but require large labeled training datasets. This presents a fundamental problem for applications with limited, expensive, or private data (i.e. small data), such as human pose and behavior estimation/tracking which could be highly personalized. In this paper, we present a semi-supervised data augmentation approach that can synthesize large scale labeled training datasets using 3D graphical engines based on a physically-valid low dimensional pose descriptor. To evaluate the performance of our synthesized datasets in training deep learning-based models, we generated a large synthetic human pose dataset, called ScanAva using 3D scans of only 7 individuals based on our proposed augmentation approach. A state-of-the-art human pose estimation deep learning model then was trained from scratch using our ScanAva dataset and could achieve the pose estimation accuracy of 91.2% at PCK0.5 criteria after applying an efficient domain adaptation on the synthetic images, in which its pose estimation accuracy was comparable to the same model trained on large scale pose data from real humans such as MPII dataset and much higher than the model trained on other synthetic human dataset such as SURREAL. |
Tasks | Data Augmentation, Domain Adaptation, Pose Estimation |
Published | 2018-08-08 |
URL | http://arxiv.org/abs/1808.02595v2 |
http://arxiv.org/pdf/1808.02595v2.pdf | |
PWC | https://paperswithcode.com/paper/a-semi-supervised-data-augmentation-approach |
Repo | https://github.com/ostadabbas/ScanAvaGenerationToolkit |
Framework | none |
Explainable Reasoning over Knowledge Graphs for Recommendation
Title | Explainable Reasoning over Knowledge Graphs for Recommendation |
Authors | Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, Yixin Cao, Tat-Seng Chua |
Abstract | Incorporating knowledge graph into recommender systems has attracted increasing attention in recent years. By exploring the interlinks within a knowledge graph, the connectivity between users and items can be discovered as paths, which provide rich and complementary information to user-item interactions. Such connectivity not only reveals the semantics of entities and relations, but also helps to comprehend a user’s interest. However, existing efforts have not fully explored this connectivity to infer user preferences, especially in terms of modeling the sequential dependencies within and holistic semantics of a path. In this paper, we contribute a new model named Knowledge-aware Path Recurrent Network (KPRN) to exploit knowledge graph for recommendation. KPRN can generate path representations by composing the semantics of both entities and relations. By leveraging the sequential dependencies within a path, we allow effective reasoning on paths to infer the underlying rationale of a user-item interaction. Furthermore, we design a new weighted pooling operation to discriminate the strengths of different paths in connecting a user with an item, endowing our model with a certain level of explainability. We conduct extensive experiments on two datasets about movie and music, demonstrating significant improvements over state-of-the-art solutions Collaborative Knowledge Base Embedding and Neural Factorization Machine. |
Tasks | Knowledge Graphs, Recommendation Systems |
Published | 2018-11-12 |
URL | http://arxiv.org/abs/1811.04540v1 |
http://arxiv.org/pdf/1811.04540v1.pdf | |
PWC | https://paperswithcode.com/paper/explainable-reasoning-over-knowledge-graphs |
Repo | https://github.com/BaeSeulki/WhySoMuch |
Framework | none |
Context-Free Transductions with Neural Stacks
Title | Context-Free Transductions with Neural Stacks |
Authors | Yiding Hao, William Merrill, Dana Angluin, Robert Frank, Noah Amsel, Andrew Benz, Simon Mendelsohn |
Abstract | This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modelling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are more difficult to train than classical architectures such as LSTMs. Rather than employ stack-based strategies, more complex networks often find approximate solutions by using the stack as unstructured memory. |
Tasks | Language Modelling |
Published | 2018-09-08 |
URL | http://arxiv.org/abs/1809.02836v1 |
http://arxiv.org/pdf/1809.02836v1.pdf | |
PWC | https://paperswithcode.com/paper/context-free-transductions-with-neural-stacks |
Repo | https://github.com/viking-sudo-rm/StackNN |
Framework | pytorch |