October 21, 2019

3380 words 16 mins read

Paper Group AWR 80

Predicting user intent from search queries using both CNNs and RNNs. A Unified Framework for Domain Adaptation using Metric Learning on Manifolds. TVAE: Triplet-Based Variational Autoencoder using Metric Learning. Safe Triplet Screening for Distance Metric Learning. Estimating Cellular Goals from High-Dimensional Biological Data. Feature selection …

Predicting user intent from search queries using both CNNs and RNNs


Title	Predicting user intent from search queries using both CNNs and RNNs
Authors	Mihai Cristian Pîrvu, Alexandra Anghel, Ciprian Borodescu, Alexandru Constantin
Abstract	Predicting user behaviour on a website is a difficult task, which requires the integration of multiple sources of information, such as geo-location, user profile or web surfing history. In this paper we tackle the problem of predicting the user intent, based on the queries that were used to access a certain webpage. We make no additional assumptions, such as domain detection, device used or location, and only use the word information embedded in the given query. In order to build competitive classifiers, we label a small fraction of the EDI query intent prediction dataset \cite{edi-challenge-dataset}, which is used as ground truth. Then, using various rule-based approaches, we automatically label the rest of the dataset, train the classifiers and evaluate the quality of the automatic labeling on the ground truth dataset. We used both recurrent and convolutional networks as the models, while representing the words in the query with multiple embedding methods.
Tasks
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07324v1
PDF	http://arxiv.org/pdf/1812.07324v1.pdf
PWC	https://paperswithcode.com/paper/predicting-user-intent-from-search-queries
Repo	https://github.com/Morphl-AI/MorphL-Model-User-Search-Intent
Framework	pytorch

A Unified Framework for Domain Adaptation using Metric Learning on Manifolds


Title	A Unified Framework for Domain Adaptation using Metric Learning on Manifolds
Authors	Sridhar Mahadevan, Bamdev Mishra, Shalini Ghosh
Abstract	We present a novel framework for domain adaptation, whereby both geometric and statistical differences between a labeled source domain and unlabeled target domain can be integrated by exploiting the curved Riemannian geometry of statistical manifolds. Our approach is based on formulating transfer from source to target as a problem of geometric mean metric learning on manifolds. Specifically, we exploit the curved Riemannian manifold geometry of symmetric positive definite (SPD) covariance matrices. We exploit a simple but important observation that as the space of covariance matrices is both a Riemannian space as well as a homogeneous space, the shortest path geodesic between two covariances on the manifold can be computed analytically. Statistics on the SPD matrix manifold, such as the geometric mean of two matrices can be reduced to solving the well-known Riccati equation. We show how the Ricatti-based solution can be constrained to not only reduce the statistical differences between the source and target domains, such as aligning second order covariances and minimizing the maximum mean discrepancy, but also the underlying geometry of the source and target domains using diffusions on the underlying source and target manifolds. A key strength of our proposed approach is that it enables integrating multiple sources of variation between source and target in a unified way, by reducing the combined objective function to a nested set of Ricatti equations where the solution can be represented by a cascaded series of geometric mean computations. In addition to showing the theoretical optimality of our solution, we present detailed experiments using standard transfer learning testbeds from computer vision comparing our proposed algorithms to past work in domain adaptation, showing improved results over a large variety of previous methods.
Tasks	Domain Adaptation, Metric Learning, Transfer Learning
Published	2018-04-28
URL	http://arxiv.org/abs/1804.10834v1
PDF	http://arxiv.org/pdf/1804.10834v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-for-domain-adaptation
Repo	https://github.com/sridharmahadevan/Geodesic-Covariance-Alignment
Framework	none

TVAE: Triplet-Based Variational Autoencoder using Metric Learning


Title	TVAE: Triplet-Based Variational Autoencoder using Metric Learning
Authors	Haque Ishfaq, Assaf Hoogi, Daniel Rubin
Abstract	Deep metric learning has been demonstrated to be highly effective in learning semantic representation and encoding information that can be used to measure data similarity, by relying on the embedding learned from metric learning. At the same time, variational autoencoder (VAE) has widely been used to approximate inference and proved to have a good performance for directed probabilistic models. However, for traditional VAE, the data label or feature information are intractable. Similarly, traditional representation learning approaches fail to represent many salient aspects of the data. In this project, we propose a novel integrated framework to learn latent embedding in VAE by incorporating deep metric learning. The features are learned by optimizing a triplet loss on the mean vectors of VAE in conjunction with standard evidence lower bound (ELBO) of VAE. This approach, which we call Triplet based Variational Autoencoder (TVAE), allows us to capture more fine-grained information in the latent embedding. Our model is tested on MNIST data set and achieves a high triplet accuracy of 95.60% while the traditional VAE (Kingma & Welling, 2013) achieves triplet accuracy of 75.08%.
Tasks	Metric Learning, Representation Learning
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04403v2
PDF	http://arxiv.org/pdf/1802.04403v2.pdf
PWC	https://paperswithcode.com/paper/tvae-triplet-based-variational-autoencoder
Repo	https://github.com/hmishfaq/DDSM-TVAE
Framework	pytorch

Safe Triplet Screening for Distance Metric Learning


Title	Safe Triplet Screening for Distance Metric Learning
Authors	Tomoki Yoshida, Ichiro Takeuchi, Masayuki Karasuyama
Abstract	We study safe screening for metric learning. Distance metric learning can optimize a metric over a set of triplets, each one of which is defined by a pair of same class instances and an instance in a different class. However, the number of possible triplets is quite huge even for a small dataset. Our safe triplet screening identifies triplets which can be safely removed from the optimization problem without losing the optimality. Compared with existing safe screening studies, triplet screening is particularly significant because of (1) the huge number of possible triplets, and (2) the semi-definite constraint in the optimization. We derive several variants of screening rules, and analyze their relationships. Numerical experiments on benchmark datasets demonstrate the effectiveness of safe triplet screening.
Tasks	Metric Learning
Published	2018-02-12
URL	http://arxiv.org/abs/1802.03923v2
PDF	http://arxiv.org/pdf/1802.03923v2.pdf
PWC	https://paperswithcode.com/paper/safe-triplet-screening-for-distance-metric
Repo	https://github.com/birdwatcherYT/Safe-Triplet-Screening-for-Distance-Metric-Learning
Framework	none

Estimating Cellular Goals from High-Dimensional Biological Data


Title	Estimating Cellular Goals from High-Dimensional Biological Data
Authors	Laurence Yang, Michael A. Saunders, Jean-Christophe Lachance, Bernhard O. Palsson, José Bento
Abstract	Optimization-based models have been used to predict cellular behavior for over 25 years. The constraints in these models are derived from genome annotations, measured macro-molecular composition of cells, and by measuring the cell’s growth rate and metabolism in different conditions. The cellular goal (the optimization problem that the cell is trying to solve) can be challenging to derive experimentally for many organisms, including human or mammalian cells, which have complex metabolic capabilities and are not well understood. Existing approaches to learning goals from data include (a) estimating a linear objective function, or (b) estimating linear constraints that model complex biochemical reactions and constrain the cell’s operation. The latter approach is important because often the known/observed biochemical reactions are not enough to explain observations, and hence there is a need to extend automatically the model complexity by learning new chemical reactions. However, this leads to nonconvex optimization problems, and existing tools cannot scale to realistically large metabolic models. Hence, constraint estimation is still used sparingly despite its benefits for modeling cell metabolism, which is important for developing novel antimicrobials against pathogens, discovering cancer drug targets, and producing value-added chemicals. Here, we develop the first approach to estimating constraint reactions from data that can scale to realistically large metabolic models. Previous tools have been used on problems having less than 75 biochemical reactions and 60 metabolites, which limits real-life-size applications. We perform extensive experiments using 75 large-scale metabolic network models for different organisms (including bacteria, yeasts, and mammals) and show that our algorithm can recover cellular constraint reactions, even when some measurements are missing.
Tasks
Published	2018-07-11
URL	https://arxiv.org/abs/1807.04245v4
PDF	https://arxiv.org/pdf/1807.04245v4.pdf
PWC	https://paperswithcode.com/paper/estimating-cellular-goals-from-high
Repo	https://github.com/laurenceyang33/cellgoal
Framework	none

Feature selection as Monte-Carlo Search in Growing Single Rooted Directed Acyclic Graph by Best Leaf Identification


Title	Feature selection as Monte-Carlo Search in Growing Single Rooted Directed Acyclic Graph by Best Leaf Identification
Authors	Aurelien Pelissier, Atsuyoshi Nakamura, Koji Tabata
Abstract	Monte Carlo tree search (MCTS) has received considerable interest due to its spectacular success in the difficult problem of computer Go and also proved beneficial in a range of other domains. A major issue that has received little attention in the MCTS literature is the fact that, in most games, different actions can lead to the same state, that may lead to a high degree of redundancy in tree representation and unnecessary additional computational cost. We extend MCTS to single rooted directed acyclic graph (SR-DAG), and consider the Best Arm Identification (BAI) and the Best Leaf Identification (BLI) problem of an expanding SR-DAG of arbitrary depth. We propose algorithms that are (epsilon, delta)-correct in the fixed confidence setting, and prove an asymptotic upper bounds of sample complexity for our BAI algorithm. As a major application for our BLI algorithm, a novel approach for Feature Selection is proposed by representing the feature set space as a SR-DAG and repeatedly evaluating feature subsets until a candidate for the best leaf is returned, a proof of concept is shown on benchmark data sets.
Tasks	Feature Selection
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07531v2
PDF	http://arxiv.org/pdf/1811.07531v2.pdf
PWC	https://paperswithcode.com/paper/feature-selection-as-monte-carlo-search-in
Repo	https://github.com/Aurelien-Pelissier/Feature-Selection-as-Reinforcement-Learning
Framework	tf

Adversarial Deep Reinforcement Learning in Portfolio Management


Title	Adversarial Deep Reinforcement Learning in Portfolio Management
Authors	Zhipeng Liang, Hao Chen, Junhao Zhu, Kangkang Jiang, Yanran Li
Abstract	In this paper, we implement three state-of-art continuous reinforcement learning algorithms, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO) and Policy Gradient (PG)in portfolio management. All of them are widely-used in game playing and robot control. What’s more, PPO has appealing theoretical propeties which is hopefully potential in portfolio management. We present the performances of them under different settings, including different learning rates, objective functions, feature combinations, in order to provide insights for parameters tuning, features selection and data preparation. We also conduct intensive experiments in China Stock market and show that PG is more desirable in financial market than DDPG and PPO, although both of them are more advanced. What’s more, we propose a so called Adversarial Training method and show that it can greatly improve the training efficiency and significantly promote average daily return and sharpe ratio in back test. Based on this new modification, our experiments results show that our agent based on Policy Gradient can outperform UCRP.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09940v3
PDF	http://arxiv.org/pdf/1808.09940v3.pdf
PWC	https://paperswithcode.com/paper/adversarial-deep-reinforcement-learning-in
Repo	https://github.com/liangzp/Reinforcement-learning-in-portfolio-management-
Framework	tf

RCCNet: An Efficient Convolutional Neural Network for Histological Routine Colon Cancer Nuclei Classification


Title	RCCNet: An Efficient Convolutional Neural Network for Histological Routine Colon Cancer Nuclei Classification
Authors	S H Shabbeer Basha, Soumen Ghosh, Kancharagunta Kishan Babu, Shiv Ram Dubey, Viswanath Pulabaigari, Snehasis Mukherjee
Abstract	Efficient and precise classification of histological cell nuclei is of utmost importance due to its potential applications in the field of medical image analysis. It would facilitate the medical practitioners to better understand and explore various factors for cancer treatment. The classification of histological cell nuclei is a challenging task due to the cellular heterogeneity. This paper proposes an efficient Convolutional Neural Network (CNN) based architecture for classification of histological routine colon cancer nuclei named as RCCNet. The main objective of this network is to keep the CNN model as simple as possible. The proposed RCCNet model consists of only 1,512,868 learnable parameters which are significantly less compared to the popular CNN models such as AlexNet, CIFARVGG, GoogLeNet, and WRN. The experiments are conducted over publicly available routine colon cancer histological dataset “CRCHistoPhenotypes”. The results of the proposed RCCNet model are compared with five state-of-the-art CNN models in terms of the accuracy, weighted average F1 score and training time. The proposed method has achieved a classification accuracy of 80.61% and 0.7887 weighted average F1 score. The proposed RCCNet is more efficient and generalized terms of the training time and data over-fitting, respectively.
Tasks	Nuclei Classification
Published	2018-09-30
URL	https://arxiv.org/abs/1810.02797v3
PDF	https://arxiv.org/pdf/1810.02797v3.pdf
PWC	https://paperswithcode.com/paper/rccnet-an-efficient-convolutional-neural
Repo	https://github.com/shabbeersh/Impact-of-FC-layers
Framework	tf

Projective Inference in High-dimensional Problems: Prediction and Feature Selection


Title	Projective Inference in High-dimensional Problems: Prediction and Feature Selection
Authors	Juho Piironen, Markus Paasiniemi, Aki Vehtari
Abstract	This paper discusses predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We argue that in many cases one can benefit from a decision theoretically justified two-stage approach: first, construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions. The model built in the first step is referred to as the \emph{reference model} and the operation during the latter step as predictive \emph{projection}. The key characteristic of this approach is that it finds an excellent tradeoff between sparsity and predictive accuracy, and the gain comes from utilizing all available information including prior and that coming from the left out features. We review several methods that follow this principle and provide novel methodological contributions. We present a new projection technique that unifies two existing techniques and is both accurate and fast to compute. We also propose a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection. Furthermore, we prove a theorem that helps to understand the conditions under which the projective approach could be beneficial. The benefits are illustrated via several simulated and real world examples.
Tasks	Feature Selection
Published	2018-10-04
URL	http://arxiv.org/abs/1810.02406v1
PDF	http://arxiv.org/pdf/1810.02406v1.pdf
PWC	https://paperswithcode.com/paper/projective-inference-in-high-dimensional
Repo	https://github.com/stan-dev/projpred
Framework	none

Deep Relevance Ranking Using Enhanced Document-Query Interactions


Title	Deep Relevance Ranking Using Enhanced Document-Query Interactions
Authors	Ryan McDonald, Georgios-Ioannis Brokos, Ion Androutsopoulos
Abstract	We explore several new models for document relevance ranking, building upon the Deep Relevance Matching Model (DRMM) of Guo et al. (2016). Unlike DRMM, which uses context-insensitive encodings of terms and query-document term interactions, we inject rich context-sensitive encodings throughout our models, inspired by PACRR’s (Hui et al., 2017) convolutional n-gram matching features, but extended in several ways including multiple views of query and document inputs. We test our models on datasets from the BIOASQ question answering challenge (Tsatsaronis et al., 2015) and TREC ROBUST 2004 (Voorhees, 2005), showing they outperform BM25-based baselines, DRMM, and PACRR.
Tasks	Ad-Hoc Information Retrieval, Question Answering
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01682v2
PDF	http://arxiv.org/pdf/1809.01682v2.pdf
PWC	https://paperswithcode.com/paper/deep-relevance-ranking-using-enhanced
Repo	https://github.com/nlpaueb/deep-relevance-ranking
Framework	tf

High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking


Title	High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking
Authors	Fan Wang, Sach Mukherjee, Sylvia Richardson, Steven M. Hill
Abstract	Penalized likelihood approaches are widely used for high-dimensional regression. Although many methods have been proposed and the associated theory is now well-developed, the relative efficacy of different approaches in finite-sample settings, as encountered in practice, remains incompletely understood. There is therefore a need for empirical investigations in this area that can offer practical insight and guidance to users. In this paper we present a large-scale comparison of penalized regression methods. We distinguish between three related goals: prediction, variable selection and variable ranking. Our results span more than 2,300 data-generating scenarios, including both synthetic and semi-synthetic data (real covariates and simulated responses), allowing us to systematically consider the influence of various factors (sample size, dimensionality, sparsity, signal strength and multicollinearity). We consider several widely-used approaches (Lasso, Adaptive Lasso, Elastic Net, Ridge Regression, SCAD, the Dantzig Selector and Stability Selection). We find considerable variation in performance between methods. Our results support a `no panacea’ view, with no unambiguous winner across all scenarios or goals, even in this restricted setting where all data align well with the assumptions underlying the methods. The study allows us to make some recommendations as to which approaches may be most (or least) suitable given the goal and some data characteristics. Our empirical results complement existing theory and provide a resource to compare methods across a range of scenarios and metrics. \|
Tasks
Published	2018-08-02
URL	https://arxiv.org/abs/1808.00723v2
PDF	https://arxiv.org/pdf/1808.00723v2.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-regression-in-practice-an
Repo	https://github.com/fw307/high_dimensional_regression_comparison
Framework	none

A Constraint-Based Algorithm For Causal Discovery with Cycles, Latent Variables and Selection Bias


Title	A Constraint-Based Algorithm For Causal Discovery with Cycles, Latent Variables and Selection Bias
Authors	Eric V. Strobl
Abstract	Causal processes in nature may contain cycles, and real datasets may violate causal sufficiency as well as contain selection bias. No constraint-based causal discovery algorithm can currently handle cycles, latent variables and selection bias (CLS) simultaneously. I therefore introduce an algorithm called Cyclic Causal Inference (CCI) that makes sound inferences with a conditional independence oracle under CLS, provided that we can represent the cyclic causal process as a non-recursive linear structural equation model with independent errors. Empirical results show that CCI outperforms CCD in the cyclic case as well as rivals FCI and RFCI in the acyclic case.
Tasks	Causal Discovery, Causal Inference
Published	2018-05-05
URL	http://arxiv.org/abs/1805.02087v1
PDF	http://arxiv.org/pdf/1805.02087v1.pdf
PWC	https://paperswithcode.com/paper/a-constraint-based-algorithm-for-causal
Repo	https://github.com/ericstrobl/CCI
Framework	none

A Semi-Supervised Data Augmentation Approach using 3D Graphical Engines


Title	A Semi-Supervised Data Augmentation Approach using 3D Graphical Engines
Authors	Shuangjun Liu, Sarah Ostadabbas
Abstract	Deep learning approaches have been rapidly adopted across a wide range of fields because of their accuracy and flexibility, but require large labeled training datasets. This presents a fundamental problem for applications with limited, expensive, or private data (i.e. small data), such as human pose and behavior estimation/tracking which could be highly personalized. In this paper, we present a semi-supervised data augmentation approach that can synthesize large scale labeled training datasets using 3D graphical engines based on a physically-valid low dimensional pose descriptor. To evaluate the performance of our synthesized datasets in training deep learning-based models, we generated a large synthetic human pose dataset, called ScanAva using 3D scans of only 7 individuals based on our proposed augmentation approach. A state-of-the-art human pose estimation deep learning model then was trained from scratch using our ScanAva dataset and could achieve the pose estimation accuracy of 91.2% at PCK0.5 criteria after applying an efficient domain adaptation on the synthetic images, in which its pose estimation accuracy was comparable to the same model trained on large scale pose data from real humans such as MPII dataset and much higher than the model trained on other synthetic human dataset such as SURREAL.
Tasks	Data Augmentation, Domain Adaptation, Pose Estimation
Published	2018-08-08
URL	http://arxiv.org/abs/1808.02595v2
PDF	http://arxiv.org/pdf/1808.02595v2.pdf
PWC	https://paperswithcode.com/paper/a-semi-supervised-data-augmentation-approach
Repo	https://github.com/ostadabbas/ScanAvaGenerationToolkit
Framework	none

Explainable Reasoning over Knowledge Graphs for Recommendation


Title	Explainable Reasoning over Knowledge Graphs for Recommendation
Authors	Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, Yixin Cao, Tat-Seng Chua
Abstract	Incorporating knowledge graph into recommender systems has attracted increasing attention in recent years. By exploring the interlinks within a knowledge graph, the connectivity between users and items can be discovered as paths, which provide rich and complementary information to user-item interactions. Such connectivity not only reveals the semantics of entities and relations, but also helps to comprehend a user’s interest. However, existing efforts have not fully explored this connectivity to infer user preferences, especially in terms of modeling the sequential dependencies within and holistic semantics of a path. In this paper, we contribute a new model named Knowledge-aware Path Recurrent Network (KPRN) to exploit knowledge graph for recommendation. KPRN can generate path representations by composing the semantics of both entities and relations. By leveraging the sequential dependencies within a path, we allow effective reasoning on paths to infer the underlying rationale of a user-item interaction. Furthermore, we design a new weighted pooling operation to discriminate the strengths of different paths in connecting a user with an item, endowing our model with a certain level of explainability. We conduct extensive experiments on two datasets about movie and music, demonstrating significant improvements over state-of-the-art solutions Collaborative Knowledge Base Embedding and Neural Factorization Machine.
Tasks	Knowledge Graphs, Recommendation Systems
Published	2018-11-12
URL	http://arxiv.org/abs/1811.04540v1
PDF	http://arxiv.org/pdf/1811.04540v1.pdf
PWC	https://paperswithcode.com/paper/explainable-reasoning-over-knowledge-graphs
Repo	https://github.com/BaeSeulki/WhySoMuch
Framework	none

Context-Free Transductions with Neural Stacks


Title	Context-Free Transductions with Neural Stacks
Authors	Yiding Hao, William Merrill, Dana Angluin, Robert Frank, Noah Amsel, Andrew Benz, Simon Mendelsohn
Abstract	This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modelling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are more difficult to train than classical architectures such as LSTMs. Rather than employ stack-based strategies, more complex networks often find approximate solutions by using the stack as unstructured memory.
Tasks	Language Modelling
Published	2018-09-08
URL	http://arxiv.org/abs/1809.02836v1
PDF	http://arxiv.org/pdf/1809.02836v1.pdf
PWC	https://paperswithcode.com/paper/context-free-transductions-with-neural-stacks
Repo	https://github.com/viking-sudo-rm/StackNN
Framework	pytorch