Paper Group AWR 144
Jointly Learning Explainable Rules for Recommendation with Knowledge Graph. ToyArchitecture: Unsupervised Learning of Interpretable Models of the World. Interaction-aware Factorization Machines for Recommender Systems. Exploring TD error as a heuristic for $σ$ selection in Q($σ$, $λ$). Learning Triggers for Heterogeneous Treatment Effects. Iterativ …
Jointly Learning Explainable Rules for Recommendation with Knowledge Graph
Title | Jointly Learning Explainable Rules for Recommendation with Knowledge Graph |
Authors | Weizhi Ma, Min Zhang, Yue Cao, Woojeong, Jin, Chenyang Wang, Yiqun Liu, Shaoping Ma, Xiang Ren |
Abstract | Explainability and effectiveness are two key aspects for building recommender systems. Prior efforts mostly focus on incorporating side information to achieve better recommendation performance. However, these methods have some weaknesses: (1) prediction of neural network-based embedding methods are hard to explain and debug; (2) symbolic, graph-based approaches (e.g., meta path-based models) require manual efforts and domain knowledge to define patterns and rules, and ignore the item association types (e.g. substitutable and complementary). In this paper, we propose a novel joint learning framework to integrate \textit{induction of explainable rules from knowledge graph} with \textit{construction of a rule-guided neural recommendation model}. The framework encourages two modules to complement each other in generating effective and explainable recommendation: 1) inductive rules, mined from item-centric knowledge graphs, summarize common multi-hop relational patterns for inferring different item associations and provide human-readable explanation for model prediction; 2) recommendation module can be augmented by induced rules and thus have better generalization ability dealing with the cold-start issue. Extensive experiments\footnote{Code and data can be found at: \url{https://github.com/THUIR/RuleRec}} show that our proposed method has achieved significant improvements in item recommendation over baselines on real-world datasets. Our model demonstrates robust performance over “noisy” item knowledge graphs, generated by linking item names to related entities. |
Tasks | Knowledge Graphs, Recommendation Systems |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.03714v1 |
http://arxiv.org/pdf/1903.03714v1.pdf | |
PWC | https://paperswithcode.com/paper/jointly-learning-explainable-rules-for |
Repo | https://github.com/THUIR/RuleRec |
Framework | none |
ToyArchitecture: Unsupervised Learning of Interpretable Models of the World
Title | ToyArchitecture: Unsupervised Learning of Interpretable Models of the World |
Authors | Jaroslav Vítků, Petr Dluhoš, Joseph Davidson, Matěj Nikl, Simon Andersson, Přemysl Paška, Jan Šinkora, Petr Hlubuček, Martin Stránský, Martin Hyben, Martin Poliak, Jan Feyereisl, Marek Rosa |
Abstract | Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are usually uncomputable, incompatible with theories of biological intelligence, or lack practical implementations. The goal of this work is to combine the main advantages of the two: to follow a big picture view, while providing a particular theory and its implementation. In contrast with purely theoretical approaches, the resulting architecture should be usable in realistic settings, but also form the core of a framework containing all the basic mechanisms, into which it should be easier to integrate additional required functionality. In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world, learning the influence of one’s own actions on the world, model-based reinforcement learning, hierarchical planning and plan execution, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations with the following properties: 1) they are increasingly more abstract, but can retain details when needed, and 2) they are easy to manipulate in their local and symbolic-like form, thus also allowing one to observe the learning process at each level of abstraction. On all levels of the system, the representation of the data can be interpreted in both a symbolic and a sub-symbolic manner. This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference. |
Tasks | |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1903.08772v2 |
http://arxiv.org/pdf/1903.08772v2.pdf | |
PWC | https://paperswithcode.com/paper/toyarchitecture-unsupervised-learning-of |
Repo | https://github.com/GoodAI/torchsim |
Framework | pytorch |
Interaction-aware Factorization Machines for Recommender Systems
Title | Interaction-aware Factorization Machines for Recommender Systems |
Authors | Fuxing Hong, Dongbo Huang, Ge Chen |
Abstract | Factorization Machine (FM) is a widely used supervised learning approach by effectively modeling of feature interactions. Despite the successful application of FM and its many deep learning variants, treating every feature interaction fairly may degrade the performance. For example, the interactions of a useless feature may introduce noises; the importance of a feature may also differ when interacting with different features. In this work, we propose a novel model named \emph{Interaction-aware Factorization Machine} (IFM) by introducing Interaction-Aware Mechanism (IAM), which comprises the \emph{feature aspect} and the \emph{field aspect}, to learn flexible interactions on two levels. The feature aspect learns feature interaction importance via an attention network while the field aspect learns the feature interaction effect as a parametric similarity of the feature interaction vector and the corresponding field interaction prototype. IFM introduces more structured control and learns feature interaction importance in a stratified manner, which allows for more leverage in tweaking the interactions on both feature-wise and field-wise levels. Besides, we give a more generalized architecture and propose Interaction-aware Neural Network (INN) and DeepIFM to capture higher-order interactions. To further improve both the performance and efficiency of IFM, a sampling scheme is developed to select interactions based on the field aspect importance. The experimental results from two well-known datasets show the superiority of the proposed models over the state-of-the-art methods. |
Tasks | Recommendation Systems |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09757v1 |
http://arxiv.org/pdf/1902.09757v1.pdf | |
PWC | https://paperswithcode.com/paper/interaction-aware-factorization-machines-for |
Repo | https://github.com/cstur4/interaction-aware-factorization-machines |
Framework | tf |
Exploring TD error as a heuristic for $σ$ selection in Q($σ$, $λ$)
Title | Exploring TD error as a heuristic for $σ$ selection in Q($σ$, $λ$) |
Authors | Abhishek Nan |
Abstract | In the landscape of TD algorithms, the Q($\sigma$, $\lambda$) algorithm is an algorithm with the ability to perform a multistep backup in an online manner while also successfully unifying the concepts of sampling with using the expectation across all actions for a state. $\sigma \in [0, 1]$ indicates the extent to which sampling is used. Selecting the value of {\sigma} can be based on characteristics of the current state rather than having a constant value or being time based. This report explores the viability of such a TD-error based scheme. |
Tasks | |
Published | 2019-12-21 |
URL | https://arxiv.org/abs/1912.10316v1 |
https://arxiv.org/pdf/1912.10316v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-td-error-as-a-heuristic-for |
Repo | https://github.com/abnan/CMPUT_609_Project |
Framework | none |
Learning Triggers for Heterogeneous Treatment Effects
Title | Learning Triggers for Heterogeneous Treatment Effects |
Authors | Christopher Tran, Elena Zheleva |
Abstract | The causal effect of a treatment can vary from person to person based on their individual characteristics and predispositions. Mining for patterns of individual-level effect differences, a problem known as heterogeneous treatment effect estimation, has many important applications, from precision medicine to recommender systems. In this paper we define and study a variant of this problem in which an individual-level threshold in treatment needs to be reached, in order to trigger an effect. One of the main contributions of our work is that we do not only estimate heterogeneous treatment effects with fixed treatments but can also prescribe individualized treatments. We propose a tree-based learning method to find the heterogeneity in the treatment effects. Our experimental results on multiple datasets show that our approach can learn the triggers better than existing approaches. |
Tasks | Recommendation Systems |
Published | 2019-01-31 |
URL | https://arxiv.org/abs/1902.00087v4 |
https://arxiv.org/pdf/1902.00087v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-triggers-for-heterogeneous-treatment |
Repo | https://github.com/chris-tran-16/CTL |
Framework | none |
Iterative Spectral Method for Alternative Clustering
Title | Iterative Spectral Method for Alternative Clustering |
Authors | Chieh Wu, Stratis Ioannidis, Mario Sznaier, Xiangyu Li, David Kaeli, Jennifer G. Dy |
Abstract | Given a dataset and an existing clustering as input, alternative clustering aims to find an alternative partition. One of the state-of-the-art approaches is Kernel Dimension Alternative Clustering (KDAC). We propose a novel Iterative Spectral Method (ISM) that greatly improves the scalability of KDAC. Our algorithm is intuitive, relies on easily implementable spectral decompositions, and comes with theoretical guarantees. Its computation time improves upon existing implementations of KDAC by as much as 5 orders of magnitude. |
Tasks | |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03441v1 |
https://arxiv.org/pdf/1909.03441v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-spectral-method-for-alternative |
Repo | https://github.com/neu-spiral/ISM |
Framework | none |
Robust Inference via Generative Classifiers for Handling Noisy Labels
Title | Robust Inference via Generative Classifiers for Handling Noisy Labels |
Authors | Kimin Lee, Sukmin Yun, Kibok Lee, Honglak Lee, Bo Li, Jinwoo Shin |
Abstract | Large-scale datasets may contain significant proportions of noisy (incorrect) class labels, and it is well-known that modern deep neural networks (DNNs) poorly generalize from such noisy training datasets. To mitigate the issue, we propose a novel inference method, termed Robust Generative classifier (RoG), applicable to any discriminative (e.g., softmax) neural classifier pre-trained on noisy datasets. In particular, we induce a generative classifier on top of hidden feature spaces of the pre-trained DNNs, for obtaining a more robust decision boundary. By estimating the parameters of generative classifier using the minimum covariance determinant estimator, we significantly improve the classification accuracy with neither re-training of the deep model nor changing its architectures. With the assumption of Gaussian distribution for features, we prove that RoG generalizes better than baselines under noisy labels. Finally, we propose the ensemble version of RoG to improve its performance by investigating the layer-wise characteristics of DNNs. Our extensive experimental results demonstrate the superiority of RoG given different learning models optimized by several training techniques to handle diverse scenarios of noisy labels. |
Tasks | |
Published | 2019-01-31 |
URL | https://arxiv.org/abs/1901.11300v2 |
https://arxiv.org/pdf/1901.11300v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-inference-via-generative-classifiers |
Repo | https://github.com/pokaxpoka/RoGNoisyLabel |
Framework | pytorch |
Discriminative Active Learning
Title | Discriminative Active Learning |
Authors | Daniel Gissin, Shai Shalev-Shwartz |
Abstract | We propose a new batch mode active learning algorithm designed for neural networks and large query batch sizes. The method, Discriminative Active Learning (DAL), poses active learning as a binary classification task, attempting to choose examples to label in such a way as to make the labeled set and the unlabeled pool indistinguishable. Experimenting on image classification tasks, we empirically show our method to be on par with state of the art methods in medium and large query batch sizes, while being simple to implement and also extend to other domains besides classification tasks. Our experiments also show that none of the state of the art methods of today are clearly better than uncertainty sampling when the batch size is relatively large, negating some of the reported results in the recent literature. |
Tasks | Active Learning, Image Classification |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06347v1 |
https://arxiv.org/pdf/1907.06347v1.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-active-learning-1 |
Repo | https://github.com/dsgissin/DiscriminativeActiveLearning |
Framework | tf |
Variational Adversarial Active Learning
Title | Variational Adversarial Active Learning |
Authors | Samarth Sinha, Sayna Ebrahimi, Trevor Darrell |
Abstract | Active learning aims to develop label-efficient algorithms by sampling the most representative queries to be labeled by an oracle. We describe a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner. Unlike conventional active learning algorithms, our approach is task agnostic, i.e., it does not depend on the performance of the task for which we are trying to acquire labeled data. Our method learns a latent space using a variational autoencoder (VAE) and an adversarial network trained to discriminate between unlabeled and labeled data. The mini-max game between the VAE and the adversarial network is played such that while the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool, the adversarial network learns how to discriminate between dissimilarities in the latent space. We extensively evaluate our method on various image classification and semantic segmentation benchmark datasets and establish a new state of the art on $\text{CIFAR10/100}$, $\text{Caltech-256}$, $\text{ImageNet}$, $\text{Cityscapes}$, and $\text{BDD100K}$. Our results demonstrate that our adversarial approach learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method. Our code is available at https://github.com/sinhasam/vaal. |
Tasks | Active Learning, Image Classification, Semantic Segmentation |
Published | 2019-03-31 |
URL | https://arxiv.org/abs/1904.00370v3 |
https://arxiv.org/pdf/1904.00370v3.pdf | |
PWC | https://paperswithcode.com/paper/variational-adversarial-active-learning |
Repo | https://github.com/sinhasam/vaal |
Framework | pytorch |
Molecular geometry prediction using a deep generative graph neural network
Title | Molecular geometry prediction using a deep generative graph neural network |
Authors | Elman Mansimov, Omar Mahmood, Seokho Kang, Kyunghyun Cho |
Abstract | A molecule’s geometry, also known as conformation, is one of a molecule’s most important properties, determining the reactions it participates in, the bonds it forms, and the interactions it has with other molecules. Conventional conformation generation methods minimize hand-designed molecular force field energy functions that are often not well correlated with the true energy function of a molecule observed in nature. They generate geometrically diverse sets of conformations, some of which are very similar to the lowest-energy conformations and others of which are very different. In this paper, we propose a conditional deep generative graph neural network that learns an energy function by directly learning to generate molecular conformations that are energetically favorable and more likely to be observed experimentally in data-driven manner. On three large-scale datasets containing small molecules, we show that our method generates a set of conformations that on average is far more likely to be close to the corresponding reference conformations than are those obtained from conventional force field methods. Our method maintains geometrical diversity by generating conformations that are not too similar to each other, and is also computationally faster. We also show that our method can be used to provide initial coordinates for conventional force field methods. On one of the evaluated datasets we show that this combination allows us to combine the best of both methods, yielding generated conformations that are on average close to reference conformations with some very similar to reference conformations. |
Tasks | |
Published | 2019-03-31 |
URL | https://arxiv.org/abs/1904.00314v2 |
https://arxiv.org/pdf/1904.00314v2.pdf | |
PWC | https://paperswithcode.com/paper/molecular-geometry-prediction-using-a-deep |
Repo | https://github.com/nyu-dl/dl4chem-geometry |
Framework | tf |
Dynamic Spatial-Temporal Representation Leaning for Crowd Flow Prediction
Title | Dynamic Spatial-Temporal Representation Leaning for Crowd Flow Prediction |
Authors | Lingbo Liu, Jiajie Zhen, Guanbin Li, Geng Zhan, Liang Lin |
Abstract | As a crucial component in intelligent transportation systems, crowd flow prediction has recently attracted widespread research interest in the field of artificial intelligence (AI) with the increasing availability of large-scale traffic mobility data. Its key challenge lies in how to integrate diverse factors (such as temporal laws and spatial dependencies) to infer the evolution trend of crowd flow. To address this problem, we propose a unified neural network called Attentive Crowd Flow Machine (ACFM), which can effectively learn the spatial-temporal feature representations of crowd flow with an attention mechanism. In particular, our ACFM is composed of two progressive Convolutional Long Short-Term Memory (ConvLSTM) units connected with a convolutional layer. Specifically, the first ConvLSTM unit takes normal crowd flow features as input and generates a hidden state at each time-step, which is further fed into the connected convolutional layer for spatial attention map inference. The second ConvLSTM unit aims at learning the dynamic spatial-temporal representations from the attentionally weighted crowd flow features. Further, we develop two deep frameworks based on ACFM to predict citywide short-term/long-term crowd flow by adaptively incorporating the sequential and periodic data as well as other external influences. Extensive experiments on two standard benchmarks well demonstrate the superiority of the proposed method for crowd flow prediction. Moreover, to verify the generalization of our method, we also apply the customized framework to forecast the passenger pickup/dropoff demands and show its superior performance in this traffic prediction task. |
Tasks | Traffic Prediction |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.02902v2 |
https://arxiv.org/pdf/1909.02902v2.pdf | |
PWC | https://paperswithcode.com/paper/acfm-a-dynamic-spatial-temporal-network-for |
Repo | https://github.com/liulingbo918/ACFM |
Framework | pytorch |
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
Title | Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask |
Authors | Hattie Zhou, Janice Lan, Rosanne Liu, Jason Yosinski |
Abstract | The recent “Lottery Ticket Hypothesis” paper by Frankle & Carbin showed that a simple approach to creating sparse networks (keeping the large weights) results in models that are trainable from scratch, but only when starting from the same initial weights. The performance of these networks often exceeds the performance of the non-sparse base model, but for reasons that were not well understood. In this paper we study the three critical components of the Lottery Ticket (LT) algorithm, showing that each may be varied significantly without impacting the overall results. Ablating these factors leads to new insights for why LT networks perform as well as they do. We show why setting weights to zero is important, how signs are all you need to make the reinitialized network train, and why masking behaves like training. Finally, we discover the existence of Supermasks, masks that can be applied to an untrained, randomly initialized network to produce a model with performance far better than chance (86% on MNIST, 41% on CIFAR-10). |
Tasks | |
Published | 2019-05-03 |
URL | https://arxiv.org/abs/1905.01067v4 |
https://arxiv.org/pdf/1905.01067v4.pdf | |
PWC | https://paperswithcode.com/paper/deconstructing-lottery-tickets-zeros-signs |
Repo | https://github.com/emerali/LottoRBM |
Framework | pytorch |
Probabilistic Noise2Void: Unsupervised Content-Aware Denoising
Title | Probabilistic Noise2Void: Unsupervised Content-Aware Denoising |
Authors | Alexander Krull, Tomas Vicar, Florian Jug |
Abstract | Today, Convolutional Neural Networks (CNNs) are the leading method for image denoising. They are traditionally trained on pairs of images, which are often hard to obtain for practical applications. This motivates self-supervised training methods such as Noise2Void~(N2V) that operate on single noisy images. Self-supervised methods are, unfortunately, not competitive with models trained on image pairs. Here, we present ‘Probabilistic Noise2Void’ (PN2V), a method to train CNNs to predict per-pixel intensity distributions. Combining these with a suitable description of the noise, we obtain a complete probabilistic model for the noisy observations and true signal in every pixel. We evaluate PN2V on publicly available microscopy datasets, under a broad range of noise regimes, and achieve competitive results with respect to supervised state-of-the-art methods. |
Tasks | Denoising, Image Denoising |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00651v2 |
https://arxiv.org/pdf/1906.00651v2.pdf | |
PWC | https://paperswithcode.com/paper/190600651 |
Repo | https://github.com/juglab/pn2v |
Framework | pytorch |
AtLoc: Attention Guided Camera Localization
Title | AtLoc: Attention Guided Camera Localization |
Authors | Bing Wang, Changhao Chen, Chris Xiaoxuan Lu, Peijun Zhao, Niki Trigoni, Andrew Markham |
Abstract | Deep learning has achieved impressive results in camera localization, but current single-image techniques typically suffer from a lack of robustness, leading to large outliers. To some extent, this has been tackled by sequential (multi-images) or geometry constraint approaches, which can learn to reject dynamic objects and illumination conditions to achieve better performance. In this work, we show that attention can be used to force the network to focus on more geometrically robust objects and features, achieving state-of-the-art performance in common benchmark, even if using only a single image as input. Extensive experimental evidence is provided through public indoor and outdoor datasets. Through visualization of the saliency maps, we demonstrate how the network learns to reject dynamic objects, yielding superior global camera pose regression performance. The source code is avaliable at https://github.com/BingCS/AtLoc. |
Tasks | Camera Localization |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03557v2 |
https://arxiv.org/pdf/1909.03557v2.pdf | |
PWC | https://paperswithcode.com/paper/atloc-attention-guided-camera-localization |
Repo | https://github.com/natowi/3D-Reconstruction-with-Neural-Network |
Framework | pytorch |
Who’s Afraid of Adversarial Queries? The Impact of Image Modifications on Content-based Image Retrieval
Title | Who’s Afraid of Adversarial Queries? The Impact of Image Modifications on Content-based Image Retrieval |
Authors | Zhuoran Liu, Zhengyu Zhao, Martha Larson |
Abstract | An adversarial query is an image that has been modified to disrupt content-based image retrieval (CBIR) while appearing nearly untouched to the human eye. This paper presents an analysis of adversarial queries for CBIR based on neural, local, and global features. We introduce an innovative neural image perturbation approach, called Perturbations for Image Retrieval Error (PIRE), that is capable of blocking neural-feature-based CBIR. PIRE differs significantly from existing approaches that create images adversarial with respect to CNN classifiers because it is unsupervised, i.e., it needs no labelled data from the data set to which it is applied. Our experimental analysis demonstrates the surprising effectiveness of PIRE in blocking CBIR, and also covers aspects of PIRE that must be taken into account in practical settings, including saving images, image quality and leaking adversarial queries into the background collection. Our experiments also compare PIRE (a neural approach) with existing keypoint removal and injection approaches (which modify local features). Finally, we discuss the challenges that face multimedia researchers in the future study of adversarial queries. |
Tasks | Content-Based Image Retrieval, Image Retrieval |
Published | 2019-01-29 |
URL | http://arxiv.org/abs/1901.10332v3 |
http://arxiv.org/pdf/1901.10332v3.pdf | |
PWC | https://paperswithcode.com/paper/whos-afraid-of-adversarial-queries-the-impact |
Repo | https://github.com/liuzrcc/PIRE |
Framework | pytorch |