Paper Group AWR 92
Split-door criterion: Identification of causal effects through auxiliary outcomes. A Discrete and Bounded Envy-Free Cake Cutting Protocol for Any Number of Agents. Hybrid Collaborative Filtering with Autoencoders. TensorLy: Tensor Learning in Python. Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia. Discrimina …
Split-door criterion: Identification of causal effects through auxiliary outcomes
Title | Split-door criterion: Identification of causal effects through auxiliary outcomes |
Authors | Amit Sharma, Jake M. Hofman, Duncan J. Watts |
Abstract | We present a method for estimating causal effects in time series data when fine-grained information about the outcome of interest is available. Specifically, we examine what we call the split-door setting, where the outcome variable can be split into two parts: one that is potentially affected by the cause being studied and another that is independent of it, with both parts sharing the same (unobserved) confounders. We show that under these conditions, the problem of identification reduces to that of testing for independence among observed variables, and present a method that uses this approach to automatically find subsets of the data that are causally identified. We demonstrate the method by estimating the causal impact of Amazon’s recommender system on traffic to product pages, finding thousands of examples within the dataset that satisfy the split-door criterion. Unlike past studies based on natural experiments that were limited to a single product category, our method applies to a large and representative sample of products viewed on the site. In line with previous work, we find that the widely-used click-through rate (CTR) metric overestimates the causal impact of recommender systems; depending on the product category, we estimate that 50-80% of the traffic attributed to recommender systems would have happened even without any recommendations. We conclude with guidelines for using the split-door criterion as well as a discussion of other contexts where the method can be applied. |
Tasks | Recommendation Systems, Time Series |
Published | 2016-11-28 |
URL | http://arxiv.org/abs/1611.09414v2 |
http://arxiv.org/pdf/1611.09414v2.pdf | |
PWC | https://paperswithcode.com/paper/split-door-criterion-identification-of-causal |
Repo | https://github.com/amit-sharma/splitdoor-causal-criterion |
Framework | none |
A Discrete and Bounded Envy-Free Cake Cutting Protocol for Any Number of Agents
Title | A Discrete and Bounded Envy-Free Cake Cutting Protocol for Any Number of Agents |
Authors | Haris Aziz, Simon Mackenzie |
Abstract | We consider the well-studied cake cutting problem in which the goal is to find an envy-free allocation based on queries from $n$ agents. The problem has received attention in computer science, mathematics, and economics. It has been a major open problem whether there exists a discrete and bounded envy-free protocol. We resolve the problem by proposing a discrete and bounded envy-free protocol for any number of agents. The maximum number of queries required by the protocol is $n^{n^{n^{n^{n^n}}}}$. We additionally show that even if we do not run our protocol to completion, it can find in at most $n^3{(n^2)}^n$ queries a partial allocation of the cake that achieves proportionality (each agent gets at least $1/n$ of the value of the whole cake) and envy-freeness. Finally we show that an envy-free partial allocation can be computed in at most $n^3{(n^2)}^n$ queries such that each agent gets a connected piece that gives the agent at least $1/(3n)$ of the value of the whole cake. |
Tasks | |
Published | 2016-04-13 |
URL | http://arxiv.org/abs/1604.03655v12 |
http://arxiv.org/pdf/1604.03655v12.pdf | |
PWC | https://paperswithcode.com/paper/a-discrete-and-bounded-envy-free-cake-cutting |
Repo | https://github.com/cowtrix/kake |
Framework | none |
Hybrid Collaborative Filtering with Autoencoders
Title | Hybrid Collaborative Filtering with Autoencoders |
Authors | Florian Strub, Jeremie Mary, Romaric Gaudel |
Abstract | Collaborative Filtering aims at exploiting the feedback of users to provide personalised recommendations. Such algorithms look for latent variables in a large sparse matrix of ratings. They can be enhanced by adding side information to tackle the well-known cold start problem. While Neu-ral Networks have tremendous success in image and speech recognition, they have received less attention in Collaborative Filtering. This is all the more surprising that Neural Networks are able to discover latent variables in large and heterogeneous datasets. In this paper, we introduce a Collaborative Filtering Neural network architecture aka CFN which computes a non-linear Matrix Factorization from sparse rating inputs and side information. We show experimentally on the MovieLens and Douban dataset that CFN outper-forms the state of the art and benefits from side information. We provide an implementation of the algorithm as a reusable plugin for Torch, a popular Neural Network framework. |
Tasks | |
Published | 2016-03-02 |
URL | http://arxiv.org/abs/1603.00806v3 |
http://arxiv.org/pdf/1603.00806v3.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-collaborative-filtering-with |
Repo | https://github.com/hojinYang/recsys_papers_using_autoencoders |
Framework | none |
TensorLy: Tensor Learning in Python
Title | TensorLy: Tensor Learning in Python |
Authors | Jean Kossaifi, Yannis Panagakis, Anima Anandkumar, Maja Pantic |
Abstract | Tensors are higher-order extensions of matrices. While matrix methods form the cornerstone of machine learning and data analysis, tensor methods have been gaining increasing traction. However, software support for tensor operations is not on the same footing. In order to bridge this gap, we have developed \emph{TensorLy}, a high-level API for tensor methods and deep tensorized neural networks in Python. TensorLy aims to follow the same standards adopted by the main projects of the Python scientific community, and seamlessly integrates with them. Its BSD license makes it suitable for both academic and commercial applications. TensorLy’s backend system allows users to perform computations with NumPy, MXNet, PyTorch, TensorFlow and CuPy. They can be scaled on multiple CPU or GPU machines. In addition, using the deep-learning frameworks as backend allows users to easily design and train deep tensorized neural networks. TensorLy is available at https://github.com/tensorly/tensorly |
Tasks | |
Published | 2016-10-29 |
URL | http://arxiv.org/abs/1610.09555v2 |
http://arxiv.org/pdf/1610.09555v2.pdf | |
PWC | https://paperswithcode.com/paper/tensorly-tensor-learning-in-python |
Repo | https://github.com/tensorly/tensorly |
Framework | pytorch |
Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia
Title | Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia |
Authors | Alexander Amini, Berthold Horn, Alan Edelman |
Abstract | Convolutions have long been regarded as fundamental to applied mathematics, physics and engineering. Their mathematical elegance allows for common tasks such as numerical differentiation to be computed efficiently on large data sets. Efficient computation of convolutions is critical to artificial intelligence in real-time applications, like machine vision, where convolutions must be continuously and efficiently computed on tens to hundreds of kilobytes per second. In this paper, we explore how convolutions are used in fundamental machine vision applications. We present an accelerated n-dimensional convolution package in the high performance computing language, Julia, and demonstrate its efficacy in solving the time to contact problem for machine vision. Results are measured against synthetically generated videos and quantitatively assessed according to their mean squared error from the ground truth. We achieve over an order of magnitude decrease in compute time and allocated memory for comparable machine vision applications. All code is packaged and integrated into the official Julia Package Manager to be used in various other scenarios. |
Tasks | |
Published | 2016-12-28 |
URL | http://arxiv.org/abs/1612.08825v1 |
http://arxiv.org/pdf/1612.08825v1.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-convolutions-for-efficient-multi |
Repo | https://github.com/aamini/FastConv.jl |
Framework | none |
Discriminative Embeddings of Latent Variable Models for Structured Data
Title | Discriminative Embeddings of Latent Variable Models for Structured Data |
Authors | Hanjun Dai, Bo Dai, Le Song |
Abstract | Kernel classifiers and regressors designed for structured data, such as sequences, trees and graphs, have significantly advanced a number of interdisciplinary areas such as computational biology and drug design. Typically, kernels are designed beforehand for a data type which either exploit statistics of the structures or make use of probabilistic generative models, and then a discriminative classifier is learned based on the kernels via convex optimization. However, such an elegant two-stage approach also limited kernel methods from scaling up to millions of data points, and exploiting discriminative information to learn feature representations. We propose, structure2vec, an effective and scalable approach for structured data representation based on the idea of embedding latent variable models into feature spaces, and learning such feature spaces using discriminative information. Interestingly, structure2vec extracts features by performing a sequence of function mappings in a way similar to graphical model inference procedures, such as mean field and belief propagation. In applications involving millions of data points, we showed that structure2vec runs 2 times faster, produces models which are $10,000$ times smaller, while at the same time achieving the state-of-the-art predictive performance. |
Tasks | Latent Variable Models |
Published | 2016-03-17 |
URL | https://arxiv.org/abs/1603.05629v5 |
https://arxiv.org/pdf/1603.05629v5.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-embeddings-of-latent-variable |
Repo | https://github.com/LeeeWee/Note |
Framework | none |
Deep Metric Learning via Facility Location
Title | Deep Metric Learning via Facility Location |
Authors | Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, Kevin Murphy |
Abstract | Learning the representation and the similarity metric in an end-to-end fashion with deep networks have demonstrated outstanding results for clustering and retrieval. However, these recent approaches still suffer from the performance degradation stemming from the local metric training procedure which is unaware of the global structure of the embedding space. We propose a global metric learning scheme for optimizing the deep metric embedding with the learnable clustering function and the clustering metric (NMI) in a novel structured prediction framework. Our experiments on CUB200-2011, Cars196, and Stanford online products datasets show state of the art performance both on the clustering and retrieval tasks measured in the NMI and Recall@K evaluation metrics. |
Tasks | Metric Learning, Structured Prediction |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01213v2 |
http://arxiv.org/pdf/1612.01213v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-metric-learning-via-facility-location |
Repo | https://github.com/michaelfiman/face_rec_metric_comparison |
Framework | tf |
Log-time and Log-space Extreme Classification
Title | Log-time and Log-space Extreme Classification |
Authors | Kalina Jasinska, Nikos Karampatziakis |
Abstract | We present LTLS, a technique for multiclass and multilabel prediction that can perform training and inference in logarithmic time and space. LTLS embeds large classification problems into simple structured prediction problems and relies on efficient dynamic programming algorithms for inference. We train LTLS with stochastic gradient descent on a number of multiclass and multilabel datasets and show that despite its small memory footprint it is often competitive with existing approaches. |
Tasks | Structured Prediction |
Published | 2016-11-07 |
URL | http://arxiv.org/abs/1611.01964v1 |
http://arxiv.org/pdf/1611.01964v1.pdf | |
PWC | https://paperswithcode.com/paper/log-time-and-log-space-extreme-classification |
Repo | https://github.com/ievron/wltls |
Framework | none |
Revisiting Batch Normalization For Practical Domain Adaptation
Title | Revisiting Batch Normalization For Practical Domain Adaptation |
Authors | Yanghao Li, Naiyan Wang, Jianping Shi, Jiaying Liu, Xiaodi Hou |
Abstract | Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection. However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain. Recent study (Tommasi et al. 2015) shows that a DNN has strong dependency towards the training dataset, and the learned features cannot be easily transferred to a different but relevant task without fine-tuning. In this paper, we propose a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN. By modulating the statistics in all Batch Normalization layers across the network, our approach achieves deep adaptation effect for domain adaptation tasks. In contrary to other deep learning domain adaptation methods, our method does not require additional components, and is parameter-free. It archives state-of-the-art performance despite its surprising simplicity. Furthermore, we demonstrate that our method is complementary with other existing methods. Combining AdaBN with existing domain adaptation treatments may further improve model performance. |
Tasks | Domain Adaptation, Image Classification, Object Detection |
Published | 2016-03-15 |
URL | http://arxiv.org/abs/1603.04779v4 |
http://arxiv.org/pdf/1603.04779v4.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-batch-normalization-for-practical |
Repo | https://github.com/erlendd/ddan |
Framework | tf |
Directional Statistics in Machine Learning: a Brief Review
Title | Directional Statistics in Machine Learning: a Brief Review |
Authors | Suvrit Sra |
Abstract | The modern data analyst must cope with data encoded in various forms, vectors, matrices, strings, graphs, or more. Consequently, statistical and machine learning models tailored to different data encodings are important. We focus on data encoded as normalized vectors, so that their “direction” is more important than their magnitude. Specifically, we consider high-dimensional vectors that lie either on the surface of the unit hypersphere or on the real projective plane. For such data, we briefly review common mathematical models prevalent in machine learning, while also outlining some technical aspects, software, applications, and open mathematical challenges. |
Tasks | |
Published | 2016-05-01 |
URL | http://arxiv.org/abs/1605.00316v1 |
http://arxiv.org/pdf/1605.00316v1.pdf | |
PWC | https://paperswithcode.com/paper/directional-statistics-in-machine-learning-a |
Repo | https://github.com/clara-labs/spherecluster |
Framework | none |
Protein-Ligand Scoring with Convolutional Neural Networks
Title | Protein-Ligand Scoring with Convolutional Neural Networks |
Authors | Matthew Ragoza, Joshua Hochuli, Elisa Idrobo, Jocelyn Sunseri, David Ryan Koes |
Abstract | Computational approaches to drug discovery can reduce the time and cost associated with experimental assays and enable the screening of novel chemotypes. Structure-based drug design methods rely on scoring functions to rank and predict binding affinities and poses. The ever-expanding amount of protein-ligand binding and structural data enables the use of deep machine learning techniques for protein-ligand scoring. We describe convolutional neural network (CNN) scoring functions that take as input a comprehensive 3D representation of a protein-ligand interaction. A CNN scoring function automatically learns the key features of protein-ligand interactions that correlate with binding. We train and optimize our CNN scoring functions to discriminate between correct and incorrect binding poses and known binders and non-binders. We find that our CNN scoring function outperforms the AutoDock Vina scoring function when ranking poses both for pose prediction and virtual screening. |
Tasks | Drug Discovery, Pose Prediction |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02751v1 |
http://arxiv.org/pdf/1612.02751v1.pdf | |
PWC | https://paperswithcode.com/paper/protein-ligand-scoring-with-convolutional |
Repo | https://github.com/gnina/gnina |
Framework | none |
Fitting a 3D Morphable Model to Edges: A Comparison Between Hard and Soft Correspondences
Title | Fitting a 3D Morphable Model to Edges: A Comparison Between Hard and Soft Correspondences |
Authors | Anil Bas, William A. P. Smith, Timo Bolkart, Stefanie Wuhrer |
Abstract | We propose a fully automatic method for fitting a 3D morphable model to single face images in arbitrary pose and lighting. Our approach relies on geometric features (edges and landmarks) and, inspired by the iterated closest point algorithm, is based on computing hard correspondences between model vertices and edge pixels. We demonstrate that this is superior to previous work that uses soft correspondences to form an edge-derived cost surface that is minimised by nonlinear optimisation. |
Tasks | |
Published | 2016-02-02 |
URL | http://arxiv.org/abs/1602.01125v2 |
http://arxiv.org/pdf/1602.01125v2.pdf | |
PWC | https://paperswithcode.com/paper/fitting-a-3d-morphable-model-to-edges-a |
Repo | https://github.com/waps101/3DMM_edges |
Framework | none |
Visual Dialog
Title | Visual Dialog |
Authors | Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh, Dhruv Batra |
Abstract | We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a question about the image, the agent has to ground the question in image, infer context from history, and answer the question accurately. Visual Dialog is disentangled enough from a specific downstream task so as to serve as a general test of machine intelligence, while being grounded in vision enough to allow objective evaluation of individual responses and benchmark progress. We develop a novel two-person chat data-collection protocol to curate a large-scale Visual Dialog dataset (VisDial). VisDial v0.9 has been released and contains 1 dialog with 10 question-answer pairs on ~120k images from COCO, with a total of ~1.2M dialog question-answer pairs. We introduce a family of neural encoder-decoder models for Visual Dialog with 3 encoders – Late Fusion, Hierarchical Recurrent Encoder and Memory Network – and 2 decoders (generative and discriminative), which outperform a number of sophisticated baselines. We propose a retrieval-based evaluation protocol for Visual Dialog where the AI agent is asked to sort a set of candidate answers and evaluated on metrics such as mean-reciprocal-rank of human response. We quantify gap between machine and human performance on the Visual Dialog task via human studies. Putting it all together, we demonstrate the first ‘visual chatbot’! Our dataset, code, trained models and visual chatbot are available on https://visualdialog.org |
Tasks | Chatbot, Visual Dialog |
Published | 2016-11-26 |
URL | http://arxiv.org/abs/1611.08669v5 |
http://arxiv.org/pdf/1611.08669v5.pdf | |
PWC | https://paperswithcode.com/paper/visual-dialog |
Repo | https://github.com/batra-mlp-lab/visdial |
Framework | torch |
STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling
Title | STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling |
Authors | Yang He, Wei-Chen Chiu, Margret Keuper, Mario Fritz |
Abstract | We propose a novel superpixel-based multi-view convolutional neural network for semantic image segmentation. The proposed network produces a high quality segmentation of a single image by leveraging information from additional views of the same scene. Particularly in indoor videos such as captured by robotic platforms or handheld and bodyworn RGBD cameras, nearby video frames provide diverse viewpoints and additional context of objects and scenes. To leverage such information, we first compute region correspondences by optical flow and image boundary-based superpixels. Given these region correspondences, we propose a novel spatio-temporal pooling layer to aggregate information over space and time. We evaluate our approach on the NYU–Depth–V2 and the SUN3D datasets and compare it to various state-of-the-art single-view and multi-view approaches. Besides a general improvement over the state-of-the-art, we also show the benefits of making use of unlabeled frames during training for multi-view as well as single-view prediction. |
Tasks | Optical Flow Estimation, Semantic Segmentation |
Published | 2016-04-08 |
URL | http://arxiv.org/abs/1604.02388v3 |
http://arxiv.org/pdf/1604.02388v3.pdf | |
PWC | https://paperswithcode.com/paper/std2p-rgbd-semantic-segmentation-using-spatio |
Repo | https://github.com/SSAW14/STD2P |
Framework | none |
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
Title | Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network |
Authors | Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, Zehan Wang |
Abstract | Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods. |
Tasks | Image Super-Resolution, Super-Resolution, Video Super-Resolution |
Published | 2016-09-16 |
URL | http://arxiv.org/abs/1609.05158v2 |
http://arxiv.org/pdf/1609.05158v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-single-image-and-video-super |
Repo | https://github.com/XueweiMeng/derain_filter |
Framework | tf |