May 7, 2019

2991 words 15 mins read

Paper Group AWR 92

Paper Group AWR 92

Split-door criterion: Identification of causal effects through auxiliary outcomes. A Discrete and Bounded Envy-Free Cake Cutting Protocol for Any Number of Agents. Hybrid Collaborative Filtering with Autoencoders. TensorLy: Tensor Learning in Python. Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia. Discrimina …

Split-door criterion: Identification of causal effects through auxiliary outcomes

Title Split-door criterion: Identification of causal effects through auxiliary outcomes
Authors Amit Sharma, Jake M. Hofman, Duncan J. Watts
Abstract We present a method for estimating causal effects in time series data when fine-grained information about the outcome of interest is available. Specifically, we examine what we call the split-door setting, where the outcome variable can be split into two parts: one that is potentially affected by the cause being studied and another that is independent of it, with both parts sharing the same (unobserved) confounders. We show that under these conditions, the problem of identification reduces to that of testing for independence among observed variables, and present a method that uses this approach to automatically find subsets of the data that are causally identified. We demonstrate the method by estimating the causal impact of Amazon’s recommender system on traffic to product pages, finding thousands of examples within the dataset that satisfy the split-door criterion. Unlike past studies based on natural experiments that were limited to a single product category, our method applies to a large and representative sample of products viewed on the site. In line with previous work, we find that the widely-used click-through rate (CTR) metric overestimates the causal impact of recommender systems; depending on the product category, we estimate that 50-80% of the traffic attributed to recommender systems would have happened even without any recommendations. We conclude with guidelines for using the split-door criterion as well as a discussion of other contexts where the method can be applied.
Tasks Recommendation Systems, Time Series
Published 2016-11-28
URL http://arxiv.org/abs/1611.09414v2
PDF http://arxiv.org/pdf/1611.09414v2.pdf
PWC https://paperswithcode.com/paper/split-door-criterion-identification-of-causal
Repo https://github.com/amit-sharma/splitdoor-causal-criterion
Framework none

A Discrete and Bounded Envy-Free Cake Cutting Protocol for Any Number of Agents

Title A Discrete and Bounded Envy-Free Cake Cutting Protocol for Any Number of Agents
Authors Haris Aziz, Simon Mackenzie
Abstract We consider the well-studied cake cutting problem in which the goal is to find an envy-free allocation based on queries from $n$ agents. The problem has received attention in computer science, mathematics, and economics. It has been a major open problem whether there exists a discrete and bounded envy-free protocol. We resolve the problem by proposing a discrete and bounded envy-free protocol for any number of agents. The maximum number of queries required by the protocol is $n^{n^{n^{n^{n^n}}}}$. We additionally show that even if we do not run our protocol to completion, it can find in at most $n^3{(n^2)}^n$ queries a partial allocation of the cake that achieves proportionality (each agent gets at least $1/n$ of the value of the whole cake) and envy-freeness. Finally we show that an envy-free partial allocation can be computed in at most $n^3{(n^2)}^n$ queries such that each agent gets a connected piece that gives the agent at least $1/(3n)$ of the value of the whole cake.
Tasks
Published 2016-04-13
URL http://arxiv.org/abs/1604.03655v12
PDF http://arxiv.org/pdf/1604.03655v12.pdf
PWC https://paperswithcode.com/paper/a-discrete-and-bounded-envy-free-cake-cutting
Repo https://github.com/cowtrix/kake
Framework none

Hybrid Collaborative Filtering with Autoencoders

Title Hybrid Collaborative Filtering with Autoencoders
Authors Florian Strub, Jeremie Mary, Romaric Gaudel
Abstract Collaborative Filtering aims at exploiting the feedback of users to provide personalised recommendations. Such algorithms look for latent variables in a large sparse matrix of ratings. They can be enhanced by adding side information to tackle the well-known cold start problem. While Neu-ral Networks have tremendous success in image and speech recognition, they have received less attention in Collaborative Filtering. This is all the more surprising that Neural Networks are able to discover latent variables in large and heterogeneous datasets. In this paper, we introduce a Collaborative Filtering Neural network architecture aka CFN which computes a non-linear Matrix Factorization from sparse rating inputs and side information. We show experimentally on the MovieLens and Douban dataset that CFN outper-forms the state of the art and benefits from side information. We provide an implementation of the algorithm as a reusable plugin for Torch, a popular Neural Network framework.
Tasks
Published 2016-03-02
URL http://arxiv.org/abs/1603.00806v3
PDF http://arxiv.org/pdf/1603.00806v3.pdf
PWC https://paperswithcode.com/paper/hybrid-collaborative-filtering-with
Repo https://github.com/hojinYang/recsys_papers_using_autoencoders
Framework none

TensorLy: Tensor Learning in Python

Title TensorLy: Tensor Learning in Python
Authors Jean Kossaifi, Yannis Panagakis, Anima Anandkumar, Maja Pantic
Abstract Tensors are higher-order extensions of matrices. While matrix methods form the cornerstone of machine learning and data analysis, tensor methods have been gaining increasing traction. However, software support for tensor operations is not on the same footing. In order to bridge this gap, we have developed \emph{TensorLy}, a high-level API for tensor methods and deep tensorized neural networks in Python. TensorLy aims to follow the same standards adopted by the main projects of the Python scientific community, and seamlessly integrates with them. Its BSD license makes it suitable for both academic and commercial applications. TensorLy’s backend system allows users to perform computations with NumPy, MXNet, PyTorch, TensorFlow and CuPy. They can be scaled on multiple CPU or GPU machines. In addition, using the deep-learning frameworks as backend allows users to easily design and train deep tensorized neural networks. TensorLy is available at https://github.com/tensorly/tensorly
Tasks
Published 2016-10-29
URL http://arxiv.org/abs/1610.09555v2
PDF http://arxiv.org/pdf/1610.09555v2.pdf
PWC https://paperswithcode.com/paper/tensorly-tensor-learning-in-python
Repo https://github.com/tensorly/tensorly
Framework pytorch

Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia

Title Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia
Authors Alexander Amini, Berthold Horn, Alan Edelman
Abstract Convolutions have long been regarded as fundamental to applied mathematics, physics and engineering. Their mathematical elegance allows for common tasks such as numerical differentiation to be computed efficiently on large data sets. Efficient computation of convolutions is critical to artificial intelligence in real-time applications, like machine vision, where convolutions must be continuously and efficiently computed on tens to hundreds of kilobytes per second. In this paper, we explore how convolutions are used in fundamental machine vision applications. We present an accelerated n-dimensional convolution package in the high performance computing language, Julia, and demonstrate its efficacy in solving the time to contact problem for machine vision. Results are measured against synthetically generated videos and quantitatively assessed according to their mean squared error from the ground truth. We achieve over an order of magnitude decrease in compute time and allocated memory for comparable machine vision applications. All code is packaged and integrated into the official Julia Package Manager to be used in various other scenarios.
Tasks
Published 2016-12-28
URL http://arxiv.org/abs/1612.08825v1
PDF http://arxiv.org/pdf/1612.08825v1.pdf
PWC https://paperswithcode.com/paper/accelerated-convolutions-for-efficient-multi
Repo https://github.com/aamini/FastConv.jl
Framework none

Discriminative Embeddings of Latent Variable Models for Structured Data

Title Discriminative Embeddings of Latent Variable Models for Structured Data
Authors Hanjun Dai, Bo Dai, Le Song
Abstract Kernel classifiers and regressors designed for structured data, such as sequences, trees and graphs, have significantly advanced a number of interdisciplinary areas such as computational biology and drug design. Typically, kernels are designed beforehand for a data type which either exploit statistics of the structures or make use of probabilistic generative models, and then a discriminative classifier is learned based on the kernels via convex optimization. However, such an elegant two-stage approach also limited kernel methods from scaling up to millions of data points, and exploiting discriminative information to learn feature representations. We propose, structure2vec, an effective and scalable approach for structured data representation based on the idea of embedding latent variable models into feature spaces, and learning such feature spaces using discriminative information. Interestingly, structure2vec extracts features by performing a sequence of function mappings in a way similar to graphical model inference procedures, such as mean field and belief propagation. In applications involving millions of data points, we showed that structure2vec runs 2 times faster, produces models which are $10,000$ times smaller, while at the same time achieving the state-of-the-art predictive performance.
Tasks Latent Variable Models
Published 2016-03-17
URL https://arxiv.org/abs/1603.05629v5
PDF https://arxiv.org/pdf/1603.05629v5.pdf
PWC https://paperswithcode.com/paper/discriminative-embeddings-of-latent-variable
Repo https://github.com/LeeeWee/Note
Framework none

Deep Metric Learning via Facility Location

Title Deep Metric Learning via Facility Location
Authors Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, Kevin Murphy
Abstract Learning the representation and the similarity metric in an end-to-end fashion with deep networks have demonstrated outstanding results for clustering and retrieval. However, these recent approaches still suffer from the performance degradation stemming from the local metric training procedure which is unaware of the global structure of the embedding space. We propose a global metric learning scheme for optimizing the deep metric embedding with the learnable clustering function and the clustering metric (NMI) in a novel structured prediction framework. Our experiments on CUB200-2011, Cars196, and Stanford online products datasets show state of the art performance both on the clustering and retrieval tasks measured in the NMI and Recall@K evaluation metrics.
Tasks Metric Learning, Structured Prediction
Published 2016-12-05
URL http://arxiv.org/abs/1612.01213v2
PDF http://arxiv.org/pdf/1612.01213v2.pdf
PWC https://paperswithcode.com/paper/deep-metric-learning-via-facility-location
Repo https://github.com/michaelfiman/face_rec_metric_comparison
Framework tf

Log-time and Log-space Extreme Classification

Title Log-time and Log-space Extreme Classification
Authors Kalina Jasinska, Nikos Karampatziakis
Abstract We present LTLS, a technique for multiclass and multilabel prediction that can perform training and inference in logarithmic time and space. LTLS embeds large classification problems into simple structured prediction problems and relies on efficient dynamic programming algorithms for inference. We train LTLS with stochastic gradient descent on a number of multiclass and multilabel datasets and show that despite its small memory footprint it is often competitive with existing approaches.
Tasks Structured Prediction
Published 2016-11-07
URL http://arxiv.org/abs/1611.01964v1
PDF http://arxiv.org/pdf/1611.01964v1.pdf
PWC https://paperswithcode.com/paper/log-time-and-log-space-extreme-classification
Repo https://github.com/ievron/wltls
Framework none

Revisiting Batch Normalization For Practical Domain Adaptation

Title Revisiting Batch Normalization For Practical Domain Adaptation
Authors Yanghao Li, Naiyan Wang, Jianping Shi, Jiaying Liu, Xiaodi Hou
Abstract Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection. However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain. Recent study (Tommasi et al. 2015) shows that a DNN has strong dependency towards the training dataset, and the learned features cannot be easily transferred to a different but relevant task without fine-tuning. In this paper, we propose a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN. By modulating the statistics in all Batch Normalization layers across the network, our approach achieves deep adaptation effect for domain adaptation tasks. In contrary to other deep learning domain adaptation methods, our method does not require additional components, and is parameter-free. It archives state-of-the-art performance despite its surprising simplicity. Furthermore, we demonstrate that our method is complementary with other existing methods. Combining AdaBN with existing domain adaptation treatments may further improve model performance.
Tasks Domain Adaptation, Image Classification, Object Detection
Published 2016-03-15
URL http://arxiv.org/abs/1603.04779v4
PDF http://arxiv.org/pdf/1603.04779v4.pdf
PWC https://paperswithcode.com/paper/revisiting-batch-normalization-for-practical
Repo https://github.com/erlendd/ddan
Framework tf

Directional Statistics in Machine Learning: a Brief Review

Title Directional Statistics in Machine Learning: a Brief Review
Authors Suvrit Sra
Abstract The modern data analyst must cope with data encoded in various forms, vectors, matrices, strings, graphs, or more. Consequently, statistical and machine learning models tailored to different data encodings are important. We focus on data encoded as normalized vectors, so that their “direction” is more important than their magnitude. Specifically, we consider high-dimensional vectors that lie either on the surface of the unit hypersphere or on the real projective plane. For such data, we briefly review common mathematical models prevalent in machine learning, while also outlining some technical aspects, software, applications, and open mathematical challenges.
Tasks
Published 2016-05-01
URL http://arxiv.org/abs/1605.00316v1
PDF http://arxiv.org/pdf/1605.00316v1.pdf
PWC https://paperswithcode.com/paper/directional-statistics-in-machine-learning-a
Repo https://github.com/clara-labs/spherecluster
Framework none

Protein-Ligand Scoring with Convolutional Neural Networks

Title Protein-Ligand Scoring with Convolutional Neural Networks
Authors Matthew Ragoza, Joshua Hochuli, Elisa Idrobo, Jocelyn Sunseri, David Ryan Koes
Abstract Computational approaches to drug discovery can reduce the time and cost associated with experimental assays and enable the screening of novel chemotypes. Structure-based drug design methods rely on scoring functions to rank and predict binding affinities and poses. The ever-expanding amount of protein-ligand binding and structural data enables the use of deep machine learning techniques for protein-ligand scoring. We describe convolutional neural network (CNN) scoring functions that take as input a comprehensive 3D representation of a protein-ligand interaction. A CNN scoring function automatically learns the key features of protein-ligand interactions that correlate with binding. We train and optimize our CNN scoring functions to discriminate between correct and incorrect binding poses and known binders and non-binders. We find that our CNN scoring function outperforms the AutoDock Vina scoring function when ranking poses both for pose prediction and virtual screening.
Tasks Drug Discovery, Pose Prediction
Published 2016-12-08
URL http://arxiv.org/abs/1612.02751v1
PDF http://arxiv.org/pdf/1612.02751v1.pdf
PWC https://paperswithcode.com/paper/protein-ligand-scoring-with-convolutional
Repo https://github.com/gnina/gnina
Framework none

Fitting a 3D Morphable Model to Edges: A Comparison Between Hard and Soft Correspondences

Title Fitting a 3D Morphable Model to Edges: A Comparison Between Hard and Soft Correspondences
Authors Anil Bas, William A. P. Smith, Timo Bolkart, Stefanie Wuhrer
Abstract We propose a fully automatic method for fitting a 3D morphable model to single face images in arbitrary pose and lighting. Our approach relies on geometric features (edges and landmarks) and, inspired by the iterated closest point algorithm, is based on computing hard correspondences between model vertices and edge pixels. We demonstrate that this is superior to previous work that uses soft correspondences to form an edge-derived cost surface that is minimised by nonlinear optimisation.
Tasks
Published 2016-02-02
URL http://arxiv.org/abs/1602.01125v2
PDF http://arxiv.org/pdf/1602.01125v2.pdf
PWC https://paperswithcode.com/paper/fitting-a-3d-morphable-model-to-edges-a
Repo https://github.com/waps101/3DMM_edges
Framework none

Visual Dialog

Title Visual Dialog
Authors Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh, Dhruv Batra
Abstract We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a question about the image, the agent has to ground the question in image, infer context from history, and answer the question accurately. Visual Dialog is disentangled enough from a specific downstream task so as to serve as a general test of machine intelligence, while being grounded in vision enough to allow objective evaluation of individual responses and benchmark progress. We develop a novel two-person chat data-collection protocol to curate a large-scale Visual Dialog dataset (VisDial). VisDial v0.9 has been released and contains 1 dialog with 10 question-answer pairs on ~120k images from COCO, with a total of ~1.2M dialog question-answer pairs. We introduce a family of neural encoder-decoder models for Visual Dialog with 3 encoders – Late Fusion, Hierarchical Recurrent Encoder and Memory Network – and 2 decoders (generative and discriminative), which outperform a number of sophisticated baselines. We propose a retrieval-based evaluation protocol for Visual Dialog where the AI agent is asked to sort a set of candidate answers and evaluated on metrics such as mean-reciprocal-rank of human response. We quantify gap between machine and human performance on the Visual Dialog task via human studies. Putting it all together, we demonstrate the first ‘visual chatbot’! Our dataset, code, trained models and visual chatbot are available on https://visualdialog.org
Tasks Chatbot, Visual Dialog
Published 2016-11-26
URL http://arxiv.org/abs/1611.08669v5
PDF http://arxiv.org/pdf/1611.08669v5.pdf
PWC https://paperswithcode.com/paper/visual-dialog
Repo https://github.com/batra-mlp-lab/visdial
Framework torch

STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling

Title STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling
Authors Yang He, Wei-Chen Chiu, Margret Keuper, Mario Fritz
Abstract We propose a novel superpixel-based multi-view convolutional neural network for semantic image segmentation. The proposed network produces a high quality segmentation of a single image by leveraging information from additional views of the same scene. Particularly in indoor videos such as captured by robotic platforms or handheld and bodyworn RGBD cameras, nearby video frames provide diverse viewpoints and additional context of objects and scenes. To leverage such information, we first compute region correspondences by optical flow and image boundary-based superpixels. Given these region correspondences, we propose a novel spatio-temporal pooling layer to aggregate information over space and time. We evaluate our approach on the NYU–Depth–V2 and the SUN3D datasets and compare it to various state-of-the-art single-view and multi-view approaches. Besides a general improvement over the state-of-the-art, we also show the benefits of making use of unlabeled frames during training for multi-view as well as single-view prediction.
Tasks Optical Flow Estimation, Semantic Segmentation
Published 2016-04-08
URL http://arxiv.org/abs/1604.02388v3
PDF http://arxiv.org/pdf/1604.02388v3.pdf
PWC https://paperswithcode.com/paper/std2p-rgbd-semantic-segmentation-using-spatio
Repo https://github.com/SSAW14/STD2P
Framework none

Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network

Title Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
Authors Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, Zehan Wang
Abstract Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods.
Tasks Image Super-Resolution, Super-Resolution, Video Super-Resolution
Published 2016-09-16
URL http://arxiv.org/abs/1609.05158v2
PDF http://arxiv.org/pdf/1609.05158v2.pdf
PWC https://paperswithcode.com/paper/real-time-single-image-and-video-super
Repo https://github.com/XueweiMeng/derain_filter
Framework tf
comments powered by Disqus