October 20, 2019

3164 words 15 mins read

Paper Group AWR 174

Empirical Risk Minimization under Fairness Constraints. BirdNet: a 3D Object Detection Framework from LiDAR information. Jointly Learning to Label Sentences and Tokens. GDPP: Learning Diverse Generations Using Determinantal Point Process. Deep-Energy: Unsupervised Training of Deep Neural Networks. Simple, Distributed, and Accelerated Probabilistic …

Empirical Risk Minimization under Fairness Constraints


Title	Empirical Risk Minimization under Fairness Constraints
Authors	Michele Donini, Luca Oneto, Shai Ben-David, John Shawe-Taylor, Massimiliano Pontil
Abstract	We address the problem of algorithmic fairness: ensuring that sensitive variables do not unfairly influence the outcome of a classifier. We present an approach based on empirical risk minimization, which incorporates a fairness constraint into the learning problem. It encourages the conditional risk of the learned classifier to be approximately constant with respect to the sensitive variable. We derive both risk and fairness bounds that support the statistical consistency of our approach. We specify our approach to kernel methods and observe that the fairness requirement implies an orthogonality constraint which can be easily added to these methods. We further observe that for linear models the constraint translates into a simple data preprocessing step. Experiments indicate that the method is empirically effective and performs favorably against state-of-the-art approaches.
Tasks
Published	2018-02-23
URL	https://arxiv.org/abs/1802.08626v3
PDF	https://arxiv.org/pdf/1802.08626v3.pdf
PWC	https://paperswithcode.com/paper/empirical-risk-minimization-under-fairness
Repo	https://github.com/jmikko/fair_ERM
Framework	none

BirdNet: a 3D Object Detection Framework from LiDAR information


Title	BirdNet: a 3D Object Detection Framework from LiDAR information
Authors	Jorge Beltran, Carlos Guindel, Francisco Miguel Moreno, Daniel Cruzado, Fernando Garcia, Arturo de la Escalera
Abstract	Understanding driving situations regardless the conditions of the traffic scene is a cornerstone on the path towards autonomous vehicles; however, despite common sensor setups already include complementary devices such as LiDAR or radar, most of the research on perception systems has traditionally focused on computer vision. We present a LiDAR-based 3D object detection pipeline entailing three stages. First, laser information is projected into a novel cell encoding for bird’s eye view projection. Later, both object location on the plane and its heading are estimated through a convolutional neural network originally designed for image processing. Finally, 3D oriented detections are computed in a post-processing phase. Experiments on KITTI dataset show that the proposed framework achieves state-of-the-art results among comparable methods. Further tests with different LiDAR sensors in real scenarios assess the multi-device capabilities of the approach.
Tasks	3D Object Detection, Autonomous Vehicles, Object Detection
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01195v1
PDF	http://arxiv.org/pdf/1805.01195v1.pdf
PWC	https://paperswithcode.com/paper/birdnet-a-3d-object-detection-framework-from
Repo	https://github.com/beltransen/lidar_bev
Framework	none

Jointly Learning to Label Sentences and Tokens


Title	Jointly Learning to Label Sentences and Tokens
Authors	Marek Rei, Anders Søgaard
Abstract	Learning to construct text representations in end-to-end systems can be difficult, as natural languages are highly compositional and task-specific annotated datasets are often limited in size. Methods for directly supervising language composition can allow us to guide the models based on existing knowledge, regularizing them towards more robust and interpretable representations. In this paper, we investigate how objectives at different granularities can be used to learn better language representations and we propose an architecture for jointly learning to label sentences and tokens. The predictions at each level are combined together using an attention mechanism, with token-level labels also acting as explicit supervision for composing sentence-level representations. Our experiments show that by learning to perform these tasks jointly on multiple levels, the model achieves substantial improvements for both sentence classification and sequence labeling.
Tasks	Grammatical Error Detection, Sentence Classification
Published	2018-11-14
URL	http://arxiv.org/abs/1811.05949v1
PDF	http://arxiv.org/pdf/1811.05949v1.pdf
PWC	https://paperswithcode.com/paper/jointly-learning-to-label-sentences-and
Repo	https://github.com/marekrei/mltagger
Framework	tf

GDPP: Learning Diverse Generations Using Determinantal Point Process


Title	GDPP: Learning Diverse Generations Using Determinantal Point Process
Authors	Mohamed Elfeki, Camille Couprie, Morgane Riviere, Mohamed Elhoseiny
Abstract	Generative models have proven to be an outstanding tool for representing high-dimensional probability distributions and generating realistic-looking images. An essential characteristic of generative models is their ability to produce multi-modal outputs. However, while training, they are often susceptible to mode collapse, that is models are limited in mapping input noise to only a few modes of the true data distribution. In this work, we draw inspiration from Determinantal Point Process (DPP) to propose an unsupervised penalty loss that alleviates mode collapse while producing higher quality samples. DPP is an elegant probabilistic measure used to model negative correlations within a subset and hence quantify its diversity. We use DPP kernel to model the diversity in real data as well as in synthetic data. Then, we devise an objective term that encourages generators to synthesize data with similar diversity to real data. In contrast to previous state-of-the-art generative models that tend to use additional trainable parameters or complex training paradigms, our method does not change the original training scheme. Embedded in an adversarial training and variational autoencoder, our Generative DPP approach shows a consistent resistance to mode-collapse on a wide variety of synthetic data and natural image datasets including MNIST, CIFAR10, and CelebA, while outperforming state-of-the-art methods for data-efficiency, generation quality, and convergence-time whereas being 5.8x faster than its closest competitor.
Tasks
Published	2018-11-30
URL	https://arxiv.org/abs/1812.00068v5
PDF	https://arxiv.org/pdf/1812.00068v5.pdf
PWC	https://paperswithcode.com/paper/gdpp-learning-diverse-generations-using
Repo	https://github.com/M-Elfeki/GDPP
Framework	tf

Deep-Energy: Unsupervised Training of Deep Neural Networks


Title	Deep-Energy: Unsupervised Training of Deep Neural Networks
Authors	Alona Golts, Daniel Freedman, Michael Elad
Abstract	The success of deep learning has been due, in no small part, to the availability of large annotated datasets. Thus, a major bottleneck in current learning pipelines is the time-consuming human annotation of data. In scenarios where such input-output pairs cannot be collected, simulation is often used instead, leading to a domain-shift between synthesized and real-world data. This work offers an unsupervised alternative that relies on the availability of task-specific energy functions, replacing the generic supervised loss. Such energy functions are assumed to lead to the desired label as their minimizer given the input. The proposed approach, termed “Deep Energy”, trains a Deep Neural Network (DNN) to approximate this minimization for any chosen input. Once trained, a simple and fast feed-forward computation provides the inferred label. This approach allows us to perform unsupervised training of DNNs with real-world inputs only, and without the need for manually-annotated labels, nor synthetically created data. “Deep Energy” is demonstrated in this paper on three different tasks – seeded segmentation, image matting and single image dehazing – exposing its generality and wide applicability. Our experiments show that the solution provided by the network is often much better in quality than the one obtained by a direct minimization of the energy function, suggesting an added regularization property in our scheme.
Tasks	Image Dehazing, Image Matting, Single Image Dehazing
Published	2018-05-31
URL	https://arxiv.org/abs/1805.12355v2
PDF	https://arxiv.org/pdf/1805.12355v2.pdf
PWC	https://paperswithcode.com/paper/deep-energy-using-energy-functions-for
Repo	https://github.com/AlonaGolts/Deep_Energy
Framework	tf

Simple, Distributed, and Accelerated Probabilistic Programming


Title	Simple, Distributed, and Accelerated Probabilistic Programming
Authors	Dustin Tran, Matthew Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, Alexey Radul, Matthew Johnson, Rif A. Saurous
Abstract	We describe a simple, low-level approach for embedding probabilistic programming in a deep learning ecosystem. In particular, we distill probabilistic programming down to a single abstraction—the random variable. Our lightweight implementation in TensorFlow enables numerous applications: a model-parallel variational auto-encoder (VAE) with 2nd-generation tensor processing units (TPUv2s); a data-parallel autoregressive model (Image Transformer) with TPUv2s; and multi-GPU No-U-Turn Sampler (NUTS). For both a state-of-the-art VAE on 64x64 ImageNet and Image Transformer on 256x256 CelebA-HQ, our approach achieves an optimal linear speedup from 1 to 256 TPUv2 chips. With NUTS, we see a 100x speedup on GPUs over Stan and 37x over PyMC3.
Tasks	Probabilistic Programming
Published	2018-11-05
URL	http://arxiv.org/abs/1811.02091v2
PDF	http://arxiv.org/pdf/1811.02091v2.pdf
PWC	https://paperswithcode.com/paper/simple-distributed-and-accelerated
Repo	https://github.com/google/edward2
Framework	tf

Convolutional Sequence to Sequence Model for Human Dynamics


Title	Convolutional Sequence to Sequence Model for Human Dynamics
Authors	Chen Li, Zhen Zhang, Wee Sun Lee, Gim Hee Lee
Abstract	Human motion modeling is a classic problem in computer vision and graphics. Challenges in modeling human motion include high dimensional prediction as well as extremely complicated dynamics.We present a novel approach to human motion modeling based on convolutional neural networks (CNN). The hierarchical structure of CNN makes it capable of capturing both spatial and temporal correlations effectively. In our proposed approach,a convolutional long-term encoder is used to encode the whole given motion sequence into a long-term hidden variable, which is used with a decoder to predict the remainder of the sequence. The decoder itself also has an encoder-decoder structure, in which the short-term encoder encodes a shorter sequence to a short-term hidden variable, and the spatial decoder maps the long and short-term hidden variable to motion predictions. By using such a model, we are able to capture both invariant and dynamic information of human motion, which results in more accurate predictions. Experiments show that our algorithm outperforms the state-of-the-art methods on the Human3.6M and CMU Motion Capture datasets. Our code is available at the project website.
Tasks	Human Dynamics, Motion Capture
Published	2018-05-02
URL	http://arxiv.org/abs/1805.00655v1
PDF	http://arxiv.org/pdf/1805.00655v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-sequence-to-sequence-model-for
Repo	https://github.com/chaneyddtt/Convolutional-Sequence-to-Sequence-Model-for-Human-Dynamics
Framework	tf

Pyro: Deep Universal Probabilistic Programming


Title	Pyro: Deep Universal Probabilistic Programming
Authors	Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, Noah D. Goodman
Abstract	Pyro is a probabilistic programming language built on Python as a platform for developing advanced probabilistic models in AI research. To scale to large datasets and high-dimensional models, Pyro uses stochastic variational inference algorithms and probability distributions built on top of PyTorch, a modern GPU-accelerated deep learning framework. To accommodate complex or model-specific algorithmic behavior, Pyro leverages Poutine, a library of composable building blocks for modifying the behavior of probabilistic programs.
Tasks	Probabilistic Programming
Published	2018-10-18
URL	http://arxiv.org/abs/1810.09538v1
PDF	http://arxiv.org/pdf/1810.09538v1.pdf
PWC	https://paperswithcode.com/paper/pyro-deep-universal-probabilistic-programming
Repo	https://github.com/uber/pyro
Framework	pytorch

Sinkhorn AutoEncoders


Title	Sinkhorn AutoEncoders
Authors	Giorgio Patrini, Rianne van den Berg, Patrick Forré, Marcello Carioni, Samarth Bhargav, Max Welling, Tim Genewein, Frank Nielsen
Abstract	Optimal transport offers an alternative to maximum likelihood for learning generative autoencoding models. We show that minimizing the p-Wasserstein distance between the generator and the true data distribution is equivalent to the unconstrained min-min optimization of the p-Wasserstein distance between the encoder aggregated posterior and the prior in latent space, plus a reconstruction error. We also identify the role of its trade-off hyperparameter as the capacity of the generator: its Lipschitz constant. Moreover, we prove that optimizing the encoder over any class of universal approximators, such as deterministic neural networks, is enough to come arbitrarily close to the optimum. We therefore advertise this framework, which holds for any metric space and prior, as a sweet-spot of current generative autoencoding objectives. We then introduce the Sinkhorn auto-encoder (SAE), which approximates and minimizes the p-Wasserstein distance in latent space via backprogation through the Sinkhorn algorithm. SAE directly works on samples, i.e. it models the aggregated posterior as an implicit distribution, with no need for a reparameterization trick for gradients estimations. SAE is thus able to work with different metric spaces and priors with minimal adaptations. We demonstrate the flexibility of SAE on latent spaces with different geometries and priors and compare with other methods on benchmark data sets.
Tasks	Probabilistic Programming
Published	2018-10-02
URL	https://arxiv.org/abs/1810.01118v3
PDF	https://arxiv.org/pdf/1810.01118v3.pdf
PWC	https://paperswithcode.com/paper/sinkhorn-autoencoders
Repo	https://github.com/jaberkow/TensorFlowSinkhorn
Framework	tf

A Rule-based Kurdish Text Transliteration System


Title	A Rule-based Kurdish Text Transliteration System
Authors	Sina Ahmadi
Abstract	In this article, we present a rule-based approach for transliterating two mostly used orthographies in Sorani Kurdish. Our work consists of detecting a character in a word by removing the possible ambiguities and mapping it into the target orthography. We describe different challenges in Kurdish text mining and propose novel ideas concerning the transliteration task for Sorani Kurdish. Our transliteration system, named Wergor, achieves 82.79% overall precision and more than 99% in detecting the double-usage characters. We also present a manually transliterated corpus for Kurdish.
Tasks	Transliteration
Published	2018-11-26
URL	http://arxiv.org/abs/1811.10278v1
PDF	http://arxiv.org/pdf/1811.10278v1.pdf
PWC	https://paperswithcode.com/paper/a-rule-based-kurdish-text-transliteration
Repo	https://github.com/sinaahmadi/wergor
Framework	none

Unsupervised Meta-Learning For Few-Shot Image Classification


Title	Unsupervised Meta-Learning For Few-Shot Image Classification
Authors	Siavash Khodadadeh, Ladislau Bölöni, Mubarak Shah
Abstract	Few-shot or one-shot learning of classifiers requires a significant inductive bias towards the type of task to be learned. One way to acquire this is by meta-learning on tasks similar to the target task. In this paper, we propose UMTRA, an algorithm that performs unsupervised, model-agnostic meta-learning for classification tasks. The meta-learning step of UMTRA is performed on a flat collection of unlabeled images. While we assume that these images can be grouped into a diverse set of classes and are relevant to the target task, no explicit information about the classes or any labels are needed. UMTRA uses random sampling and augmentation to create synthetic training tasks for meta-learning phase. Labels are only needed at the final target task learning step, and they can be as little as one sample per class. On the Omniglot and Mini-Imagenet few-shot learning benchmarks, UMTRA outperforms every tested approach based on unsupervised learning of representations, while alternating for the best performance with the recent CACTUs algorithm. Compared to supervised model-agnostic meta-learning approaches, UMTRA trades off some classification accuracy for a reduction in the required labels of several orders of magnitude.
Tasks	Few-Shot Image Classification, Few-Shot Learning, Image Classification, Meta-Learning, Omniglot, One-Shot Learning, Video Classification
Published	2018-11-28
URL	https://arxiv.org/abs/1811.11819v2
PDF	https://arxiv.org/pdf/1811.11819v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-meta-learning-for-few-shot-image
Repo	https://github.com/siavash-khodadadeh/MetaLearning-TF2.0
Framework	tf

A Dataset for Building Code-Mixed Goal Oriented Conversation Systems


Title	A Dataset for Building Code-Mixed Goal Oriented Conversation Systems
Authors	Suman Banerjee, Nikita Moghe, Siddhartha Arora, Mitesh M. Khapra
Abstract	There is an increasing demand for goal-oriented conversation systems which can assist users in various day-to-day activities such as booking tickets, restaurant reservations, shopping, etc. Most of the existing datasets for building such conversation systems focus on monolingual conversations and there is hardly any work on multilingual and/or code-mixed conversations. Such datasets and systems thus do not cater to the multilingual regions of the world, such as India, where it is very common for people to speak more than one language and seamlessly switch between them resulting in code-mixed conversations. For example, a Hindi speaking user looking to book a restaurant would typically ask, “Kya tum is restaurant mein ek table book karne mein meri help karoge?” (“Can you help me in booking a table at this restaurant?"). To facilitate the development of such code-mixed conversation models, we build a goal-oriented dialog dataset containing code-mixed conversations. Specifically, we take the text from the DSTC2 restaurant reservation dataset and create code-mixed versions of it in Hindi-English, Bengali-English, Gujarati-English and Tamil-English. We also establish initial baselines on this dataset using existing state of the art models. This dataset along with our baseline implementations is made publicly available for research purposes.
Tasks	Goal-Oriented Dialog
Published	2018-06-15
URL	http://arxiv.org/abs/1806.05997v1
PDF	http://arxiv.org/pdf/1806.05997v1.pdf
PWC	https://paperswithcode.com/paper/a-dataset-for-building-code-mixed-goal
Repo	https://github.com/sumanbanerjee1/Code-Mixed-Dialog
Framework	tf

Class2Str: End to End Latent Hierarchy Learning


Title	Class2Str: End to End Latent Hierarchy Learning
Authors	Soham Saha, Girish Varma, C. V. Jawahar
Abstract	Deep neural networks for image classification typically consists of a convolutional feature extractor followed by a fully connected classifier network. The predicted and the ground truth labels are represented as one hot vectors. Such a representation assumes that all classes are equally dissimilar. However, classes have visual similarities and often form a hierarchy. Learning this latent hierarchy explicitly in the architecture could provide invaluable insights. We propose an alternate architecture to the classifier network called the Latent Hierarchy (LH) Classifier and an end to end learned Class2Str mapping which discovers a latent hierarchy of the classes. We show that for some of the best performing architectures on CIFAR and Imagenet datasets, the proposed replacement and training by LH classifier recovers the accuracy, with a fraction of the number of parameters in the classifier part. Compared to the previous work of HDCNN, which also learns a 2 level hierarchy, we are able to learn a hierarchy at an arbitrary number of levels as well as obtain an accuracy improvement on the Imagenet classification task over them. We also verify that many visually similar classes are grouped together, under the learnt hierarchy.
Tasks	Image Classification
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06675v1
PDF	http://arxiv.org/pdf/1808.06675v1.pdf
PWC	https://paperswithcode.com/paper/class2str-end-to-end-latent-hierarchy
Repo	https://github.com/Soham0/Class2Str
Framework	tf

Outer Product-based Neural Collaborative Filtering


Title	Outer Product-based Neural Collaborative Filtering
Authors	Xiangnan He, Xiaoyu Du, Xiang Wang, Feng Tian, Jinhui Tang, Tat-Seng Chua
Abstract	In this work, we contribute a new multi-layer neural network architecture named ONCF to perform collaborative filtering. The idea is to use an outer product to explicitly model the pairwise correlations between the dimensions of the embedding space. In contrast to existing neural recommender models that combine user embedding and item embedding via a simple concatenation or element-wise product, our proposal of using outer product above the embedding layer results in a two-dimensional interaction map that is more expressive and semantically plausible. Above the interaction map obtained by outer product, we propose to employ a convolutional neural network to learn high-order correlations among embedding dimensions. Extensive experiments on two public implicit feedback data demonstrate the effectiveness of our proposed ONCF framework, in particular, the positive effect of using outer product to model the correlations between embedding dimensions in the low level of multi-layer neural recommender model. The experiment codes are available at: https://github.com/duxy-me/ConvNCF
Tasks
Published	2018-08-12
URL	http://arxiv.org/abs/1808.03912v1
PDF	http://arxiv.org/pdf/1808.03912v1.pdf
PWC	https://paperswithcode.com/paper/outer-product-based-neural-collaborative
Repo	https://github.com/duxy-me/ConvNCF
Framework	tf

Revisiting Random Binning Features: Fast Convergence and Strong Parallelizability


Title	Revisiting Random Binning Features: Fast Convergence and Strong Parallelizability
Authors	Lingfei Wu, Ian E. H. Yen, Jie Chen, Rui Yan
Abstract	Kernel method has been developed as one of the standard approaches for nonlinear learning, which however, does not scale to large data set due to its quadratic complexity in the number of samples. A number of kernel approximation methods have thus been proposed in the recent years, among which the random features method gains much popularity due to its simplicity and direct reduction of nonlinear problem to a linear one. The Random Binning (RB) feature, proposed in the first random-feature paper \cite{rahimi2007random}, has drawn much less attention than the Random Fourier (RF) feature. In this work, we observe that the RB features, with right choice of optimization solver, could be orders-of-magnitude more efficient than other random features and kernel approximation methods under the same requirement of accuracy. We thus propose the first analysis of RB from the perspective of optimization, which by interpreting RB as a Randomized Block Coordinate Descent in the infinite-dimensional space, gives a faster convergence rate compared to that of other random features. In particular, we show that by drawing $R$ random grids with at least $\kappa$ number of non-empty bins per grid in expectation, RB method achieves a convergence rate of $O(1/(\kappa R))$, which not only sharpens its $O(1/\sqrt{R})$ rate from Monte Carlo analysis, but also shows a $\kappa$ times speedup over other random features under the same analysis framework. In addition, we demonstrate another advantage of RB in the L1-regularized setting, where unlike other random features, a RB-based Coordinate Descent solver can be parallelized with guaranteed speedup proportional to $\kappa$. Our extensive experiments demonstrate the superior performance of the RB features over other random features and kernel approximation methods. Our code and data is available at { \url{https://github.com/teddylfwu/RB_GEN}}.
Tasks
Published	2018-09-14
URL	http://arxiv.org/abs/1809.05247v2
PDF	http://arxiv.org/pdf/1809.05247v2.pdf
PWC	https://paperswithcode.com/paper/revisiting-random-binning-features-fast
Repo	https://github.com/teddylfwu/RB_GEN
Framework	none