Paper Group AWR 174
Empirical Risk Minimization under Fairness Constraints. BirdNet: a 3D Object Detection Framework from LiDAR information. Jointly Learning to Label Sentences and Tokens. GDPP: Learning Diverse Generations Using Determinantal Point Process. Deep-Energy: Unsupervised Training of Deep Neural Networks. Simple, Distributed, and Accelerated Probabilistic …
Empirical Risk Minimization under Fairness Constraints
Title | Empirical Risk Minimization under Fairness Constraints |
Authors | Michele Donini, Luca Oneto, Shai Ben-David, John Shawe-Taylor, Massimiliano Pontil |
Abstract | We address the problem of algorithmic fairness: ensuring that sensitive variables do not unfairly influence the outcome of a classifier. We present an approach based on empirical risk minimization, which incorporates a fairness constraint into the learning problem. It encourages the conditional risk of the learned classifier to be approximately constant with respect to the sensitive variable. We derive both risk and fairness bounds that support the statistical consistency of our approach. We specify our approach to kernel methods and observe that the fairness requirement implies an orthogonality constraint which can be easily added to these methods. We further observe that for linear models the constraint translates into a simple data preprocessing step. Experiments indicate that the method is empirically effective and performs favorably against state-of-the-art approaches. |
Tasks | |
Published | 2018-02-23 |
URL | https://arxiv.org/abs/1802.08626v3 |
https://arxiv.org/pdf/1802.08626v3.pdf | |
PWC | https://paperswithcode.com/paper/empirical-risk-minimization-under-fairness |
Repo | https://github.com/jmikko/fair_ERM |
Framework | none |
BirdNet: a 3D Object Detection Framework from LiDAR information
Title | BirdNet: a 3D Object Detection Framework from LiDAR information |
Authors | Jorge Beltran, Carlos Guindel, Francisco Miguel Moreno, Daniel Cruzado, Fernando Garcia, Arturo de la Escalera |
Abstract | Understanding driving situations regardless the conditions of the traffic scene is a cornerstone on the path towards autonomous vehicles; however, despite common sensor setups already include complementary devices such as LiDAR or radar, most of the research on perception systems has traditionally focused on computer vision. We present a LiDAR-based 3D object detection pipeline entailing three stages. First, laser information is projected into a novel cell encoding for bird’s eye view projection. Later, both object location on the plane and its heading are estimated through a convolutional neural network originally designed for image processing. Finally, 3D oriented detections are computed in a post-processing phase. Experiments on KITTI dataset show that the proposed framework achieves state-of-the-art results among comparable methods. Further tests with different LiDAR sensors in real scenarios assess the multi-device capabilities of the approach. |
Tasks | 3D Object Detection, Autonomous Vehicles, Object Detection |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01195v1 |
http://arxiv.org/pdf/1805.01195v1.pdf | |
PWC | https://paperswithcode.com/paper/birdnet-a-3d-object-detection-framework-from |
Repo | https://github.com/beltransen/lidar_bev |
Framework | none |
Jointly Learning to Label Sentences and Tokens
Title | Jointly Learning to Label Sentences and Tokens |
Authors | Marek Rei, Anders Søgaard |
Abstract | Learning to construct text representations in end-to-end systems can be difficult, as natural languages are highly compositional and task-specific annotated datasets are often limited in size. Methods for directly supervising language composition can allow us to guide the models based on existing knowledge, regularizing them towards more robust and interpretable representations. In this paper, we investigate how objectives at different granularities can be used to learn better language representations and we propose an architecture for jointly learning to label sentences and tokens. The predictions at each level are combined together using an attention mechanism, with token-level labels also acting as explicit supervision for composing sentence-level representations. Our experiments show that by learning to perform these tasks jointly on multiple levels, the model achieves substantial improvements for both sentence classification and sequence labeling. |
Tasks | Grammatical Error Detection, Sentence Classification |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.05949v1 |
http://arxiv.org/pdf/1811.05949v1.pdf | |
PWC | https://paperswithcode.com/paper/jointly-learning-to-label-sentences-and |
Repo | https://github.com/marekrei/mltagger |
Framework | tf |
GDPP: Learning Diverse Generations Using Determinantal Point Process
Title | GDPP: Learning Diverse Generations Using Determinantal Point Process |
Authors | Mohamed Elfeki, Camille Couprie, Morgane Riviere, Mohamed Elhoseiny |
Abstract | Generative models have proven to be an outstanding tool for representing high-dimensional probability distributions and generating realistic-looking images. An essential characteristic of generative models is their ability to produce multi-modal outputs. However, while training, they are often susceptible to mode collapse, that is models are limited in mapping input noise to only a few modes of the true data distribution. In this work, we draw inspiration from Determinantal Point Process (DPP) to propose an unsupervised penalty loss that alleviates mode collapse while producing higher quality samples. DPP is an elegant probabilistic measure used to model negative correlations within a subset and hence quantify its diversity. We use DPP kernel to model the diversity in real data as well as in synthetic data. Then, we devise an objective term that encourages generators to synthesize data with similar diversity to real data. In contrast to previous state-of-the-art generative models that tend to use additional trainable parameters or complex training paradigms, our method does not change the original training scheme. Embedded in an adversarial training and variational autoencoder, our Generative DPP approach shows a consistent resistance to mode-collapse on a wide variety of synthetic data and natural image datasets including MNIST, CIFAR10, and CelebA, while outperforming state-of-the-art methods for data-efficiency, generation quality, and convergence-time whereas being 5.8x faster than its closest competitor. |
Tasks | |
Published | 2018-11-30 |
URL | https://arxiv.org/abs/1812.00068v5 |
https://arxiv.org/pdf/1812.00068v5.pdf | |
PWC | https://paperswithcode.com/paper/gdpp-learning-diverse-generations-using |
Repo | https://github.com/M-Elfeki/GDPP |
Framework | tf |
Deep-Energy: Unsupervised Training of Deep Neural Networks
Title | Deep-Energy: Unsupervised Training of Deep Neural Networks |
Authors | Alona Golts, Daniel Freedman, Michael Elad |
Abstract | The success of deep learning has been due, in no small part, to the availability of large annotated datasets. Thus, a major bottleneck in current learning pipelines is the time-consuming human annotation of data. In scenarios where such input-output pairs cannot be collected, simulation is often used instead, leading to a domain-shift between synthesized and real-world data. This work offers an unsupervised alternative that relies on the availability of task-specific energy functions, replacing the generic supervised loss. Such energy functions are assumed to lead to the desired label as their minimizer given the input. The proposed approach, termed “Deep Energy”, trains a Deep Neural Network (DNN) to approximate this minimization for any chosen input. Once trained, a simple and fast feed-forward computation provides the inferred label. This approach allows us to perform unsupervised training of DNNs with real-world inputs only, and without the need for manually-annotated labels, nor synthetically created data. “Deep Energy” is demonstrated in this paper on three different tasks – seeded segmentation, image matting and single image dehazing – exposing its generality and wide applicability. Our experiments show that the solution provided by the network is often much better in quality than the one obtained by a direct minimization of the energy function, suggesting an added regularization property in our scheme. |
Tasks | Image Dehazing, Image Matting, Single Image Dehazing |
Published | 2018-05-31 |
URL | https://arxiv.org/abs/1805.12355v2 |
https://arxiv.org/pdf/1805.12355v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-energy-using-energy-functions-for |
Repo | https://github.com/AlonaGolts/Deep_Energy |
Framework | tf |
Simple, Distributed, and Accelerated Probabilistic Programming
Title | Simple, Distributed, and Accelerated Probabilistic Programming |
Authors | Dustin Tran, Matthew Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, Alexey Radul, Matthew Johnson, Rif A. Saurous |
Abstract | We describe a simple, low-level approach for embedding probabilistic programming in a deep learning ecosystem. In particular, we distill probabilistic programming down to a single abstraction—the random variable. Our lightweight implementation in TensorFlow enables numerous applications: a model-parallel variational auto-encoder (VAE) with 2nd-generation tensor processing units (TPUv2s); a data-parallel autoregressive model (Image Transformer) with TPUv2s; and multi-GPU No-U-Turn Sampler (NUTS). For both a state-of-the-art VAE on 64x64 ImageNet and Image Transformer on 256x256 CelebA-HQ, our approach achieves an optimal linear speedup from 1 to 256 TPUv2 chips. With NUTS, we see a 100x speedup on GPUs over Stan and 37x over PyMC3. |
Tasks | Probabilistic Programming |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.02091v2 |
http://arxiv.org/pdf/1811.02091v2.pdf | |
PWC | https://paperswithcode.com/paper/simple-distributed-and-accelerated |
Repo | https://github.com/google/edward2 |
Framework | tf |
Convolutional Sequence to Sequence Model for Human Dynamics
Title | Convolutional Sequence to Sequence Model for Human Dynamics |
Authors | Chen Li, Zhen Zhang, Wee Sun Lee, Gim Hee Lee |
Abstract | Human motion modeling is a classic problem in computer vision and graphics. Challenges in modeling human motion include high dimensional prediction as well as extremely complicated dynamics.We present a novel approach to human motion modeling based on convolutional neural networks (CNN). The hierarchical structure of CNN makes it capable of capturing both spatial and temporal correlations effectively. In our proposed approach,a convolutional long-term encoder is used to encode the whole given motion sequence into a long-term hidden variable, which is used with a decoder to predict the remainder of the sequence. The decoder itself also has an encoder-decoder structure, in which the short-term encoder encodes a shorter sequence to a short-term hidden variable, and the spatial decoder maps the long and short-term hidden variable to motion predictions. By using such a model, we are able to capture both invariant and dynamic information of human motion, which results in more accurate predictions. Experiments show that our algorithm outperforms the state-of-the-art methods on the Human3.6M and CMU Motion Capture datasets. Our code is available at the project website. |
Tasks | Human Dynamics, Motion Capture |
Published | 2018-05-02 |
URL | http://arxiv.org/abs/1805.00655v1 |
http://arxiv.org/pdf/1805.00655v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-sequence-to-sequence-model-for |
Repo | https://github.com/chaneyddtt/Convolutional-Sequence-to-Sequence-Model-for-Human-Dynamics |
Framework | tf |
Pyro: Deep Universal Probabilistic Programming
Title | Pyro: Deep Universal Probabilistic Programming |
Authors | Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, Noah D. Goodman |
Abstract | Pyro is a probabilistic programming language built on Python as a platform for developing advanced probabilistic models in AI research. To scale to large datasets and high-dimensional models, Pyro uses stochastic variational inference algorithms and probability distributions built on top of PyTorch, a modern GPU-accelerated deep learning framework. To accommodate complex or model-specific algorithmic behavior, Pyro leverages Poutine, a library of composable building blocks for modifying the behavior of probabilistic programs. |
Tasks | Probabilistic Programming |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.09538v1 |
http://arxiv.org/pdf/1810.09538v1.pdf | |
PWC | https://paperswithcode.com/paper/pyro-deep-universal-probabilistic-programming |
Repo | https://github.com/uber/pyro |
Framework | pytorch |
Sinkhorn AutoEncoders
Title | Sinkhorn AutoEncoders |
Authors | Giorgio Patrini, Rianne van den Berg, Patrick Forré, Marcello Carioni, Samarth Bhargav, Max Welling, Tim Genewein, Frank Nielsen |
Abstract | Optimal transport offers an alternative to maximum likelihood for learning generative autoencoding models. We show that minimizing the p-Wasserstein distance between the generator and the true data distribution is equivalent to the unconstrained min-min optimization of the p-Wasserstein distance between the encoder aggregated posterior and the prior in latent space, plus a reconstruction error. We also identify the role of its trade-off hyperparameter as the capacity of the generator: its Lipschitz constant. Moreover, we prove that optimizing the encoder over any class of universal approximators, such as deterministic neural networks, is enough to come arbitrarily close to the optimum. We therefore advertise this framework, which holds for any metric space and prior, as a sweet-spot of current generative autoencoding objectives. We then introduce the Sinkhorn auto-encoder (SAE), which approximates and minimizes the p-Wasserstein distance in latent space via backprogation through the Sinkhorn algorithm. SAE directly works on samples, i.e. it models the aggregated posterior as an implicit distribution, with no need for a reparameterization trick for gradients estimations. SAE is thus able to work with different metric spaces and priors with minimal adaptations. We demonstrate the flexibility of SAE on latent spaces with different geometries and priors and compare with other methods on benchmark data sets. |
Tasks | Probabilistic Programming |
Published | 2018-10-02 |
URL | https://arxiv.org/abs/1810.01118v3 |
https://arxiv.org/pdf/1810.01118v3.pdf | |
PWC | https://paperswithcode.com/paper/sinkhorn-autoencoders |
Repo | https://github.com/jaberkow/TensorFlowSinkhorn |
Framework | tf |
A Rule-based Kurdish Text Transliteration System
Title | A Rule-based Kurdish Text Transliteration System |
Authors | Sina Ahmadi |
Abstract | In this article, we present a rule-based approach for transliterating two mostly used orthographies in Sorani Kurdish. Our work consists of detecting a character in a word by removing the possible ambiguities and mapping it into the target orthography. We describe different challenges in Kurdish text mining and propose novel ideas concerning the transliteration task for Sorani Kurdish. Our transliteration system, named Wergor, achieves 82.79% overall precision and more than 99% in detecting the double-usage characters. We also present a manually transliterated corpus for Kurdish. |
Tasks | Transliteration |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10278v1 |
http://arxiv.org/pdf/1811.10278v1.pdf | |
PWC | https://paperswithcode.com/paper/a-rule-based-kurdish-text-transliteration |
Repo | https://github.com/sinaahmadi/wergor |
Framework | none |
Unsupervised Meta-Learning For Few-Shot Image Classification
Title | Unsupervised Meta-Learning For Few-Shot Image Classification |
Authors | Siavash Khodadadeh, Ladislau Bölöni, Mubarak Shah |
Abstract | Few-shot or one-shot learning of classifiers requires a significant inductive bias towards the type of task to be learned. One way to acquire this is by meta-learning on tasks similar to the target task. In this paper, we propose UMTRA, an algorithm that performs unsupervised, model-agnostic meta-learning for classification tasks. The meta-learning step of UMTRA is performed on a flat collection of unlabeled images. While we assume that these images can be grouped into a diverse set of classes and are relevant to the target task, no explicit information about the classes or any labels are needed. UMTRA uses random sampling and augmentation to create synthetic training tasks for meta-learning phase. Labels are only needed at the final target task learning step, and they can be as little as one sample per class. On the Omniglot and Mini-Imagenet few-shot learning benchmarks, UMTRA outperforms every tested approach based on unsupervised learning of representations, while alternating for the best performance with the recent CACTUs algorithm. Compared to supervised model-agnostic meta-learning approaches, UMTRA trades off some classification accuracy for a reduction in the required labels of several orders of magnitude. |
Tasks | Few-Shot Image Classification, Few-Shot Learning, Image Classification, Meta-Learning, Omniglot, One-Shot Learning, Video Classification |
Published | 2018-11-28 |
URL | https://arxiv.org/abs/1811.11819v2 |
https://arxiv.org/pdf/1811.11819v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-meta-learning-for-few-shot-image |
Repo | https://github.com/siavash-khodadadeh/MetaLearning-TF2.0 |
Framework | tf |
A Dataset for Building Code-Mixed Goal Oriented Conversation Systems
Title | A Dataset for Building Code-Mixed Goal Oriented Conversation Systems |
Authors | Suman Banerjee, Nikita Moghe, Siddhartha Arora, Mitesh M. Khapra |
Abstract | There is an increasing demand for goal-oriented conversation systems which can assist users in various day-to-day activities such as booking tickets, restaurant reservations, shopping, etc. Most of the existing datasets for building such conversation systems focus on monolingual conversations and there is hardly any work on multilingual and/or code-mixed conversations. Such datasets and systems thus do not cater to the multilingual regions of the world, such as India, where it is very common for people to speak more than one language and seamlessly switch between them resulting in code-mixed conversations. For example, a Hindi speaking user looking to book a restaurant would typically ask, “Kya tum is restaurant mein ek table book karne mein meri help karoge?” (“Can you help me in booking a table at this restaurant?"). To facilitate the development of such code-mixed conversation models, we build a goal-oriented dialog dataset containing code-mixed conversations. Specifically, we take the text from the DSTC2 restaurant reservation dataset and create code-mixed versions of it in Hindi-English, Bengali-English, Gujarati-English and Tamil-English. We also establish initial baselines on this dataset using existing state of the art models. This dataset along with our baseline implementations is made publicly available for research purposes. |
Tasks | Goal-Oriented Dialog |
Published | 2018-06-15 |
URL | http://arxiv.org/abs/1806.05997v1 |
http://arxiv.org/pdf/1806.05997v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dataset-for-building-code-mixed-goal |
Repo | https://github.com/sumanbanerjee1/Code-Mixed-Dialog |
Framework | tf |
Class2Str: End to End Latent Hierarchy Learning
Title | Class2Str: End to End Latent Hierarchy Learning |
Authors | Soham Saha, Girish Varma, C. V. Jawahar |
Abstract | Deep neural networks for image classification typically consists of a convolutional feature extractor followed by a fully connected classifier network. The predicted and the ground truth labels are represented as one hot vectors. Such a representation assumes that all classes are equally dissimilar. However, classes have visual similarities and often form a hierarchy. Learning this latent hierarchy explicitly in the architecture could provide invaluable insights. We propose an alternate architecture to the classifier network called the Latent Hierarchy (LH) Classifier and an end to end learned Class2Str mapping which discovers a latent hierarchy of the classes. We show that for some of the best performing architectures on CIFAR and Imagenet datasets, the proposed replacement and training by LH classifier recovers the accuracy, with a fraction of the number of parameters in the classifier part. Compared to the previous work of HDCNN, which also learns a 2 level hierarchy, we are able to learn a hierarchy at an arbitrary number of levels as well as obtain an accuracy improvement on the Imagenet classification task over them. We also verify that many visually similar classes are grouped together, under the learnt hierarchy. |
Tasks | Image Classification |
Published | 2018-08-20 |
URL | http://arxiv.org/abs/1808.06675v1 |
http://arxiv.org/pdf/1808.06675v1.pdf | |
PWC | https://paperswithcode.com/paper/class2str-end-to-end-latent-hierarchy |
Repo | https://github.com/Soham0/Class2Str |
Framework | tf |
Outer Product-based Neural Collaborative Filtering
Title | Outer Product-based Neural Collaborative Filtering |
Authors | Xiangnan He, Xiaoyu Du, Xiang Wang, Feng Tian, Jinhui Tang, Tat-Seng Chua |
Abstract | In this work, we contribute a new multi-layer neural network architecture named ONCF to perform collaborative filtering. The idea is to use an outer product to explicitly model the pairwise correlations between the dimensions of the embedding space. In contrast to existing neural recommender models that combine user embedding and item embedding via a simple concatenation or element-wise product, our proposal of using outer product above the embedding layer results in a two-dimensional interaction map that is more expressive and semantically plausible. Above the interaction map obtained by outer product, we propose to employ a convolutional neural network to learn high-order correlations among embedding dimensions. Extensive experiments on two public implicit feedback data demonstrate the effectiveness of our proposed ONCF framework, in particular, the positive effect of using outer product to model the correlations between embedding dimensions in the low level of multi-layer neural recommender model. The experiment codes are available at: https://github.com/duxy-me/ConvNCF |
Tasks | |
Published | 2018-08-12 |
URL | http://arxiv.org/abs/1808.03912v1 |
http://arxiv.org/pdf/1808.03912v1.pdf | |
PWC | https://paperswithcode.com/paper/outer-product-based-neural-collaborative |
Repo | https://github.com/duxy-me/ConvNCF |
Framework | tf |
Revisiting Random Binning Features: Fast Convergence and Strong Parallelizability
Title | Revisiting Random Binning Features: Fast Convergence and Strong Parallelizability |
Authors | Lingfei Wu, Ian E. H. Yen, Jie Chen, Rui Yan |
Abstract | Kernel method has been developed as one of the standard approaches for nonlinear learning, which however, does not scale to large data set due to its quadratic complexity in the number of samples. A number of kernel approximation methods have thus been proposed in the recent years, among which the random features method gains much popularity due to its simplicity and direct reduction of nonlinear problem to a linear one. The Random Binning (RB) feature, proposed in the first random-feature paper \cite{rahimi2007random}, has drawn much less attention than the Random Fourier (RF) feature. In this work, we observe that the RB features, with right choice of optimization solver, could be orders-of-magnitude more efficient than other random features and kernel approximation methods under the same requirement of accuracy. We thus propose the first analysis of RB from the perspective of optimization, which by interpreting RB as a Randomized Block Coordinate Descent in the infinite-dimensional space, gives a faster convergence rate compared to that of other random features. In particular, we show that by drawing $R$ random grids with at least $\kappa$ number of non-empty bins per grid in expectation, RB method achieves a convergence rate of $O(1/(\kappa R))$, which not only sharpens its $O(1/\sqrt{R})$ rate from Monte Carlo analysis, but also shows a $\kappa$ times speedup over other random features under the same analysis framework. In addition, we demonstrate another advantage of RB in the L1-regularized setting, where unlike other random features, a RB-based Coordinate Descent solver can be parallelized with guaranteed speedup proportional to $\kappa$. Our extensive experiments demonstrate the superior performance of the RB features over other random features and kernel approximation methods. Our code and data is available at { \url{https://github.com/teddylfwu/RB_GEN}}. |
Tasks | |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05247v2 |
http://arxiv.org/pdf/1809.05247v2.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-random-binning-features-fast |
Repo | https://github.com/teddylfwu/RB_GEN |
Framework | none |