January 27, 2020

2853 words 14 mins read

Paper Group ANR 1116

Paper Group ANR 1116

Regularized deep learning with non-convex penalties. Semi-supervised learning based on generative adversarial network: a comparison between good GAN and bad GAN approach. Spatiotemporal Pyramid Network for Video Action Recognition. Towards Expressive Priors for Bayesian Neural Networks: Poisson Process Radial Basis Function Networks. Exploring Offl …

Regularized deep learning with non-convex penalties

Title Regularized deep learning with non-convex penalties
Authors Sujit Vettam, Majnu John
Abstract Regularization methods are often employed in deep learning neural networks (DNNs) to prevent overfitting. For penalty based methods for DNN regularization, typically only convex penalties are considered because of their optimization guarantees. Recent theoretical work have shown that non-convex penalties that satisfy certain regularity conditions are also guaranteed to perform well with standard optimization algorithms. In this paper, we examine new and currently existing non-convex penalties for DNN regularization. We provide theoretical justifications for the new penalties and also assess the performance of all penalties on DNN analysis of real datasets.
Tasks
Published 2019-09-11
URL https://arxiv.org/abs/1909.05142v3
PDF https://arxiv.org/pdf/1909.05142v3.pdf
PWC https://paperswithcode.com/paper/regularized-deep-learning-with-a-non-convex
Repo
Framework

Semi-supervised learning based on generative adversarial network: a comparison between good GAN and bad GAN approach

Title Semi-supervised learning based on generative adversarial network: a comparison between good GAN and bad GAN approach
Authors Wenyuan Li, Zichen Wang, Jiayun Li, Jennifer Polson, William Speier, Corey Arnold
Abstract Recently, semi-supervised learning methods based on generative adversarial networks (GANs) have received much attention. Among them, two distinct approaches have achieved competitive results on a variety of benchmark datasets. Bad GAN learns a classifier with unrealistic samples distributed on the complement of the support of the input data. Conversely, Triple GAN consists of a three-player game that tries to leverage good generated samples to boost classification results. In this paper, we perform a comprehensive comparison of these two approaches on different benchmark datasets. We demonstrate their different properties on image generation, and sensitivity to the amount of labeled data provided. By comprehensively comparing these two methods, we hope to shed light on the future of GAN-based semi-supervised learning.
Tasks Image Generation
Published 2019-05-16
URL https://arxiv.org/abs/1905.06484v2
PDF https://arxiv.org/pdf/1905.06484v2.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-based-on-generative
Repo
Framework

Spatiotemporal Pyramid Network for Video Action Recognition

Title Spatiotemporal Pyramid Network for Video Action Recognition
Authors Yunbo Wang, Mingsheng Long, Jianmin Wang, Philip S. Yu
Abstract Two-stream convolutional networks have shown strong performance in video action recognition tasks. The key idea is to learn spatiotemporal features by fusing convolutional networks spatially and temporally. However, it remains unclear how to model the correlations between the spatial and temporal structures at multiple abstraction levels. First, the spatial stream tends to fail if two videos share similar backgrounds. Second, the temporal stream may be fooled if two actions resemble in short snippets, though appear to be distinct in the long term. We propose a novel spatiotemporal pyramid network to fuse the spatial and temporal features in a pyramid structure such that they can reinforce each other. From the architecture perspective, our network constitutes hierarchical fusion strategies which can be trained as a whole using a unified spatiotemporal loss. A series of ablation experiments support the importance of each fusion strategy. From the technical perspective, we introduce the spatiotemporal compact bilinear operator into video analysis tasks. This operator enables efficient training of bilinear fusion operations which can capture full interactions between the spatial and temporal features. Our final network achieves state-of-the-art results on standard video datasets.
Tasks Temporal Action Localization
Published 2019-03-04
URL http://arxiv.org/abs/1903.01038v1
PDF http://arxiv.org/pdf/1903.01038v1.pdf
PWC https://paperswithcode.com/paper/spatiotemporal-pyramid-network-for-video
Repo
Framework

Towards Expressive Priors for Bayesian Neural Networks: Poisson Process Radial Basis Function Networks

Title Towards Expressive Priors for Bayesian Neural Networks: Poisson Process Radial Basis Function Networks
Authors Beau Coker, Melanie F. Pradier, Finale Doshi-Velez
Abstract While Bayesian neural networks have many appealing characteristics, current priors do not easily allow users to specify basic properties such as expected lengthscale or amplitude variance. In this work, we introduce Poisson Process Radial Basis Function Networks, a novel prior that is able to encode amplitude stationarity and input-dependent lengthscale. We prove that our novel formulation allows for a decoupled specification of these properties, and that the estimated regression function is consistent as the number of observations tends to infinity. We demonstrate its behavior on synthetic and real examples.
Tasks
Published 2019-12-12
URL https://arxiv.org/abs/1912.05779v1
PDF https://arxiv.org/pdf/1912.05779v1.pdf
PWC https://paperswithcode.com/paper/towards-expressive-priors-for-bayesian-neural
Repo
Framework

Exploring Offline Policy Evaluation for the Continuous-Armed Bandit Problem

Title Exploring Offline Policy Evaluation for the Continuous-Armed Bandit Problem
Authors Jules Kruijswijk, Petri Parvinen, Maurits Kaptein
Abstract The (contextual) multi-armed bandit problem (MAB) provides a formalization of sequential decision-making which has many applications. However, validly evaluating MAB policies is challenging; we either resort to simulations which inherently include debatable assumptions, or we resort to expensive field trials. Recently an offline evaluation method has been suggested that is based on empirical data, thus relaxing the assumptions, and can be used to evaluate multiple competing policies in parallel. This method is however not directly suited for the continuous armed (CAB) problem; an often encountered version of the MAB problem in which the action set is continuous instead of discrete. We propose and evaluate an extension of the existing method such that it can be used to evaluate CAB policies. We empirically demonstrate that our method provides a relatively consistent ranking of policies. Furthermore, we detail how our method can be used to select policies in a real-life CAB problem.
Tasks Decision Making
Published 2019-08-21
URL https://arxiv.org/abs/1908.07808v1
PDF https://arxiv.org/pdf/1908.07808v1.pdf
PWC https://paperswithcode.com/paper/190807808
Repo
Framework

Geometric Capsule Autoencoders for 3D Point Clouds

Title Geometric Capsule Autoencoders for 3D Point Clouds
Authors Nitish Srivastava, Hanlin Goh, Ruslan Salakhutdinov
Abstract We propose a method to learn object representations from 3D point clouds using bundles of geometrically interpretable hidden units, which we call geometric capsules. Each geometric capsule represents a visual entity, such as an object or a part, and consists of two components: a pose and a feature. The pose encodes where the entity is, while the feature encodes what it is. We use these capsules to construct a Geometric Capsule Autoencoder that learns to group 3D points into parts (small local surfaces), and these parts into the whole object, in an unsupervised manner. Our novel Multi-View Agreement voting mechanism is used to discover an object’s canonical pose and its pose-invariant feature vector. Using the ShapeNet and ModelNet40 datasets, we analyze the properties of the learned representations and show the benefits of having multiple votes agree. We perform alignment and retrieval of arbitrarily rotated objects – tasks that evaluate our model’s object identification and canonical pose recovery capabilities – and obtained insightful results.
Tasks
Published 2019-12-06
URL https://arxiv.org/abs/1912.03310v1
PDF https://arxiv.org/pdf/1912.03310v1.pdf
PWC https://paperswithcode.com/paper/geometric-capsule-autoencoders-for-3d-point
Repo
Framework

A Simple and Strong Convolutional-Attention Network for Irregular Text Recognition

Title A Simple and Strong Convolutional-Attention Network for Irregular Text Recognition
Authors Lu Yang, Peng Wang, Hui Li, Ye Gao, Linjiang Zhang, Chunhua Shen, Yanning Zhang
Abstract Reading irregular scene text of arbitrary shape in natural images is still a challenging problem, despite the progress made recently. Many existing approaches incorporate sophisticated network structures to handle various shapes, use extra annotations for stronger supervision, or employ hard-to-train recurrent neural networks for sequence modeling. In this work, we propose a simple yet robust approach for scene text recognition. With no need to convert input images to sequence representations, we directly connect two-dimensional CNN features to an attention-based sequence decoder. As no recurrent module is adopted, our model can be trained in parallel. It achieves 1.7x to 10x acceleration to backward pass and 1.4x to 9x acceleration to forward pass, compared with the RNN counterparts. The proposed model is trained with only word-level annotations. With this simple design, our method achieves state-of-the-art or competitive recognition performance on the evaluated regular and irregular scene text benchmark datasets.
Tasks Irregular Text Recognition, Scene Text Recognition
Published 2019-04-02
URL https://arxiv.org/abs/1904.01375v3
PDF https://arxiv.org/pdf/1904.01375v3.pdf
PWC https://paperswithcode.com/paper/a-simple-and-robust-convolutional-attention
Repo
Framework

Classifying Norm Conflicts using Learned Semantic Representations

Title Classifying Norm Conflicts using Learned Semantic Representations
Authors João Paulo Aires, Roger Granada, Juarez Monteiro, Rodrigo C. Barros, Felipe Meneguzzi
Abstract While most social norms are informal, they are often formalized by companies in contracts to regulate trades of goods and services. When poorly written, contracts may contain normative conflicts resulting from opposing deontic meanings or contradict specifications. As contracts tend to be long and contain many norms, manually identifying such conflicts requires human-effort, which is time-consuming and error-prone. Automating such task benefits contract makers increasing productivity and making conflict identification more reliable. To address this problem, we introduce an approach to detect and classify norm conflicts in contracts by converting them into latent representations that preserve both syntactic and semantic information and training a model to classify norm conflicts in four conflict types. Our results reach the new state of the art when compared to a previous approach.
Tasks
Published 2019-05-13
URL https://arxiv.org/abs/1906.02121v1
PDF https://arxiv.org/pdf/1906.02121v1.pdf
PWC https://paperswithcode.com/paper/190602121
Repo
Framework

Explaining the Predictions of Any Image Classifier via Decision Trees

Title Explaining the Predictions of Any Image Classifier via Decision Trees
Authors Sheng Shi, Xinfeng Zhang, Wei Fan
Abstract Despite outstanding contribution to the significant progress of Artificial Intelligence (AI), deep learning models remain mostly black boxes, which are extremely weak in explainability of the reasoning process and prediction results. Explainability is not only a gateway between AI and society but also a powerful tool to detect flaws in the model and biases in the data. Local Interpretable Model-agnostic Explanation (LIME) is a recent approach that uses an interpretable model to form a local explanation for the individual prediction result. The current implementation of LIME adopts the linear regression as its interpretable function. However, being so restricted and usually over-simplifying the relationships, linear models fail in situations where nonlinear associations and interactions exist among features and prediction results. This paper implements a decision Tree-based LIME approach, which uses a decision tree model to form an interpretable representation that is locally faithful to the original model. Tree-LIME approach can capture nonlinear interactions among features in the data and creates plausible explanations. Various experiments show that the Tree-LIME explanation of multiple black-box models can achieve more reliable performance in terms of understandability, fidelity, and efficiency.
Tasks
Published 2019-11-04
URL https://arxiv.org/abs/1911.01058v2
PDF https://arxiv.org/pdf/1911.01058v2.pdf
PWC https://paperswithcode.com/paper/explaining-the-predictions-of-any-image
Repo
Framework

3D-Rotation-Equivariant Quaternion Neural Networks

Title 3D-Rotation-Equivariant Quaternion Neural Networks
Authors Binbin Zhang, Wen Shen, Shikun Huang, Zhihua Wei, Quanshi Zhang
Abstract This paper proposes a set of rules to revise various neural networks for 3D point cloud processing to rotation-equivariant quaternion neural networks (REQNNs). We find that when a neural network uses quaternion features under certain conditions, the network feature naturally has the rotation-equivariance property. Rotation equivariance means that applying a specific rotation transformation to the input point cloud is equivalent to applying the same rotation transformation to all intermediate-layer quaternion features. Besides, the REQNN also ensures that the intermediate-layer features are invariant to the permutation of input points. Compared with the original neural network, the REQNN exhibits higher rotation robustness.
Tasks
Published 2019-11-20
URL https://arxiv.org/abs/1911.09040v1
PDF https://arxiv.org/pdf/1911.09040v1.pdf
PWC https://paperswithcode.com/paper/3d-rotation-equivariant-quaternion-neural
Repo
Framework

Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

Title Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes
Authors Yunsu Kim, Julian Schamper, Hermann Ney
Abstract We address for the first time unsupervised training for a translation task with hundreds of thousands of vocabulary words. We scale up the expectation-maximization (EM) algorithm to learn a large translation table without any parallel text or seed lexicon. First, we solve the memory bottleneck and enforce the sparsity with a simple thresholding scheme for the lexicon. Second, we initialize the lexicon training with word classes, which efficiently boosts the performance. Our methods produced promising results on two large-scale unsupervised translation tasks.
Tasks
Published 2019-01-06
URL http://arxiv.org/abs/1901.01577v1
PDF http://arxiv.org/pdf/1901.01577v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-training-for-large-vocabulary
Repo
Framework

The Tower of Babel Meets Web 2.0: User-Generated Content and its Applications in a Multilingual Context

Title The Tower of Babel Meets Web 2.0: User-Generated Content and its Applications in a Multilingual Context
Authors B. Hecht, D. Gergle
Abstract This study explores language’s fragmenting effect on user-generated content by examining the diversity of knowledge representations across 25 different Wikipedia language editions. This diversity is measured at two levels: the concepts that are included in each edition and the ways in which these concepts are described. We demonstrate that the diversity present is greater than has been presumed in the literature and has a significant influence on applications that use Wikipedia as a source of world knowledge. We close by explicating how knowledge diversity can be beneficially leveraged to create “culturally-aware applications” and “hyperlingual applications”.
Tasks
Published 2019-04-02
URL http://arxiv.org/abs/1904.01689v1
PDF http://arxiv.org/pdf/1904.01689v1.pdf
PWC https://paperswithcode.com/paper/the-tower-of-babel-meets-web-20-user
Repo
Framework

Automatic cephalometric landmarks detection on frontal faces: an approach based on supervised learning techniques

Title Automatic cephalometric landmarks detection on frontal faces: an approach based on supervised learning techniques
Authors Lucas Faria Porto, Laise Nascimento Correia Lima, Marta Flores, Andrea Valsecchi, Oscar Ibanez, Carlos Eduardo Machado Palhares, Flavio de Barros Vidal
Abstract Facial landmarks are employed in many research areas such as facial recognition, craniofacial identification, age and sex estimation among the most important. In the forensic field, the focus is on the analysis of a particular set of facial landmarks, defined as cephalometric landmarks. Previous works demonstrated that the descriptive adequacy of these anatomical references for an indirect application (photo-anthropometric description) increased the marking precision of these points, contributing to a greater reliability of these analyzes. However, most of them are performed manually and all of them are subjectivity inherent to the expert examiners. In this sense, the purpose of this work is the development and validation of automatic techniques to detect cephalometric landmarks from digital images of frontal faces in forensic field. The presented approach uses a combination of computer vision and image processing techniques within a supervised learning procedures. The proposed methodology obtains similar precision to a group of human manual cephalometric reference markers and result to be more accurate against others state-of-the-art facial landmark detection frameworks. It achieves a normalized mean distance (in pixel) error of 0.014, similar to the mean inter-expert dispersion (0.009) and clearly better than other automatic approaches also analyzed along of this work (0.026 and 0.101).
Tasks Facial Landmark Detection
Published 2019-04-24
URL http://arxiv.org/abs/1904.10816v1
PDF http://arxiv.org/pdf/1904.10816v1.pdf
PWC https://paperswithcode.com/paper/automatic-cephalometric-landmarks-detection
Repo
Framework

Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling

Title Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling
Authors Randell Cotta, Mingyang Hu, Dan Jiang, Peizhou Liao
Abstract We evaluate the impact of probabilistically-constructed digital identity data collected from Sep. to Dec. 2017 (approx.), in the context of Lookalike-targeted campaigns. The backbone of this study is a large set of probabilistically-constructed “identities”, represented as small bags of cookies and mobile ad identifiers with associated metadata, that are likely all owned by the same underlying user. The identity data allows to generate “identity-based”, rather than “identifier-based”, user models, giving a fuller picture of the interests of the users underlying the identifiers. We employ off-policy techniques to evaluate the potential of identity-powered lookalike models without incurring the risk of allowing untested models to direct large amounts of ad spend or the large cost of performing A/B tests. We add to historical work on off-policy evaluation by noting a significant type of “finite-sample bias” that occurs for studies combining modestly-sized datasets and evaluation metrics involving rare events (e.g., conversions). We illustrate this bias using a simulation study that later informs the handling of inverse propensity weights in our analyses on real data. We demonstrate significant lift in identity-powered lookalikes versus an identity-ignorant baseline: on average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for identifiers having little data themselves, but that can be inferred to belong to users with substantial data to aggregate across identifiers. This implies that identity-powered user modeling is especially important in the context of identifiers having very short lifespans (i.e., frequently churned cookies). Our work motivates and informs the use of probabilistically-constructed identities in marketing. It also deepens the canon of examples in which off-policy learning has been employed to evaluate the complex systems of the internet economy.
Tasks
Published 2019-01-04
URL http://arxiv.org/abs/1901.05560v1
PDF http://arxiv.org/pdf/1901.05560v1.pdf
PWC https://paperswithcode.com/paper/off-policy-evaluation-of-probabilistic
Repo
Framework

Benchmark Dataset for Timetable Optimization of Bus Routes in the City of New Delhi

Title Benchmark Dataset for Timetable Optimization of Bus Routes in the City of New Delhi
Authors Anubhav Jain, Avdesh Kumar, Saumya Balodi, Pravesh Biyani
Abstract Public transport is one of the major forms of transportation in the world. This makes it vital to ensure that public transport is efficient. This research presents a novel real-time GPS bus transit data for over 500 routes of buses operating in New Delhi. The data can be used for modeling various timetable optimization tasks as well as in other domains such as traffic management, travel time estimation, etc. The paper also presents an approach to reduce the waiting time of Delhi buses by analyzing the traffic behavior and proposing a timetable. This algorithm serves as a benchmark for the dataset. The algorithm uses a constrained clustering algorithm for classification of trips. It further analyses the data statistically to provide a timetable which is efficient in learning the inter- and intra-month variations.
Tasks
Published 2019-10-20
URL https://arxiv.org/abs/1910.08903v1
PDF https://arxiv.org/pdf/1910.08903v1.pdf
PWC https://paperswithcode.com/paper/benchmark-dataset-for-timetable-optimization
Repo
Framework
comments powered by Disqus