January 27, 2020

2853 words 14 mins read

Paper Group ANR 1116

Regularized deep learning with non-convex penalties. Semi-supervised learning based on generative adversarial network: a comparison between good GAN and bad GAN approach. Spatiotemporal Pyramid Network for Video Action Recognition. Towards Expressive Priors for Bayesian Neural Networks: Poisson Process Radial Basis Function Networks. Exploring Offl …

Regularized deep learning with non-convex penalties


Title	Regularized deep learning with non-convex penalties
Authors	Sujit Vettam, Majnu John
Abstract	Regularization methods are often employed in deep learning neural networks (DNNs) to prevent overfitting. For penalty based methods for DNN regularization, typically only convex penalties are considered because of their optimization guarantees. Recent theoretical work have shown that non-convex penalties that satisfy certain regularity conditions are also guaranteed to perform well with standard optimization algorithms. In this paper, we examine new and currently existing non-convex penalties for DNN regularization. We provide theoretical justifications for the new penalties and also assess the performance of all penalties on DNN analysis of real datasets.
Tasks
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05142v3
PDF	https://arxiv.org/pdf/1909.05142v3.pdf
PWC	https://paperswithcode.com/paper/regularized-deep-learning-with-a-non-convex
Repo
Framework

Semi-supervised learning based on generative adversarial network: a comparison between good GAN and bad GAN approach


Title	Semi-supervised learning based on generative adversarial network: a comparison between good GAN and bad GAN approach
Authors	Wenyuan Li, Zichen Wang, Jiayun Li, Jennifer Polson, William Speier, Corey Arnold
Abstract	Recently, semi-supervised learning methods based on generative adversarial networks (GANs) have received much attention. Among them, two distinct approaches have achieved competitive results on a variety of benchmark datasets. Bad GAN learns a classifier with unrealistic samples distributed on the complement of the support of the input data. Conversely, Triple GAN consists of a three-player game that tries to leverage good generated samples to boost classification results. In this paper, we perform a comprehensive comparison of these two approaches on different benchmark datasets. We demonstrate their different properties on image generation, and sensitivity to the amount of labeled data provided. By comprehensively comparing these two methods, we hope to shed light on the future of GAN-based semi-supervised learning.
Tasks	Image Generation
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06484v2
PDF	https://arxiv.org/pdf/1905.06484v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning-based-on-generative
Repo
Framework

Spatiotemporal Pyramid Network for Video Action Recognition


Title	Spatiotemporal Pyramid Network for Video Action Recognition
Authors	Yunbo Wang, Mingsheng Long, Jianmin Wang, Philip S. Yu
Abstract	Two-stream convolutional networks have shown strong performance in video action recognition tasks. The key idea is to learn spatiotemporal features by fusing convolutional networks spatially and temporally. However, it remains unclear how to model the correlations between the spatial and temporal structures at multiple abstraction levels. First, the spatial stream tends to fail if two videos share similar backgrounds. Second, the temporal stream may be fooled if two actions resemble in short snippets, though appear to be distinct in the long term. We propose a novel spatiotemporal pyramid network to fuse the spatial and temporal features in a pyramid structure such that they can reinforce each other. From the architecture perspective, our network constitutes hierarchical fusion strategies which can be trained as a whole using a unified spatiotemporal loss. A series of ablation experiments support the importance of each fusion strategy. From the technical perspective, we introduce the spatiotemporal compact bilinear operator into video analysis tasks. This operator enables efficient training of bilinear fusion operations which can capture full interactions between the spatial and temporal features. Our final network achieves state-of-the-art results on standard video datasets.
Tasks	Temporal Action Localization
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01038v1
PDF	http://arxiv.org/pdf/1903.01038v1.pdf
PWC	https://paperswithcode.com/paper/spatiotemporal-pyramid-network-for-video
Repo
Framework

Towards Expressive Priors for Bayesian Neural Networks: Poisson Process Radial Basis Function Networks


Title	Towards Expressive Priors for Bayesian Neural Networks: Poisson Process Radial Basis Function Networks
Authors	Beau Coker, Melanie F. Pradier, Finale Doshi-Velez
Abstract	While Bayesian neural networks have many appealing characteristics, current priors do not easily allow users to specify basic properties such as expected lengthscale or amplitude variance. In this work, we introduce Poisson Process Radial Basis Function Networks, a novel prior that is able to encode amplitude stationarity and input-dependent lengthscale. We prove that our novel formulation allows for a decoupled specification of these properties, and that the estimated regression function is consistent as the number of observations tends to infinity. We demonstrate its behavior on synthetic and real examples.
Tasks
Published	2019-12-12
URL	https://arxiv.org/abs/1912.05779v1
PDF	https://arxiv.org/pdf/1912.05779v1.pdf
PWC	https://paperswithcode.com/paper/towards-expressive-priors-for-bayesian-neural
Repo
Framework

Exploring Offline Policy Evaluation for the Continuous-Armed Bandit Problem


Title	Exploring Offline Policy Evaluation for the Continuous-Armed Bandit Problem
Authors	Jules Kruijswijk, Petri Parvinen, Maurits Kaptein
Abstract	The (contextual) multi-armed bandit problem (MAB) provides a formalization of sequential decision-making which has many applications. However, validly evaluating MAB policies is challenging; we either resort to simulations which inherently include debatable assumptions, or we resort to expensive field trials. Recently an offline evaluation method has been suggested that is based on empirical data, thus relaxing the assumptions, and can be used to evaluate multiple competing policies in parallel. This method is however not directly suited for the continuous armed (CAB) problem; an often encountered version of the MAB problem in which the action set is continuous instead of discrete. We propose and evaluate an extension of the existing method such that it can be used to evaluate CAB policies. We empirically demonstrate that our method provides a relatively consistent ranking of policies. Furthermore, we detail how our method can be used to select policies in a real-life CAB problem.
Tasks	Decision Making
Published	2019-08-21
URL	https://arxiv.org/abs/1908.07808v1
PDF	https://arxiv.org/pdf/1908.07808v1.pdf
PWC	https://paperswithcode.com/paper/190807808
Repo
Framework

Geometric Capsule Autoencoders for 3D Point Clouds


Title	Geometric Capsule Autoencoders for 3D Point Clouds
Authors	Nitish Srivastava, Hanlin Goh, Ruslan Salakhutdinov
Abstract	We propose a method to learn object representations from 3D point clouds using bundles of geometrically interpretable hidden units, which we call geometric capsules. Each geometric capsule represents a visual entity, such as an object or a part, and consists of two components: a pose and a feature. The pose encodes where the entity is, while the feature encodes what it is. We use these capsules to construct a Geometric Capsule Autoencoder that learns to group 3D points into parts (small local surfaces), and these parts into the whole object, in an unsupervised manner. Our novel Multi-View Agreement voting mechanism is used to discover an object’s canonical pose and its pose-invariant feature vector. Using the ShapeNet and ModelNet40 datasets, we analyze the properties of the learned representations and show the benefits of having multiple votes agree. We perform alignment and retrieval of arbitrarily rotated objects – tasks that evaluate our model’s object identification and canonical pose recovery capabilities – and obtained insightful results.
Tasks
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03310v1
PDF	https://arxiv.org/pdf/1912.03310v1.pdf
PWC	https://paperswithcode.com/paper/geometric-capsule-autoencoders-for-3d-point
Repo
Framework

A Simple and Strong Convolutional-Attention Network for Irregular Text Recognition


Title	A Simple and Strong Convolutional-Attention Network for Irregular Text Recognition
Authors	Lu Yang, Peng Wang, Hui Li, Ye Gao, Linjiang Zhang, Chunhua Shen, Yanning Zhang
Abstract	Reading irregular scene text of arbitrary shape in natural images is still a challenging problem, despite the progress made recently. Many existing approaches incorporate sophisticated network structures to handle various shapes, use extra annotations for stronger supervision, or employ hard-to-train recurrent neural networks for sequence modeling. In this work, we propose a simple yet robust approach for scene text recognition. With no need to convert input images to sequence representations, we directly connect two-dimensional CNN features to an attention-based sequence decoder. As no recurrent module is adopted, our model can be trained in parallel. It achieves 1.7x to 10x acceleration to backward pass and 1.4x to 9x acceleration to forward pass, compared with the RNN counterparts. The proposed model is trained with only word-level annotations. With this simple design, our method achieves state-of-the-art or competitive recognition performance on the evaluated regular and irregular scene text benchmark datasets.
Tasks	Irregular Text Recognition, Scene Text Recognition
Published	2019-04-02
URL	https://arxiv.org/abs/1904.01375v3
PDF	https://arxiv.org/pdf/1904.01375v3.pdf
PWC	https://paperswithcode.com/paper/a-simple-and-robust-convolutional-attention
Repo
Framework

Classifying Norm Conflicts using Learned Semantic Representations


Title	Classifying Norm Conflicts using Learned Semantic Representations
Authors	João Paulo Aires, Roger Granada, Juarez Monteiro, Rodrigo C. Barros, Felipe Meneguzzi
Abstract	While most social norms are informal, they are often formalized by companies in contracts to regulate trades of goods and services. When poorly written, contracts may contain normative conflicts resulting from opposing deontic meanings or contradict specifications. As contracts tend to be long and contain many norms, manually identifying such conflicts requires human-effort, which is time-consuming and error-prone. Automating such task benefits contract makers increasing productivity and making conflict identification more reliable. To address this problem, we introduce an approach to detect and classify norm conflicts in contracts by converting them into latent representations that preserve both syntactic and semantic information and training a model to classify norm conflicts in four conflict types. Our results reach the new state of the art when compared to a previous approach.
Tasks
Published	2019-05-13
URL	https://arxiv.org/abs/1906.02121v1
PDF	https://arxiv.org/pdf/1906.02121v1.pdf
PWC	https://paperswithcode.com/paper/190602121
Repo
Framework

Explaining the Predictions of Any Image Classifier via Decision Trees


Title	Explaining the Predictions of Any Image Classifier via Decision Trees
Authors	Sheng Shi, Xinfeng Zhang, Wei Fan
Abstract	Despite outstanding contribution to the significant progress of Artificial Intelligence (AI), deep learning models remain mostly black boxes, which are extremely weak in explainability of the reasoning process and prediction results. Explainability is not only a gateway between AI and society but also a powerful tool to detect flaws in the model and biases in the data. Local Interpretable Model-agnostic Explanation (LIME) is a recent approach that uses an interpretable model to form a local explanation for the individual prediction result. The current implementation of LIME adopts the linear regression as its interpretable function. However, being so restricted and usually over-simplifying the relationships, linear models fail in situations where nonlinear associations and interactions exist among features and prediction results. This paper implements a decision Tree-based LIME approach, which uses a decision tree model to form an interpretable representation that is locally faithful to the original model. Tree-LIME approach can capture nonlinear interactions among features in the data and creates plausible explanations. Various experiments show that the Tree-LIME explanation of multiple black-box models can achieve more reliable performance in terms of understandability, fidelity, and efficiency.
Tasks
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01058v2
PDF	https://arxiv.org/pdf/1911.01058v2.pdf
PWC	https://paperswithcode.com/paper/explaining-the-predictions-of-any-image
Repo
Framework

3D-Rotation-Equivariant Quaternion Neural Networks


Title	3D-Rotation-Equivariant Quaternion Neural Networks
Authors	Binbin Zhang, Wen Shen, Shikun Huang, Zhihua Wei, Quanshi Zhang
Abstract	This paper proposes a set of rules to revise various neural networks for 3D point cloud processing to rotation-equivariant quaternion neural networks (REQNNs). We find that when a neural network uses quaternion features under certain conditions, the network feature naturally has the rotation-equivariance property. Rotation equivariance means that applying a specific rotation transformation to the input point cloud is equivalent to applying the same rotation transformation to all intermediate-layer quaternion features. Besides, the REQNN also ensures that the intermediate-layer features are invariant to the permutation of input points. Compared with the original neural network, the REQNN exhibits higher rotation robustness.
Tasks
Published	2019-11-20
URL	https://arxiv.org/abs/1911.09040v1
PDF	https://arxiv.org/pdf/1911.09040v1.pdf
PWC	https://paperswithcode.com/paper/3d-rotation-equivariant-quaternion-neural
Repo
Framework

Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes


Title	Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes
Authors	Yunsu Kim, Julian Schamper, Hermann Ney
Abstract	We address for the first time unsupervised training for a translation task with hundreds of thousands of vocabulary words. We scale up the expectation-maximization (EM) algorithm to learn a large translation table without any parallel text or seed lexicon. First, we solve the memory bottleneck and enforce the sparsity with a simple thresholding scheme for the lexicon. Second, we initialize the lexicon training with word classes, which efficiently boosts the performance. Our methods produced promising results on two large-scale unsupervised translation tasks.
Tasks
Published	2019-01-06
URL	http://arxiv.org/abs/1901.01577v1
PDF	http://arxiv.org/pdf/1901.01577v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-training-for-large-vocabulary
Repo
Framework

The Tower of Babel Meets Web 2.0: User-Generated Content and its Applications in a Multilingual Context


Title	The Tower of Babel Meets Web 2.0: User-Generated Content and its Applications in a Multilingual Context
Authors	B. Hecht, D. Gergle
Abstract	This study explores language’s fragmenting effect on user-generated content by examining the diversity of knowledge representations across 25 different Wikipedia language editions. This diversity is measured at two levels: the concepts that are included in each edition and the ways in which these concepts are described. We demonstrate that the diversity present is greater than has been presumed in the literature and has a significant influence on applications that use Wikipedia as a source of world knowledge. We close by explicating how knowledge diversity can be beneficially leveraged to create “culturally-aware applications” and “hyperlingual applications”.
Tasks
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01689v1
PDF	http://arxiv.org/pdf/1904.01689v1.pdf
PWC	https://paperswithcode.com/paper/the-tower-of-babel-meets-web-20-user
Repo
Framework

Automatic cephalometric landmarks detection on frontal faces: an approach based on supervised learning techniques


Title	Automatic cephalometric landmarks detection on frontal faces: an approach based on supervised learning techniques
Authors	Lucas Faria Porto, Laise Nascimento Correia Lima, Marta Flores, Andrea Valsecchi, Oscar Ibanez, Carlos Eduardo Machado Palhares, Flavio de Barros Vidal
Abstract	Facial landmarks are employed in many research areas such as facial recognition, craniofacial identification, age and sex estimation among the most important. In the forensic field, the focus is on the analysis of a particular set of facial landmarks, defined as cephalometric landmarks. Previous works demonstrated that the descriptive adequacy of these anatomical references for an indirect application (photo-anthropometric description) increased the marking precision of these points, contributing to a greater reliability of these analyzes. However, most of them are performed manually and all of them are subjectivity inherent to the expert examiners. In this sense, the purpose of this work is the development and validation of automatic techniques to detect cephalometric landmarks from digital images of frontal faces in forensic field. The presented approach uses a combination of computer vision and image processing techniques within a supervised learning procedures. The proposed methodology obtains similar precision to a group of human manual cephalometric reference markers and result to be more accurate against others state-of-the-art facial landmark detection frameworks. It achieves a normalized mean distance (in pixel) error of 0.014, similar to the mean inter-expert dispersion (0.009) and clearly better than other automatic approaches also analyzed along of this work (0.026 and 0.101).
Tasks	Facial Landmark Detection
Published	2019-04-24
URL	http://arxiv.org/abs/1904.10816v1
PDF	http://arxiv.org/pdf/1904.10816v1.pdf
PWC	https://paperswithcode.com/paper/automatic-cephalometric-landmarks-detection
Repo
Framework

Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling


Title	Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling
Authors	Randell Cotta, Mingyang Hu, Dan Jiang, Peizhou Liao
Abstract	We evaluate the impact of probabilistically-constructed digital identity data collected from Sep. to Dec. 2017 (approx.), in the context of Lookalike-targeted campaigns. The backbone of this study is a large set of probabilistically-constructed “identities”, represented as small bags of cookies and mobile ad identifiers with associated metadata, that are likely all owned by the same underlying user. The identity data allows to generate “identity-based”, rather than “identifier-based”, user models, giving a fuller picture of the interests of the users underlying the identifiers. We employ off-policy techniques to evaluate the potential of identity-powered lookalike models without incurring the risk of allowing untested models to direct large amounts of ad spend or the large cost of performing A/B tests. We add to historical work on off-policy evaluation by noting a significant type of “finite-sample bias” that occurs for studies combining modestly-sized datasets and evaluation metrics involving rare events (e.g., conversions). We illustrate this bias using a simulation study that later informs the handling of inverse propensity weights in our analyses on real data. We demonstrate significant lift in identity-powered lookalikes versus an identity-ignorant baseline: on average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for identifiers having little data themselves, but that can be inferred to belong to users with substantial data to aggregate across identifiers. This implies that identity-powered user modeling is especially important in the context of identifiers having very short lifespans (i.e., frequently churned cookies). Our work motivates and informs the use of probabilistically-constructed identities in marketing. It also deepens the canon of examples in which off-policy learning has been employed to evaluate the complex systems of the internet economy.
Tasks
Published	2019-01-04
URL	http://arxiv.org/abs/1901.05560v1
PDF	http://arxiv.org/pdf/1901.05560v1.pdf
PWC	https://paperswithcode.com/paper/off-policy-evaluation-of-probabilistic
Repo
Framework

Benchmark Dataset for Timetable Optimization of Bus Routes in the City of New Delhi


Title	Benchmark Dataset for Timetable Optimization of Bus Routes in the City of New Delhi
Authors	Anubhav Jain, Avdesh Kumar, Saumya Balodi, Pravesh Biyani
Abstract	Public transport is one of the major forms of transportation in the world. This makes it vital to ensure that public transport is efficient. This research presents a novel real-time GPS bus transit data for over 500 routes of buses operating in New Delhi. The data can be used for modeling various timetable optimization tasks as well as in other domains such as traffic management, travel time estimation, etc. The paper also presents an approach to reduce the waiting time of Delhi buses by analyzing the traffic behavior and proposing a timetable. This algorithm serves as a benchmark for the dataset. The algorithm uses a constrained clustering algorithm for classification of trips. It further analyses the data statistically to provide a timetable which is efficient in learning the inter- and intra-month variations.
Tasks
Published	2019-10-20
URL	https://arxiv.org/abs/1910.08903v1
PDF	https://arxiv.org/pdf/1910.08903v1.pdf
PWC	https://paperswithcode.com/paper/benchmark-dataset-for-timetable-optimization
Repo
Framework