January 29, 2020

3009 words 15 mins read

Paper Group ANR 542

A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments. Exploring difference in public perceptions on HPV vaccine between gender groups from Twitter using deep learning. ABI Neural Ensemble Model for Gender Prediction Adapt Bar-Ilan Submission for the CLIN29 Shared Task on Gender Prediction. Towards Understanding the Spect …

A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments


Title	A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments
Authors	Adam Foster, Martin Jankowiak, Matthew O’Meara, Yee Whye Teh, Tom Rainforth
Abstract	We introduce a fully stochastic gradient based approach to Bayesian optimal experimental design (BOED). Our approach utilizes variational lower bounds on the expected information gain (EIG) of an experiment that can be simultaneously optimized with respect to both the variational and design parameters. This allows the design process to be carried out through a single unified stochastic gradient ascent procedure, in contrast to existing approaches that typically construct a pointwise EIG estimator, before passing this estimator to a separate optimizer. We provide a number of different variational objectives including the novel adaptive contrastive estimation (ACE) bound. Finally, we show that our gradient-based approaches are able to provide effective design optimization in substantially higher dimensional settings than existing approaches.
Tasks
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00294v2
PDF	https://arxiv.org/pdf/1911.00294v2.pdf
PWC	https://paperswithcode.com/paper/a-unified-stochastic-gradient-approach-to
Repo
Framework

Exploring difference in public perceptions on HPV vaccine between gender groups from Twitter using deep learning


Title	Exploring difference in public perceptions on HPV vaccine between gender groups from Twitter using deep learning
Authors	Jingcheng Du, Chongliang Luo, Qiang Wei, Yong Chen, Cui Tao
Abstract	In this study, we proposed a convolutional neural network model for gender prediction using English Twitter text as input. Ensemble of proposed model achieved an accuracy at 0.8237 on gender prediction and compared favorably with the state-of-the-art performance in a recent author profiling task. We further leveraged the trained models to predict the gender labels from an HPV vaccine related corpus and identified gender difference in public perceptions regarding HPV vaccine. The findings are largely consistent with previous survey-based studies.
Tasks	Gender Prediction
Published	2019-07-06
URL	https://arxiv.org/abs/1907.03167v1
PDF	https://arxiv.org/pdf/1907.03167v1.pdf
PWC	https://paperswithcode.com/paper/exploring-difference-in-public-perceptions-on
Repo
Framework

ABI Neural Ensemble Model for Gender Prediction Adapt Bar-Ilan Submission for the CLIN29 Shared Task on Gender Prediction


Title	ABI Neural Ensemble Model for Gender Prediction Adapt Bar-Ilan Submission for the CLIN29 Shared Task on Gender Prediction
Authors	Eva Vanmassenhove, Amit Moryossef, Alberto Poncelas, Andy Way, Dimitar Shterionov
Abstract	We present our system for the CLIN29 shared task on cross-genre gender detection for Dutch. We experimented with a multitude of neural models (CNN, RNN, LSTM, etc.), more “traditional” models (SVM, RF, LogReg, etc.), different feature sets as well as data pre-processing. The final results suggested that using tokenized, non-lowercased data works best for most of the neural models, while a combination of word clusters, character trigrams and word lists showed to be most beneficial for the majority of the more “traditional” (that is, non-neural) models, beating features used in previous tasks such as n-grams, character n-grams, part-of-speech tags and combinations thereof. In contradiction with the results described in previous comparable shared tasks, our neural models performed better than our best traditional approaches with our best feature set-up. Our final model consisted of a weighted ensemble model combining the top 25 models. Our final model won both the in-domain gender prediction task and the cross-genre challenge, achieving an average accuracy of 64.93% on the in-domain gender prediction task, and 56.26% on cross-genre gender prediction.
Tasks	Gender Prediction
Published	2019-02-23
URL	http://arxiv.org/abs/1902.08856v1
PDF	http://arxiv.org/pdf/1902.08856v1.pdf
PWC	https://paperswithcode.com/paper/abi-neural-ensemble-model-for-gender
Repo
Framework

Towards Understanding the Spectral Bias of Deep Learning


Title	Towards Understanding the Spectral Bias of Deep Learning
Authors	Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, Quanquan Gu
Abstract	An intriguing phenomenon observed during training neural networks is the spectral bias, which states that neural networks are biased towards learning less complex functions. The priority of learning functions with low complexity might be at the core of explaining generalization ability of neural network, and certain efforts have been made to provide theoretical explanation for spectral bias. However, there is still no satisfying theoretical result justifying the underlying mechanism of spectral bias. In this paper, we give a comprehensive and rigorous explanation for spectral bias and relate it with the neural tangent kernel function proposed in recent work. We prove that the training process of neural networks can be decomposed along different directions defined by the eigenfunctions of the neural tangent kernel, where each direction has its own convergence rate and the rate is determined by the corresponding eigenvalue. We then provide a case study when the input data is uniformly distributed over the unit sphere, and show that lower degree spherical harmonics are easier to be learned by over-parameterized neural networks.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01198v2
PDF	https://arxiv.org/pdf/1912.01198v2.pdf
PWC	https://paperswithcode.com/paper/towards-understanding-the-spectral-bias-of-1
Repo
Framework

Dimension Estimation Using Autoencoders


Title	Dimension Estimation Using Autoencoders
Authors	Nitish Bahadur, Randy Paffenroth
Abstract	Dimension Estimation (DE) and Dimension Reduction (DR) are two closely related topics, but with quite different goals. In DE, one attempts to estimate the intrinsic dimensionality or number of latent variables in a set of measurements of a random vector. However, in DR, one attempts to project a random vector, either linearly or non-linearly, to a lower dimensional space that preserves the information contained in the original higher dimensional space. Of course, these two ideas are quite closely linked since, for example, doing DR to a dimension smaller than suggested by DE will likely lead to information loss. Accordingly, in this paper we will focus on a particular class of deep neural networks called autoencoders which are used extensively for DR but are less well studied for DE. We show that several important questions arise when using autoencoders for DE, above and beyond those that arise for more classic DR/DE techniques such as Principal Component Analysis. We address autoencoder architectural choices and regularization techniques that allow one to transform autoencoder latent layer representations into estimates of intrinsic dimension.
Tasks	Dimensionality Reduction
Published	2019-09-24
URL	https://arxiv.org/abs/1909.10702v1
PDF	https://arxiv.org/pdf/1909.10702v1.pdf
PWC	https://paperswithcode.com/paper/dimension-estimation-using-autoencoders
Repo
Framework

Recursive Style Breach Detection with Multifaceted Ensemble Learning


Title	Recursive Style Breach Detection with Multifaceted Ensemble Learning
Authors	Daniel Kopev, Dimitrina Zlatkova, Kristiyan Mitov, Atanas Atanasov, Momchil Hardalov, Ivan Koychev, Preslav Nakov
Abstract	We present a supervised approach for style change detection, which aims at predicting whether there are changes in the style in a given text document, as well as at finding the exact positions where such changes occur. In particular, we combine a TF.IDF representation of the document with features specifically engineered for the task, and we make predictions via an ensemble of diverse classifiers including SVM, Random Forest, AdaBoost, MLP, and LightGBM. Whenever the model detects that style change is present, we apply it recursively, looking to find the specific positions of the change. Our approach powered the winning system for the PAN@CLEF 2018 task on Style Change Detection.
Tasks
Published	2019-06-17
URL	https://arxiv.org/abs/1906.06917v1
PDF	https://arxiv.org/pdf/1906.06917v1.pdf
PWC	https://paperswithcode.com/paper/recursive-style-breach-detection-with
Repo
Framework

Data Consistent Artifact Reduction for Limited Angle Tomography with Deep Learning Prior


Title	Data Consistent Artifact Reduction for Limited Angle Tomography with Deep Learning Prior
Authors	Yixing Huang, Alexander Preuhs, Guenter Lauritsch, Michael Manhart, Xiaolin Huang, Andreas Maier
Abstract	Robustness of deep learning methods for limited angle tomography is challenged by two major factors: a) due to insufficient training data the network may not generalize well to unseen data; b) deep learning methods are sensitive to noise. Thus, generating reconstructed images directly from a neural network appears inadequate. We propose to constrain the reconstructed images to be consistent with the measured projection data, while the unmeasured information is complemented by learning based methods. For this purpose, a data consistent artifact reduction (DCAR) method is introduced: First, a prior image is generated from an initial limited angle reconstruction via deep learning as a substitute for missing information. Afterwards, a conventional iterative reconstruction algorithm is applied, integrating the data consistency in the measured angular range and the prior information in the missing angular range. This ensures data integrity in the measured area, while inaccuracies incorporated by the deep learning prior lie only in areas where no information is acquired. The proposed DCAR method achieves significant image quality improvement: for 120-degree cone-beam limited angle tomography more than 10% RMSE reduction in noise-free case and more than 24% RMSE reduction in noisy case compared with a state-of-the-art U-Net based method.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06792v2
PDF	https://arxiv.org/pdf/1908.06792v2.pdf
PWC	https://paperswithcode.com/paper/data-consistent-artifact-reduction-for
Repo
Framework

Actions Generation from Captions


Title	Actions Generation from Captions
Authors	Xuan Liang, Yida Xu
Abstract	Sequence transduction models have been widely explored in many natural language processing tasks. However, the target sequence usually consists of discrete tokens which represent word indices in a given vocabulary. We barely see the case where target sequence is composed of continuous vectors, where each vector is an element of a time series taken successively in a temporal domain. In this work, we introduce a new data set, named Action Generation Data Set (AGDS) which is specifically designed to carry out the task of caption-to-action generation. This data set contains caption-action pairs. The caption is comprised of a sequence of words describing the interactive movement between two people, and the action is a captured sequence of poses representing the movement. This data set is introduced to study the ability of generating continuous sequences through sequence transduction models. We also propose a model to innovatively combine Multi-Head Attention (MHA) and Generative Adversarial Network (GAN) together. In our model, we have one generator to generate actions from captions and three discriminators where each of them is designed to carry out a unique functionality: caption-action consistency discriminator, pose discriminator and pose transition discriminator. This novel design allowed us to achieve plausible generation performance which is demonstrated in the experiments.
Tasks	Time Series
Published	2019-02-14
URL	http://arxiv.org/abs/1902.11109v1
PDF	http://arxiv.org/pdf/1902.11109v1.pdf
PWC	https://paperswithcode.com/paper/actions-generation-from-captions
Repo
Framework

GAN Path Finder: Preliminary results


Title	GAN Path Finder: Preliminary results
Authors	Natalia Soboleva, Konstantin Yakovlev
Abstract	2D path planning in static environment is a well-known problem and one of the common ways to solve it is to 1) represent the environment as a grid and 2) perform a heuristic search for a path on it. At the same time 2D grid resembles much a digital image, thus an appealing idea comes to being – to treat the problem as an image generation task and to solve it utilizing the recent advances in deep learning. In this work we make an attempt to apply a generative neural network as a path finder and report preliminary results, convincing enough to claim that this direction of research is worth further exploration.
Tasks	Image Generation
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01499v1
PDF	https://arxiv.org/pdf/1908.01499v1.pdf
PWC	https://paperswithcode.com/paper/gan-path-finder-preliminary-results
Repo
Framework

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation


Title	Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation
Authors	Hao Tang, Dan Xu, Gaowen Liu, Wei Wang, Nicu Sebe, Yan Yan
Abstract	In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C$^2$GAN) for the task of keypoint-guided image generation. The proposed C$^2$GAN is a cross-modal framework exploring a joint exploitation of the keypoint and the image data in an interactive manner. C$^2$GAN contains two different types of generators, i.e., keypoint-oriented generator and image-oriented generator. Both of them are mutually connected in an end-to-end learnable fashion and explicitly form three cycled sub-networks, i.e., one image generation cycle and two keypoint generation cycles. Each cycle not only aims at reconstructing the input domain, and also produces useful output involving in the generation of another cycle. By so doing, the cycles constrain each other implicitly, which provides complementary information from the two different modalities and brings extra supervision across cycles, thus facilitating more robust optimization of the whole network. Extensive experimental results on two publicly available datasets, i.e., Radboud Faces and Market-1501, demonstrate that our approach is effective to generate more photo-realistic images compared with state-of-the-art models.
Tasks	Image Generation
Published	2019-08-02
URL	https://arxiv.org/abs/1908.00999v2
PDF	https://arxiv.org/pdf/1908.00999v2.pdf
PWC	https://paperswithcode.com/paper/cycle-in-cycle-generative-adversarial
Repo
Framework

A nonasymptotic law of iterated logarithm for general M-estimators


Title	A nonasymptotic law of iterated logarithm for general M-estimators
Authors	Victor-Emmanuel Brunel, Arnak S. Dalalyan, Nicolas Schreuder
Abstract	M-estimators are ubiquitous in machine learning and statistical learning theory. They are used both for defining prediction strategies and for evaluating their precision. In this paper, we propose the first non-asymptotic “any-time” deviation bounds for general M-estimators, where “any-time” means that the bound holds with a prescribed probability for every sample size. These bounds are nonasymptotic versions of the law of iterated logarithm. They are established under general assumptions such as Lipschitz continuity of the loss function and (local) curvature of the population risk. These conditions are satisfied for most examples used in machine learning, including those ensuring robustness to outliers and to heavy tailed distributions. As an example of application, we consider the problem of best arm identification in a parametric stochastic multi-arm bandit setting. We show that the established bound can be converted into a new algorithm, with provably optimal theoretical guarantees. Numerical experiments illustrating the validity of the algorithm are reported.
Tasks
Published	2019-03-15
URL	https://arxiv.org/abs/1903.06576v2
PDF	https://arxiv.org/pdf/1903.06576v2.pdf
PWC	https://paperswithcode.com/paper/a-nonasymptotic-law-of-iterated-logarithm-for
Repo
Framework

Fractals2019: Combinatorial Optimisation with Dynamic Constraint Annealing


Title	Fractals2019: Combinatorial Optimisation with Dynamic Constraint Annealing
Authors	Mikhail Prokopenko, Peter Wang
Abstract	Fractals2019 started as a new experimental entry in the RoboCup Soccer 2D Simulation League, based on Gliders2d code base, and advanced to become a RoboCup-2019 champion. We employ combinatorial optimisation methods, within the framework of Guided Self-Organisation, with the search guided by local constraints. We present examples of several tactical tasks based on the Gliders2d code (version v2), including the search for an optimal assignment of heterogeneous player types, as well as blocking behaviours, offside trap, and attacking formations. We propose a new method, Dynamic Constraint Annealing, for solving dynamic constraint satisfaction problems, and apply it to optimise thermodynamic potential of collective behaviours, under dynamically induced constraints.
Tasks
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01788v2
PDF	https://arxiv.org/pdf/1909.01788v2.pdf
PWC	https://paperswithcode.com/paper/fractals2019-combinatorial-optimisation-with
Repo
Framework

Training CNNs faster with Dynamic Input and Kernel Downsampling


Title	Training CNNs faster with Dynamic Input and Kernel Downsampling
Authors	Zissis Poulos, Ali Nouri, Andreas Moshovos
Abstract	We reduce training time in convolutional networks (CNNs) with a method that, for some of the mini-batches: a) scales down the resolution of input images via downsampling, and b) reduces the forward pass operations via pooling on the convolution filters. Training is performed in an interleaved fashion; some batches undergo the regular forward and backpropagation passes with original network parameters, whereas others undergo a forward pass with pooled filters and downsampled inputs. Since pooling is differentiable, the gradients of the pooled filters propagate to the original network parameters for a standard parameter update. The latter phase requires fewer floating point operations and less storage due to the reduced spatial dimensions in feature maps and filters. The key idea is that this phase leads to smaller and approximate updates and thus slower learning, but at significantly reduced cost, followed by passes that use the original network parameters as a refinement stage. Deciding how often and for which batches the downsmapling occurs can be done either stochastically or deterministically, and can be defined as a training hyperparameter itself. Experiments on residual architectures show that we can achieve up to 23% reduction in training time with minimal loss in validation accuracy.
Tasks
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06548v1
PDF	https://arxiv.org/pdf/1910.06548v1.pdf
PWC	https://paperswithcode.com/paper/training-cnns-faster-with-dynamic-input-and
Repo
Framework

Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning


Title	Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning
Authors	Mark Edmonds, Xiaojian Ma, Siyuan Qi, Yixin Zhu, Hongjing Lu, Song-Chun Zhu
Abstract	Learning transferable knowledge across similar but different settings is a fundamental component of generalized intelligence. In this paper, we approach the transfer learning challenge from a causal theory perspective. Our agent is endowed with two basic yet general theories for transfer learning: (i) a task shares a common abstract structure that is invariant across domains, and (ii) the behavior of specific features of the environment remain constant across domains. We adopt a Bayesian perspective of causal theory induction and use these theories to transfer knowledge between environments. Given these general theories, the goal is to train an agent by interactively exploring the problem space to (i) discover, form, and transfer useful abstract and structural knowledge, and (ii) induce useful knowledge from the instance-level attributes observed in the environment. A hierarchy of Bayesian structures is used to model abstract-level structural causal knowledge, and an instance-level associative learning scheme learns which specific objects can be used to induce state changes through interaction. This model-learning scheme is then integrated with a model-based planner to achieve a task in the OpenLock environment, a virtual ``escape room’’ with a complex hierarchy that requires agents to reason about an abstract, generalized causal structure. We compare performances against a set of predominate model-free reinforcement learning(RL) algorithms. RL agents showed poor ability transferring learned knowledge across different trials. Whereas the proposed model revealed similar performance trends as human learners, and more importantly, demonstrated transfer behavior across trials and learning situations. \|
Tasks	Transfer Learning
Published	2019-11-25
URL	https://arxiv.org/abs/1911.11185v1
PDF	https://arxiv.org/pdf/1911.11185v1.pdf
PWC	https://paperswithcode.com/paper/theory-based-causal-transfer-integrating
Repo
Framework

Density-based Clustering with Best-scored Random Forest


Title	Density-based Clustering with Best-scored Random Forest
Authors	Hanyuan Hang, Yuchao Cai, Hanfang Yang
Abstract	Single-level density-based approach has long been widely acknowledged to be a conceptually and mathematically convincing clustering method. In this paper, we propose an algorithm called “best-scored clustering forest” that can obtain the optimal level and determine corresponding clusters. The terminology “best-scored” means to select one random tree with the best empirical performance out of a certain number of purely random tree candidates. From the theoretical perspective, we first show that consistency of our proposed algorithm can be guaranteed. Moreover, under certain mild restrictions on the underlying density functions and target clusters, even fast convergence rates can be achieved. Last but not least, comparisons with other state-of-the-art clustering methods in the numerical experiments demonstrate accuracy of our algorithm on both synthetic data and several benchmark real data sets.
Tasks
Published	2019-06-24
URL	https://arxiv.org/abs/1906.10094v1
PDF	https://arxiv.org/pdf/1906.10094v1.pdf
PWC	https://paperswithcode.com/paper/density-based-clustering-with-best-scored
Repo
Framework