January 31, 2020

3521 words 17 mins read

Paper Group AWR 405

Web Links Prediction And Category-Wise Recommendation Based On Browser History. Passive nonlinear dendritic interactions as a general computational resource in functional spiking neural networks. Single Headed Attention RNN: Stop Thinking With Your Head. Adversarial Invariant Feature Learning with Accuracy Constraint for Domain Generalization. Sate …

Web Links Prediction And Category-Wise Recommendation Based On Browser History


Title	Web Links Prediction And Category-Wise Recommendation Based On Browser History
Authors	Ashadullah Shawon, Syed Tauhid Zuhori, Firoz Mahmud, Md. Jamil-Ur Rahman
Abstract	A web browser should not be only for browsing web pages but also help users to find out their target websites and recommend similar type websites based on their behavior. Throughout this paper, we propose two methods to make a web browser more intelligent about link prediction which works during typing on address-bar and recommendation of websites according to several categories. Our proposed link prediction system is actually frecency prediction which is predicted based on the first visit, last visit and URL counts. But recommend system is the most challenging as it is needed to classify web URLs according to names without visiting web pages. So we use existing model for URL classification. The only existing approach gives unsatisfactory results and low accuracy. So we add hyperparameter optimization with an existing approach that finds the best parameters for existing URL classification model and gives better accuracy. In this paper, we propose a category wise recommendation system using frecency value and the total visit of individual URL category.
Tasks	Hyperparameter Optimization, Link Prediction
Published	2019-02-21
URL	http://arxiv.org/abs/1902.08496v1
PDF	http://arxiv.org/pdf/1902.08496v1.pdf
PWC	https://paperswithcode.com/paper/web-links-prediction-and-category-wise
Repo	https://github.com/shawon100/Web-Link-Prediction-System
Framework	none

Passive nonlinear dendritic interactions as a general computational resource in functional spiking neural networks


Title	Passive nonlinear dendritic interactions as a general computational resource in functional spiking neural networks
Authors	Andreas Stöckel, Chris Eliasmith
Abstract	Nonlinear interactions in the dendritic tree play a key role in neural computation. Nevertheless, modeling frameworks aimed at the construction of large-scale, functional spiking neural networks tend to assume linear, current-based superposition of post-synaptic currents. We extend the theory underlying the Neural Engineering Framework to systematically exploit nonlinear interactions between the local membrane potential and conductance-based synaptic channels as a computational resource. In particular, we demonstrate that even a single passive distal dendritic compartment with AMPA and GABA-A synapses connected to a leaky integrate-and-fire neuron supports the computation of a wide variety of multivariate, bandlimited functions, including the Euclidean norm, controlled shunting, and non-negative multiplication. Our results demonstrate that, for certain operations, the accuracy of dendritic computation is on a par with or even surpasses the accuracy of an additional layer of neurons in the network. These findings allow modelers to construct large-scale models of neurobiological systems that closer approximate network topologies and computational resources available in biology. Our results may inform neuromorphic hardware design and could lead to a better utilization of resources on existing neuromorphic hardware platforms.
Tasks
Published	2019-04-26
URL	http://arxiv.org/abs/1904.11713v1
PDF	http://arxiv.org/pdf/1904.11713v1.pdf
PWC	https://paperswithcode.com/paper/passive-nonlinear-dendritic-interactions-as-a
Repo	https://github.com/astoeckel/nengo-bio
Framework	none

Single Headed Attention RNN: Stop Thinking With Your Head


Title	Single Headed Attention RNN: Stop Thinking With Your Head
Authors	Stephen Merity
Abstract	The leading approaches in language modeling are all obsessed with TV shows of my youth - namely Transformers and Sesame Street. Transformers this, Transformers that, and over here a bonfire worth of GPU-TPU-neuromorphic wafer scale silicon. We opt for the lazy path of old and proven techniques with a fancy crypto inspired acronym: the Single Headed Attention RNN (SHA-RNN). The author’s lone goal is to show that the entire field might have evolved a different direction if we had instead been obsessed with a slightly different acronym and slightly different result. We take a previously strong language model based only on boring LSTMs and get it to within a stone’s throw of a stone’s throw of state-of-the-art byte level language model results on enwik8. This work has undergone no intensive hyperparameter optimization and lived entirely on a commodity desktop machine that made the author’s small studio apartment far too warm in the midst of a San Franciscan summer. The final results are achievable in plus or minus 24 hours on a single GPU as the author is impatient. The attention mechanism is also readily extended to large contexts with minimal computation. Take that Sesame Street.
Tasks	Hyperparameter Optimization, Language Modelling
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11423v2
PDF	https://arxiv.org/pdf/1911.11423v2.pdf
PWC	https://paperswithcode.com/paper/single-headed-attention-rnn-stop-thinking
Repo	https://github.com/alisafaya/SHA-RNN
Framework	none

Adversarial Invariant Feature Learning with Accuracy Constraint for Domain Generalization


Title	Adversarial Invariant Feature Learning with Accuracy Constraint for Domain Generalization
Authors	Kei Akuzawa, Yusuke Iwasawa, Yutaka Matsuo
Abstract	Learning domain-invariant representation is a dominant approach for domain generalization (DG), where we need to build a classifier that is robust toward domain shifts. However, previous domain-invariance-based methods overlooked the underlying dependency of classes on domains, which is responsible for the trade-off between classification accuracy and domain invariance. Because the primary purpose of DG is to classify unseen domains rather than the invariance itself, the improvement of the invariance can negatively affect DG performance under this trade-off. To overcome the problem, this study first expands the analysis of the trade-off by Xie et. al., and provides the notion of accuracy-constrained domain invariance, which means the maximum domain invariance within a range that does not interfere with accuracy. We then propose a novel method adversarial feature learning with accuracy constraint (AFLAC), which explicitly leads to that invariance on adversarial training. Empirical validations show that the performance of AFLAC is superior to that of domain-invariance-based methods on both synthetic and three real-world datasets, supporting the importance of considering the dependency and the efficacy of the proposed method.
Tasks	Domain Generalization
Published	2019-04-29
URL	https://arxiv.org/abs/1904.12543v3
PDF	https://arxiv.org/pdf/1904.12543v3.pdf
PWC	https://paperswithcode.com/paper/adversarial-invariant-feature-learning-with
Repo	https://github.com/akuzeee/AFLAC
Framework	pytorch

Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention


Title	Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention
Authors	Vivien Sainte Fare Garnot, Loic Landrieu, Sebastien Giordano, Nesrine Chehata
Abstract	Satellite image time series, bolstered by their growing availability, are at the forefront of an extensive effort towards automated Earth monitoring by international institutions. In particular, large-scale control of agricultural parcels is an issue of major political and economic importance. In this regard, hybrid convolutional-recurrent neural architectures have shown promising results for the automated classification of satellite image time series.We propose an alternative approach in which the convolutional layers are advantageously replaced with encoders operating on unordered sets of pixels to exploit the typically coarse resolution of publicly available satellite images. We also propose to extract temporal features using a bespoke neural architecture based on self-attention instead of recurrent networks. We demonstrate experimentally that our method not only outperforms previous state-of-the-art approaches in terms of precision, but also significantly decreases processing time and memory requirements. Lastly, we release a large open-access annotated dataset as a benchmark for future work on satellite image time series.
Tasks	Time Series, Time Series Classification
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07757v1
PDF	https://arxiv.org/pdf/1911.07757v1.pdf
PWC	https://paperswithcode.com/paper/satellite-image-time-series-classification
Repo	https://github.com/VSainteuf/pytorch-psetae
Framework	pytorch

Light Field Saliency Detection with Deep Convolutional Networks


Title	Light Field Saliency Detection with Deep Convolutional Networks
Authors	Jun Zhang, Yamei Liu, Shengping Zhang, Ronald Poppe, Meng Wang
Abstract	Light field imaging presents an attractive alternative to RGB imaging because of the recording of the direction of the incoming light. The detection of salient regions in a light field image benefits from the additional modeling of angular patterns. For RGB imaging, methods using CNNs have achieved excellent results on a range of tasks, including saliency detection. However, it is not trivial to use CNN-based methods for saliency detection on light field images because these methods are not specifically designed for processing light field inputs. In addition, current light field datasets are not sufficiently large to train CNNs. To overcome these issues, we present a new Lytro Illum dataset, which contains 640 light fields and their corresponding ground-truth saliency maps. Compared to current light field saliency datasets [1], [2], our new dataset is larger, of higher quality, contains more variation and more types of light field inputs. This makes our dataset suitable for training deeper networks and benchmarking. Furthermore, we propose a novel end-to-end CNN-based framework for light field saliency detection. Specifically, we propose three novel MAC (Model Angular Changes) blocks to process light field micro-lens images. We systematically study the impact of different architecture variants and compare light field saliency with regular 2D saliency. Our extensive comparisons indicate that our novel network significantly outperforms state-of-the-art methods on the proposed dataset and has desired generalization abilities on other existing datasets.
Tasks	Saliency Detection
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08331v2
PDF	https://arxiv.org/pdf/1906.08331v2.pdf
PWC	https://paperswithcode.com/paper/light-field-saliency-detection-with-deep
Repo	https://github.com/pencilzhang/LFNet-light-field-saliency-net
Framework	none

Time-Series Anomaly Detection Service at Microsoft


Title	Time-Series Anomaly Detection Service at Microsoft
Authors	Hansheng Ren, Bixiong Xu, Yujing Wang, Chao Yi, Congrui Huang, Xiaoyu Kou, Tony Xing, Mao Yang, Jie Tong, Qi Zhang
Abstract	Large companies need to monitor various metrics (for example, Page Views and Revenue) of their applications and services in real time. At Microsoft, we develop a time-series anomaly detection service which helps customers to monitor the time-series continuously and alert for potential incidents on time. In this paper, we introduce the pipeline and algorithm of our anomaly detection service, which is designed to be accurate, efficient and general. The pipeline consists of three major modules, including data ingestion, experimentation platform and online compute. To tackle the problem of time-series anomaly detection, we propose a novel algorithm based on Spectral Residual (SR) and Convolutional Neural Network (CNN). Our work is the first attempt to borrow the SR model from visual saliency detection domain to time-series anomaly detection. Moreover, we innovatively combine SR and CNN together to improve the performance of SR model. Our approach achieves superior experimental results compared with state-of-the-art baselines on both public datasets and Microsoft production data.
Tasks	Anomaly Detection, Saliency Detection, Time Series
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03821v1
PDF	https://arxiv.org/pdf/1906.03821v1.pdf
PWC	https://paperswithcode.com/paper/time-series-anomaly-detection-service-at
Repo	https://github.com/yoshinaga0106/spectral-residual
Framework	none

Appearance and Pose-Conditioned Human Image Generation using Deformable GANs


Title	Appearance and Pose-Conditioned Human Image Generation using Deformable GANs
Authors	Aliaksandr Siarohin, Stéphane Lathuilière, Enver Sangineto, Nicu Sebe
Abstract	In this paper, we address the problem of generating person images conditioned on both pose and appearance information. Specifically, given an image xa of a person and a target pose P(xb), extracted from a different image xb, we synthesize a new image of that person in pose P(xb), while preserving the visual details in xa. In order to deal with pixel-to-pixel misalignments caused by the pose differences between P(xa) and P(xb), we introduce deformable skip connections in the generator of our Generative Adversarial Network. Moreover, a nearest-neighbour loss is proposed instead of the common L1 and L2 losses in order to match the details of the generated image with the target image. Quantitative and qualitative results, using common datasets and protocols recently proposed for this task, show that our approach is competitive with respect to the state of the art. Moreover, we conduct an extensive evaluation using off-the-shell person re-identification (Re-ID) systems trained with person-generation based augmented data, which is one of the main important applications for this task. Our experiments show that our Deformable GANs can significantly boost the Re-ID accuracy and are even better than data-augmentation methods specifically trained using Re-ID losses.
Tasks	Data Augmentation, Image Generation, Person Re-Identification
Published	2019-04-30
URL	https://arxiv.org/abs/1905.00007v2
PDF	https://arxiv.org/pdf/1905.00007v2.pdf
PWC	https://paperswithcode.com/paper/appearance-and-pose-conditioned-human-image
Repo	https://github.com/AliaksandrSiarohin/pose-gan
Framework	tf

TreeGen: A Tree-Based Transformer Architecture for Code Generation


Title	TreeGen: A Tree-Based Transformer Architecture for Code Generation
Authors	Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou, Lu Zhang
Abstract	A code generation system generates programming language code based on an input natural language description. State-of-the-art approaches rely on neural networks for code generation. However, these code generators suffer from two problems. One is the long dependency problem, where a code element often depends on another far-away code element. A variable reference, for example, depends on its definition, which may appear quite a few lines before. The other problem is structure modeling, as programs contain rich structural information. In this paper, we propose a novel tree-based neural architecture, TreeGen, for code generation. TreeGen uses the attention mechanism of Transformers to alleviate the long-dependency problem, and introduces a novel AST reader (encoder) to incorporate grammar rules and AST structures into the network. We evaluated TreeGen on a Python benchmark, HearthStone, and two semantic parsing benchmarks, ATIS and GEO. TreeGen outperformed the previous state-of-the-art approach by 4.5 percentage points on HearthStone, and achieved the best accuracy among neural network-based approaches on ATIS (89.1%) and GEO (89.6%). We also conducted an ablation test to better understand each component of our model.
Tasks	Code Generation, Semantic Parsing
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09983v2
PDF	https://arxiv.org/pdf/1911.09983v2.pdf
PWC	https://paperswithcode.com/paper/treegen-a-tree-based-transformer-architecture
Repo	https://github.com/zysszy/TreeGen
Framework	tf

LAMOL: LAnguage MOdeling for Lifelong Language Learning


Title	LAMOL: LAnguage MOdeling for Lifelong Language Learning
Authors	Fan-Keng Sun, Cheng-Hao Ho, Hung-Yi Lee
Abstract	Most research on lifelong learning applies to images or games, but not language. We present LAMOL, a simple yet effective method for lifelong language learning (LLL) based on language modeling. LAMOL replays pseudo-samples of previous tasks while requiring no extra memory or model capacity. Specifically, LAMOL is a language model that simultaneously learns to solve the tasks and generate training samples. When the model is trained for a new task, it generates pseudo-samples of previous tasks for training alongside data for the new task. The results show that LAMOL prevents catastrophic forgetting without any sign of intransigence and can perform five very different language tasks sequentially with only one model. Overall, LAMOL outperforms previous methods by a considerable margin and is only 2-3% worse than multitasking, which is usually considered the LLL upper bound. The source code is available at https://github.com/jojotenya/LAMOL.
Tasks	Language Modelling
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03329v2
PDF	https://arxiv.org/pdf/1909.03329v2.pdf
PWC	https://paperswithcode.com/paper/lamal-language-modeling-is-all-you-need-for
Repo	https://github.com/jojotenya/LAMOL
Framework	pytorch

Periodic Bandits and Wireless Network Selection


Title	Periodic Bandits and Wireless Network Selection
Authors	Shunhao Oh, Anuja Meetoo Appavoo, Seth Gilbert
Abstract	Bandit-style algorithms have been studied extensively in stochastic and adversarial settings. Such algorithms have been shown to be useful in multiplayer settings, e.g. to solve the wireless network selection problem, which can be formulated as an adversarial bandit problem. A leading bandit algorithm for the adversarial setting is EXP3. However, network behavior is often repetitive, where user density and network behavior follow regular patterns. Bandit algorithms, like EXP3, fail to provide good guarantees for periodic behaviors. A major reason is that these algorithms compete against fixed-action policies, which is ineffective in a periodic setting. In this paper, we define a periodic bandit setting, and periodic regret as a better performance measure for this type of setting. Instead of comparing an algorithm’s performance to fixed-action policies, we aim to be competitive with policies that play arms under some set of possible periodic patterns $F$ (for example, all possible periodic functions with periods $1,2,\cdots,P$). We propose Periodic EXP4, a computationally efficient variant of the EXP4 algorithm for periodic settings. With $K$ arms, $T$ time steps, and where each periodic pattern in $F$ is of length at most $P$, we show that the periodic regret obtained by Periodic EXP4 is at most $O\big(\sqrt{PKT \log K + KT \log F}\big)$. We also prove a lower bound of $\Omega\big(\sqrt{PKT + KT \frac{\log F}{\log K}} \big)$ for the periodic setting, showing that this is optimal within log-factors. As an example, we focus on the wireless network selection problem. Through simulation, we show that Periodic EXP4 learns the periodic pattern over time, adapts to changes in a dynamic environment, and far outperforms EXP3.
Tasks
Published	2019-04-28
URL	http://arxiv.org/abs/1904.12355v1
PDF	http://arxiv.org/pdf/1904.12355v1.pdf
PWC	https://paperswithcode.com/paper/periodic-bandits-and-wireless-network
Repo	https://github.com/Ohohcakester/PeriodicEXP4-Source
Framework	none

A Distributed Method for Fitting Laplacian Regularized Stratified Models


Title	A Distributed Method for Fitting Laplacian Regularized Stratified Models
Authors	Jonathan Tuck, Shane Barratt, Stephen Boyd
Abstract	Stratified models are models that depend in an arbitrary way on a set of selected categorical features, and depend linearly on the other features. In a basic and traditional formulation a separate model is fit for each value of the categorical feature, using only the data that has the specific categorical value. To this formulation we add Laplacian regularization, which encourages the model parameters for neighboring categorical values to be similar. Laplacian regularization allows us to specify one or more weighted graphs on the stratification feature values. For example, stratifying over the days of the week, we can specify that the Sunday model parameter should be close to the Saturday and Monday model parameters. The regularization improves the performance of the model over the traditional stratified model, since the model for each value of the categorical `borrows strength’ from its neighbors. In particular, it produces a model even for categorical values that did not appear in the training data set. We propose an efficient distributed method for fitting stratified models, based on the alternating direction method of multipliers (ADMM). When the fitting loss functions are convex, the stratified model fitting problem is convex, and our method computes the global minimizer of the loss plus regularization; in other cases it computes a local minimizer. The method is very efficient, and naturally scales to large data sets or numbers of stratified feature values. We illustrate our method with a variety of examples. \|
Tasks
Published	2019-04-26
URL	https://arxiv.org/abs/1904.12017v3
PDF	https://arxiv.org/pdf/1904.12017v3.pdf
PWC	https://paperswithcode.com/paper/a-distributed-method-for-fitting-laplacian
Repo	https://github.com/cvxgrp/cvxstrat
Framework	pytorch

A Binary Regression Adaptive Goodness-of-fit Test (BAGofT)


Title	A Binary Regression Adaptive Goodness-of-fit Test (BAGofT)
Authors	Jiawei Zhang, Jie Ding, Yuhong Yang
Abstract	The Pearson’s $\chi^2$ test and residual deviance test are two classical goodness-of-fit tests for binary regression models such as logistic regression. These two tests cannot be applied when we have one or more continuous covariates in the data, a quite common situation in practice. In that case, the most widely used approach is the Hosmer-Lemeshow test, which partitions the covariate space into groups according to quantiles of the fitted probabilities from all the observations. However, its grouping scheme is not flexible enough to explore how to adversarially partition the data space in order to enhance the power. In this work, we propose a new methodology, named binary regression adaptive grouping goodness-of-fit test (BAGofT), to address the above concern. It is a two-stage solution where the first stage adaptively selects candidate partitions using “training” data, and the second stage performs $\chi^2$ tests with necessary corrections based on “test” data. A proper data splitting ensures that the test has desirable size and power properties. From our experimental results, BAGofT performs much better than Hosmer-Lemeshow test in many situations.
Tasks
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03063v1
PDF	https://arxiv.org/pdf/1911.03063v1.pdf
PWC	https://paperswithcode.com/paper/a-binary-regression-adaptive-goodness-of-fit
Repo	https://github.com/JZHANG4362/BAGofT
Framework	none

Deep Generalized Max Pooling


Title	Deep Generalized Max Pooling
Authors	Vincent Christlein, Lukas Spranger, Mathias Seuret, Anguelos Nicolaou, Pavel Král, Andreas Maier
Abstract	Global pooling layers are an essential part of Convolutional Neural Networks (CNN). They are used to aggregate activations of spatial locations to produce a fixed-size vector in several state-of-the-art CNNs. Global average pooling or global max pooling are commonly used for converting convolutional features of variable size images to a fix-sized embedding. However, both pooling layer types are computed spatially independent: each individual activation map is pooled and thus activations of different locations are pooled together. In contrast, we propose Deep Generalized Max Pooling that balances the contribution of all activations of a spatially coherent region by re-weighting all descriptors so that the impact of frequent and rare ones is equalized. We show that this layer is superior to both average and max pooling on the classification of Latin medieval manuscripts (CLAMM’16, CLAMM’17), as well as writer identification (Historical-WI’17).
Tasks
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05040v1
PDF	https://arxiv.org/pdf/1908.05040v1.pdf
PWC	https://paperswithcode.com/paper/deep-generalized-max-pooling
Repo	https://github.com/VChristlein/dgmp
Framework	pytorch

DeepIlluminance: Contextual Illuminance Estimation via Deep Neural Networks


Title	DeepIlluminance: Contextual Illuminance Estimation via Deep Neural Networks
Authors	Jun Zhang, Tong Zheng, Shengping Zhang, Meng Wang
Abstract	Computational color constancy refers to the estimation of the scene illumination and makes the perceived color relatively stable under varying illumination. In the past few years, deep Convolutional Neural Networks (CNNs) have delivered superior performance in illuminant estimation. Several representative methods formulate it as a multi-label prediction problem by learning the local appearance of image patches using CNNs. However, these approaches inevitably make incorrect estimations for the ambiguous patches affected by their neighborhood contexts. Inaccurate local estimates are likely to bring in degraded performance when combining into a global prediction. To address the above issues, we propose a contextual deep network for patch-based illuminant estimation equipped with refinement. First, the contextual net with a center-surround architecture extracts local contextual features from image patches, and generates initial illuminant estimates and the corresponding color corrected patches. The patches are sampled based on the observation that pixels with large color differences describe the illumination well. Then, the refinement net integrates the input patches with the corrected patches in conjunction with the use of intermediate features to improve the performance. To train such a network with numerous parameters, we propose a stage-wise training strategy, in which the features and the predicted illuminant from previous stages are provided to the next learning stage with more finer estimates recovered. Experiments show that our approach obtains competitive performance on two illuminant estimation benchmarks.
Tasks	Color Constancy
Published	2019-05-12
URL	https://arxiv.org/abs/1905.04791v2
PDF	https://arxiv.org/pdf/1905.04791v2.pdf
PWC	https://paperswithcode.com/paper/deepilluminance-contextual-illuminance
Repo	https://github.com/pencilzhang/DeepIlluminance-computational-color-constancy
Framework	caffe2