January 28, 2020

2969 words 14 mins read

Paper Group ANR 880

Unsupervised Lemmatization as Embeddings-Based Word Clustering. The Chi-Square Test of Distance Correlation. An Empirical Study of Propagation-based Methods for Video Object Segmentation. Event-based Feature Extraction Using Adaptive Selection Thresholds. Representation Learning on Graphs: A Reinforcement Learning Application. Multispectral and Hyp …

Unsupervised Lemmatization as Embeddings-Based Word Clustering


Title	Unsupervised Lemmatization as Embeddings-Based Word Clustering
Authors	Rudolf Rosa, Zdeněk Žabokrtský
Abstract	We focus on the task of unsupervised lemmatization, i.e. grouping together inflected forms of one word under one label (a lemma) without the use of annotated training data. We propose to perform agglomerative clustering of word forms with a novel distance measure. Our distance measure is based on the observation that inflections of the same word tend to be similar both string-wise and in meaning. We therefore combine word embedding cosine similarity, serving as a proxy to the meaning similarity, with Jaro-Winkler edit distance. Our experiments on 23 languages show our approach to be promising, surpassing the baseline on 23 of the 28 evaluation datasets.
Tasks	Lemmatization
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08528v1
PDF	https://arxiv.org/pdf/1908.08528v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-lemmatization-as-embeddings
Repo
Framework

The Chi-Square Test of Distance Correlation


Title	The Chi-Square Test of Distance Correlation
Authors	Cencheng Shen, Joshua T. Vogelstein
Abstract	Distance correlation has gained much recent attention in the data science community: the sample statistic is straightforward to compute and asymptotically equals zero if and only if independence, making it an ideal choice to test any type of dependency structure given sufficient sample size. One major bottleneck is the testing process: because the null distribution of distance correlation depends on the underlying random variables and metric choice, it typically requires a permutation test to estimate the null and compute the p-value, which is very costly for large amount of data. To overcome the difficulty, we propose a centered chi-square distribution, demonstrate it well-approximates the limiting null distribution of unbiased distance correlation, and prove upper tail dominance and distribution bound. The resulting distance correlation chi-square test is a nonparametric test for independence, is valid and universally consistent using any strong negative type metric or characteristic kernel, enjoys a similar finite-sample testing power as the standard permutation test, is provably most powerful among all valid tests of distance correlation using known distributions, and is also applicable to K-sample and partial testing.
Tasks
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12150v4
PDF	https://arxiv.org/pdf/1912.12150v4.pdf
PWC	https://paperswithcode.com/paper/the-chi-square-test-of-distance-correlation
Repo
Framework

An Empirical Study of Propagation-based Methods for Video Object Segmentation


Title	An Empirical Study of Propagation-based Methods for Video Object Segmentation
Authors	Hengkai Guo, Wenji Wang, Guanjun Guo, Huaxia Li, Jiachen Liu, Qian He, Xuefeng Xiao
Abstract	While propagation-based approaches have achieved state-of-the-art performance for video object segmentation, the literature lacks a fair comparison of different methods using the same settings. In this paper, we carry out an empirical study for propagation-based methods. We view these approaches from a unified perspective and conduct detailed ablation study for core methods, input cues, multi-object combination and training strategies. With careful designs, our improved end-to-end memory networks achieve a global mean of 76.1 on DAVIS 2017 val set.
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2019-07-30
URL	https://arxiv.org/abs/1907.12769v1
PDF	https://arxiv.org/pdf/1907.12769v1.pdf
PWC	https://paperswithcode.com/paper/an-empirical-study-of-propagation-based
Repo
Framework

Event-based Feature Extraction Using Adaptive Selection Thresholds


Title	Event-based Feature Extraction Using Adaptive Selection Thresholds
Authors	Saeed Afshar, Ying Xu, Jonathan Tapson, André van Schaik, Gregory Cohen
Abstract	Unsupervised feature extraction algorithms form one of the most important building blocks in machine learning systems. These algorithms are often adapted to the event-based domain to perform online learning in neuromorphic hardware. However, not designed for the purpose, such algorithms typically require significant simplification during implementation to meet hardware constraints, creating trade offs with performance. Furthermore, conventional feature extraction algorithms are not designed to generate useful intermediary signals which are valuable only in the context of neuromorphic hardware limitations. In this work a novel event-based feature extraction method is proposed that focuses on these issues. The algorithm operates via simple adaptive selection thresholds which allow a simpler implementation of network homeostasis than previous works by trading off a small amount of information loss in the form of missed events that fall outside the selection thresholds. The behavior of the selection thresholds and the output of the network as a whole are shown to provide uniquely useful signals indicating network weight convergence without the need to access network weights. A novel heuristic method for network size selection is proposed which makes use of noise events and their feature representations. The use of selection thresholds is shown to produce network activation patterns that predict classification accuracy allowing rapid evaluation and optimization of system parameters without the need to run back-end classifiers. The feature extraction method is tested on both the N-MNIST benchmarking dataset and a dataset of airplanes passing through the field of view. Multiple configurations with different classifiers are tested with the results quantifying the resultant performance gains at each processing stage.
Tasks
Published	2019-07-18
URL	https://arxiv.org/abs/1907.07853v2
PDF	https://arxiv.org/pdf/1907.07853v2.pdf
PWC	https://paperswithcode.com/paper/event-based-feature-extraction-using-adaptive
Repo
Framework

Representation Learning on Graphs: A Reinforcement Learning Application


Title	Representation Learning on Graphs: A Reinforcement Learning Application
Authors	Sephora Madjiheurem, Laura Toni
Abstract	In this work, we study value function approximation in reinforcement learning (RL) problems with high dimensional state or action spaces via a generalized version of representation policy iteration (RPI). We consider the limitations of proto-value functions (PVFs) at accurately approximating the value function in low dimensions and we highlight the importance of features learning for an improved low-dimensional value function approximation. Then, we adopt different representation learning algorithm on graphs to learn the basis functions that best represent the value function. We empirically show that node2vec, an algorithm for scalable feature learning in networks, and the Variational Graph Auto-Encoder constantly outperform the commonly used smooth proto-value functions in low-dimensional feature space.
Tasks	Representation Learning
Published	2019-01-16
URL	http://arxiv.org/abs/1901.05351v2
PDF	http://arxiv.org/pdf/1901.05351v2.pdf
PWC	https://paperswithcode.com/paper/representation-learning-on-graphs-a
Repo
Framework

Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net


Title	Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net
Authors	Qi Xie, Minghao Zhou, Qian Zhao, Deyu Meng, Wangmeng Zuo, Zongben Xu
Abstract	Hyperspectral imaging can help better understand the characteristics of different materials, compared with traditional image systems. However, only high-resolution multispectral (HrMS) and low-resolution hyperspectral (LrHS) images can generally be captured at video rate in practice. In this paper, we propose a model-based deep learning approach for merging an HrMS and LrHS images to generate a high-resolution hyperspectral (HrHS) image. In specific, we construct a novel MS/HS fusion model which takes the observation models of low-resolution images and the low-rankness knowledge along the spectral mode of HrHS image into consideration. Then we design an iterative algorithm to solve the model by exploiting the proximal gradient method. And then, by unfolding the designed algorithm, we construct a deep network, called MS/HS Fusion Net, with learning the proximal operators and model parameters by convolutional neural networks. Experimental results on simulated and real data substantiate the superiority of our method both visually and quantitatively as compared with state-of-the-art methods along this line of research.
Tasks
Published	2019-01-10
URL	http://arxiv.org/abs/1901.03281v1
PDF	http://arxiv.org/pdf/1901.03281v1.pdf
PWC	https://paperswithcode.com/paper/multispectral-and-hyperspectral-image-fusion-1
Repo
Framework

AutoScale: Learning to Scale for Crowd Counting


Title	AutoScale: Learning to Scale for Crowd Counting
Authors	Chenfeng Xu, Dingkang Liang, Yongchao Xu, Song Bai, Wei Zhan, Masayoshi Tomizuka, Xiang Bai
Abstract	Crowd counting in images is a widely explored but challenging task. Though recent convolutional neural network (CNN) methods have achieved great progress, it is still difficult to accurately count and even to precisely localize people in very dense regions. A major issue is that dense regions usually consist of many instances of small size, and thus exhibit very different density patterns compared with sparse regions. Localizing or detecting dense small objects is also very delicate. In this paper, instead of processing image pyramid and aggregating multi-scale features, we propose a simple yet effective Learning to Scale (L2S) module to cope with significant scale variations in both regression and localization. Specifically, L2S module aims to automatically scale dense regions into similar and reasonable scale levels. This alleviates the density pattern shift for density regression methods and facilitates the localization of small instances. Besides, we also introduce a novel distance label map combined with a customized adapted cross-entropy loss for precise person localization. Extensive experiments demonstrate that the proposed method termed AutoScale consistently improves upon state-of-the-art methods in both regression and localization benchmarks on three widely used datasets. The proposed AutoScale also demonstrates a noteworthy transferability under cross-dataset validation on different datasets.
Tasks	Crowd Counting
Published	2019-12-20
URL	https://arxiv.org/abs/1912.09632v1
PDF	https://arxiv.org/pdf/1912.09632v1.pdf
PWC	https://paperswithcode.com/paper/autoscale-learning-to-scale-for-crowd
Repo
Framework

Principal Component Projection and Regression in Nearly Linear Time through Asymmetric SVRG


Title	Principal Component Projection and Regression in Nearly Linear Time through Asymmetric SVRG
Authors	Yujia Jin, Aaron Sidford
Abstract	Given a data matrix $\mathbf{A} \in \mathbb{R}^{n \times d}$, principal component projection (PCP) and principal component regression (PCR), i.e. projection and regression restricted to the top-eigenspace of $\mathbf{A}$, are fundamental problems in machine learning, optimization, and numerical analysis. In this paper we provide the first algorithms that solve these problems in nearly linear time for fixed eigenvalue distribution and large n. This improves upon previous methods which have superlinear running times when both the number of top eigenvalues and inverse gap between eigenspaces is large. We achieve our results by applying rational approximations to reduce PCP and PCR to solving asymmetric linear systems which we solve by a variant of SVRG. We corroborate these findings with preliminary empirical experiments.
Tasks
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06517v1
PDF	https://arxiv.org/pdf/1910.06517v1.pdf
PWC	https://paperswithcode.com/paper/principal-component-projection-and-regression
Repo
Framework

Global Optimality Guarantees for Nonconvex Unsupervised Video Segmentation


Title	Global Optimality Guarantees for Nonconvex Unsupervised Video Segmentation
Authors	Brendon G. Anderson, Somayeh Sojoudi
Abstract	In this paper, we consider the problem of unsupervised video object segmentation via background subtraction. Specifically, we pose the nonsemantic extraction of a video’s moving objects as a nonconvex optimization problem via a sum of sparse and low-rank matrices. The resulting formulation, a nonnegative variant of robust principal component analysis, is more computationally tractable than its commonly employed convex relaxation, although not generally solvable to global optimality. In spite of this limitation, we derive intuitive and interpretable conditions on the video data under which the uniqueness and global optimality of the object segmentation are guaranteed using local search methods. We illustrate these novel optimality criteria through example segmentations using real video data.
Tasks	Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04409v2
PDF	https://arxiv.org/pdf/1907.04409v2.pdf
PWC	https://paperswithcode.com/paper/global-optimality-guarantees-for-nonconvex
Repo
Framework

Robust Machine Translation with Domain Sensitive Pseudo-Sources: Baidu-OSU WMT19 MT Robustness Shared Task System Report


Title	Robust Machine Translation with Domain Sensitive Pseudo-Sources: Baidu-OSU WMT19 MT Robustness Shared Task System Report
Authors	Renjie Zheng, Hairong Liu, Mingbo Ma, Baigong Zheng, Liang Huang
Abstract	This paper describes the machine translation system developed jointly by Baidu Research and Oregon State University for WMT 2019 Machine Translation Robustness Shared Task. Translation of social media is a very challenging problem, since its style is very different from normal parallel corpora (e.g. News) and also include various types of noises. To make it worse, the amount of social media parallel corpora is extremely limited. In this paper, we use a domain sensitive training method which leverages a large amount of parallel data from popular domains together with a little amount of parallel data from social media. Furthermore, we generate a parallel dataset with pseudo noisy source sentences which are back-translated from monolingual data using a model trained by a similar domain sensitive way. We achieve more than 10 BLEU improvement in both En-Fr and Fr-En translation compared with the baseline methods.
Tasks	Machine Translation
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08393v2
PDF	https://arxiv.org/pdf/1906.08393v2.pdf
PWC	https://paperswithcode.com/paper/robust-machine-translation-with-domain
Repo
Framework

Domain-adaptive Crowd Counting via Inter-domain Features Segregation and Gaussian-prior Reconstruction


Title	Domain-adaptive Crowd Counting via Inter-domain Features Segregation and Gaussian-prior Reconstruction
Authors	Junyu Gao, Tao Han, Qi Wang, Yuan Yuan
Abstract	Recently, crowd counting using supervised learning achieves a remarkable improvement. Nevertheless, most counters rely on a large amount of manually labeled data. With the release of synthetic crowd data, a potential alternative is transferring knowledge from them to real data without any manual label. However, there is no method to effectively suppress domain gaps and output elaborate density maps during the transferring. To remedy the above problems, this paper proposed a Domain-Adaptive Crowd Counting (DACC) framework, which consists of Inter-domain Features Segregation (IFS) and Gaussian-prior Reconstruction (GPR). To be specific, IFS translates synthetic data to realistic images, which contains domain-shared features extraction and domain-independent features decoration. Then a coarse counter is trained on translated data and applied to the real world. Moreover, according to the coarse predictions, GPR generates pseudo labels to improve the prediction quality of the real data. Next, we retrain a final counter using these pseudo labels. Adaptation experiments on six real-world datasets demonstrate that the proposed method outperforms the state-of-the-art methods. Furthermore, the code and pre-trained models will be released as soon as possible.
Tasks	Crowd Counting
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03677v2
PDF	https://arxiv.org/pdf/1912.03677v2.pdf
PWC	https://paperswithcode.com/paper/domain-adaptive-crowd-counting-via-inter
Repo
Framework

A Novel Euler’s Elastica based Segmentation Approach for Noisy Images via using the Progressive Hedging Algorithm


Title	A Novel Euler’s Elastica based Segmentation Approach for Noisy Images via using the Progressive Hedging Algorithm
Authors	Lu Tan, Ling Li, Wanquan Liu, Jie Sun, Min Zhang
Abstract	Euler’s Elastica based unsupervised segmentation models have strong capability of completing the missing boundaries for existing objects in a clean image, but they are not working well for noisy images. This paper aims to establish a Euler’s Elastica based approach that properly deals with random noises to improve the segmentation performance for noisy images. We solve the corresponding optimization problem via using the progressive hedging algorithm (PHA) with a step length suggested by the alternating direction method of multipliers (ADMM). Technically, all the simplified convex versions of the subproblems derived from the major framework of PHA can be obtained by using the curvature weighted approach and the convex relaxation method. Then an alternating optimization strategy is applied with the merits of using some powerful accelerating techniques including the fast Fourier transform (FFT) and generalized soft threshold formulas. Extensive experiments have been conducted on both synthetic and real images, which validated some significant gains of the proposed segmentation models and demonstrated the advantages of the developed algorithm.
Tasks
Published	2019-02-20
URL	http://arxiv.org/abs/1902.07402v1
PDF	http://arxiv.org/pdf/1902.07402v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-eulers-elastica-based-segmentation
Repo
Framework

PIXOR: Real-time 3D Object Detection from Point Clouds


Title	PIXOR: Real-time 3D Object Detection from Point Clouds
Authors	Bin Yang, Wenjie Luo, Raquel Urtasun
Abstract	We address the problem of real-time 3D object detection from point clouds in the context of autonomous driving. Computation speed is critical as detection is a necessary component for safety. Existing approaches are, however, expensive in computation due to high dimensionality of point clouds. We utilize the 3D data more efficiently by representing the scene from the Bird’s Eye View (BEV), and propose PIXOR, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions. The input representation, network architecture, and model optimization are especially designed to balance high accuracy and real-time efficiency. We validate PIXOR on two datasets: the KITTI BEV object detection benchmark, and a large-scale 3D vehicle detection benchmark. In both datasets we show that the proposed detector surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at >28 FPS.
Tasks	3D Object Detection, Autonomous Driving, Object Detection
Published	2019-02-17
URL	http://arxiv.org/abs/1902.06326v3
PDF	http://arxiv.org/pdf/1902.06326v3.pdf
PWC	https://paperswithcode.com/paper/pixor-real-time-3d-object-detection-from
Repo
Framework

Some observations concerning Off Training Set (OTS) error


Title	Some observations concerning Off Training Set (OTS) error
Authors	Jonathan Baxter
Abstract	A form of generalisation error known as Off Training Set (OTS) error was recently introduced in [Wolpert, 1996b], along with a theorem showing that small training set error does not guarantee small OTS error, unless assumptions are made about the target function. Here it is shown that the applicability of this theorem is limited to models in which the distribution generating training data has no overlap with the distribution generating test data. It is argued that such a scenario is of limited relevance to machine learning.
Tasks
Published	2019-11-18
URL	https://arxiv.org/abs/1912.05915v1
PDF	https://arxiv.org/pdf/1912.05915v1.pdf
PWC	https://paperswithcode.com/paper/some-observations-concerning-off-training-set
Repo
Framework

Recurring Concept Meta-learning for Evolving Data Streams


Title	Recurring Concept Meta-learning for Evolving Data Streams
Authors	Robert Anderson, Yun Sing Koh, Gillian Dobbie, Albert Bifet
Abstract	When concept drift is detected during classification in a data stream, a common remedy is to retrain a framework’s classifier. However, this loses useful information if the classifier has learnt the current concept well, and this concept will recur again in the future. Some frameworks retain and reuse classifiers, but it can be time-consuming to select an appropriate classifier to reuse. These frameworks rarely match the accuracy of state-of-the-art ensemble approaches. For many data stream tasks, speed is important: fast, accurate frameworks are needed for time-dependent applications. We propose the Enhanced Concept Profiling Framework (ECPF), which aims to recognise recurring concepts and reuse a classifier trained previously, enabling accurate classification immediately following a drift. The novelty of ECPF is in how it uses similarity of classifications on new data, between a new classifier and existing classifiers, to quickly identify the best classifier to reuse. It always trains both a new classifier and a reused classifier, and retains the more accurate classifier when concept drift occurs. Finally, it creates a copy of reused classifiers, so a classifier well-suited for a recurring concept will not be impacted by being trained on a different concept. In our experiments, ECPF classifies significantly more accurately than a state-of-the-art classifier reuse framework (Diversity Pool) and a state-of-the-art ensemble technique (Adaptive Random Forest) on synthetic datasets with recurring concepts. It classifies real-world datasets five times faster than Diversity Pool, and six times faster than Adaptive Random Forest and is not significantly less accurate than either.
Tasks	Meta-Learning
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08848v1
PDF	https://arxiv.org/pdf/1905.08848v1.pdf
PWC	https://paperswithcode.com/paper/recurring-concept-meta-learning-for-evolving
Repo
Framework