Paper Group ANR 880
Unsupervised Lemmatization as Embeddings-Based Word Clustering. The Chi-Square Test of Distance Correlation. An Empirical Study of Propagation-based Methods for Video Object Segmentation. Event-based Feature Extraction Using Adaptive Selection Thresholds. Representation Learning on Graphs: A Reinforcement Learning Application. Multispectral and Hyp …
Unsupervised Lemmatization as Embeddings-Based Word Clustering
Title | Unsupervised Lemmatization as Embeddings-Based Word Clustering |
Authors | Rudolf Rosa, Zdeněk Žabokrtský |
Abstract | We focus on the task of unsupervised lemmatization, i.e. grouping together inflected forms of one word under one label (a lemma) without the use of annotated training data. We propose to perform agglomerative clustering of word forms with a novel distance measure. Our distance measure is based on the observation that inflections of the same word tend to be similar both string-wise and in meaning. We therefore combine word embedding cosine similarity, serving as a proxy to the meaning similarity, with Jaro-Winkler edit distance. Our experiments on 23 languages show our approach to be promising, surpassing the baseline on 23 of the 28 evaluation datasets. |
Tasks | Lemmatization |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08528v1 |
https://arxiv.org/pdf/1908.08528v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-lemmatization-as-embeddings |
Repo | |
Framework | |
The Chi-Square Test of Distance Correlation
Title | The Chi-Square Test of Distance Correlation |
Authors | Cencheng Shen, Joshua T. Vogelstein |
Abstract | Distance correlation has gained much recent attention in the data science community: the sample statistic is straightforward to compute and asymptotically equals zero if and only if independence, making it an ideal choice to test any type of dependency structure given sufficient sample size. One major bottleneck is the testing process: because the null distribution of distance correlation depends on the underlying random variables and metric choice, it typically requires a permutation test to estimate the null and compute the p-value, which is very costly for large amount of data. To overcome the difficulty, we propose a centered chi-square distribution, demonstrate it well-approximates the limiting null distribution of unbiased distance correlation, and prove upper tail dominance and distribution bound. The resulting distance correlation chi-square test is a nonparametric test for independence, is valid and universally consistent using any strong negative type metric or characteristic kernel, enjoys a similar finite-sample testing power as the standard permutation test, is provably most powerful among all valid tests of distance correlation using known distributions, and is also applicable to K-sample and partial testing. |
Tasks | |
Published | 2019-12-27 |
URL | https://arxiv.org/abs/1912.12150v4 |
https://arxiv.org/pdf/1912.12150v4.pdf | |
PWC | https://paperswithcode.com/paper/the-chi-square-test-of-distance-correlation |
Repo | |
Framework | |
An Empirical Study of Propagation-based Methods for Video Object Segmentation
Title | An Empirical Study of Propagation-based Methods for Video Object Segmentation |
Authors | Hengkai Guo, Wenji Wang, Guanjun Guo, Huaxia Li, Jiachen Liu, Qian He, Xuefeng Xiao |
Abstract | While propagation-based approaches have achieved state-of-the-art performance for video object segmentation, the literature lacks a fair comparison of different methods using the same settings. In this paper, we carry out an empirical study for propagation-based methods. We view these approaches from a unified perspective and conduct detailed ablation study for core methods, input cues, multi-object combination and training strategies. With careful designs, our improved end-to-end memory networks achieve a global mean of 76.1 on DAVIS 2017 val set. |
Tasks | Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.12769v1 |
https://arxiv.org/pdf/1907.12769v1.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-study-of-propagation-based |
Repo | |
Framework | |
Event-based Feature Extraction Using Adaptive Selection Thresholds
Title | Event-based Feature Extraction Using Adaptive Selection Thresholds |
Authors | Saeed Afshar, Ying Xu, Jonathan Tapson, André van Schaik, Gregory Cohen |
Abstract | Unsupervised feature extraction algorithms form one of the most important building blocks in machine learning systems. These algorithms are often adapted to the event-based domain to perform online learning in neuromorphic hardware. However, not designed for the purpose, such algorithms typically require significant simplification during implementation to meet hardware constraints, creating trade offs with performance. Furthermore, conventional feature extraction algorithms are not designed to generate useful intermediary signals which are valuable only in the context of neuromorphic hardware limitations. In this work a novel event-based feature extraction method is proposed that focuses on these issues. The algorithm operates via simple adaptive selection thresholds which allow a simpler implementation of network homeostasis than previous works by trading off a small amount of information loss in the form of missed events that fall outside the selection thresholds. The behavior of the selection thresholds and the output of the network as a whole are shown to provide uniquely useful signals indicating network weight convergence without the need to access network weights. A novel heuristic method for network size selection is proposed which makes use of noise events and their feature representations. The use of selection thresholds is shown to produce network activation patterns that predict classification accuracy allowing rapid evaluation and optimization of system parameters without the need to run back-end classifiers. The feature extraction method is tested on both the N-MNIST benchmarking dataset and a dataset of airplanes passing through the field of view. Multiple configurations with different classifiers are tested with the results quantifying the resultant performance gains at each processing stage. |
Tasks | |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.07853v2 |
https://arxiv.org/pdf/1907.07853v2.pdf | |
PWC | https://paperswithcode.com/paper/event-based-feature-extraction-using-adaptive |
Repo | |
Framework | |
Representation Learning on Graphs: A Reinforcement Learning Application
Title | Representation Learning on Graphs: A Reinforcement Learning Application |
Authors | Sephora Madjiheurem, Laura Toni |
Abstract | In this work, we study value function approximation in reinforcement learning (RL) problems with high dimensional state or action spaces via a generalized version of representation policy iteration (RPI). We consider the limitations of proto-value functions (PVFs) at accurately approximating the value function in low dimensions and we highlight the importance of features learning for an improved low-dimensional value function approximation. Then, we adopt different representation learning algorithm on graphs to learn the basis functions that best represent the value function. We empirically show that node2vec, an algorithm for scalable feature learning in networks, and the Variational Graph Auto-Encoder constantly outperform the commonly used smooth proto-value functions in low-dimensional feature space. |
Tasks | Representation Learning |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05351v2 |
http://arxiv.org/pdf/1901.05351v2.pdf | |
PWC | https://paperswithcode.com/paper/representation-learning-on-graphs-a |
Repo | |
Framework | |
Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net
Title | Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net |
Authors | Qi Xie, Minghao Zhou, Qian Zhao, Deyu Meng, Wangmeng Zuo, Zongben Xu |
Abstract | Hyperspectral imaging can help better understand the characteristics of different materials, compared with traditional image systems. However, only high-resolution multispectral (HrMS) and low-resolution hyperspectral (LrHS) images can generally be captured at video rate in practice. In this paper, we propose a model-based deep learning approach for merging an HrMS and LrHS images to generate a high-resolution hyperspectral (HrHS) image. In specific, we construct a novel MS/HS fusion model which takes the observation models of low-resolution images and the low-rankness knowledge along the spectral mode of HrHS image into consideration. Then we design an iterative algorithm to solve the model by exploiting the proximal gradient method. And then, by unfolding the designed algorithm, we construct a deep network, called MS/HS Fusion Net, with learning the proximal operators and model parameters by convolutional neural networks. Experimental results on simulated and real data substantiate the superiority of our method both visually and quantitatively as compared with state-of-the-art methods along this line of research. |
Tasks | |
Published | 2019-01-10 |
URL | http://arxiv.org/abs/1901.03281v1 |
http://arxiv.org/pdf/1901.03281v1.pdf | |
PWC | https://paperswithcode.com/paper/multispectral-and-hyperspectral-image-fusion-1 |
Repo | |
Framework | |
AutoScale: Learning to Scale for Crowd Counting
Title | AutoScale: Learning to Scale for Crowd Counting |
Authors | Chenfeng Xu, Dingkang Liang, Yongchao Xu, Song Bai, Wei Zhan, Masayoshi Tomizuka, Xiang Bai |
Abstract | Crowd counting in images is a widely explored but challenging task. Though recent convolutional neural network (CNN) methods have achieved great progress, it is still difficult to accurately count and even to precisely localize people in very dense regions. A major issue is that dense regions usually consist of many instances of small size, and thus exhibit very different density patterns compared with sparse regions. Localizing or detecting dense small objects is also very delicate. In this paper, instead of processing image pyramid and aggregating multi-scale features, we propose a simple yet effective Learning to Scale (L2S) module to cope with significant scale variations in both regression and localization. Specifically, L2S module aims to automatically scale dense regions into similar and reasonable scale levels. This alleviates the density pattern shift for density regression methods and facilitates the localization of small instances. Besides, we also introduce a novel distance label map combined with a customized adapted cross-entropy loss for precise person localization. Extensive experiments demonstrate that the proposed method termed AutoScale consistently improves upon state-of-the-art methods in both regression and localization benchmarks on three widely used datasets. The proposed AutoScale also demonstrates a noteworthy transferability under cross-dataset validation on different datasets. |
Tasks | Crowd Counting |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.09632v1 |
https://arxiv.org/pdf/1912.09632v1.pdf | |
PWC | https://paperswithcode.com/paper/autoscale-learning-to-scale-for-crowd |
Repo | |
Framework | |
Principal Component Projection and Regression in Nearly Linear Time through Asymmetric SVRG
Title | Principal Component Projection and Regression in Nearly Linear Time through Asymmetric SVRG |
Authors | Yujia Jin, Aaron Sidford |
Abstract | Given a data matrix $\mathbf{A} \in \mathbb{R}^{n \times d}$, principal component projection (PCP) and principal component regression (PCR), i.e. projection and regression restricted to the top-eigenspace of $\mathbf{A}$, are fundamental problems in machine learning, optimization, and numerical analysis. In this paper we provide the first algorithms that solve these problems in nearly linear time for fixed eigenvalue distribution and large n. This improves upon previous methods which have superlinear running times when both the number of top eigenvalues and inverse gap between eigenspaces is large. We achieve our results by applying rational approximations to reduce PCP and PCR to solving asymmetric linear systems which we solve by a variant of SVRG. We corroborate these findings with preliminary empirical experiments. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06517v1 |
https://arxiv.org/pdf/1910.06517v1.pdf | |
PWC | https://paperswithcode.com/paper/principal-component-projection-and-regression |
Repo | |
Framework | |
Global Optimality Guarantees for Nonconvex Unsupervised Video Segmentation
Title | Global Optimality Guarantees for Nonconvex Unsupervised Video Segmentation |
Authors | Brendon G. Anderson, Somayeh Sojoudi |
Abstract | In this paper, we consider the problem of unsupervised video object segmentation via background subtraction. Specifically, we pose the nonsemantic extraction of a video’s moving objects as a nonconvex optimization problem via a sum of sparse and low-rank matrices. The resulting formulation, a nonnegative variant of robust principal component analysis, is more computationally tractable than its commonly employed convex relaxation, although not generally solvable to global optimality. In spite of this limitation, we derive intuitive and interpretable conditions on the video data under which the uniqueness and global optimality of the object segmentation are guaranteed using local search methods. We illustrate these novel optimality criteria through example segmentations using real video data. |
Tasks | Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04409v2 |
https://arxiv.org/pdf/1907.04409v2.pdf | |
PWC | https://paperswithcode.com/paper/global-optimality-guarantees-for-nonconvex |
Repo | |
Framework | |
Robust Machine Translation with Domain Sensitive Pseudo-Sources: Baidu-OSU WMT19 MT Robustness Shared Task System Report
Title | Robust Machine Translation with Domain Sensitive Pseudo-Sources: Baidu-OSU WMT19 MT Robustness Shared Task System Report |
Authors | Renjie Zheng, Hairong Liu, Mingbo Ma, Baigong Zheng, Liang Huang |
Abstract | This paper describes the machine translation system developed jointly by Baidu Research and Oregon State University for WMT 2019 Machine Translation Robustness Shared Task. Translation of social media is a very challenging problem, since its style is very different from normal parallel corpora (e.g. News) and also include various types of noises. To make it worse, the amount of social media parallel corpora is extremely limited. In this paper, we use a domain sensitive training method which leverages a large amount of parallel data from popular domains together with a little amount of parallel data from social media. Furthermore, we generate a parallel dataset with pseudo noisy source sentences which are back-translated from monolingual data using a model trained by a similar domain sensitive way. We achieve more than 10 BLEU improvement in both En-Fr and Fr-En translation compared with the baseline methods. |
Tasks | Machine Translation |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.08393v2 |
https://arxiv.org/pdf/1906.08393v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-machine-translation-with-domain |
Repo | |
Framework | |
Domain-adaptive Crowd Counting via Inter-domain Features Segregation and Gaussian-prior Reconstruction
Title | Domain-adaptive Crowd Counting via Inter-domain Features Segregation and Gaussian-prior Reconstruction |
Authors | Junyu Gao, Tao Han, Qi Wang, Yuan Yuan |
Abstract | Recently, crowd counting using supervised learning achieves a remarkable improvement. Nevertheless, most counters rely on a large amount of manually labeled data. With the release of synthetic crowd data, a potential alternative is transferring knowledge from them to real data without any manual label. However, there is no method to effectively suppress domain gaps and output elaborate density maps during the transferring. To remedy the above problems, this paper proposed a Domain-Adaptive Crowd Counting (DACC) framework, which consists of Inter-domain Features Segregation (IFS) and Gaussian-prior Reconstruction (GPR). To be specific, IFS translates synthetic data to realistic images, which contains domain-shared features extraction and domain-independent features decoration. Then a coarse counter is trained on translated data and applied to the real world. Moreover, according to the coarse predictions, GPR generates pseudo labels to improve the prediction quality of the real data. Next, we retrain a final counter using these pseudo labels. Adaptation experiments on six real-world datasets demonstrate that the proposed method outperforms the state-of-the-art methods. Furthermore, the code and pre-trained models will be released as soon as possible. |
Tasks | Crowd Counting |
Published | 2019-12-08 |
URL | https://arxiv.org/abs/1912.03677v2 |
https://arxiv.org/pdf/1912.03677v2.pdf | |
PWC | https://paperswithcode.com/paper/domain-adaptive-crowd-counting-via-inter |
Repo | |
Framework | |
A Novel Euler’s Elastica based Segmentation Approach for Noisy Images via using the Progressive Hedging Algorithm
Title | A Novel Euler’s Elastica based Segmentation Approach for Noisy Images via using the Progressive Hedging Algorithm |
Authors | Lu Tan, Ling Li, Wanquan Liu, Jie Sun, Min Zhang |
Abstract | Euler’s Elastica based unsupervised segmentation models have strong capability of completing the missing boundaries for existing objects in a clean image, but they are not working well for noisy images. This paper aims to establish a Euler’s Elastica based approach that properly deals with random noises to improve the segmentation performance for noisy images. We solve the corresponding optimization problem via using the progressive hedging algorithm (PHA) with a step length suggested by the alternating direction method of multipliers (ADMM). Technically, all the simplified convex versions of the subproblems derived from the major framework of PHA can be obtained by using the curvature weighted approach and the convex relaxation method. Then an alternating optimization strategy is applied with the merits of using some powerful accelerating techniques including the fast Fourier transform (FFT) and generalized soft threshold formulas. Extensive experiments have been conducted on both synthetic and real images, which validated some significant gains of the proposed segmentation models and demonstrated the advantages of the developed algorithm. |
Tasks | |
Published | 2019-02-20 |
URL | http://arxiv.org/abs/1902.07402v1 |
http://arxiv.org/pdf/1902.07402v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-eulers-elastica-based-segmentation |
Repo | |
Framework | |
PIXOR: Real-time 3D Object Detection from Point Clouds
Title | PIXOR: Real-time 3D Object Detection from Point Clouds |
Authors | Bin Yang, Wenjie Luo, Raquel Urtasun |
Abstract | We address the problem of real-time 3D object detection from point clouds in the context of autonomous driving. Computation speed is critical as detection is a necessary component for safety. Existing approaches are, however, expensive in computation due to high dimensionality of point clouds. We utilize the 3D data more efficiently by representing the scene from the Bird’s Eye View (BEV), and propose PIXOR, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions. The input representation, network architecture, and model optimization are especially designed to balance high accuracy and real-time efficiency. We validate PIXOR on two datasets: the KITTI BEV object detection benchmark, and a large-scale 3D vehicle detection benchmark. In both datasets we show that the proposed detector surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at >28 FPS. |
Tasks | 3D Object Detection, Autonomous Driving, Object Detection |
Published | 2019-02-17 |
URL | http://arxiv.org/abs/1902.06326v3 |
http://arxiv.org/pdf/1902.06326v3.pdf | |
PWC | https://paperswithcode.com/paper/pixor-real-time-3d-object-detection-from |
Repo | |
Framework | |
Some observations concerning Off Training Set (OTS) error
Title | Some observations concerning Off Training Set (OTS) error |
Authors | Jonathan Baxter |
Abstract | A form of generalisation error known as Off Training Set (OTS) error was recently introduced in [Wolpert, 1996b], along with a theorem showing that small training set error does not guarantee small OTS error, unless assumptions are made about the target function. Here it is shown that the applicability of this theorem is limited to models in which the distribution generating training data has no overlap with the distribution generating test data. It is argued that such a scenario is of limited relevance to machine learning. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1912.05915v1 |
https://arxiv.org/pdf/1912.05915v1.pdf | |
PWC | https://paperswithcode.com/paper/some-observations-concerning-off-training-set |
Repo | |
Framework | |
Recurring Concept Meta-learning for Evolving Data Streams
Title | Recurring Concept Meta-learning for Evolving Data Streams |
Authors | Robert Anderson, Yun Sing Koh, Gillian Dobbie, Albert Bifet |
Abstract | When concept drift is detected during classification in a data stream, a common remedy is to retrain a framework’s classifier. However, this loses useful information if the classifier has learnt the current concept well, and this concept will recur again in the future. Some frameworks retain and reuse classifiers, but it can be time-consuming to select an appropriate classifier to reuse. These frameworks rarely match the accuracy of state-of-the-art ensemble approaches. For many data stream tasks, speed is important: fast, accurate frameworks are needed for time-dependent applications. We propose the Enhanced Concept Profiling Framework (ECPF), which aims to recognise recurring concepts and reuse a classifier trained previously, enabling accurate classification immediately following a drift. The novelty of ECPF is in how it uses similarity of classifications on new data, between a new classifier and existing classifiers, to quickly identify the best classifier to reuse. It always trains both a new classifier and a reused classifier, and retains the more accurate classifier when concept drift occurs. Finally, it creates a copy of reused classifiers, so a classifier well-suited for a recurring concept will not be impacted by being trained on a different concept. In our experiments, ECPF classifies significantly more accurately than a state-of-the-art classifier reuse framework (Diversity Pool) and a state-of-the-art ensemble technique (Adaptive Random Forest) on synthetic datasets with recurring concepts. It classifies real-world datasets five times faster than Diversity Pool, and six times faster than Adaptive Random Forest and is not significantly less accurate than either. |
Tasks | Meta-Learning |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08848v1 |
https://arxiv.org/pdf/1905.08848v1.pdf | |
PWC | https://paperswithcode.com/paper/recurring-concept-meta-learning-for-evolving |
Repo | |
Framework | |