July 28, 2019

3048 words 15 mins read

Paper Group ANR 462

Paper Group ANR 462

Neural Decomposition of Time-Series Data for Effective Generalization. Learning weakly supervised multimodal phoneme embeddings. Position-based Content Attention for Time Series Forecasting with Sequence-to-sequence RNNs. Model compression as constrained optimization, with application to neural nets. Part II: quantization. Super-Trajectory for Vide …

Neural Decomposition of Time-Series Data for Effective Generalization

Title Neural Decomposition of Time-Series Data for Effective Generalization
Authors Luke B. Godfrey, Michael S. Gashler
Abstract We present a neural network technique for the analysis and extrapolation of time-series data called Neural Decomposition (ND). Units with a sinusoidal activation function are used to perform a Fourier-like decomposition of training samples into a sum of sinusoids, augmented by units with nonperiodic activation functions to capture linear trends and other nonperiodic components. We show how careful weight initialization can be combined with regularization to form a simple model that generalizes well. Our method generalizes effectively on the Mackey-Glass series, a dataset of unemployment rates as reported by the U.S. Department of Labor Statistics, a time-series of monthly international airline passengers, the monthly ozone concentration in downtown Los Angeles, and an unevenly sampled time-series of oxygen isotope measurements from a cave in north India. We find that ND outperforms popular time-series forecasting techniques including LSTM, echo state networks, ARIMA, SARIMA, SVR with a radial basis function, and Gashler and Ashmore’s model.
Tasks Time Series, Time Series Forecasting
Published 2017-05-25
URL http://arxiv.org/abs/1705.09137v2
PDF http://arxiv.org/pdf/1705.09137v2.pdf
PWC https://paperswithcode.com/paper/neural-decomposition-of-time-series-data-for
Repo
Framework

Learning weakly supervised multimodal phoneme embeddings

Title Learning weakly supervised multimodal phoneme embeddings
Authors Rahma Chaabouni, Ewan Dunbar, Neil Zeghidour, Emmanuel Dupoux
Abstract Recent works have explored deep architectures for learning multimodal speech representation (e.g. audio and images, articulation and audio) in a supervised way. Here we investigate the role of combining different speech modalities, i.e. audio and visual information representing the lips movements, in a weakly supervised way using Siamese networks and lexical same-different side information. In particular, we ask whether one modality can benefit from the other to provide a richer representation for phone recognition in a weakly supervised setting. We introduce mono-task and multi-task methods for merging speech and visual modalities for phone recognition. The mono-task learning consists in applying a Siamese network on the concatenation of the two modalities, while the multi-task learning receives several different combinations of modalities at train time. We show that multi-task learning enhances discriminability for visual and multimodal inputs while minimally impacting auditory inputs. Furthermore, we present a qualitative analysis of the obtained phone embeddings, and show that cross-modal visual input can improve the discriminability of phonological features which are visually discernable (rounding, open/close, labial place of articulation), resulting in representations that are closer to abstract linguistic features than those based on audio only.
Tasks Multi-Task Learning
Published 2017-04-23
URL http://arxiv.org/abs/1704.06913v2
PDF http://arxiv.org/pdf/1704.06913v2.pdf
PWC https://paperswithcode.com/paper/learning-weakly-supervised-multimodal-phoneme
Repo
Framework

Position-based Content Attention for Time Series Forecasting with Sequence-to-sequence RNNs

Title Position-based Content Attention for Time Series Forecasting with Sequence-to-sequence RNNs
Authors Yagmur G. Cinar, Hamid Mirisaee, Parantapa Goswami, Eric Gaussier, Ali Ait-Bachir, Vadim Strijov
Abstract We propose here an extended attention model for sequence-to-sequence recurrent neural networks (RNNs) designed to capture (pseudo-)periods in time series. This extended attention model can be deployed on top of any RNN and is shown to yield state-of-the-art performance for time series forecasting on several univariate and multivariate time series.
Tasks Time Series, Time Series Forecasting
Published 2017-03-29
URL http://arxiv.org/abs/1703.10089v2
PDF http://arxiv.org/pdf/1703.10089v2.pdf
PWC https://paperswithcode.com/paper/position-based-content-attention-for-time
Repo
Framework

Model compression as constrained optimization, with application to neural nets. Part II: quantization

Title Model compression as constrained optimization, with application to neural nets. Part II: quantization
Authors Miguel Á. Carreira-Perpiñán, Yerlan Idelbayev
Abstract We consider the problem of deep neural net compression by quantization: given a large, reference net, we want to quantize its real-valued weights using a codebook with $K$ entries so that the training loss of the quantized net is minimal. The codebook can be optimally learned jointly with the net, or fixed, as for binarization or ternarization approaches. Previous work has quantized the weights of the reference net, or incorporated rounding operations in the backpropagation algorithm, but this has no guarantee of converging to a loss-optimal, quantized net. We describe a new approach based on the recently proposed framework of model compression as constrained optimization \citep{Carreir17a}. This results in a simple iterative “learning-compression” algorithm, which alternates a step that learns a net of continuous weights with a step that quantizes (or binarizes/ternarizes) the weights, and is guaranteed to converge to local optimum of the loss for quantized nets. We develop algorithms for an adaptive codebook or a (partially) fixed codebook. The latter includes binarization, ternarization, powers-of-two and other important particular cases. We show experimentally that we can achieve much higher compression rates than previous quantization work (even using just 1 bit per weight) with negligible loss degradation.
Tasks Model Compression, Quantization
Published 2017-07-13
URL http://arxiv.org/abs/1707.04319v1
PDF http://arxiv.org/pdf/1707.04319v1.pdf
PWC https://paperswithcode.com/paper/model-compression-as-constrained-optimization
Repo
Framework

Super-Trajectory for Video Segmentation

Title Super-Trajectory for Video Segmentation
Authors Wenguan Wang, Jianbing Shen, Jianwen Xie, Fatih Porikli
Abstract We introduce a novel semi-supervised video segmentation approach based on an efficient video representation, called as “super-trajectory”. Each super-trajectory corresponds to a group of compact trajectories that exhibit consistent motion patterns, similar appearance and close spatiotemporal relationships. We generate trajectories using a probabilistic model, which handles occlusions and drifts in a robust and natural way. To reliably group trajectories, we adopt a modified version of the density peaks based clustering algorithm that allows capturing rich spatiotemporal relations among trajectories in the clustering process. The presented video representation is discriminative enough to accurately propagate the initial annotations in the first frame onto the remaining video frames. Extensive experimental analysis on challenging benchmarks demonstrate our method is capable of distinguishing the target objects from complex backgrounds and even reidentifying them after long-term occlusions.
Tasks Video Semantic Segmentation
Published 2017-02-28
URL http://arxiv.org/abs/1702.08634v4
PDF http://arxiv.org/pdf/1702.08634v4.pdf
PWC https://paperswithcode.com/paper/super-trajectory-for-video-segmentation
Repo
Framework

Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes

Title Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes
Authors Takuya Akiba, Shuji Suzuki, Keisuke Fukuda
Abstract We demonstrate that training ResNet-50 on ImageNet for 90 epochs can be achieved in 15 minutes with 1024 Tesla P100 GPUs. This was made possible by using a large minibatch size of 32k. To maintain accuracy with this large minibatch size, we employed several techniques such as RMSprop warm-up, batch normalization without moving averages, and a slow-start learning rate schedule. This paper also describes the details of the hardware and software of the system used to achieve the above performance.
Tasks
Published 2017-11-12
URL http://arxiv.org/abs/1711.04325v1
PDF http://arxiv.org/pdf/1711.04325v1.pdf
PWC https://paperswithcode.com/paper/extremely-large-minibatch-sgd-training-resnet
Repo
Framework

A Focal Any-Angle Path-finding Algorithm Based on A* on Visibility Graphs

Title A Focal Any-Angle Path-finding Algorithm Based on A* on Visibility Graphs
Authors Pei Cao, Zhaoyan Fan, Robert X. Gao, Jiong Tang
Abstract In this research, we investigate the subject of path-finding. A pruned version of visibility graph based on Candidate Vertices is formulated, followed by a new visibility check technique. Such combination enables us to quickly identify the useful vertices and thus find the optimal path more efficiently. The algorithm proposed is demonstrated on various path-finding cases. The performance of the new technique on visibility graphs is compared to the traditional A* on Grids, Theta* and A* on Visibility Graphs in terms of path length, number of nodes evaluated, as well as computational time. The key algorithmic contribution is that the new approach combines the merits of grid-based method and visibility graph-based method and thus yields better overall performance.
Tasks
Published 2017-06-09
URL http://arxiv.org/abs/1706.03144v1
PDF http://arxiv.org/pdf/1706.03144v1.pdf
PWC https://paperswithcode.com/paper/a-focal-any-angle-path-finding-algorithm
Repo
Framework

Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries

Title Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
Authors Bohan Zhuang, Qi Wu, Chunhua Shen, Ian Reid, Anton van den Hengel
Abstract Recognising objects according to a pre-defined fixed set of class labels has been well studied in the Computer Vision. There are a great many practical applications where the subjects that may be of interest are not known beforehand, or so easily delineated, however. In many of these cases natural language dialog is a natural way to specify the subject of interest, and the task achieving this capability (a.k.a, Referring Expression Comprehension) has recently attracted attention. To this end we propose a unified framework, the ParalleL AttentioN (PLAN) network, to discover the object in an image that is being referred to in variable length natural expression descriptions, from short phrases query to long multi-round dialogs. The PLAN network has two attention mechanisms that relate parts of the expressions to both the global visual content and also directly to object candidates. Furthermore, the attention mechanisms are recurrent, making the referring process visualizable and explainable. The attended information from these dual sources are combined to reason about the referred object. These two attention mechanisms can be trained in parallel and we find the combined system outperforms the state-of-art on several benchmarked datasets with different length language input, such as RefCOCO, RefCOCO+ and GuessWhat?!.
Tasks
Published 2017-11-17
URL http://arxiv.org/abs/1711.06370v1
PDF http://arxiv.org/pdf/1711.06370v1.pdf
PWC https://paperswithcode.com/paper/parallel-attention-a-unified-framework-for
Repo
Framework

Speech Map: A Statistical Multimodal Atlas of 4D Tongue Motion During Speech from Tagged and Cine MR Images

Title Speech Map: A Statistical Multimodal Atlas of 4D Tongue Motion During Speech from Tagged and Cine MR Images
Authors Jonghye Woo, Fangxu Xing, Maureen Stone, Jordan Green, Timothy G. Reese, Thomas J. Brady, Van J. Wedeen, Jerry L. Prince, Georges El Fakhri
Abstract Quantitative measurement of functional and anatomical traits of 4D tongue motion in the course of speech or other lingual behaviors remains a major challenge in scientific research and clinical applications. Here, we introduce a statistical multimodal atlas of 4D tongue motion using healthy subjects, which enables a combined quantitative characterization of tongue motion in a reference anatomical configuration. This atlas framework, termed Speech Map, combines cine- and tagged-MRI in order to provide both the anatomic reference and motion information during speech. Our approach involves a series of steps including (1) construction of a common reference anatomical configuration from cine-MRI, (2) motion estimation from tagged-MRI, (3) transformation of the motion estimations to the reference anatomical configuration, and (4) computation of motion quantities such as Lagrangian strain. Using this framework, the anatomic configuration of the tongue appears motionless, while the motion fields and associated strain measurements change over the time course of speech. In addition, to form a succinct representation of the high-dimensional and complex motion fields, principal component analysis is carried out to characterize the central tendencies and variations of motion fields of our speech tasks. Our proposed method provides a platform to quantitatively and objectively explain the differences and variability of tongue motion by illuminating internal motion and strain that have so far been intractable. The findings are used to understand how tongue function for speech is limited by abnormal internal motion and strain in glossectomy patients.
Tasks Motion Estimation
Published 2017-01-24
URL http://arxiv.org/abs/1701.06708v2
PDF http://arxiv.org/pdf/1701.06708v2.pdf
PWC https://paperswithcode.com/paper/speech-map-a-statistical-multimodal-atlas-of
Repo
Framework

Neobility at SemEval-2017 Task 1: An Attention-based Sentence Similarity Model

Title Neobility at SemEval-2017 Task 1: An Attention-based Sentence Similarity Model
Authors Wenli Zhuang, Ernie Chang
Abstract This paper describes a neural-network model which performed competitively (top 6) at the SemEval 2017 cross-lingual Semantic Textual Similarity (STS) task. Our system employs an attention-based recurrent neural network model that optimizes the sentence similarity. In this paper, we describe our participation in the multilingual STS task which measures similarity across English, Spanish, and Arabic.
Tasks Cross-Lingual Semantic Textual Similarity, Semantic Textual Similarity
Published 2017-03-16
URL http://arxiv.org/abs/1703.05465v1
PDF http://arxiv.org/pdf/1703.05465v1.pdf
PWC https://paperswithcode.com/paper/neobility-at-semeval-2017-task-1-an-attention
Repo
Framework

Role of Secondary Attributes to Boost the Prediction Accuracy of Students Employability Via Data Mining

Title Role of Secondary Attributes to Boost the Prediction Accuracy of Students Employability Via Data Mining
Authors Pooja Thakar, Anil Mehta, Manisha
Abstract Data Mining is best-known for its analytical and prediction capabilities. It is used in several areas such as fraud detection, predicting client behavior, money market behavior, bankruptcy prediction. It can also help in establishing an educational ecosystem, which discovers useful knowledge, and assist educators to take proactive decisions to boost student performance and employability. This paper presents an empirical study that compares varied classification algorithms on two datasets of MCA (Masters in Computer Applications) students collected from various affiliated colleges of a reputed state university in India. One dataset includes only primary attributes, whereas other dataset is feeded with secondary psychometric attributes in it. The results showcase that solely primary academic attributes do not lead to smart prediction accuracy of students employability, once they square measure within the initial year of their education. The study analyzes and stresses the role of secondary psychometric attributes for better prediction accuracy and analysis of students performance. Timely prediction and analysis of students performance can help Management, Teachers and Students to work on their gray areas for better results and employment opportunities.
Tasks Fraud Detection
Published 2017-08-09
URL http://arxiv.org/abs/1708.02940v1
PDF http://arxiv.org/pdf/1708.02940v1.pdf
PWC https://paperswithcode.com/paper/role-of-secondary-attributes-to-boost-the
Repo
Framework

Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Datasets

Title Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Datasets
Authors Van Loi Cao, Nhien-An Le-Khac, Miguel Nicolau, Michael ONeill, James McDermott
Abstract Credit card fraud detection based on machine learning has recently attracted considerable interest from the research community. One of the most important tasks in this area is the ability of classifiers to handle the imbalance in credit card data. In this scenario, classifiers tend to yield poor accuracy on the fraud class (minority class) despite realizing high overall accuracy. This is due to the influence of the majority class on traditional training criteria. In this paper, we aim to apply genetic programming to address this issue by adapting existing fitness functions. We examine two fitness functions from previous studies and develop two new fitness functions to evolve GP classifier with superior accuracy on the minority class and overall. Two UCI credit card datasets are used to evaluate the effectiveness of the proposed fitness functions. The results demonstrate that the proposed fitness functions augment GP classifiers, encouraging fitter solutions on both the minority and the majority classes.
Tasks Fraud Detection
Published 2017-04-11
URL http://arxiv.org/abs/1704.03522v1
PDF http://arxiv.org/pdf/1704.03522v1.pdf
PWC https://paperswithcode.com/paper/improving-fitness-functions-in-genetic
Repo
Framework

Block building programming for symbolic regression

Title Block building programming for symbolic regression
Authors Chen Chen, Changtong Luo, Zonglin Jiang
Abstract Symbolic regression that aims to detect underlying data-driven models has become increasingly important for industrial data analysis. For most existing algorithms such as genetic programming (GP), the convergence speed might be too slow for large-scale problems with a large number of variables. This situation may become even worse with increasing problem size. The aforementioned difficulty makes symbolic regression limited in practical applications. Fortunately, in many engineering problems, the independent variables in target models are separable or partially separable. This feature inspires us to develop a new approach, block building programming (BBP). BBP divides the original target function into several blocks, and further into factors. The factors are then modeled by an optimization engine (e.g. GP). Under such circumstances, BBP can make large reductions to the search space. The partition of separability is based on a special method, block and factor detection. Two different optimization engines are applied to test the performance of BBP on a set of symbolic regression problems. Numerical results show that BBP has a good capability of structure and coefficient optimization with high computational efficiency.
Tasks
Published 2017-05-22
URL http://arxiv.org/abs/1705.07877v4
PDF http://arxiv.org/pdf/1705.07877v4.pdf
PWC https://paperswithcode.com/paper/block-building-programming-for-symbolic
Repo
Framework
Title The Many Faces of Link Fraud
Authors Neil Shah, Hemank Lamba, Alex Beutel, Christos Faloutsos
Abstract Most past work on social network link fraud detection tries to separate genuine users from fraudsters, implicitly assuming that there is only one type of fraudulent behavior. But is this assumption true? And, in either case, what are the characteristics of such fraudulent behaviors? In this work, we set up honeypots (“dummy” social network accounts), and buy fake followers (after careful IRB approval). We report the signs of such behaviors including oddities in local network connectivity, account attributes, and similarities and differences across fraud providers. Most valuably, we discover and characterize several types of fraud behaviors. We discuss how to leverage our insights in practice by engineering strongly performing entropy-based features and demonstrating high classification accuracy. Our contributions are (a) instrumentation: we detail our experimental setup and carefully engineered data collection process to scrape Twitter data while respecting API rate-limits, (b) observations on fraud multimodality: we analyze our honeypot fraudster ecosystem and give surprising insights into the multifaceted behaviors of these fraudster types, and (c) features: we propose novel features that give strong (>0.95 precision/recall) discriminative power on ground-truth Twitter data.
Tasks Fraud Detection
Published 2017-04-05
URL http://arxiv.org/abs/1704.01420v3
PDF http://arxiv.org/pdf/1704.01420v3.pdf
PWC https://paperswithcode.com/paper/the-many-faces-of-link-fraud
Repo
Framework

Non-negative Matrix Factorization via Archetypal Analysis

Title Non-negative Matrix Factorization via Archetypal Analysis
Authors Hamid Javadi, Andrea Montanari
Abstract Given a collection of data points, non-negative matrix factorization (NMF) suggests to express them as convex combinations of a small set of archetypes' with non-negative entries. This decomposition is unique only if the true archetypes are non-negative and sufficiently sparse (or the weights are sufficiently sparse), a regime that is captured by the separability condition and its generalizations. In this paper, we study an approach to NMF that can be traced back to the work of Cutler and Breiman (1994) and does not require the data to be separable, while providing a generally unique decomposition. We optimize the trade-off between two objectives: we minimize the distance of the data points from the convex envelope of the archetypes (which can be interpreted as an empirical risk), while minimizing the distance of the archetypes from the convex envelope of the data (which can be interpreted as a data-dependent regularization). The archetypal analysis method of (Cutler, Breiman, 1994) is recovered as the limiting case in which the last term is given infinite weight. We introduce a uniqueness condition’ on the data which is necessary for exactly recovering the archetypes from noiseless data. We prove that, under uniqueness (plus additional regularity conditions on the geometry of the archetypes), our estimator is robust. While our approach requires solving a non-convex optimization problem, we find that standard optimization methods succeed in finding good solutions both for real and synthetic data.
Tasks
Published 2017-05-08
URL http://arxiv.org/abs/1705.02994v1
PDF http://arxiv.org/pdf/1705.02994v1.pdf
PWC https://paperswithcode.com/paper/non-negative-matrix-factorization-via
Repo
Framework
comments powered by Disqus