January 29, 2020

3086 words 15 mins read

Paper Group ANR 733

Paper Group ANR 733

DeepSWIR: A Deep Learning Based Approach for the Synthesis of Short-Wave InfraRed Band using Multi-Sensor Concurrent Datasets. Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition. On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization. Deep Poetry: A Ch …

DeepSWIR: A Deep Learning Based Approach for the Synthesis of Short-Wave InfraRed Band using Multi-Sensor Concurrent Datasets

Title DeepSWIR: A Deep Learning Based Approach for the Synthesis of Short-Wave InfraRed Band using Multi-Sensor Concurrent Datasets
Authors Litu Rout, Yatharath Bhateja, Ankur Garg, Indranil Mishra, S Manthira Moorthi, Debjyoti Dhar
Abstract Convolutional Neural Network (CNN) is achieving remarkable progress in various computer vision tasks. In the past few years, the remote sensing community has observed Deep Neural Network (DNN) finally taking off in several challenging fields. In this study, we propose a DNN to generate a predefined High Resolution (HR) synthetic spectral band using an ensemble of concurrent Low Resolution (LR) bands and existing HR bands. Of particular interest, the proposed network, namely DeepSWIR, synthesizes Short-Wave InfraRed (SWIR) band at 5m Ground Sampling Distance (GSD) using Green (G), Red (R) and Near InfraRed (NIR) bands at both 24m and 5m GSD, and SWIR band at 24m GSD. To our knowledge, the highest spatial resolution of commercially deliverable SWIR band is at 7.5m GSD. Also, we propose a Gaussian feathering based image stitching approach in light of processing large satellite imagery. To experimentally validate the synthesized HR SWIR band, we critically analyse the qualitative and quantitative results produced by DeepSWIR using state-of-the-art evaluation metrics. Further, we convert the synthesized DN values to Top Of Atmosphere (TOA) reflectance and compare with the corresponding band of Sentinel-2B. Finally, we show one real world application of the synthesized band by using it to map wetland resources over our region of interest.
Tasks Image Stitching
Published 2019-05-07
URL https://arxiv.org/abs/1905.02749v1
PDF https://arxiv.org/pdf/1905.02749v1.pdf
PWC https://paperswithcode.com/paper/deepswir-a-deep-learning-based-approach-for
Repo
Framework

Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition

Title Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition
Authors Cleison Correia de Amorim, David Macêdo, Cleber Zanchettin
Abstract The recognition of sign language is a challenging task with an important role in society to facilitate the communication of deaf persons. We propose a new approach of Spatial-Temporal Graph Convolutional Network to sign language recognition based on the human skeletal movements. The method uses graphs to capture the signs dynamics in two dimensions, spatial and temporal, considering the complex aspects of the language. Additionally, we present a new dataset of human skeletons for sign language based on ASLLVD to contribute to future related studies.
Tasks Sign Language Recognition
Published 2019-01-31
URL http://arxiv.org/abs/1901.11164v1
PDF http://arxiv.org/pdf/1901.11164v1.pdf
PWC https://paperswithcode.com/paper/spatial-temporal-graph-convolutional-networks
Repo
Framework

On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization

Title On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization
Authors Hao Yu, Rong Jin
Abstract For SGD based distributed stochastic optimization, computation complexity, measured by the convergence rate in terms of the number of stochastic gradient calls, and communication complexity, measured by the number of inter-node communication rounds, are two most important performance metrics. The classical data-parallel implementation of SGD over $N$ workers can achieve linear speedup of its convergence rate but incurs an inter-node communication round at each batch. We study the benefit of using dynamically increasing batch sizes in parallel SGD for stochastic non-convex optimization by charactering the attained convergence rate and the required number of communication rounds. We show that for stochastic non-convex optimization under the P-L condition, the classical data-parallel SGD with exponentially increasing batch sizes can achieve the fastest known $O(1/(NT))$ convergence with linear speedup using only $\log(T)$ communication rounds. For general stochastic non-convex optimization, we propose a Catalyst-like algorithm to achieve the fastest known $O(1/\sqrt{NT})$ convergence with only $O(\sqrt{NT}\log(\frac{T}{N}))$ communication rounds.
Tasks Stochastic Optimization
Published 2019-05-10
URL https://arxiv.org/abs/1905.04346v1
PDF https://arxiv.org/pdf/1905.04346v1.pdf
PWC https://paperswithcode.com/paper/on-the-computation-and-communication
Repo
Framework

Deep Poetry: A Chinese Classical Poetry Generation System

Title Deep Poetry: A Chinese Classical Poetry Generation System
Authors Yusen Liu, Dayiheng Liu, Jiancheng Lv
Abstract In this work, we demonstrate a Chinese classical poetry generation system called Deep Poetry. Existing systems for Chinese classical poetry generation are mostly template-based and very few of them can accept multi-modal input. Unlike previous systems, Deep Poetry uses neural networks that are trained on over 200 thousand poems and 3 million ancient Chinese prose. Our system can accept plain text, images or artistic conceptions as inputs to generate Chinese classical poetry. More importantly, users are allowed to participate in the process of writing poetry by our system. For the user’s convenience, we deploy the system at the WeChat applet platform, users can use the system on the mobile device whenever and wherever possible. The demo video of this paper is available at https://youtu.be/jD1R_u9TA3M.
Tasks
Published 2019-11-19
URL https://arxiv.org/abs/1911.08212v1
PDF https://arxiv.org/pdf/1911.08212v1.pdf
PWC https://paperswithcode.com/paper/deep-poetry-a-chinese-classical-poetry
Repo
Framework

Natively Interpretable Machine Learning and Artificial Intelligence: Preliminary Results and Future Directions

Title Natively Interpretable Machine Learning and Artificial Intelligence: Preliminary Results and Future Directions
Authors Christopher J. Hazard, Christopher Fusting, Michael Resnick, Michael Auerbach, Michael Meehan, Valeri Korobov
Abstract Machine learning models have become more and more complex in order to better approximate complex functions. Although fruitful in many domains, the added complexity has come at the cost of model interpretability. The once popular k-nearest neighbors (kNN) approach, which finds and uses the most similar data for reasoning, has received much less attention in recent decades due to numerous problems when compared to other techniques. We show that many of these historical problems with kNN can be overcome, and our contribution has applications not only in machine learning but also in online learning, data synthesis, anomaly detection, model compression, and reinforcement learning, without sacrificing interpretability. We introduce a synthesis between kNN and information theory that we hope will provide a clear path towards models that are innately interpretable and auditable. Through this work we hope to gather interest in combining kNN with information theory as a promising path to fully auditable machine learning and artificial intelligence.
Tasks Anomaly Detection, Interpretable Machine Learning, Model Compression
Published 2019-01-02
URL http://arxiv.org/abs/1901.00246v2
PDF http://arxiv.org/pdf/1901.00246v2.pdf
PWC https://paperswithcode.com/paper/natively-interpretable-machine-learning-and
Repo
Framework

Enhancing Underexposed Photos using Perceptually Bidirectional Similarity

Title Enhancing Underexposed Photos using Perceptually Bidirectional Similarity
Authors Qing Zhang, Yongwei Nie, Lei Zhu, Chunxia Xiao, Wei-Shi Zheng
Abstract Although remarkable progress has been made, existing methods for enhancing underexposed photos tend to produce visually unpleasing results due to the existence of visual artifacts (e.g., color distortion, loss of details and uneven exposure). We observed that this is because they fail to ensure the perceptual consistency of visual information between the source underexposed image and its enhanced output. To obtain high-quality results free of these artifacts, we present a novel underexposed photo enhancement approach that is able to maintain the perceptual consistency. We achieve this by proposing an effective criterion, referred to as perceptually bidirectional similarity (PBS), which explicitly describes how to ensure the perceptual consistency when enhancing underexposed images. Particularly, we adopt the Retinex theory and cast the enhancement problem as PBS-constrained illumination estimation, where we formulate PBS as constraints on illumination and solve for the illumination which can recover the desired artifact-free enhancement results. In addition, we describe a video enhancement framework that adopts the presented illumination estimation for handling underexposed videos. To this end, a probabilistic approach is introduced to propagate illuminations of sampled keyframes to the entire video by tackling a Bayesian Maximum A Posteriori (MAP) problem. Extensive experiments demonstrate the superiority of our method over the state-of-the-art methods.
Tasks
Published 2019-07-25
URL https://arxiv.org/abs/1907.10992v2
PDF https://arxiv.org/pdf/1907.10992v2.pdf
PWC https://paperswithcode.com/paper/enhancing-underexposed-photos-using
Repo
Framework

Image Quality Assessment for Omnidirectional Cross-reference Stitching

Title Image Quality Assessment for Omnidirectional Cross-reference Stitching
Authors Kaiwen Yu, Jia Li, Yu Zhang, Yifan Zhao, Long Xu
Abstract Along with the development of virtual reality (VR), omnidirectional images play an important role in producing multimedia content with immersive experience. However, despite various existing approaches for omnidirectional image stitching, how to quantitatively assess the quality of stitched images is still insufficiently explored. To address this problem, we establish a novel omnidirectional image dataset containing stitched images as well as dual-fisheye images captured from standard quarters of 0$^\circ$, 90$^\circ$, 180$^\circ$ and 270$^\circ$. In this manner, when evaluating the quality of an image stitched from a pair of fisheye images (e.g., 0$^\circ$ and 180$^\circ$), the other pair of fisheye images (e.g., 90$^\circ$ and 270$^\circ$) can be used as the cross-reference to provide ground-truth observations of the stitching regions. Based on this dataset, we further benchmark six widely used stitching models with seven evaluation metrics for IQA. To the best of our knowledge, it is the first dataset that focuses on assessing the stitching quality of omnidirectional images.
Tasks Image Quality Assessment, Image Stitching
Published 2019-04-10
URL http://arxiv.org/abs/1904.04960v2
PDF http://arxiv.org/pdf/1904.04960v2.pdf
PWC https://paperswithcode.com/paper/image-quality-assessment-for-omnidirectional
Repo
Framework

A Crowdsourced Frame Disambiguation Corpus with Ambiguity

Title A Crowdsourced Frame Disambiguation Corpus with Ambiguity
Authors Anca Dumitrache, Lora Aroyo, Chris Welty
Abstract We present a resource for the task of FrameNet semantic frame disambiguation of over 5,000 word-sentence pairs from the Wikipedia corpus. The annotations were collected using a novel crowdsourcing approach with multiple workers per sentence to capture inter-annotator disagreement. In contrast to the typical approach of attributing the best single frame to each word, we provide a list of frames with disagreement-based scores that express the confidence with which each frame applies to the word. This is based on the idea that inter-annotator disagreement is at least partly caused by ambiguity that is inherent to the text and frames. We have found many examples where the semantics of individual frames overlap sufficiently to make them acceptable alternatives for interpreting a sentence. We have argued that ignoring this ambiguity creates an overly arbitrary target for training and evaluating natural language processing systems - if humans cannot agree, why would we expect the correct answer from a machine to be any different? To process this data we also utilized an expanded lemma-set provided by the Framester system, which merges FN with WordNet to enhance coverage. Our dataset includes annotations of 1,000 sentence-word pairs whose lemmas are not part of FN. Finally we present metrics for evaluating frame disambiguation systems that account for ambiguity.
Tasks
Published 2019-04-12
URL http://arxiv.org/abs/1904.06101v1
PDF http://arxiv.org/pdf/1904.06101v1.pdf
PWC https://paperswithcode.com/paper/a-crowdsourced-frame-disambiguation-corpus
Repo
Framework

ChOracle: A Unified Statistical Framework for Churn Prediction

Title ChOracle: A Unified Statistical Framework for Churn Prediction
Authors Ali Khodadadi, Seyed Abbas Hosseini, Ehsan Pajouheshgar, Farnam Mansouri, Hamid R. Rabiee
Abstract User churn is an important issue in online services that threatens the health and profitability of services. Most of the previous works on churn prediction convert the problem into a binary classification task where the users are labeled as churned and non-churned. More recently, some works have tried to convert the user churn prediction problem into the prediction of user return time. In this approach which is more realistic in real world online services, at each time-step the model predicts the user return time instead of predicting a churn label. However, the previous works in this category suffer from lack of generality and require high computational complexity. In this paper, we introduce \emph{ChOracle}, an oracle that predicts the user churn by modeling the user return times to service by utilizing a combination of Temporal Point Processes and Recurrent Neural Networks. Moreover, we incorporate latent variables into the proposed recurrent neural network to model the latent user loyalty to the system. We also develop an efficient approximate variational algorithm for learning parameters of the proposed RNN by using back propagation through time. Finally, we demonstrate the superior performance of ChOracle on a wide variety of real world datasets.
Tasks Point Processes
Published 2019-09-15
URL https://arxiv.org/abs/1909.06868v1
PDF https://arxiv.org/pdf/1909.06868v1.pdf
PWC https://paperswithcode.com/paper/choracle-a-unified-statistical-framework-for
Repo
Framework

Looking to Relations for Future Trajectory Forecast

Title Looking to Relations for Future Trajectory Forecast
Authors Chiho Choi, Behzad Dariush
Abstract Inferring relational behavior between road users as well as road users and their surrounding physical space is an important step toward effective modeling and prediction of navigation strategies adopted by participants in road scenes. To this end, we propose a relation-aware framework for future trajectory forecast. Our system aims to infer relational information from the interactions of road users with each other and with the environment. The first module involves visual encoding of spatio-temporal features, which captures human-human and human-space interactions over time. The following module explicitly constructs pair-wise relations from spatio-temporal interactions and identifies more descriptive relations that highly influence future motion of the target road user by considering its past trajectory. The resulting relational features are used to forecast future locations of the target, in the form of heatmaps with an additional guidance of spatial dependencies and consideration of the uncertainty. Extensive evaluations on the public benchmark datasets demonstrate the robustness and efficacy of the proposed framework as observed by performances higher than the state-of-the-art methods.
Tasks
Published 2019-05-21
URL https://arxiv.org/abs/1905.08855v4
PDF https://arxiv.org/pdf/1905.08855v4.pdf
PWC https://paperswithcode.com/paper/looking-to-relations-for-future-trajectory
Repo
Framework

Effects of padding on LSTMs and CNNs

Title Effects of padding on LSTMs and CNNs
Authors Mahidhar Dwarampudi, N V Subba Reddy
Abstract Long Short-Term Memory (LSTM) Networks and Convolutional Neural Networks (CNN) have become very common and are used in many fields as they were effective in solving many problems where the general neural networks were inefficient. They were applied to various problems mostly related to images and sequences. Since LSTMs and CNNs take inputs of the same length and dimension, input images and sequences are padded to maximum length while testing and training. This padding can affect the way the networks function and can make a great deal when it comes to performance and accuracies. This paper studies this and suggests the best way to pad an input sequence. This paper uses a simple sentiment analysis task for this purpose. We use the same dataset on both the networks with various padding to show the difference. This paper also discusses some preprocessing techniques done on the data to ensure effective analysis of the data.
Tasks Sentiment Analysis
Published 2019-03-18
URL http://arxiv.org/abs/1903.07288v1
PDF http://arxiv.org/pdf/1903.07288v1.pdf
PWC https://paperswithcode.com/paper/effects-of-padding-on-lstms-and-cnns
Repo
Framework

Sparse residual tree and forest

Title Sparse residual tree and forest
Authors Xin Xu, Xiaopeng Luo
Abstract Sparse residual tree (SRT) is an adaptive exploration method for multivariate scattered data approximation. It leads to sparse and stable approximations in areas where the data is sufficient or redundant, and points out the possible local regions where data refinement is needed. Sparse residual forest (SRF) is a combination of SRT predictors to further improve the approximation accuracy and stability according to the error characteristics of SRTs. The hierarchical parallel SRT algorithm is based on both tree decomposition and adaptive radial basis function (RBF) explorations, whereby for each child a sparse and proper RBF refinement is added to the approximation by minimizing the norm of the residual inherited from its parent. The convergence results are established for both SRTs and SRFs. The worst case time complexity of SRTs is $\mathcal{O}(N\log_2N)$ for the initial work and $\mathcal{O}(\log_2N)$ for each prediction, meanwhile, the worst case storage requirement is $\mathcal{O}(N\log_2N)$, where the $N$ data points can be arbitrary distributed. Numerical experiments are performed for several illustrative examples.
Tasks
Published 2019-02-18
URL http://arxiv.org/abs/1902.06443v1
PDF http://arxiv.org/pdf/1902.06443v1.pdf
PWC https://paperswithcode.com/paper/190206443
Repo
Framework

Large dimensional analysis of general margin based classification methods

Title Large dimensional analysis of general margin based classification methods
Authors Hanwen Huang
Abstract Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Since a large number of classifiers are available, one natural question is which type of classifiers should be used given a particular classification task. We aim to answering this question by investigating the asymptotic performance of a family of large-margin classifiers in situations where the data dimension $p$ and the sample $n$ are both large. This family covers a broad range of classifiers including support vector machine, distance weighted discrimination, penalized logistic regression, and large-margin unified machine as special cases. The asymptotic results are described by a set of nonlinear equations and we observe a close match of them with Monte Carlo simulation on finite data samples. Our analytical studies shed new light on how to select the best classifier among various classification methods as well as on how to choose the optimal tuning parameters for a given method.
Tasks
Published 2019-01-23
URL http://arxiv.org/abs/1901.08057v1
PDF http://arxiv.org/pdf/1901.08057v1.pdf
PWC https://paperswithcode.com/paper/large-dimensional-analysis-of-general-margin
Repo
Framework

Optimal Fusion of Elliptic Extended Target Estimates based on the Wasserstein Distance

Title Optimal Fusion of Elliptic Extended Target Estimates based on the Wasserstein Distance
Authors Kolja Thormann, Marcus Baum
Abstract This paper considers the fusion of multiple estimates of a spatially extended object, where the object extent is modeled as an ellipse parameterized by the orientation and semiaxes lengths. For this purpose, we propose a novel systematic approach that employs a distance measure for ellipses, i.e., the Gaussian Wasserstein distance, as a cost function. We derive an explicit approximate expression for the Minimum Mean Gaussian Wasserstein distance (MMGW) estimate. Based on the concept of a MMGW estimator, we develop efficient methods for the fusion of extended target estimates. The proposed fusion methods are evaluated in a simulated experiment and the benefits of the novel methods are discussed.
Tasks
Published 2019-04-01
URL https://arxiv.org/abs/1904.00708v3
PDF https://arxiv.org/pdf/1904.00708v3.pdf
PWC https://paperswithcode.com/paper/optimal-fusion-of-elliptic-extended-target
Repo
Framework

Improving interactive reinforcement learning: What makes a good teacher?

Title Improving interactive reinforcement learning: What makes a good teacher?
Authors Francisco Cruz, Sven Magg, Yukie Nagai, Stefan Wermter
Abstract Interactive reinforcement learning has become an important apprenticeship approach to speed up convergence in classic reinforcement learning problems. In this regard, a variant of interactive reinforcement learning is policy shaping which uses a parent-like trainer to propose the next action to be performed and by doing so reduces the search space by advice. On some occasions, the trainer may be another artificial agent which in turn was trained using reinforcement learning methods to afterward becoming an advisor for other learner-agents. In this work, we analyze internal representations and characteristics of artificial agents to determine which agent may outperform others to become a better trainer-agent. Using a polymath agent, as compared to a specialist agent, an advisor leads to a larger reward and faster convergence of the reward signal and also to a more stable behavior in terms of the state visit frequency of the learner-agents. Moreover, we analyze system interaction parameters in order to determine how influential they are in the apprenticeship process, where the consistency of feedback is much more relevant when dealing with different learner obedience parameters.
Tasks
Published 2019-04-15
URL http://arxiv.org/abs/1904.06879v1
PDF http://arxiv.org/pdf/1904.06879v1.pdf
PWC https://paperswithcode.com/paper/improving-interactive-reinforcement-learning
Repo
Framework
comments powered by Disqus