Paper Group ANR 733
DeepSWIR: A Deep Learning Based Approach for the Synthesis of Short-Wave InfraRed Band using Multi-Sensor Concurrent Datasets. Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition. On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization. Deep Poetry: A Ch …
DeepSWIR: A Deep Learning Based Approach for the Synthesis of Short-Wave InfraRed Band using Multi-Sensor Concurrent Datasets
Title | DeepSWIR: A Deep Learning Based Approach for the Synthesis of Short-Wave InfraRed Band using Multi-Sensor Concurrent Datasets |
Authors | Litu Rout, Yatharath Bhateja, Ankur Garg, Indranil Mishra, S Manthira Moorthi, Debjyoti Dhar |
Abstract | Convolutional Neural Network (CNN) is achieving remarkable progress in various computer vision tasks. In the past few years, the remote sensing community has observed Deep Neural Network (DNN) finally taking off in several challenging fields. In this study, we propose a DNN to generate a predefined High Resolution (HR) synthetic spectral band using an ensemble of concurrent Low Resolution (LR) bands and existing HR bands. Of particular interest, the proposed network, namely DeepSWIR, synthesizes Short-Wave InfraRed (SWIR) band at 5m Ground Sampling Distance (GSD) using Green (G), Red (R) and Near InfraRed (NIR) bands at both 24m and 5m GSD, and SWIR band at 24m GSD. To our knowledge, the highest spatial resolution of commercially deliverable SWIR band is at 7.5m GSD. Also, we propose a Gaussian feathering based image stitching approach in light of processing large satellite imagery. To experimentally validate the synthesized HR SWIR band, we critically analyse the qualitative and quantitative results produced by DeepSWIR using state-of-the-art evaluation metrics. Further, we convert the synthesized DN values to Top Of Atmosphere (TOA) reflectance and compare with the corresponding band of Sentinel-2B. Finally, we show one real world application of the synthesized band by using it to map wetland resources over our region of interest. |
Tasks | Image Stitching |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02749v1 |
https://arxiv.org/pdf/1905.02749v1.pdf | |
PWC | https://paperswithcode.com/paper/deepswir-a-deep-learning-based-approach-for |
Repo | |
Framework | |
Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition
Title | Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition |
Authors | Cleison Correia de Amorim, David Macêdo, Cleber Zanchettin |
Abstract | The recognition of sign language is a challenging task with an important role in society to facilitate the communication of deaf persons. We propose a new approach of Spatial-Temporal Graph Convolutional Network to sign language recognition based on the human skeletal movements. The method uses graphs to capture the signs dynamics in two dimensions, spatial and temporal, considering the complex aspects of the language. Additionally, we present a new dataset of human skeletons for sign language based on ASLLVD to contribute to future related studies. |
Tasks | Sign Language Recognition |
Published | 2019-01-31 |
URL | http://arxiv.org/abs/1901.11164v1 |
http://arxiv.org/pdf/1901.11164v1.pdf | |
PWC | https://paperswithcode.com/paper/spatial-temporal-graph-convolutional-networks |
Repo | |
Framework | |
On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization
Title | On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization |
Authors | Hao Yu, Rong Jin |
Abstract | For SGD based distributed stochastic optimization, computation complexity, measured by the convergence rate in terms of the number of stochastic gradient calls, and communication complexity, measured by the number of inter-node communication rounds, are two most important performance metrics. The classical data-parallel implementation of SGD over $N$ workers can achieve linear speedup of its convergence rate but incurs an inter-node communication round at each batch. We study the benefit of using dynamically increasing batch sizes in parallel SGD for stochastic non-convex optimization by charactering the attained convergence rate and the required number of communication rounds. We show that for stochastic non-convex optimization under the P-L condition, the classical data-parallel SGD with exponentially increasing batch sizes can achieve the fastest known $O(1/(NT))$ convergence with linear speedup using only $\log(T)$ communication rounds. For general stochastic non-convex optimization, we propose a Catalyst-like algorithm to achieve the fastest known $O(1/\sqrt{NT})$ convergence with only $O(\sqrt{NT}\log(\frac{T}{N}))$ communication rounds. |
Tasks | Stochastic Optimization |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04346v1 |
https://arxiv.org/pdf/1905.04346v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-computation-and-communication |
Repo | |
Framework | |
Deep Poetry: A Chinese Classical Poetry Generation System
Title | Deep Poetry: A Chinese Classical Poetry Generation System |
Authors | Yusen Liu, Dayiheng Liu, Jiancheng Lv |
Abstract | In this work, we demonstrate a Chinese classical poetry generation system called Deep Poetry. Existing systems for Chinese classical poetry generation are mostly template-based and very few of them can accept multi-modal input. Unlike previous systems, Deep Poetry uses neural networks that are trained on over 200 thousand poems and 3 million ancient Chinese prose. Our system can accept plain text, images or artistic conceptions as inputs to generate Chinese classical poetry. More importantly, users are allowed to participate in the process of writing poetry by our system. For the user’s convenience, we deploy the system at the WeChat applet platform, users can use the system on the mobile device whenever and wherever possible. The demo video of this paper is available at https://youtu.be/jD1R_u9TA3M. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08212v1 |
https://arxiv.org/pdf/1911.08212v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-poetry-a-chinese-classical-poetry |
Repo | |
Framework | |
Natively Interpretable Machine Learning and Artificial Intelligence: Preliminary Results and Future Directions
Title | Natively Interpretable Machine Learning and Artificial Intelligence: Preliminary Results and Future Directions |
Authors | Christopher J. Hazard, Christopher Fusting, Michael Resnick, Michael Auerbach, Michael Meehan, Valeri Korobov |
Abstract | Machine learning models have become more and more complex in order to better approximate complex functions. Although fruitful in many domains, the added complexity has come at the cost of model interpretability. The once popular k-nearest neighbors (kNN) approach, which finds and uses the most similar data for reasoning, has received much less attention in recent decades due to numerous problems when compared to other techniques. We show that many of these historical problems with kNN can be overcome, and our contribution has applications not only in machine learning but also in online learning, data synthesis, anomaly detection, model compression, and reinforcement learning, without sacrificing interpretability. We introduce a synthesis between kNN and information theory that we hope will provide a clear path towards models that are innately interpretable and auditable. Through this work we hope to gather interest in combining kNN with information theory as a promising path to fully auditable machine learning and artificial intelligence. |
Tasks | Anomaly Detection, Interpretable Machine Learning, Model Compression |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00246v2 |
http://arxiv.org/pdf/1901.00246v2.pdf | |
PWC | https://paperswithcode.com/paper/natively-interpretable-machine-learning-and |
Repo | |
Framework | |
Enhancing Underexposed Photos using Perceptually Bidirectional Similarity
Title | Enhancing Underexposed Photos using Perceptually Bidirectional Similarity |
Authors | Qing Zhang, Yongwei Nie, Lei Zhu, Chunxia Xiao, Wei-Shi Zheng |
Abstract | Although remarkable progress has been made, existing methods for enhancing underexposed photos tend to produce visually unpleasing results due to the existence of visual artifacts (e.g., color distortion, loss of details and uneven exposure). We observed that this is because they fail to ensure the perceptual consistency of visual information between the source underexposed image and its enhanced output. To obtain high-quality results free of these artifacts, we present a novel underexposed photo enhancement approach that is able to maintain the perceptual consistency. We achieve this by proposing an effective criterion, referred to as perceptually bidirectional similarity (PBS), which explicitly describes how to ensure the perceptual consistency when enhancing underexposed images. Particularly, we adopt the Retinex theory and cast the enhancement problem as PBS-constrained illumination estimation, where we formulate PBS as constraints on illumination and solve for the illumination which can recover the desired artifact-free enhancement results. In addition, we describe a video enhancement framework that adopts the presented illumination estimation for handling underexposed videos. To this end, a probabilistic approach is introduced to propagate illuminations of sampled keyframes to the entire video by tackling a Bayesian Maximum A Posteriori (MAP) problem. Extensive experiments demonstrate the superiority of our method over the state-of-the-art methods. |
Tasks | |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.10992v2 |
https://arxiv.org/pdf/1907.10992v2.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-underexposed-photos-using |
Repo | |
Framework | |
Image Quality Assessment for Omnidirectional Cross-reference Stitching
Title | Image Quality Assessment for Omnidirectional Cross-reference Stitching |
Authors | Kaiwen Yu, Jia Li, Yu Zhang, Yifan Zhao, Long Xu |
Abstract | Along with the development of virtual reality (VR), omnidirectional images play an important role in producing multimedia content with immersive experience. However, despite various existing approaches for omnidirectional image stitching, how to quantitatively assess the quality of stitched images is still insufficiently explored. To address this problem, we establish a novel omnidirectional image dataset containing stitched images as well as dual-fisheye images captured from standard quarters of 0$^\circ$, 90$^\circ$, 180$^\circ$ and 270$^\circ$. In this manner, when evaluating the quality of an image stitched from a pair of fisheye images (e.g., 0$^\circ$ and 180$^\circ$), the other pair of fisheye images (e.g., 90$^\circ$ and 270$^\circ$) can be used as the cross-reference to provide ground-truth observations of the stitching regions. Based on this dataset, we further benchmark six widely used stitching models with seven evaluation metrics for IQA. To the best of our knowledge, it is the first dataset that focuses on assessing the stitching quality of omnidirectional images. |
Tasks | Image Quality Assessment, Image Stitching |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.04960v2 |
http://arxiv.org/pdf/1904.04960v2.pdf | |
PWC | https://paperswithcode.com/paper/image-quality-assessment-for-omnidirectional |
Repo | |
Framework | |
A Crowdsourced Frame Disambiguation Corpus with Ambiguity
Title | A Crowdsourced Frame Disambiguation Corpus with Ambiguity |
Authors | Anca Dumitrache, Lora Aroyo, Chris Welty |
Abstract | We present a resource for the task of FrameNet semantic frame disambiguation of over 5,000 word-sentence pairs from the Wikipedia corpus. The annotations were collected using a novel crowdsourcing approach with multiple workers per sentence to capture inter-annotator disagreement. In contrast to the typical approach of attributing the best single frame to each word, we provide a list of frames with disagreement-based scores that express the confidence with which each frame applies to the word. This is based on the idea that inter-annotator disagreement is at least partly caused by ambiguity that is inherent to the text and frames. We have found many examples where the semantics of individual frames overlap sufficiently to make them acceptable alternatives for interpreting a sentence. We have argued that ignoring this ambiguity creates an overly arbitrary target for training and evaluating natural language processing systems - if humans cannot agree, why would we expect the correct answer from a machine to be any different? To process this data we also utilized an expanded lemma-set provided by the Framester system, which merges FN with WordNet to enhance coverage. Our dataset includes annotations of 1,000 sentence-word pairs whose lemmas are not part of FN. Finally we present metrics for evaluating frame disambiguation systems that account for ambiguity. |
Tasks | |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06101v1 |
http://arxiv.org/pdf/1904.06101v1.pdf | |
PWC | https://paperswithcode.com/paper/a-crowdsourced-frame-disambiguation-corpus |
Repo | |
Framework | |
ChOracle: A Unified Statistical Framework for Churn Prediction
Title | ChOracle: A Unified Statistical Framework for Churn Prediction |
Authors | Ali Khodadadi, Seyed Abbas Hosseini, Ehsan Pajouheshgar, Farnam Mansouri, Hamid R. Rabiee |
Abstract | User churn is an important issue in online services that threatens the health and profitability of services. Most of the previous works on churn prediction convert the problem into a binary classification task where the users are labeled as churned and non-churned. More recently, some works have tried to convert the user churn prediction problem into the prediction of user return time. In this approach which is more realistic in real world online services, at each time-step the model predicts the user return time instead of predicting a churn label. However, the previous works in this category suffer from lack of generality and require high computational complexity. In this paper, we introduce \emph{ChOracle}, an oracle that predicts the user churn by modeling the user return times to service by utilizing a combination of Temporal Point Processes and Recurrent Neural Networks. Moreover, we incorporate latent variables into the proposed recurrent neural network to model the latent user loyalty to the system. We also develop an efficient approximate variational algorithm for learning parameters of the proposed RNN by using back propagation through time. Finally, we demonstrate the superior performance of ChOracle on a wide variety of real world datasets. |
Tasks | Point Processes |
Published | 2019-09-15 |
URL | https://arxiv.org/abs/1909.06868v1 |
https://arxiv.org/pdf/1909.06868v1.pdf | |
PWC | https://paperswithcode.com/paper/choracle-a-unified-statistical-framework-for |
Repo | |
Framework | |
Looking to Relations for Future Trajectory Forecast
Title | Looking to Relations for Future Trajectory Forecast |
Authors | Chiho Choi, Behzad Dariush |
Abstract | Inferring relational behavior between road users as well as road users and their surrounding physical space is an important step toward effective modeling and prediction of navigation strategies adopted by participants in road scenes. To this end, we propose a relation-aware framework for future trajectory forecast. Our system aims to infer relational information from the interactions of road users with each other and with the environment. The first module involves visual encoding of spatio-temporal features, which captures human-human and human-space interactions over time. The following module explicitly constructs pair-wise relations from spatio-temporal interactions and identifies more descriptive relations that highly influence future motion of the target road user by considering its past trajectory. The resulting relational features are used to forecast future locations of the target, in the form of heatmaps with an additional guidance of spatial dependencies and consideration of the uncertainty. Extensive evaluations on the public benchmark datasets demonstrate the robustness and efficacy of the proposed framework as observed by performances higher than the state-of-the-art methods. |
Tasks | |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08855v4 |
https://arxiv.org/pdf/1905.08855v4.pdf | |
PWC | https://paperswithcode.com/paper/looking-to-relations-for-future-trajectory |
Repo | |
Framework | |
Effects of padding on LSTMs and CNNs
Title | Effects of padding on LSTMs and CNNs |
Authors | Mahidhar Dwarampudi, N V Subba Reddy |
Abstract | Long Short-Term Memory (LSTM) Networks and Convolutional Neural Networks (CNN) have become very common and are used in many fields as they were effective in solving many problems where the general neural networks were inefficient. They were applied to various problems mostly related to images and sequences. Since LSTMs and CNNs take inputs of the same length and dimension, input images and sequences are padded to maximum length while testing and training. This padding can affect the way the networks function and can make a great deal when it comes to performance and accuracies. This paper studies this and suggests the best way to pad an input sequence. This paper uses a simple sentiment analysis task for this purpose. We use the same dataset on both the networks with various padding to show the difference. This paper also discusses some preprocessing techniques done on the data to ensure effective analysis of the data. |
Tasks | Sentiment Analysis |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07288v1 |
http://arxiv.org/pdf/1903.07288v1.pdf | |
PWC | https://paperswithcode.com/paper/effects-of-padding-on-lstms-and-cnns |
Repo | |
Framework | |
Sparse residual tree and forest
Title | Sparse residual tree and forest |
Authors | Xin Xu, Xiaopeng Luo |
Abstract | Sparse residual tree (SRT) is an adaptive exploration method for multivariate scattered data approximation. It leads to sparse and stable approximations in areas where the data is sufficient or redundant, and points out the possible local regions where data refinement is needed. Sparse residual forest (SRF) is a combination of SRT predictors to further improve the approximation accuracy and stability according to the error characteristics of SRTs. The hierarchical parallel SRT algorithm is based on both tree decomposition and adaptive radial basis function (RBF) explorations, whereby for each child a sparse and proper RBF refinement is added to the approximation by minimizing the norm of the residual inherited from its parent. The convergence results are established for both SRTs and SRFs. The worst case time complexity of SRTs is $\mathcal{O}(N\log_2N)$ for the initial work and $\mathcal{O}(\log_2N)$ for each prediction, meanwhile, the worst case storage requirement is $\mathcal{O}(N\log_2N)$, where the $N$ data points can be arbitrary distributed. Numerical experiments are performed for several illustrative examples. |
Tasks | |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06443v1 |
http://arxiv.org/pdf/1902.06443v1.pdf | |
PWC | https://paperswithcode.com/paper/190206443 |
Repo | |
Framework | |
Large dimensional analysis of general margin based classification methods
Title | Large dimensional analysis of general margin based classification methods |
Authors | Hanwen Huang |
Abstract | Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Since a large number of classifiers are available, one natural question is which type of classifiers should be used given a particular classification task. We aim to answering this question by investigating the asymptotic performance of a family of large-margin classifiers in situations where the data dimension $p$ and the sample $n$ are both large. This family covers a broad range of classifiers including support vector machine, distance weighted discrimination, penalized logistic regression, and large-margin unified machine as special cases. The asymptotic results are described by a set of nonlinear equations and we observe a close match of them with Monte Carlo simulation on finite data samples. Our analytical studies shed new light on how to select the best classifier among various classification methods as well as on how to choose the optimal tuning parameters for a given method. |
Tasks | |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.08057v1 |
http://arxiv.org/pdf/1901.08057v1.pdf | |
PWC | https://paperswithcode.com/paper/large-dimensional-analysis-of-general-margin |
Repo | |
Framework | |
Optimal Fusion of Elliptic Extended Target Estimates based on the Wasserstein Distance
Title | Optimal Fusion of Elliptic Extended Target Estimates based on the Wasserstein Distance |
Authors | Kolja Thormann, Marcus Baum |
Abstract | This paper considers the fusion of multiple estimates of a spatially extended object, where the object extent is modeled as an ellipse parameterized by the orientation and semiaxes lengths. For this purpose, we propose a novel systematic approach that employs a distance measure for ellipses, i.e., the Gaussian Wasserstein distance, as a cost function. We derive an explicit approximate expression for the Minimum Mean Gaussian Wasserstein distance (MMGW) estimate. Based on the concept of a MMGW estimator, we develop efficient methods for the fusion of extended target estimates. The proposed fusion methods are evaluated in a simulated experiment and the benefits of the novel methods are discussed. |
Tasks | |
Published | 2019-04-01 |
URL | https://arxiv.org/abs/1904.00708v3 |
https://arxiv.org/pdf/1904.00708v3.pdf | |
PWC | https://paperswithcode.com/paper/optimal-fusion-of-elliptic-extended-target |
Repo | |
Framework | |
Improving interactive reinforcement learning: What makes a good teacher?
Title | Improving interactive reinforcement learning: What makes a good teacher? |
Authors | Francisco Cruz, Sven Magg, Yukie Nagai, Stefan Wermter |
Abstract | Interactive reinforcement learning has become an important apprenticeship approach to speed up convergence in classic reinforcement learning problems. In this regard, a variant of interactive reinforcement learning is policy shaping which uses a parent-like trainer to propose the next action to be performed and by doing so reduces the search space by advice. On some occasions, the trainer may be another artificial agent which in turn was trained using reinforcement learning methods to afterward becoming an advisor for other learner-agents. In this work, we analyze internal representations and characteristics of artificial agents to determine which agent may outperform others to become a better trainer-agent. Using a polymath agent, as compared to a specialist agent, an advisor leads to a larger reward and faster convergence of the reward signal and also to a more stable behavior in terms of the state visit frequency of the learner-agents. Moreover, we analyze system interaction parameters in order to determine how influential they are in the apprenticeship process, where the consistency of feedback is much more relevant when dealing with different learner obedience parameters. |
Tasks | |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.06879v1 |
http://arxiv.org/pdf/1904.06879v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-interactive-reinforcement-learning |
Repo | |
Framework | |