January 25, 2020

3009 words 15 mins read

Paper Group ANR 1629

Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities. Decision making in dynamic and interactive environments based on cognitive hierarchy theory, Bayesian inference, and predictive control. Beating SGD Saturation with Tail-Averaging and Minibatching. Parsing All: Syntax and Semantics, Dependencies …

Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities


Title	Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities
Authors	Pratik Joshi, Christain Barnes, Sebastin Santy, Simran Khanuja, Sanket Shah, Anirudh Srinivasan, Satwik Bhattamishra, Sunayana Sitaram, Monojit Choudhury, Kalika Bali
Abstract	In this paper, we examine and analyze the challenges associated with developing and introducing language technologies to low-resource language communities. While doing so, we bring to light the successes and failures of past work in this area, challenges being faced in doing so, and what they have achieved. Throughout this paper, we take a problem-facing approach and describe essential factors which the success of such technologies hinges upon. We present the various aspects in a manner which clarify and lay out the different tasks involved, which can aid organizations looking to make an impact in this area. We take the example of Gondi, an extremely-low resource Indian language, to reinforce and complement our discussion.
Tasks
Published	2019-12-07
URL	https://arxiv.org/abs/1912.03457v1
PDF	https://arxiv.org/pdf/1912.03457v1.pdf
PWC	https://paperswithcode.com/paper/unsung-challenges-of-building-and-deploying
Repo
Framework

Decision making in dynamic and interactive environments based on cognitive hierarchy theory, Bayesian inference, and predictive control


Title	Decision making in dynamic and interactive environments based on cognitive hierarchy theory, Bayesian inference, and predictive control
Authors	Sisi Li, Nan Li, Anouck Girard, Ilya Kolmanovsky
Abstract	In this paper, we describe an integrated framework for autonomous decision making in a dynamic and interactive environment. We model the interactions between the ego agent and its operating environment as a two-player dynamic game, and integrate cognitive behavioral models, Bayesian inference, and receding-horizon optimal control to define a dynamically-evolving decision strategy for the ego agent. Simulation examples representing autonomous vehicle control in three traffic scenarios where the autonomous ego vehicle interacts with a human-driven vehicle are reported.
Tasks	Autonomous Driving, Bayesian Inference, Decision Making
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04005v3
PDF	https://arxiv.org/pdf/1908.04005v3.pdf
PWC	https://paperswithcode.com/paper/decision-making-in-dynamic-and-interactive
Repo
Framework

Beating SGD Saturation with Tail-Averaging and Minibatching


Title	Beating SGD Saturation with Tail-Averaging and Minibatching
Authors	Nicole Mücke, Gergely Neu, Lorenzo Rosasco
Abstract	While stochastic gradient descent (SGD) is one of the major workhorses in machine learning, the learning properties of many practically used variants are poorly understood. In this paper, we consider least squares learning in a nonparametric setting and contribute to filling this gap by focusing on the effect and interplay of multiple passes, mini-batching and averaging, and in particular tail averaging. Our results show how these different variants of SGD can be combined to achieve optimal learning errors, hence providing practical insights. In particular, we show for the first time in the literature that tail averaging allows faster convergence rates than uniform averaging in the nonparametric setting. Finally, we show that a combination of tail-averaging and minibatching allows more aggressive step-size choices than using any one of said components.
Tasks
Published	2019-02-22
URL	https://arxiv.org/abs/1902.08668v2
PDF	https://arxiv.org/pdf/1902.08668v2.pdf
PWC	https://paperswithcode.com/paper/beating-sgd-saturation-with-tail-averaging
Repo
Framework

Parsing All: Syntax and Semantics, Dependencies and Spans


Title	Parsing All: Syntax and Semantics, Dependencies and Spans
Authors	Junru Zhou, Zuchao Li, Hai Zhao
Abstract	Both syntactic and semantic structures are key linguistic contextual clues, in which parsing the latter has been well shown beneficial from parsing the former. However, few works ever made an attempt to let semantic parsing help syntactic parsing. As linguistic representation formalisms, both syntax and semantics may be represented in either span (constituent/phrase) or dependency, on both of which joint learning was also seldom explored. In this paper, we propose a novel joint model of syntactic and semantic parsing on both span and dependency representations, which incorporates syntactic information effectively in the encoder of neural network and benefits from two representation formalisms in a uniform way. The experiments show that semantics and syntax can benefit each other by optimizing joint objectives. Our single model achieves new state-of-the-art or competitive results on both span and dependency semantic parsing on Propbank benchmarks and both dependency and constituent syntactic parsing on Penn Treebank.
Tasks	Semantic Parsing
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11522v1
PDF	https://arxiv.org/pdf/1908.11522v1.pdf
PWC	https://paperswithcode.com/paper/parsing-all-syntax-and-semantics-dependencies
Repo
Framework

Dual Network Architecture for Few-view CT – Trained on ImageNet Data and Transferred for Medical Imaging


Title	Dual Network Architecture for Few-view CT – Trained on ImageNet Data and Transferred for Medical Imaging
Authors	Huidong Xie, Hongming Shan, Wenxiang Cong, Xiaohua Zhang, Shaohua Liu, Ruola Ning, Ge Wang
Abstract	X-ray computed tomography (CT) reconstructs cross-sectional images from projection data. However, ionizing X-ray radiation associated with CT scanning might induce cancer and genetic damage. Therefore, the reduction of radiation dose has attracted major attention. Few-view CT image reconstruction is an important topic to reduce the radiation dose. Recently, data-driven algorithms have shown great potential to solve the few-view CT problem. In this paper, we develop a dual network architecture (DNA) for reconstructing images directly from sinograms. In the proposed DNA method, a point-based fully-connected layer learns the backprojection process requesting significantly less memory than the prior arts do. Proposed method uses O(CNN_c) parameters where N and N_c denote the dimension of reconstructed images and number of projections respectively. C is an adjustable parameter that can be set as low as 1. Our experimental results demonstrate that DNA produces a competitive performance over the other state-of-the-art methods. Interestingly, natural images can be used to pre-train DNA to avoid overfitting when the amount of real patient images is limited.
Tasks	Computed Tomography (CT), Image Reconstruction
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01262v6
PDF	https://arxiv.org/pdf/1907.01262v6.pdf
PWC	https://paperswithcode.com/paper/dual-network-architecture-for-few-view-ct
Repo
Framework

Harvesting Visual Objects from Internet Images via Deep Learning Based Objectness Assessment


Title	Harvesting Visual Objects from Internet Images via Deep Learning Based Objectness Assessment
Authors	Kan Wu, Guanbin Li, Haofeng Li, Jianjun Zhang, Yizhou Yu
Abstract	The collection of internet images has been growing in an astonishing speed. It is undoubted that these images contain rich visual information that can be useful in many applications, such as visual media creation and data-driven image synthesis. In this paper, we focus on the methodologies for building a visual object database from a collection of internet images. Such database is built to contain a large number of high-quality visual objects that can help with various data-driven image applications. Our method is based on dense proposal generation and objectness-based re-ranking. A novel deep convolutional neural network is designed for the inference of proposal objectness, the probability of a proposal containing optimally-located foreground object. In our work, the objectness is quantitatively measured in regard of completeness and fullness, reflecting two complementary features of an optimal proposal: a complete foreground and relatively small background. Our experiments indicate that object proposals re-ranked according to the output of our network generally achieve higher performance than those produced by other state-of-the-art methods. As a concrete example, a database of over 1.2 million visual objects has been built using the proposed method, and has been successfully used in various data-driven image applications.
Tasks	Image Generation
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00641v1
PDF	http://arxiv.org/pdf/1904.00641v1.pdf
PWC	https://paperswithcode.com/paper/harvesting-visual-objects-from-internet
Repo
Framework

Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic


Title	Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic
Authors	Zhengwei Bai, Baigen Cai, Wei Shangguan, Linguo Chai
Abstract	High-level driving behavior decision-making is an open-challenging problem for connected vehicle technology, especially in heterogeneous traffic scenarios. In this paper, a deep reinforcement learning based high-level driving behavior decision-making approach is proposed for connected vehicle in heterogeneous traffic situations. The model is composed of three main parts: a data preprocessor that maps hybrid data into a data format called hyper-grid matrix, a two-stream deep neural network that extracts the hidden features, and a deep reinforcement learning network that learns the optimal policy. Moreover, a simulation environment, which includes different heterogeneous traffic scenarios, is built to train and test the proposed method. The results demonstrate that the model has the capability to learn the optimal high-level driving policy such as driving fast through heterogeneous traffic without unnecessary lane changes. Furthermore, two separate models are used to compare with the proposed model, and the performances are analyzed in detail.
Tasks	Decision Making
Published	2019-02-15
URL	http://arxiv.org/abs/1902.05772v2
PDF	http://arxiv.org/pdf/1902.05772v2.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-based-high-level
Repo
Framework

Putting Ridesharing to the Test: Efficient and Scalable Solutions and the Power of Dynamic Vehicle Relocation


Title	Putting Ridesharing to the Test: Efficient and Scalable Solutions and the Power of Dynamic Vehicle Relocation
Authors	Panayiotis Danassis, Marija Sakota, Aris Filos-Ratsikas, Boi Faltings
Abstract	We perform a systematic evaluation of a diverse set of algorithms for the ridesharing problem which is, to the best of our knowledge, one of the largest and most comprehensive to date. In particular, we evaluate 12 different algorithms over 12 metrics related to global efficiency, complexity, passenger, driver, and platform incentives. Our evaluation setting is specifically designed to resemble reality as closely as possible. We achieve this by (a) using actual data from the NYC’s yellow taxi trip records, both for modeling customer requests, and taxis (b) following closely the pricing model employed by ridesharing platforms and (c) running our simulations to the scale of the actual problem faced by the ridesharing platforms. Our results provide a clear-cut recommendation to ridesharing platforms on which solutions can be employed in practice and demonstrate the large potential for efficiency gains. Moreover, we show that simple, lightweight relocation schemes – which can be used as independent components to any ridesharing algorithm – can significantly improve Quality of Service metrics by up to 50%. As a highlight of our findings, we identify a scalable, on-device heuristic that offers an efficient, end-to-end solution for the Dynamic Ridesharing and Fleet Relocation problem.
Tasks
Published	2019-12-17
URL	https://arxiv.org/abs/1912.08066v2
PDF	https://arxiv.org/pdf/1912.08066v2.pdf
PWC	https://paperswithcode.com/paper/putting-ridesharing-to-the-test-efficient-and
Repo
Framework

OSVNet: Convolutional Siamese Network for Writer Independent Online Signature Verification


Title	OSVNet: Convolutional Siamese Network for Writer Independent Online Signature Verification
Authors	Chandra Sekhar, Prerana Mukherjee, Devanur S Guru, Viswanath Pulabaigari
Abstract	Online signature verification (OSV) is one of the most challenging tasks in writer identification and digital forensics. Owing to the large intra-individual variability, there is a critical requirement to accurately learn the intra-personal variations of the signature to achieve higher classification accuracy. To achieve this, in this paper, we propose an OSV framework based on deep convolutional Siamese network (DCSN). DCSN automatically extracts robust feature descriptions based on metric-based loss function which decreases intra-writer variability (Genuine-Genuine) and increases inter-individual variability (Genuine-Forgery) and directs the DCSN for effective discriminative representation learning for online signatures and extend it for one shot learning framework. Comprehensive experimentation conducted on three widely accepted benchmark datasets MCYT-100 (DB1), MCYT-330 (DB2) and SVC-2004-Task2 demonstrate the capability of our framework to distinguish the genuine and forgery samples. Experimental results confirm the efficiency of deep convolutional Siamese network based OSV by achieving a lower error rate as compared to many recent and state-of-the art OSV techniques.
Tasks	One-Shot Learning, Representation Learning
Published	2019-03-30
URL	https://arxiv.org/abs/1904.00240v2
PDF	https://arxiv.org/pdf/1904.00240v2.pdf
PWC	https://paperswithcode.com/paper/osvnet-convolutional-siamese-network-for
Repo
Framework

Cellular Traffic Prediction and Classification: a comparative evaluation of LSTM and ARIMA


Title	Cellular Traffic Prediction and Classification: a comparative evaluation of LSTM and ARIMA
Authors	Amin Azari, Panagiotis Papapetrou, Stojan Denic, Gunnar Peters
Abstract	Prediction of user traffic in cellular networks has attracted profound attention for improving resource utilization. In this paper, we study the problem of network traffic traffic prediction and classification by employing standard machine learning and statistical learning time series prediction methods, including long short-term memory (LSTM) and autoregressive integrated moving average (ARIMA), respectively. We present an extensive experimental evaluation of the designed tools over a real network traffic dataset. Within this analysis, we explore the impact of different parameters to the effectiveness of the predictions. We further extend our analysis to the problem of network traffic classification and prediction of traffic bursts. The results, on the one hand, demonstrate superior performance of LSTM over ARIMA in general, especially when the length of the training time series is high enough, and it is augmented by a wisely-selected set of features. On the other hand, the results shed light on the circumstances in which, ARIMA performs close to the optimal with lower complexity.
Tasks	Time Series, Time Series Prediction, Traffic Prediction
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00939v1
PDF	https://arxiv.org/pdf/1906.00939v1.pdf
PWC	https://paperswithcode.com/paper/190600939
Repo
Framework

Towards calibrated and scalable uncertainty representations for neural networks


Title	Towards calibrated and scalable uncertainty representations for neural networks
Authors	Nabeel Seedat, Christopher Kanan
Abstract	For many applications it is critical to know the uncertainty of a neural network’s predictions. While a variety of neural network parameter estimation methods have been proposed for uncertainty estimation, they have not been rigorously compared across uncertainty measures. We assess four of these parameter estimation methods to calibrate uncertainty estimation using four different uncertainty measures: entropy, mutual information, aleatoric uncertainty and epistemic uncertainty. We evaluate the calibration of these parameter estimation methods using expected calibration error. Additionally, we propose a novel method of neural network parameter estimation called RECAST, which combines cosine annealing with warm restarts with Stochastic Gradient Langevin Dynamics, capturing more diverse parameter distributions. When benchmarked against mutilated image data, we show that RECAST is well-calibrated and when combined with predictive entropy and epistemic uncertainty it offers the best calibrated measure of uncertainty when compared to recent methods.
Tasks	Calibration
Published	2019-10-28
URL	https://arxiv.org/abs/1911.00104v3
PDF	https://arxiv.org/pdf/1911.00104v3.pdf
PWC	https://paperswithcode.com/paper/towards-calibrated-and-scalable-uncertainty
Repo
Framework

A Mobile Robot Generating Video Summaries of Seniors’ Indoor Activities


Title	A Mobile Robot Generating Video Summaries of Seniors’ Indoor Activities
Authors	Chih-Yuan Yang, Heeseung Yun, Srenavis Varadaraj, Jane Yung-jen Hsu
Abstract	We develop a system which generates summaries from seniors’ indoor-activity videos captured by a social robot to help remote family members know their seniors’ daily activities at home. Unlike the traditional video summarization datasets, indoor videos captured from a moving robot poses additional challenges, namely, (i) the video sequences are very long (ii) a significant number of video-frames contain no-subject or with subjects at ill-posed locations and scales (iii) most of the well-posed frames contain highly redundant information. To address this problem, we propose to \hl{exploit} pose estimation \hl{for detecting} people in frames\hl{. This guides the robot} to follow the user and capture effective videos. We use person identification to distinguish a target senior from other people. We \hl{also make use of} action recognition to analyze seniors’ major activities at different moments, and develop a video summarization method to select diverse and representative keyframes as summaries.
Tasks	Human Detection, Person Identification, Pose Estimation, Video Summarization
Published	2019-01-30
URL	https://arxiv.org/abs/1901.10713v2
PDF	https://arxiv.org/pdf/1901.10713v2.pdf
PWC	https://paperswithcode.com/paper/video-summarization-through-human-detection
Repo
Framework

A New Statistical Approach for Comparing Algorithms for Lexicon Based Sentiment Analysis


Title	A New Statistical Approach for Comparing Algorithms for Lexicon Based Sentiment Analysis
Authors	Mateus Machado, Evandro Ruiz, Kuruvilla Joseph Abraham
Abstract	Lexicon based sentiment analysis usually relies on the identification of various words to which a numerical value corresponding to sentiment can be assigned. In principle, classifiers can be obtained from these algorithms by comparison with human annotation, which is considered the gold standard. In practise this is difficult in languages such as Portuguese where there is a paucity of human annotated texts. Thus in order to compare algorithms, a next best step is to directly compare different algorithms with each other without referring to human annotation. In this paper we develop methods for a statistical comparison of algorithms which does not rely on human annotation or on known class labels. We will motivate the use of marginal homogeneity tests, as well as log linear models within the framework of maximum likelihood estimation We will also show how some uncertainties present in lexicon based sentiment analysis may be similar to those which occur in human annotated tweets. We will also show how the variability in the output of different algorithms is lexicon dependent, and quantify this variability in the output within the framework of log linear models.
Tasks	Sentiment Analysis
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08717v1
PDF	https://arxiv.org/pdf/1906.08717v1.pdf
PWC	https://paperswithcode.com/paper/a-new-statistical-approach-for-comparing
Repo
Framework

TextScanner: Reading Characters in Order for Robust Scene Text Recognition


Title	TextScanner: Reading Characters in Order for Robust Scene Text Recognition
Authors	Zhaoyi Wan, Minghang He, Haoran Chen, Xiang Bai, Cong Yao
Abstract	Driven by deep learning and the large volume of data, scene text recognition has evolved rapidly in recent years. Formerly, RNN-attention based methods have dominated this field, but suffer from the problem of \textit{attention drift} in certain situations. Lately, semantic segmentation based algorithms have proven effective at recognizing text of different forms (horizontal, oriented and curved). However, these methods may produce spurious characters or miss genuine characters, as they rely heavily on a thresholding procedure operated on segmentation maps. To tackle these challenges, we propose in this paper an alternative approach, called TextScanner, for scene text recognition. TextScanner bears three characteristics: (1) Basically, it belongs to the semantic segmentation family, as it generates pixel-wise, multi-channel segmentation maps for character class, position and order; (2) Meanwhile, akin to RNN-attention based methods, it also adopts RNN for context modeling; (3) Moreover, it performs paralleled prediction for character position and class, and ensures that characters are transcripted in correct order. The experiments on standard benchmark datasets demonstrate that TextScanner outperforms the state-of-the-art methods. Moreover, TextScanner shows its superiority in recognizing more difficult text such Chinese transcripts and aligning with target characters.
Tasks	Scene Text Recognition, Semantic Segmentation
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12422v2
PDF	https://arxiv.org/pdf/1912.12422v2.pdf
PWC	https://paperswithcode.com/paper/textscanner-reading-characters-in-order-for
Repo
Framework

Investigating Under and Overfitting in Wasserstein Generative Adversarial Networks


Title	Investigating Under and Overfitting in Wasserstein Generative Adversarial Networks
Authors	Ben Adlam, Charles Weill, Amol Kapoor
Abstract	We investigate under and overfitting in Generative Adversarial Networks (GANs), using discriminators unseen by the generator to measure generalization. We find that the model capacity of the discriminator has a significant effect on the generator’s model quality, and that the generator’s poor performance coincides with the discriminator underfitting. Contrary to our expectations, we find that generators with large model capacities relative to the discriminator do not show evidence of overfitting on CIFAR10, CIFAR100, and CelebA.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14137v1
PDF	https://arxiv.org/pdf/1910.14137v1.pdf
PWC	https://paperswithcode.com/paper/investigating-under-and-overfitting-in
Repo
Framework