Paper Group ANR 1629
Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities. Decision making in dynamic and interactive environments based on cognitive hierarchy theory, Bayesian inference, and predictive control. Beating SGD Saturation with Tail-Averaging and Minibatching. Parsing All: Syntax and Semantics, Dependencies …
Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities
Title | Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities |
Authors | Pratik Joshi, Christain Barnes, Sebastin Santy, Simran Khanuja, Sanket Shah, Anirudh Srinivasan, Satwik Bhattamishra, Sunayana Sitaram, Monojit Choudhury, Kalika Bali |
Abstract | In this paper, we examine and analyze the challenges associated with developing and introducing language technologies to low-resource language communities. While doing so, we bring to light the successes and failures of past work in this area, challenges being faced in doing so, and what they have achieved. Throughout this paper, we take a problem-facing approach and describe essential factors which the success of such technologies hinges upon. We present the various aspects in a manner which clarify and lay out the different tasks involved, which can aid organizations looking to make an impact in this area. We take the example of Gondi, an extremely-low resource Indian language, to reinforce and complement our discussion. |
Tasks | |
Published | 2019-12-07 |
URL | https://arxiv.org/abs/1912.03457v1 |
https://arxiv.org/pdf/1912.03457v1.pdf | |
PWC | https://paperswithcode.com/paper/unsung-challenges-of-building-and-deploying |
Repo | |
Framework | |
Decision making in dynamic and interactive environments based on cognitive hierarchy theory, Bayesian inference, and predictive control
Title | Decision making in dynamic and interactive environments based on cognitive hierarchy theory, Bayesian inference, and predictive control |
Authors | Sisi Li, Nan Li, Anouck Girard, Ilya Kolmanovsky |
Abstract | In this paper, we describe an integrated framework for autonomous decision making in a dynamic and interactive environment. We model the interactions between the ego agent and its operating environment as a two-player dynamic game, and integrate cognitive behavioral models, Bayesian inference, and receding-horizon optimal control to define a dynamically-evolving decision strategy for the ego agent. Simulation examples representing autonomous vehicle control in three traffic scenarios where the autonomous ego vehicle interacts with a human-driven vehicle are reported. |
Tasks | Autonomous Driving, Bayesian Inference, Decision Making |
Published | 2019-08-12 |
URL | https://arxiv.org/abs/1908.04005v3 |
https://arxiv.org/pdf/1908.04005v3.pdf | |
PWC | https://paperswithcode.com/paper/decision-making-in-dynamic-and-interactive |
Repo | |
Framework | |
Beating SGD Saturation with Tail-Averaging and Minibatching
Title | Beating SGD Saturation with Tail-Averaging and Minibatching |
Authors | Nicole Mücke, Gergely Neu, Lorenzo Rosasco |
Abstract | While stochastic gradient descent (SGD) is one of the major workhorses in machine learning, the learning properties of many practically used variants are poorly understood. In this paper, we consider least squares learning in a nonparametric setting and contribute to filling this gap by focusing on the effect and interplay of multiple passes, mini-batching and averaging, and in particular tail averaging. Our results show how these different variants of SGD can be combined to achieve optimal learning errors, hence providing practical insights. In particular, we show for the first time in the literature that tail averaging allows faster convergence rates than uniform averaging in the nonparametric setting. Finally, we show that a combination of tail-averaging and minibatching allows more aggressive step-size choices than using any one of said components. |
Tasks | |
Published | 2019-02-22 |
URL | https://arxiv.org/abs/1902.08668v2 |
https://arxiv.org/pdf/1902.08668v2.pdf | |
PWC | https://paperswithcode.com/paper/beating-sgd-saturation-with-tail-averaging |
Repo | |
Framework | |
Parsing All: Syntax and Semantics, Dependencies and Spans
Title | Parsing All: Syntax and Semantics, Dependencies and Spans |
Authors | Junru Zhou, Zuchao Li, Hai Zhao |
Abstract | Both syntactic and semantic structures are key linguistic contextual clues, in which parsing the latter has been well shown beneficial from parsing the former. However, few works ever made an attempt to let semantic parsing help syntactic parsing. As linguistic representation formalisms, both syntax and semantics may be represented in either span (constituent/phrase) or dependency, on both of which joint learning was also seldom explored. In this paper, we propose a novel joint model of syntactic and semantic parsing on both span and dependency representations, which incorporates syntactic information effectively in the encoder of neural network and benefits from two representation formalisms in a uniform way. The experiments show that semantics and syntax can benefit each other by optimizing joint objectives. Our single model achieves new state-of-the-art or competitive results on both span and dependency semantic parsing on Propbank benchmarks and both dependency and constituent syntactic parsing on Penn Treebank. |
Tasks | Semantic Parsing |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11522v1 |
https://arxiv.org/pdf/1908.11522v1.pdf | |
PWC | https://paperswithcode.com/paper/parsing-all-syntax-and-semantics-dependencies |
Repo | |
Framework | |
Dual Network Architecture for Few-view CT – Trained on ImageNet Data and Transferred for Medical Imaging
Title | Dual Network Architecture for Few-view CT – Trained on ImageNet Data and Transferred for Medical Imaging |
Authors | Huidong Xie, Hongming Shan, Wenxiang Cong, Xiaohua Zhang, Shaohua Liu, Ruola Ning, Ge Wang |
Abstract | X-ray computed tomography (CT) reconstructs cross-sectional images from projection data. However, ionizing X-ray radiation associated with CT scanning might induce cancer and genetic damage. Therefore, the reduction of radiation dose has attracted major attention. Few-view CT image reconstruction is an important topic to reduce the radiation dose. Recently, data-driven algorithms have shown great potential to solve the few-view CT problem. In this paper, we develop a dual network architecture (DNA) for reconstructing images directly from sinograms. In the proposed DNA method, a point-based fully-connected layer learns the backprojection process requesting significantly less memory than the prior arts do. Proposed method uses O(CNN_c) parameters where N and N_c denote the dimension of reconstructed images and number of projections respectively. C is an adjustable parameter that can be set as low as 1. Our experimental results demonstrate that DNA produces a competitive performance over the other state-of-the-art methods. Interestingly, natural images can be used to pre-train DNA to avoid overfitting when the amount of real patient images is limited. |
Tasks | Computed Tomography (CT), Image Reconstruction |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01262v6 |
https://arxiv.org/pdf/1907.01262v6.pdf | |
PWC | https://paperswithcode.com/paper/dual-network-architecture-for-few-view-ct |
Repo | |
Framework | |
Harvesting Visual Objects from Internet Images via Deep Learning Based Objectness Assessment
Title | Harvesting Visual Objects from Internet Images via Deep Learning Based Objectness Assessment |
Authors | Kan Wu, Guanbin Li, Haofeng Li, Jianjun Zhang, Yizhou Yu |
Abstract | The collection of internet images has been growing in an astonishing speed. It is undoubted that these images contain rich visual information that can be useful in many applications, such as visual media creation and data-driven image synthesis. In this paper, we focus on the methodologies for building a visual object database from a collection of internet images. Such database is built to contain a large number of high-quality visual objects that can help with various data-driven image applications. Our method is based on dense proposal generation and objectness-based re-ranking. A novel deep convolutional neural network is designed for the inference of proposal objectness, the probability of a proposal containing optimally-located foreground object. In our work, the objectness is quantitatively measured in regard of completeness and fullness, reflecting two complementary features of an optimal proposal: a complete foreground and relatively small background. Our experiments indicate that object proposals re-ranked according to the output of our network generally achieve higher performance than those produced by other state-of-the-art methods. As a concrete example, a database of over 1.2 million visual objects has been built using the proposed method, and has been successfully used in various data-driven image applications. |
Tasks | Image Generation |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00641v1 |
http://arxiv.org/pdf/1904.00641v1.pdf | |
PWC | https://paperswithcode.com/paper/harvesting-visual-objects-from-internet |
Repo | |
Framework | |
Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic
Title | Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic |
Authors | Zhengwei Bai, Baigen Cai, Wei Shangguan, Linguo Chai |
Abstract | High-level driving behavior decision-making is an open-challenging problem for connected vehicle technology, especially in heterogeneous traffic scenarios. In this paper, a deep reinforcement learning based high-level driving behavior decision-making approach is proposed for connected vehicle in heterogeneous traffic situations. The model is composed of three main parts: a data preprocessor that maps hybrid data into a data format called hyper-grid matrix, a two-stream deep neural network that extracts the hidden features, and a deep reinforcement learning network that learns the optimal policy. Moreover, a simulation environment, which includes different heterogeneous traffic scenarios, is built to train and test the proposed method. The results demonstrate that the model has the capability to learn the optimal high-level driving policy such as driving fast through heterogeneous traffic without unnecessary lane changes. Furthermore, two separate models are used to compare with the proposed model, and the performances are analyzed in detail. |
Tasks | Decision Making |
Published | 2019-02-15 |
URL | http://arxiv.org/abs/1902.05772v2 |
http://arxiv.org/pdf/1902.05772v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-based-high-level |
Repo | |
Framework | |
Putting Ridesharing to the Test: Efficient and Scalable Solutions and the Power of Dynamic Vehicle Relocation
Title | Putting Ridesharing to the Test: Efficient and Scalable Solutions and the Power of Dynamic Vehicle Relocation |
Authors | Panayiotis Danassis, Marija Sakota, Aris Filos-Ratsikas, Boi Faltings |
Abstract | We perform a systematic evaluation of a diverse set of algorithms for the ridesharing problem which is, to the best of our knowledge, one of the largest and most comprehensive to date. In particular, we evaluate 12 different algorithms over 12 metrics related to global efficiency, complexity, passenger, driver, and platform incentives. Our evaluation setting is specifically designed to resemble reality as closely as possible. We achieve this by (a) using actual data from the NYC’s yellow taxi trip records, both for modeling customer requests, and taxis (b) following closely the pricing model employed by ridesharing platforms and (c) running our simulations to the scale of the actual problem faced by the ridesharing platforms. Our results provide a clear-cut recommendation to ridesharing platforms on which solutions can be employed in practice and demonstrate the large potential for efficiency gains. Moreover, we show that simple, lightweight relocation schemes – which can be used as independent components to any ridesharing algorithm – can significantly improve Quality of Service metrics by up to 50%. As a highlight of our findings, we identify a scalable, on-device heuristic that offers an efficient, end-to-end solution for the Dynamic Ridesharing and Fleet Relocation problem. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.08066v2 |
https://arxiv.org/pdf/1912.08066v2.pdf | |
PWC | https://paperswithcode.com/paper/putting-ridesharing-to-the-test-efficient-and |
Repo | |
Framework | |
OSVNet: Convolutional Siamese Network for Writer Independent Online Signature Verification
Title | OSVNet: Convolutional Siamese Network for Writer Independent Online Signature Verification |
Authors | Chandra Sekhar, Prerana Mukherjee, Devanur S Guru, Viswanath Pulabaigari |
Abstract | Online signature verification (OSV) is one of the most challenging tasks in writer identification and digital forensics. Owing to the large intra-individual variability, there is a critical requirement to accurately learn the intra-personal variations of the signature to achieve higher classification accuracy. To achieve this, in this paper, we propose an OSV framework based on deep convolutional Siamese network (DCSN). DCSN automatically extracts robust feature descriptions based on metric-based loss function which decreases intra-writer variability (Genuine-Genuine) and increases inter-individual variability (Genuine-Forgery) and directs the DCSN for effective discriminative representation learning for online signatures and extend it for one shot learning framework. Comprehensive experimentation conducted on three widely accepted benchmark datasets MCYT-100 (DB1), MCYT-330 (DB2) and SVC-2004-Task2 demonstrate the capability of our framework to distinguish the genuine and forgery samples. Experimental results confirm the efficiency of deep convolutional Siamese network based OSV by achieving a lower error rate as compared to many recent and state-of-the art OSV techniques. |
Tasks | One-Shot Learning, Representation Learning |
Published | 2019-03-30 |
URL | https://arxiv.org/abs/1904.00240v2 |
https://arxiv.org/pdf/1904.00240v2.pdf | |
PWC | https://paperswithcode.com/paper/osvnet-convolutional-siamese-network-for |
Repo | |
Framework | |
Cellular Traffic Prediction and Classification: a comparative evaluation of LSTM and ARIMA
Title | Cellular Traffic Prediction and Classification: a comparative evaluation of LSTM and ARIMA |
Authors | Amin Azari, Panagiotis Papapetrou, Stojan Denic, Gunnar Peters |
Abstract | Prediction of user traffic in cellular networks has attracted profound attention for improving resource utilization. In this paper, we study the problem of network traffic traffic prediction and classification by employing standard machine learning and statistical learning time series prediction methods, including long short-term memory (LSTM) and autoregressive integrated moving average (ARIMA), respectively. We present an extensive experimental evaluation of the designed tools over a real network traffic dataset. Within this analysis, we explore the impact of different parameters to the effectiveness of the predictions. We further extend our analysis to the problem of network traffic classification and prediction of traffic bursts. The results, on the one hand, demonstrate superior performance of LSTM over ARIMA in general, especially when the length of the training time series is high enough, and it is augmented by a wisely-selected set of features. On the other hand, the results shed light on the circumstances in which, ARIMA performs close to the optimal with lower complexity. |
Tasks | Time Series, Time Series Prediction, Traffic Prediction |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00939v1 |
https://arxiv.org/pdf/1906.00939v1.pdf | |
PWC | https://paperswithcode.com/paper/190600939 |
Repo | |
Framework | |
Towards calibrated and scalable uncertainty representations for neural networks
Title | Towards calibrated and scalable uncertainty representations for neural networks |
Authors | Nabeel Seedat, Christopher Kanan |
Abstract | For many applications it is critical to know the uncertainty of a neural network’s predictions. While a variety of neural network parameter estimation methods have been proposed for uncertainty estimation, they have not been rigorously compared across uncertainty measures. We assess four of these parameter estimation methods to calibrate uncertainty estimation using four different uncertainty measures: entropy, mutual information, aleatoric uncertainty and epistemic uncertainty. We evaluate the calibration of these parameter estimation methods using expected calibration error. Additionally, we propose a novel method of neural network parameter estimation called RECAST, which combines cosine annealing with warm restarts with Stochastic Gradient Langevin Dynamics, capturing more diverse parameter distributions. When benchmarked against mutilated image data, we show that RECAST is well-calibrated and when combined with predictive entropy and epistemic uncertainty it offers the best calibrated measure of uncertainty when compared to recent methods. |
Tasks | Calibration |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1911.00104v3 |
https://arxiv.org/pdf/1911.00104v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-calibrated-and-scalable-uncertainty |
Repo | |
Framework | |
A Mobile Robot Generating Video Summaries of Seniors’ Indoor Activities
Title | A Mobile Robot Generating Video Summaries of Seniors’ Indoor Activities |
Authors | Chih-Yuan Yang, Heeseung Yun, Srenavis Varadaraj, Jane Yung-jen Hsu |
Abstract | We develop a system which generates summaries from seniors’ indoor-activity videos captured by a social robot to help remote family members know their seniors’ daily activities at home. Unlike the traditional video summarization datasets, indoor videos captured from a moving robot poses additional challenges, namely, (i) the video sequences are very long (ii) a significant number of video-frames contain no-subject or with subjects at ill-posed locations and scales (iii) most of the well-posed frames contain highly redundant information. To address this problem, we propose to \hl{exploit} pose estimation \hl{for detecting} people in frames\hl{. This guides the robot} to follow the user and capture effective videos. We use person identification to distinguish a target senior from other people. We \hl{also make use of} action recognition to analyze seniors’ major activities at different moments, and develop a video summarization method to select diverse and representative keyframes as summaries. |
Tasks | Human Detection, Person Identification, Pose Estimation, Video Summarization |
Published | 2019-01-30 |
URL | https://arxiv.org/abs/1901.10713v2 |
https://arxiv.org/pdf/1901.10713v2.pdf | |
PWC | https://paperswithcode.com/paper/video-summarization-through-human-detection |
Repo | |
Framework | |
A New Statistical Approach for Comparing Algorithms for Lexicon Based Sentiment Analysis
Title | A New Statistical Approach for Comparing Algorithms for Lexicon Based Sentiment Analysis |
Authors | Mateus Machado, Evandro Ruiz, Kuruvilla Joseph Abraham |
Abstract | Lexicon based sentiment analysis usually relies on the identification of various words to which a numerical value corresponding to sentiment can be assigned. In principle, classifiers can be obtained from these algorithms by comparison with human annotation, which is considered the gold standard. In practise this is difficult in languages such as Portuguese where there is a paucity of human annotated texts. Thus in order to compare algorithms, a next best step is to directly compare different algorithms with each other without referring to human annotation. In this paper we develop methods for a statistical comparison of algorithms which does not rely on human annotation or on known class labels. We will motivate the use of marginal homogeneity tests, as well as log linear models within the framework of maximum likelihood estimation We will also show how some uncertainties present in lexicon based sentiment analysis may be similar to those which occur in human annotated tweets. We will also show how the variability in the output of different algorithms is lexicon dependent, and quantify this variability in the output within the framework of log linear models. |
Tasks | Sentiment Analysis |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08717v1 |
https://arxiv.org/pdf/1906.08717v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-statistical-approach-for-comparing |
Repo | |
Framework | |
TextScanner: Reading Characters in Order for Robust Scene Text Recognition
Title | TextScanner: Reading Characters in Order for Robust Scene Text Recognition |
Authors | Zhaoyi Wan, Minghang He, Haoran Chen, Xiang Bai, Cong Yao |
Abstract | Driven by deep learning and the large volume of data, scene text recognition has evolved rapidly in recent years. Formerly, RNN-attention based methods have dominated this field, but suffer from the problem of \textit{attention drift} in certain situations. Lately, semantic segmentation based algorithms have proven effective at recognizing text of different forms (horizontal, oriented and curved). However, these methods may produce spurious characters or miss genuine characters, as they rely heavily on a thresholding procedure operated on segmentation maps. To tackle these challenges, we propose in this paper an alternative approach, called TextScanner, for scene text recognition. TextScanner bears three characteristics: (1) Basically, it belongs to the semantic segmentation family, as it generates pixel-wise, multi-channel segmentation maps for character class, position and order; (2) Meanwhile, akin to RNN-attention based methods, it also adopts RNN for context modeling; (3) Moreover, it performs paralleled prediction for character position and class, and ensures that characters are transcripted in correct order. The experiments on standard benchmark datasets demonstrate that TextScanner outperforms the state-of-the-art methods. Moreover, TextScanner shows its superiority in recognizing more difficult text such Chinese transcripts and aligning with target characters. |
Tasks | Scene Text Recognition, Semantic Segmentation |
Published | 2019-12-28 |
URL | https://arxiv.org/abs/1912.12422v2 |
https://arxiv.org/pdf/1912.12422v2.pdf | |
PWC | https://paperswithcode.com/paper/textscanner-reading-characters-in-order-for |
Repo | |
Framework | |
Investigating Under and Overfitting in Wasserstein Generative Adversarial Networks
Title | Investigating Under and Overfitting in Wasserstein Generative Adversarial Networks |
Authors | Ben Adlam, Charles Weill, Amol Kapoor |
Abstract | We investigate under and overfitting in Generative Adversarial Networks (GANs), using discriminators unseen by the generator to measure generalization. We find that the model capacity of the discriminator has a significant effect on the generator’s model quality, and that the generator’s poor performance coincides with the discriminator underfitting. Contrary to our expectations, we find that generators with large model capacities relative to the discriminator do not show evidence of overfitting on CIFAR10, CIFAR100, and CelebA. |
Tasks | |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.14137v1 |
https://arxiv.org/pdf/1910.14137v1.pdf | |
PWC | https://paperswithcode.com/paper/investigating-under-and-overfitting-in |
Repo | |
Framework | |