October 16, 2019

3296 words 16 mins read

Paper Group ANR 1006

Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media. On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions. Emergence of Addictive Behaviors in Reinforcement Learning Agents. Your Actions or Your Associates? Predicting Certification and Dro …

Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media


Title	Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media
Authors	Agnese Chiatti, Mu Jung Cho, Anupriya Gagneja, Xiao Yang, Miriam Brinberg, Katie Roehrick, Sagnik Ray Choudhury, Nilam Ram, Byron Reeves, C. Lee Giles
Abstract	Daily engagement in life experiences is increasingly interwoven with mobile device use. Screen capture at the scale of seconds is being used in behavioral studies and to implement “just-in-time” health interventions. The increasing psychological breadth of digital information will continue to make the actual screens that people view a preferred if not required source of data about life experiences. Effective and efficient Information Extraction and Retrieval from digital screenshots is a crucial prerequisite to successful use of screen data. In this paper, we present the experimental workflow we exploited to: (i) pre-process a unique collection of screen captures, (ii) extract unstructured text embedded in the images, (iii) organize image text and metadata based on a structured schema, (iv) index the resulting document collection, and (v) allow for Image Retrieval through a dedicated vertical search engine application. The adopted procedure integrates different open source libraries for traditional image processing, Optical Character Recognition (OCR), and Image Retrieval. Our aim is to assess whether and how state-of-the-art methodologies can be applied to this novel data set. We show how combining OpenCV-based pre-processing modules with a Long short-term memory (LSTM) based release of Tesseract OCR, without ad hoc training, led to a 74% character-level accuracy of the extracted text. Further, we used the processed repository as baseline for a dedicated Image Retrieval system, for the immediate use and application for behavioral and prevention scientists. We discuss issues of Text Information Extraction and Retrieval that are particular to the screenshot image case and suggest important future work.
Tasks	Image Retrieval, Optical Character Recognition
Published	2018-01-04
URL	http://arxiv.org/abs/1801.01316v1
PDF	http://arxiv.org/pdf/1801.01316v1.pdf
PWC	https://paperswithcode.com/paper/text-extraction-and-retrieval-from-smartphone
Repo
Framework

On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions


Title	On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions
Authors	Songtao Lu, Mingyi Hong, Zhengdao Wang
Abstract	The alternating gradient descent (AGD) is a simple but popular algorithm which has been applied to problems in optimization, machine learning, data ming, and signal processing, etc. The algorithm updates two blocks of variables in an alternating manner, in which a gradient step is taken on one block, while keeping the remaining block fixed. When the objective function is nonconvex, it is well-known the AGD converges to the first-order stationary solution with a global sublinear rate. In this paper, we show that a variant of AGD-type algorithms will not be trapped by “bad” stationary solutions such as saddle points and local maximum points. In particular, we consider a smooth unconstrained optimization problem, and propose a perturbed AGD (PA-GD) which converges (with high probability) to the set of second-order stationary solutions (SS2) with a global sublinear rate. To the best of our knowledge, this is the first alternating type algorithm which takes $\mathcal{O}(\text{polylog}(d)/\epsilon^{7/3})$ iterations to achieve SS2 with high probability [where polylog$(d)$ is polynomial of the logarithm of dimension $d$ of the problem].
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10418v1
PDF	http://arxiv.org/pdf/1802.10418v1.pdf
PWC	https://paperswithcode.com/paper/on-the-sublinear-convergence-of-randomly
Repo
Framework

Emergence of Addictive Behaviors in Reinforcement Learning Agents


Title	Emergence of Addictive Behaviors in Reinforcement Learning Agents
Authors	Vahid Behzadan, Roman V. Yampolskiy, Arslan Munir
Abstract	This paper presents a novel approach to the technical analysis of wireheading in intelligent agents. Inspired by the natural analogues of wireheading and their prevalent manifestations, we propose the modeling of such phenomenon in Reinforcement Learning (RL) agents as psychological disorders. In a preliminary step towards evaluating this proposal, we study the feasibility and dynamics of emergent addictive policies in Q-learning agents in the tractable environment of the game of Snake. We consider a slightly modified settings for this game, in which the environment provides a “drug” seed alongside the original “healthy” seed for the consumption of the snake. We adopt and extend an RL-based model of natural addiction to Q-learning agents in this settings, and derive sufficient parametric conditions for the emergence of addictive behaviors in such agents. Furthermore, we evaluate our theoretical analysis with three sets of simulation-based experiments. The results demonstrate the feasibility of addictive wireheading in RL agents, and provide promising venues of further research on the psychopathological modeling of complex AI safety problems.
Tasks	Q-Learning
Published	2018-11-14
URL	http://arxiv.org/abs/1811.05590v1
PDF	http://arxiv.org/pdf/1811.05590v1.pdf
PWC	https://paperswithcode.com/paper/emergence-of-addictive-behaviors-in
Repo
Framework


Title	Your Actions or Your Associates? Predicting Certification and Dropout in MOOCs with Behavioral and Social Features
Authors	Niki Gitinabard, Farzaneh Khoshnevisan, Collin F. Lynch, Elle Yuan Wang
Abstract	The high level of attrition and low rate of certification in Massive Open Online Courses (MOOCs) has prompted a great deal of research. Prior researchers have focused on predicting dropout based upon behavioral features such as student confusion, click-stream patterns, and social interactions. However, few studies have focused on combining student logs with forum data. In this work, we use data from two different offerings of the same MOOC. We conduct a survival analysis to identify likely dropouts. We then examine two classes of features, social and behavioral, and apply a combination of modeling and feature-selection methods to identify the most relevant features to predict both dropout and certification. We examine the utility of three different model types and we consider the impact of different definitions of dropout on the predictors. Finally, we assess the reliability of the models over time by evaluating whether or not models from week 1 can predict dropout in week 2, and so on. The outcomes of this study will help instructors identify students likely to fail or dropout as soon as the first two weeks and provide them with more support.
Tasks	Feature Selection, Survival Analysis
Published	2018-08-31
URL	http://arxiv.org/abs/1809.00052v1
PDF	http://arxiv.org/pdf/1809.00052v1.pdf
PWC	https://paperswithcode.com/paper/your-actions-or-your-associates-predicting
Repo
Framework

Towards Multifocal Displays with Dense Focal Stacks


Title	Towards Multifocal Displays with Dense Focal Stacks
Authors	Jen-Hao Rick Chang, B. V. K. Vijaya Kumar, Aswin C. Sankaranarayanan
Abstract	We present a virtual reality display that is capable of generating a dense collection of depth/focal planes. This is achieved by driving a focus-tunable lens to sweep a range of focal lengths at a high frequency and, subsequently, tracking the focal length precisely at microsecond time resolutions using an optical module. Precise tracking of the focal length, coupled with a high-speed display, enables our lab prototype to generate 1600 focal planes per second. This enables a novel first-of-its-kind virtual reality multifocal display that is capable of resolving the vergence-accommodation conflict endemic to today’s displays.
Tasks
Published	2018-05-27
URL	http://arxiv.org/abs/1805.10664v3
PDF	http://arxiv.org/pdf/1805.10664v3.pdf
PWC	https://paperswithcode.com/paper/towards-multifocal-displays-with-dense-focal
Repo
Framework

Deep Dictionary Learning: A PARametric NETwork Approach


Title	Deep Dictionary Learning: A PARametric NETwork Approach
Authors	Shahin Mahdizadehaghdam, Ashkan Panahi, Hamid Krim, Liyi Dai
Abstract	Deep dictionary learning seeks multiple dictionaries at different image scales to capture complementary coherent characteristics. We propose a method for learning a hierarchy of synthesis dictionaries with an image classification goal. The dictionaries and classification parameters are trained by a classification objective, and the sparse features are extracted by reducing a reconstruction loss in each layer. The reconstruction objectives in some sense regularize the classification problem and inject source signal information in the extracted features. The performance of the proposed hierarchical method increases by adding more layers, which consequently makes this model easier to tune and adapt. The proposed algorithm furthermore, shows remarkably lower fooling rate in presence of adversarial perturbation. The validation of the proposed approach is based on its classification performance using four benchmark datasets and is compared to a CNN of similar size.
Tasks	Dictionary Learning, Image Classification
Published	2018-03-11
URL	http://arxiv.org/abs/1803.04022v1
PDF	http://arxiv.org/pdf/1803.04022v1.pdf
PWC	https://paperswithcode.com/paper/deep-dictionary-learning-a-parametric-network
Repo
Framework

Text Classification of the Precursory Accelerating Seismicity Corpus: Inference on some Theoretical Trends in Earthquake Predictability Research from 1988 to 2018


Title	Text Classification of the Precursory Accelerating Seismicity Corpus: Inference on some Theoretical Trends in Earthquake Predictability Research from 1988 to 2018
Authors	Arnaud Mignan
Abstract	Text analytics based on supervised machine learning classifiers has shown great promise in a multitude of domains, but has yet to be applied to Seismology. We test various standard models (Naive Bayes, k-Nearest Neighbors, Support Vector Machines, and Random Forests) on a seismological corpus of 100 articles related to the topic of precursory accelerating seismicity, spanning from 1988 to 2010. This corpus was labelled in Mignan (2011) with the precursor whether explained by critical processes (i.e., cascade triggering) or by other processes (such as signature of main fault loading). We investigate rather the classification process can be automatized to help analyze larger corpora in order to better understand trends in earthquake predictability research. We find that the Naive Bayes model performs best, in agreement with the machine learning literature for the case of small datasets, with cross-validation accuracies of 86% for binary classification. For a refined multiclass classification (‘non-critical process’ < ‘agnostic’ < ‘critical process assumed’ < ‘critical process demonstrated’), we obtain up to 78% accuracy. Prediction on a dozen of articles published since 2011 shows however a weak generalization with a F1-score of 60%, only slightly better than a random classifier, which can be explained by a change of authorship and use of different terminologies. Yet, the model shows F1-scores greater than 80% for the two multiclass extremes (‘non-critical process’ versus ‘critical process demonstrated’) while it falls to random classifier results (around 25%) for papers labelled ‘agnostic’ or ‘critical process assumed’. Those results are encouraging in view of the small size of the corpus and of the high degree of abstraction of the labelling. Domain knowledge engineering remains essential but can be made transparent by an investigation of Naive Bayes keyword posterior probabilities.
Tasks	Text Classification
Published	2018-10-05
URL	http://arxiv.org/abs/1810.03480v1
PDF	http://arxiv.org/pdf/1810.03480v1.pdf
PWC	https://paperswithcode.com/paper/text-classification-of-the-precursory
Repo
Framework

Towards Multi-Object Detection and Tracking in Urban Scenario under Uncertainties


Title	Towards Multi-Object Detection and Tracking in Urban Scenario under Uncertainties
Authors	Achim Kampker, Mohsen Sefati, Arya Abdul Rachman, Kai Kreisköther, Pascual Campoy
Abstract	Urban-oriented autonomous vehicles require a reliable perception technology to tackle the high amount of uncertainties. The recently introduced compact 3D LIDAR sensor offers a surround spatial information that can be exploited to enhance the vehicle perception. We present a real-time integrated framework of multi-target object detection and tracking using 3D LIDAR geared toward urban use. Our approach combines sensor occlusion-aware detection method with computationally efficient heuristics rule-based filtering and adaptive probabilistic tracking to handle uncertainties arising from sensing limitation of 3D LIDAR and complexity of the target object movement. The evaluation results using real-world pre-recorded 3D LIDAR data and comparison with state-of-the-art works shows that our framework is capable of achieving promising tracking performance in the urban situation.
Tasks	Autonomous Vehicles, Object Detection
Published	2018-01-08
URL	http://arxiv.org/abs/1801.02686v2
PDF	http://arxiv.org/pdf/1801.02686v2.pdf
PWC	https://paperswithcode.com/paper/towards-multi-object-detection-and-tracking
Repo
Framework

Fairness-aware Classification: Criterion, Convexity, and Bounds


Title	Fairness-aware Classification: Criterion, Convexity, and Bounds
Authors	Yongkai Wu, Lu Zhang, Xintao Wu
Abstract	Fairness-aware classification is receiving increasing attention in the machine learning fields. Recently research proposes to formulate the fairness-aware classification as constrained optimization problems. However, several limitations exist in previous works due to the lack of a theoretical framework for guiding the formulation. In this paper, we propose a general framework for learning fair classifiers which addresses previous limitations. The framework formulates various commonly-used fairness metrics as convex constraints that can be directly incorporated into classic classification models. Within the framework, we propose a constraint-free criterion on the training data which ensures that any classifier learned from the data is fair. We also derive the constraints which ensure that the real fairness metric is satisfied when surrogate functions are used to achieve convexity. Our framework can be used to for formulating fairness-aware classification with fairness guarantee and computational efficiency. The experiments using real-world datasets demonstrate our theoretical results and show the effectiveness of proposed framework and methods.
Tasks
Published	2018-09-13
URL	http://arxiv.org/abs/1809.04737v1
PDF	http://arxiv.org/pdf/1809.04737v1.pdf
PWC	https://paperswithcode.com/paper/fairness-aware-classification-criterion
Repo
Framework

Spotting Micro-Expressions on Long Videos Sequences


Title	Spotting Micro-Expressions on Long Videos Sequences
Authors	Jingting Li, Catherine Soladie, Renaud Sguier, Sujing Wang, Moi Hoon Yap
Abstract	This paper presents baseline results for the first Micro-Expression Spotting Challenge 2019 by evaluating local temporal pattern (LTP) on SAMM and CAS(ME)2. The proposed LTP patterns are extracted by applying PCA in a temporal window on several facial local regions. The micro-expression sequences are then spotted by a local classification of LTP and a global fusion. The performance is evaluated by Leave-One-Subject-Out cross validation. Furthermore, we define the criteria of determining true positives in one video by overlap rate and set the metric F1-score for spotting performance of the whole database. The F1-score of baseline results for SAMM and CAS(ME)2 are 0.0316 and 0.0179, respectively.
Tasks
Published	2018-12-26
URL	https://arxiv.org/abs/1812.10306v2
PDF	https://arxiv.org/pdf/1812.10306v2.pdf
PWC	https://paperswithcode.com/paper/spotting-micro-expressions-on-long-videos
Repo
Framework

Dynamic Adaptation on Non-Stationary Visual Domains


Title	Dynamic Adaptation on Non-Stationary Visual Domains
Authors	Sindi Shkodrani, Michael Hofmann, Efstratios Gavves
Abstract	Domain adaptation aims to learn models on a supervised source domain that perform well on an unsupervised target. Prior work has examined domain adaptation in the context of stationary domain shifts, i.e. static data sets. However, with large-scale or dynamic data sources, data from a defined domain is not usually available all at once. For instance, in a streaming data scenario, dataset statistics effectively become a function of time. We introduce a framework for adaptation over non-stationary distribution shifts applicable to large-scale and streaming data scenarios. The model is adapted sequentially over incoming unsupervised streaming data batches. This enables improvements over several batches without the need for any additionally annotated data. To demonstrate the effectiveness of our proposed framework, we modify associative domain adaptation to work well on source and target data batches with unequal class distributions. We apply our method to several adaptation benchmark datasets for classification and show improved classifier accuracy not only for the currently adapted batch, but also when applied on future stream batches. Furthermore, we show the applicability of our associative learning modifications to semantic segmentation, where we achieve competitive results.
Tasks	Domain Adaptation, Semantic Segmentation
Published	2018-08-02
URL	http://arxiv.org/abs/1808.00736v1
PDF	http://arxiv.org/pdf/1808.00736v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-adaptation-on-non-stationary-visual
Repo
Framework

Embedding Logical Queries on Knowledge Graphs


Title	Embedding Logical Queries on Knowledge Graphs
Authors	William L. Hamilton, Payal Bajaj, Marinka Zitnik, Dan Jurafsky, Jure Leskovec
Abstract	Learning low-dimensional embeddings of knowledge graphs is a powerful approach used to predict unobserved or missing edges between entities. However, an open challenge in this area is developing techniques that can go beyond simple edge prediction and handle more complex logical queries, which might involve multiple unobserved edges, entities, and variables. For instance, given an incomplete biological knowledge graph, we might want to predict “em what drugs are likely to target proteins involved with both diseases X and Y?” – a query that requires reasoning about all possible proteins that {\em might} interact with diseases X and Y. Here we introduce a framework to efficiently make predictions about conjunctive logical queries – a flexible but tractable subset of first-order logic – on incomplete knowledge graphs. In our approach, we embed graph nodes in a low-dimensional space and represent logical operators as learned geometric operations (e.g., translation, rotation) in this embedding space. By performing logical operations within a low-dimensional embedding space, our approach achieves a time complexity that is linear in the number of query variables, compared to the exponential complexity required by a naive enumeration-based approach. We demonstrate the utility of this framework in two application studies on real-world datasets with millions of relations: predicting logical relationships in a network of drug-gene-disease interactions and in a graph-based representation of social interactions derived from a popular web forum.
Tasks	Knowledge Graphs
Published	2018-06-05
URL	https://arxiv.org/abs/1806.01445v4
PDF	https://arxiv.org/pdf/1806.01445v4.pdf
PWC	https://paperswithcode.com/paper/embedding-logical-queries-on-knowledge-graphs
Repo
Framework

On the Convergence and Robustness of Training GANs with Regularized Optimal Transport


Title	On the Convergence and Robustness of Training GANs with Regularized Optimal Transport
Authors	Maziar Sanjabi, Jimmy Ba, Meisam Razaviyayn, Jason D. Lee
Abstract	Generative Adversarial Networks (GANs) are one of the most practical methods for learning data distributions. A popular GAN formulation is based on the use of Wasserstein distance as a metric between probability distributions. Unfortunately, minimizing the Wasserstein distance between the data distribution and the generative model distribution is a computationally challenging problem as its objective is non-convex, non-smooth, and even hard to compute. In this work, we show that obtaining gradient information of the smoothed Wasserstein GAN formulation, which is based on regularized Optimal Transport (OT), is computationally effortless and hence one can apply first order optimization methods to minimize this objective. Consequently, we establish theoretical convergence guarantee to stationarity for a proposed class of GAN optimization algorithms. Unlike the original non-smooth formulation, our algorithm only requires solving the discriminator to approximate optimality. We apply our method to learning MNIST digits as well as CIFAR-10images. Our experiments show that our method is computationally efficient and generates images comparable to the state of the art algorithms given the same architecture and computational power.
Tasks
Published	2018-02-22
URL	http://arxiv.org/abs/1802.08249v2
PDF	http://arxiv.org/pdf/1802.08249v2.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-and-robustness-of-training
Repo
Framework

Online Learning and Decision-Making under Generalized Linear Model with High-Dimensional Data


Title	Online Learning and Decision-Making under Generalized Linear Model with High-Dimensional Data
Authors	Xue Wang, Mike Mingcheng Wei, Tao Yao
Abstract	We propose a minimax concave penalized multi-armed bandit algorithm under generalized linear model (G-MCP-Bandit) for a decision-maker facing high-dimensional data in an online learning and decision-making process. We demonstrate that the G-MCP-Bandit algorithm asymptotically achieves the optimal cumulative regret in the sample size dimension T , O(log T), and further attains a tight bound in the covariate dimension d, O(log d). In addition, we develop a linear approximation method, the 2-step weighted Lasso procedure, to identify the MCP estimator for the G-MCP-Bandit algorithm under non-iid samples. Under this procedure, the MCP estimator matches the oracle estimator with high probability and converges to the true parameters with the optimal convergence rate. Finally, through experiments based on synthetic data and two real datasets (warfarin dosing dataset and Tencent search advertising dataset), we show that the G-MCP-Bandit algorithm outperforms other benchmark algorithms, especially when there is a high level of data sparsity or the decision set is large.
Tasks	Decision Making
Published	2018-12-07
URL	http://arxiv.org/abs/1812.02962v1
PDF	http://arxiv.org/pdf/1812.02962v1.pdf
PWC	https://paperswithcode.com/paper/online-learning-and-decision-making-under
Repo
Framework

Modelling trait dependent speciation with Approximate Bayesian Computation


Title	Modelling trait dependent speciation with Approximate Bayesian Computation
Authors	Krzysztof Bartoszek, Pietro Liò
Abstract	Phylogeny is the field of modelling the temporal discrete dynamics of speciation. Complex models can nowadays be studied using the Approximate Bayesian Computation approach which avoids likelihood calculations. The field’s progression is hampered by the lack of robust software to estimate the numerous parameters of the speciation process. In this work we present an R package, pcmabc, based on Approximate Bayesian Computations, that implements three novel phylogenetic algorithms for trait-dependent speciation modelling. Our phylogenetic comparative methodology takes into account both the simulated traits and phylogeny, attempting to estimate the parameters of the processes generating the phenotype and the trait. The user is not restricted to a predefined set of models and can specify a variety of evolutionary and branching models. We illustrate the software with a simulation-reestimation study focused around the branching Ornstein-Uhlenbeck process, where the branching rate depends non-linearly on the value of the driving Ornstein-Uhlenbeck process. Included in this work is a tutorial on how to use the software.
Tasks
Published	2018-12-10
URL	http://arxiv.org/abs/1812.03715v1
PDF	http://arxiv.org/pdf/1812.03715v1.pdf
PWC	https://paperswithcode.com/paper/modelling-trait-dependent-speciation-with
Repo
Framework