Paper Group ANR 1006
Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media. On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions. Emergence of Addictive Behaviors in Reinforcement Learning Agents. Your Actions or Your Associates? Predicting Certification and Dro …
Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media
Title | Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media |
Authors | Agnese Chiatti, Mu Jung Cho, Anupriya Gagneja, Xiao Yang, Miriam Brinberg, Katie Roehrick, Sagnik Ray Choudhury, Nilam Ram, Byron Reeves, C. Lee Giles |
Abstract | Daily engagement in life experiences is increasingly interwoven with mobile device use. Screen capture at the scale of seconds is being used in behavioral studies and to implement “just-in-time” health interventions. The increasing psychological breadth of digital information will continue to make the actual screens that people view a preferred if not required source of data about life experiences. Effective and efficient Information Extraction and Retrieval from digital screenshots is a crucial prerequisite to successful use of screen data. In this paper, we present the experimental workflow we exploited to: (i) pre-process a unique collection of screen captures, (ii) extract unstructured text embedded in the images, (iii) organize image text and metadata based on a structured schema, (iv) index the resulting document collection, and (v) allow for Image Retrieval through a dedicated vertical search engine application. The adopted procedure integrates different open source libraries for traditional image processing, Optical Character Recognition (OCR), and Image Retrieval. Our aim is to assess whether and how state-of-the-art methodologies can be applied to this novel data set. We show how combining OpenCV-based pre-processing modules with a Long short-term memory (LSTM) based release of Tesseract OCR, without ad hoc training, led to a 74% character-level accuracy of the extracted text. Further, we used the processed repository as baseline for a dedicated Image Retrieval system, for the immediate use and application for behavioral and prevention scientists. We discuss issues of Text Information Extraction and Retrieval that are particular to the screenshot image case and suggest important future work. |
Tasks | Image Retrieval, Optical Character Recognition |
Published | 2018-01-04 |
URL | http://arxiv.org/abs/1801.01316v1 |
http://arxiv.org/pdf/1801.01316v1.pdf | |
PWC | https://paperswithcode.com/paper/text-extraction-and-retrieval-from-smartphone |
Repo | |
Framework | |
On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions
Title | On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions |
Authors | Songtao Lu, Mingyi Hong, Zhengdao Wang |
Abstract | The alternating gradient descent (AGD) is a simple but popular algorithm which has been applied to problems in optimization, machine learning, data ming, and signal processing, etc. The algorithm updates two blocks of variables in an alternating manner, in which a gradient step is taken on one block, while keeping the remaining block fixed. When the objective function is nonconvex, it is well-known the AGD converges to the first-order stationary solution with a global sublinear rate. In this paper, we show that a variant of AGD-type algorithms will not be trapped by “bad” stationary solutions such as saddle points and local maximum points. In particular, we consider a smooth unconstrained optimization problem, and propose a perturbed AGD (PA-GD) which converges (with high probability) to the set of second-order stationary solutions (SS2) with a global sublinear rate. To the best of our knowledge, this is the first alternating type algorithm which takes $\mathcal{O}(\text{polylog}(d)/\epsilon^{7/3})$ iterations to achieve SS2 with high probability [where polylog$(d)$ is polynomial of the logarithm of dimension $d$ of the problem]. |
Tasks | |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10418v1 |
http://arxiv.org/pdf/1802.10418v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-sublinear-convergence-of-randomly |
Repo | |
Framework | |
Emergence of Addictive Behaviors in Reinforcement Learning Agents
Title | Emergence of Addictive Behaviors in Reinforcement Learning Agents |
Authors | Vahid Behzadan, Roman V. Yampolskiy, Arslan Munir |
Abstract | This paper presents a novel approach to the technical analysis of wireheading in intelligent agents. Inspired by the natural analogues of wireheading and their prevalent manifestations, we propose the modeling of such phenomenon in Reinforcement Learning (RL) agents as psychological disorders. In a preliminary step towards evaluating this proposal, we study the feasibility and dynamics of emergent addictive policies in Q-learning agents in the tractable environment of the game of Snake. We consider a slightly modified settings for this game, in which the environment provides a “drug” seed alongside the original “healthy” seed for the consumption of the snake. We adopt and extend an RL-based model of natural addiction to Q-learning agents in this settings, and derive sufficient parametric conditions for the emergence of addictive behaviors in such agents. Furthermore, we evaluate our theoretical analysis with three sets of simulation-based experiments. The results demonstrate the feasibility of addictive wireheading in RL agents, and provide promising venues of further research on the psychopathological modeling of complex AI safety problems. |
Tasks | Q-Learning |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.05590v1 |
http://arxiv.org/pdf/1811.05590v1.pdf | |
PWC | https://paperswithcode.com/paper/emergence-of-addictive-behaviors-in |
Repo | |
Framework | |
Your Actions or Your Associates? Predicting Certification and Dropout in MOOCs with Behavioral and Social Features
Title | Your Actions or Your Associates? Predicting Certification and Dropout in MOOCs with Behavioral and Social Features |
Authors | Niki Gitinabard, Farzaneh Khoshnevisan, Collin F. Lynch, Elle Yuan Wang |
Abstract | The high level of attrition and low rate of certification in Massive Open Online Courses (MOOCs) has prompted a great deal of research. Prior researchers have focused on predicting dropout based upon behavioral features such as student confusion, click-stream patterns, and social interactions. However, few studies have focused on combining student logs with forum data. In this work, we use data from two different offerings of the same MOOC. We conduct a survival analysis to identify likely dropouts. We then examine two classes of features, social and behavioral, and apply a combination of modeling and feature-selection methods to identify the most relevant features to predict both dropout and certification. We examine the utility of three different model types and we consider the impact of different definitions of dropout on the predictors. Finally, we assess the reliability of the models over time by evaluating whether or not models from week 1 can predict dropout in week 2, and so on. The outcomes of this study will help instructors identify students likely to fail or dropout as soon as the first two weeks and provide them with more support. |
Tasks | Feature Selection, Survival Analysis |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1809.00052v1 |
http://arxiv.org/pdf/1809.00052v1.pdf | |
PWC | https://paperswithcode.com/paper/your-actions-or-your-associates-predicting |
Repo | |
Framework | |
Towards Multifocal Displays with Dense Focal Stacks
Title | Towards Multifocal Displays with Dense Focal Stacks |
Authors | Jen-Hao Rick Chang, B. V. K. Vijaya Kumar, Aswin C. Sankaranarayanan |
Abstract | We present a virtual reality display that is capable of generating a dense collection of depth/focal planes. This is achieved by driving a focus-tunable lens to sweep a range of focal lengths at a high frequency and, subsequently, tracking the focal length precisely at microsecond time resolutions using an optical module. Precise tracking of the focal length, coupled with a high-speed display, enables our lab prototype to generate 1600 focal planes per second. This enables a novel first-of-its-kind virtual reality multifocal display that is capable of resolving the vergence-accommodation conflict endemic to today’s displays. |
Tasks | |
Published | 2018-05-27 |
URL | http://arxiv.org/abs/1805.10664v3 |
http://arxiv.org/pdf/1805.10664v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-multifocal-displays-with-dense-focal |
Repo | |
Framework | |
Deep Dictionary Learning: A PARametric NETwork Approach
Title | Deep Dictionary Learning: A PARametric NETwork Approach |
Authors | Shahin Mahdizadehaghdam, Ashkan Panahi, Hamid Krim, Liyi Dai |
Abstract | Deep dictionary learning seeks multiple dictionaries at different image scales to capture complementary coherent characteristics. We propose a method for learning a hierarchy of synthesis dictionaries with an image classification goal. The dictionaries and classification parameters are trained by a classification objective, and the sparse features are extracted by reducing a reconstruction loss in each layer. The reconstruction objectives in some sense regularize the classification problem and inject source signal information in the extracted features. The performance of the proposed hierarchical method increases by adding more layers, which consequently makes this model easier to tune and adapt. The proposed algorithm furthermore, shows remarkably lower fooling rate in presence of adversarial perturbation. The validation of the proposed approach is based on its classification performance using four benchmark datasets and is compared to a CNN of similar size. |
Tasks | Dictionary Learning, Image Classification |
Published | 2018-03-11 |
URL | http://arxiv.org/abs/1803.04022v1 |
http://arxiv.org/pdf/1803.04022v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-dictionary-learning-a-parametric-network |
Repo | |
Framework | |
Text Classification of the Precursory Accelerating Seismicity Corpus: Inference on some Theoretical Trends in Earthquake Predictability Research from 1988 to 2018
Title | Text Classification of the Precursory Accelerating Seismicity Corpus: Inference on some Theoretical Trends in Earthquake Predictability Research from 1988 to 2018 |
Authors | Arnaud Mignan |
Abstract | Text analytics based on supervised machine learning classifiers has shown great promise in a multitude of domains, but has yet to be applied to Seismology. We test various standard models (Naive Bayes, k-Nearest Neighbors, Support Vector Machines, and Random Forests) on a seismological corpus of 100 articles related to the topic of precursory accelerating seismicity, spanning from 1988 to 2010. This corpus was labelled in Mignan (2011) with the precursor whether explained by critical processes (i.e., cascade triggering) or by other processes (such as signature of main fault loading). We investigate rather the classification process can be automatized to help analyze larger corpora in order to better understand trends in earthquake predictability research. We find that the Naive Bayes model performs best, in agreement with the machine learning literature for the case of small datasets, with cross-validation accuracies of 86% for binary classification. For a refined multiclass classification (‘non-critical process’ < ‘agnostic’ < ‘critical process assumed’ < ‘critical process demonstrated’), we obtain up to 78% accuracy. Prediction on a dozen of articles published since 2011 shows however a weak generalization with a F1-score of 60%, only slightly better than a random classifier, which can be explained by a change of authorship and use of different terminologies. Yet, the model shows F1-scores greater than 80% for the two multiclass extremes (‘non-critical process’ versus ‘critical process demonstrated’) while it falls to random classifier results (around 25%) for papers labelled ‘agnostic’ or ‘critical process assumed’. Those results are encouraging in view of the small size of the corpus and of the high degree of abstraction of the labelling. Domain knowledge engineering remains essential but can be made transparent by an investigation of Naive Bayes keyword posterior probabilities. |
Tasks | Text Classification |
Published | 2018-10-05 |
URL | http://arxiv.org/abs/1810.03480v1 |
http://arxiv.org/pdf/1810.03480v1.pdf | |
PWC | https://paperswithcode.com/paper/text-classification-of-the-precursory |
Repo | |
Framework | |
Towards Multi-Object Detection and Tracking in Urban Scenario under Uncertainties
Title | Towards Multi-Object Detection and Tracking in Urban Scenario under Uncertainties |
Authors | Achim Kampker, Mohsen Sefati, Arya Abdul Rachman, Kai Kreisköther, Pascual Campoy |
Abstract | Urban-oriented autonomous vehicles require a reliable perception technology to tackle the high amount of uncertainties. The recently introduced compact 3D LIDAR sensor offers a surround spatial information that can be exploited to enhance the vehicle perception. We present a real-time integrated framework of multi-target object detection and tracking using 3D LIDAR geared toward urban use. Our approach combines sensor occlusion-aware detection method with computationally efficient heuristics rule-based filtering and adaptive probabilistic tracking to handle uncertainties arising from sensing limitation of 3D LIDAR and complexity of the target object movement. The evaluation results using real-world pre-recorded 3D LIDAR data and comparison with state-of-the-art works shows that our framework is capable of achieving promising tracking performance in the urban situation. |
Tasks | Autonomous Vehicles, Object Detection |
Published | 2018-01-08 |
URL | http://arxiv.org/abs/1801.02686v2 |
http://arxiv.org/pdf/1801.02686v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-multi-object-detection-and-tracking |
Repo | |
Framework | |
Fairness-aware Classification: Criterion, Convexity, and Bounds
Title | Fairness-aware Classification: Criterion, Convexity, and Bounds |
Authors | Yongkai Wu, Lu Zhang, Xintao Wu |
Abstract | Fairness-aware classification is receiving increasing attention in the machine learning fields. Recently research proposes to formulate the fairness-aware classification as constrained optimization problems. However, several limitations exist in previous works due to the lack of a theoretical framework for guiding the formulation. In this paper, we propose a general framework for learning fair classifiers which addresses previous limitations. The framework formulates various commonly-used fairness metrics as convex constraints that can be directly incorporated into classic classification models. Within the framework, we propose a constraint-free criterion on the training data which ensures that any classifier learned from the data is fair. We also derive the constraints which ensure that the real fairness metric is satisfied when surrogate functions are used to achieve convexity. Our framework can be used to for formulating fairness-aware classification with fairness guarantee and computational efficiency. The experiments using real-world datasets demonstrate our theoretical results and show the effectiveness of proposed framework and methods. |
Tasks | |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.04737v1 |
http://arxiv.org/pdf/1809.04737v1.pdf | |
PWC | https://paperswithcode.com/paper/fairness-aware-classification-criterion |
Repo | |
Framework | |
Spotting Micro-Expressions on Long Videos Sequences
Title | Spotting Micro-Expressions on Long Videos Sequences |
Authors | Jingting Li, Catherine Soladie, Renaud Sguier, Sujing Wang, Moi Hoon Yap |
Abstract | This paper presents baseline results for the first Micro-Expression Spotting Challenge 2019 by evaluating local temporal pattern (LTP) on SAMM and CAS(ME)2. The proposed LTP patterns are extracted by applying PCA in a temporal window on several facial local regions. The micro-expression sequences are then spotted by a local classification of LTP and a global fusion. The performance is evaluated by Leave-One-Subject-Out cross validation. Furthermore, we define the criteria of determining true positives in one video by overlap rate and set the metric F1-score for spotting performance of the whole database. The F1-score of baseline results for SAMM and CAS(ME)2 are 0.0316 and 0.0179, respectively. |
Tasks | |
Published | 2018-12-26 |
URL | https://arxiv.org/abs/1812.10306v2 |
https://arxiv.org/pdf/1812.10306v2.pdf | |
PWC | https://paperswithcode.com/paper/spotting-micro-expressions-on-long-videos |
Repo | |
Framework | |
Dynamic Adaptation on Non-Stationary Visual Domains
Title | Dynamic Adaptation on Non-Stationary Visual Domains |
Authors | Sindi Shkodrani, Michael Hofmann, Efstratios Gavves |
Abstract | Domain adaptation aims to learn models on a supervised source domain that perform well on an unsupervised target. Prior work has examined domain adaptation in the context of stationary domain shifts, i.e. static data sets. However, with large-scale or dynamic data sources, data from a defined domain is not usually available all at once. For instance, in a streaming data scenario, dataset statistics effectively become a function of time. We introduce a framework for adaptation over non-stationary distribution shifts applicable to large-scale and streaming data scenarios. The model is adapted sequentially over incoming unsupervised streaming data batches. This enables improvements over several batches without the need for any additionally annotated data. To demonstrate the effectiveness of our proposed framework, we modify associative domain adaptation to work well on source and target data batches with unequal class distributions. We apply our method to several adaptation benchmark datasets for classification and show improved classifier accuracy not only for the currently adapted batch, but also when applied on future stream batches. Furthermore, we show the applicability of our associative learning modifications to semantic segmentation, where we achieve competitive results. |
Tasks | Domain Adaptation, Semantic Segmentation |
Published | 2018-08-02 |
URL | http://arxiv.org/abs/1808.00736v1 |
http://arxiv.org/pdf/1808.00736v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-adaptation-on-non-stationary-visual |
Repo | |
Framework | |
Embedding Logical Queries on Knowledge Graphs
Title | Embedding Logical Queries on Knowledge Graphs |
Authors | William L. Hamilton, Payal Bajaj, Marinka Zitnik, Dan Jurafsky, Jure Leskovec |
Abstract | Learning low-dimensional embeddings of knowledge graphs is a powerful approach used to predict unobserved or missing edges between entities. However, an open challenge in this area is developing techniques that can go beyond simple edge prediction and handle more complex logical queries, which might involve multiple unobserved edges, entities, and variables. For instance, given an incomplete biological knowledge graph, we might want to predict “em what drugs are likely to target proteins involved with both diseases X and Y?” – a query that requires reasoning about all possible proteins that {\em might} interact with diseases X and Y. Here we introduce a framework to efficiently make predictions about conjunctive logical queries – a flexible but tractable subset of first-order logic – on incomplete knowledge graphs. In our approach, we embed graph nodes in a low-dimensional space and represent logical operators as learned geometric operations (e.g., translation, rotation) in this embedding space. By performing logical operations within a low-dimensional embedding space, our approach achieves a time complexity that is linear in the number of query variables, compared to the exponential complexity required by a naive enumeration-based approach. We demonstrate the utility of this framework in two application studies on real-world datasets with millions of relations: predicting logical relationships in a network of drug-gene-disease interactions and in a graph-based representation of social interactions derived from a popular web forum. |
Tasks | Knowledge Graphs |
Published | 2018-06-05 |
URL | https://arxiv.org/abs/1806.01445v4 |
https://arxiv.org/pdf/1806.01445v4.pdf | |
PWC | https://paperswithcode.com/paper/embedding-logical-queries-on-knowledge-graphs |
Repo | |
Framework | |
On the Convergence and Robustness of Training GANs with Regularized Optimal Transport
Title | On the Convergence and Robustness of Training GANs with Regularized Optimal Transport |
Authors | Maziar Sanjabi, Jimmy Ba, Meisam Razaviyayn, Jason D. Lee |
Abstract | Generative Adversarial Networks (GANs) are one of the most practical methods for learning data distributions. A popular GAN formulation is based on the use of Wasserstein distance as a metric between probability distributions. Unfortunately, minimizing the Wasserstein distance between the data distribution and the generative model distribution is a computationally challenging problem as its objective is non-convex, non-smooth, and even hard to compute. In this work, we show that obtaining gradient information of the smoothed Wasserstein GAN formulation, which is based on regularized Optimal Transport (OT), is computationally effortless and hence one can apply first order optimization methods to minimize this objective. Consequently, we establish theoretical convergence guarantee to stationarity for a proposed class of GAN optimization algorithms. Unlike the original non-smooth formulation, our algorithm only requires solving the discriminator to approximate optimality. We apply our method to learning MNIST digits as well as CIFAR-10images. Our experiments show that our method is computationally efficient and generates images comparable to the state of the art algorithms given the same architecture and computational power. |
Tasks | |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.08249v2 |
http://arxiv.org/pdf/1802.08249v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-convergence-and-robustness-of-training |
Repo | |
Framework | |
Online Learning and Decision-Making under Generalized Linear Model with High-Dimensional Data
Title | Online Learning and Decision-Making under Generalized Linear Model with High-Dimensional Data |
Authors | Xue Wang, Mike Mingcheng Wei, Tao Yao |
Abstract | We propose a minimax concave penalized multi-armed bandit algorithm under generalized linear model (G-MCP-Bandit) for a decision-maker facing high-dimensional data in an online learning and decision-making process. We demonstrate that the G-MCP-Bandit algorithm asymptotically achieves the optimal cumulative regret in the sample size dimension T , O(log T), and further attains a tight bound in the covariate dimension d, O(log d). In addition, we develop a linear approximation method, the 2-step weighted Lasso procedure, to identify the MCP estimator for the G-MCP-Bandit algorithm under non-iid samples. Under this procedure, the MCP estimator matches the oracle estimator with high probability and converges to the true parameters with the optimal convergence rate. Finally, through experiments based on synthetic data and two real datasets (warfarin dosing dataset and Tencent search advertising dataset), we show that the G-MCP-Bandit algorithm outperforms other benchmark algorithms, especially when there is a high level of data sparsity or the decision set is large. |
Tasks | Decision Making |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.02962v1 |
http://arxiv.org/pdf/1812.02962v1.pdf | |
PWC | https://paperswithcode.com/paper/online-learning-and-decision-making-under |
Repo | |
Framework | |
Modelling trait dependent speciation with Approximate Bayesian Computation
Title | Modelling trait dependent speciation with Approximate Bayesian Computation |
Authors | Krzysztof Bartoszek, Pietro Liò |
Abstract | Phylogeny is the field of modelling the temporal discrete dynamics of speciation. Complex models can nowadays be studied using the Approximate Bayesian Computation approach which avoids likelihood calculations. The field’s progression is hampered by the lack of robust software to estimate the numerous parameters of the speciation process. In this work we present an R package, pcmabc, based on Approximate Bayesian Computations, that implements three novel phylogenetic algorithms for trait-dependent speciation modelling. Our phylogenetic comparative methodology takes into account both the simulated traits and phylogeny, attempting to estimate the parameters of the processes generating the phenotype and the trait. The user is not restricted to a predefined set of models and can specify a variety of evolutionary and branching models. We illustrate the software with a simulation-reestimation study focused around the branching Ornstein-Uhlenbeck process, where the branching rate depends non-linearly on the value of the driving Ornstein-Uhlenbeck process. Included in this work is a tutorial on how to use the software. |
Tasks | |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.03715v1 |
http://arxiv.org/pdf/1812.03715v1.pdf | |
PWC | https://paperswithcode.com/paper/modelling-trait-dependent-speciation-with |
Repo | |
Framework | |