October 16, 2019

3300 words 16 mins read

Paper Group ANR 1145

Predicting Human Trustfulness from Facebook Language. Automatic Identification of Closely-related Indian Languages: Resources and Experiments. How to Make Causal Inferences Using Texts. A Variational Observation Model of 3D Object for Probabilistic Semantic SLAM. Learning to Recognize 3D Human Action from A New Skeleton-based Representation Using D …

Predicting Human Trustfulness from Facebook Language


Title	Predicting Human Trustfulness from Facebook Language
Authors	Mohammadzaman Zamani, Anneke Buffone, H. Andrew Schwartz
Abstract	Trustfulness – one’s general tendency to have confidence in unknown people or situations – predicts many important real-world outcomes such as mental health and likelihood to cooperate with others such as clinicians. While data-driven measures of interpersonal trust have previously been introduced, here, we develop the first language-based assessment of the personality trait of trustfulness by fitting one’s language to an accepted questionnaire-based trust score. Further, using trustfulness as a type of case study, we explore the role of questionnaire size as well as word count in developing language-based predictive models of users’ psychological traits. We find that leveraging a longer questionnaire can yield greater test set accuracy, while, for training, we find it beneficial to include users who took smaller questionnaires which offers more observations for training. Similarly, after noting a decrease in individual prediction error as word count increased, we found a word count-weighted training scheme was helpful when there were very few users in the first place.
Tasks
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05668v1
PDF	http://arxiv.org/pdf/1808.05668v1.pdf
PWC	https://paperswithcode.com/paper/predicting-human-trustfulness-from-facebook
Repo
Framework


Title	Automatic Identification of Closely-related Indian Languages: Resources and Experiments
Authors	Ritesh Kumar, Bornini Lahiri, Deepak Alok, Atul Kr. Ojha, Mayank Jain, Abdul Basit, Yogesh Dawer
Abstract	In this paper, we discuss an attempt to develop an automatic language identification system for 5 closely-related Indo-Aryan languages of India, Awadhi, Bhojpuri, Braj, Hindi and Magahi. We have compiled a comparable corpora of varying length for these languages from various resources. We discuss the method of creation of these corpora in detail. Using these corpora, a language identification system was developed, which currently gives state of the art accuracy of 96.48%. We also used these corpora to study the similarity between the 5 languages at the lexical level, which is the first data-based study of the extent of closeness of these languages.
Tasks	Language Identification
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09405v1
PDF	http://arxiv.org/pdf/1803.09405v1.pdf
PWC	https://paperswithcode.com/paper/automatic-identification-of-closely-related
Repo
Framework

How to Make Causal Inferences Using Texts


Title	How to Make Causal Inferences Using Texts
Authors	Naoki Egami, Christian J. Fong, Justin Grimmer, Margaret E. Roberts, Brandon M. Stewart
Abstract	New text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories of interest from large collections of text. We introduce a conceptual framework for making causal inferences with discovered measures as a treatment or outcome. Our framework enables researchers to discover high-dimensional textual interventions and estimate the ways that observed treatments affect text-based outcomes. We argue that nearly all text-based causal inferences depend upon a latent representation of the text and we provide a framework to learn the latent representation. But estimating this latent representation, we show, creates new risks: we may introduce an identification problem or overfit. To address these risks we describe a split-sample framework and apply it to estimate causal effects from an experiment on immigration attitudes and a study on bureaucratic response. Our work provides a rigorous foundation for text-based causal inferences.
Tasks
Published	2018-02-06
URL	http://arxiv.org/abs/1802.02163v1
PDF	http://arxiv.org/pdf/1802.02163v1.pdf
PWC	https://paperswithcode.com/paper/how-to-make-causal-inferences-using-texts
Repo
Framework

A Variational Observation Model of 3D Object for Probabilistic Semantic SLAM


Title	A Variational Observation Model of 3D Object for Probabilistic Semantic SLAM
Authors	H. W. Yu, B. H. Le
Abstract	We present a Bayesian object observation model for complete probabilistic semantic SLAM. Recent studies on object detection and feature extraction have become important for scene understanding and 3D mapping. However, 3D shape of the object is too complex to formulate the probabilistic observation model; therefore, performing the Bayesian inference of the object-oriented features as well as their pose is less considered. Besides, when the robot equipped with an RGB mono camera only observes the projected single view of an object, a significant amount of the 3D shape information is abandoned. Due to these limitations, semantic SLAM and viewpoint-independent loop closure using volumetric 3D object shape is challenging. In order to enable the complete formulation of probabilistic semantic SLAM, we approximate the observation model of a 3D object with a tractable distribution. We also estimate the variational likelihood from the 2D image of the object to exploit its observed single view. In order to evaluate the proposed method, we perform pose and feature estimation, and demonstrate that the automatic loop closure works seamlessly without additional loop detector in various environments.
Tasks	Bayesian Inference, Object Detection, Scene Understanding
Published	2018-09-14
URL	http://arxiv.org/abs/1809.05225v1
PDF	http://arxiv.org/pdf/1809.05225v1.pdf
PWC	https://paperswithcode.com/paper/a-variational-observation-model-of-3d-object
Repo
Framework

Learning to Recognize 3D Human Action from A New Skeleton-based Representation Using Deep Convolutional Neural Networks


Title	Learning to Recognize 3D Human Action from A New Skeleton-based Representation Using Deep Convolutional Neural Networks
Authors	Huy-Hieu Pham, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, Sergio A. Velastin
Abstract	Recognizing human actions in untrimmed videos is an important challenging task. An effective 3D motion representation and a powerful learning model are two key factors influencing recognition performance. In this paper we introduce a new skeleton-based representation for 3D action recognition in videos. The key idea of the proposed representation is to transform 3D joint coordinates of the human body carried in skeleton sequences into RGB images via a color encoding process. By normalizing the 3D joint coordinates and dividing each skeleton frame into five parts, where the joints are concatenated according to the order of their physical connections, the color-coded representation is able to represent spatio-temporal evolutions of complex 3D motions, independently of the length of each sequence. We then design and train different Deep Convolutional Neural Networks (D-CNNs) based on the Residual Network architecture (ResNet) on the obtained image-based representations to learn 3D motion features and classify them into classes. Our method is evaluated on two widely used action recognition benchmarks: MSR Action3D and NTU-RGB+D, a very large-scale dataset for 3D human action recognition. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches whilst requiring less computation for training and prediction.
Tasks	3D Human Action Recognition, Action Recognition In Videos, Temporal Action Localization
Published	2018-12-26
URL	http://arxiv.org/abs/1812.10550v1
PDF	http://arxiv.org/pdf/1812.10550v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-recognize-3d-human-action-from-a
Repo
Framework

Specifying, Monitoring, and Executing Workflows in Linked Data Environments


Title	Specifying, Monitoring, and Executing Workflows in Linked Data Environments
Authors	Tobias Käfer, Andreas Harth
Abstract	We present an ontology for representing workflows over components with Read-Write Linked Data interfaces and give an operational semantics to the ontology via a rule language. Workflow languages have been successfully applied for modelling behaviour in enterprise information systems, in which the data is often managed in a relational database. Linked Data interfaces have been widely deployed on the web to support data integration in very diverse domains, increasingly also in scenarios involving the Internet of Things, in which application behaviour is often specified using imperative programming languages. With our work we aim to combine workflow languages, which allow for the high-level specification of application behaviour by non-expert users, with Linked Data, which allows for decentralised data publication and integrated data access. We show that our ontology is expressive enough to cover the basic workflow patterns and demonstrate the applicability of our approach with a prototype system that observes pilots carrying out tasks in a mixed-reality aircraft cockpit. On a synthetic benchmark from the building automation domain, the runtime scales linearly with the size of the number of Internet of Things devices.
Tasks
Published	2018-04-13
URL	http://arxiv.org/abs/1804.05044v4
PDF	http://arxiv.org/pdf/1804.05044v4.pdf
PWC	https://paperswithcode.com/paper/specifying-monitoring-and-executing-workflows
Repo
Framework

Upcycle Your OCR: Reusing OCRs for Post-OCR Text Correction in Romanised Sanskrit


Title	Upcycle Your OCR: Reusing OCRs for Post-OCR Text Correction in Romanised Sanskrit
Authors	Amrith Krishna, Bodhisattwa Prasad Majumder, Rajesh Shreedhar Bhat, Pawan Goyal
Abstract	We propose a post-OCR text correction approach for digitising texts in Romanised Sanskrit. Owing to the lack of resources our approach uses OCR models trained for other languages written in Roman. Currently, there exists no dataset available for Romanised Sanskrit OCR. So, we bootstrap a dataset of 430 images, scanned in two different settings and their corresponding ground truth. For training, we synthetically generate training images for both the settings. We find that the use of copying mechanism (Gu et al., 2016) yields a percentage increase of 7.69 in Character Recognition Rate (CRR) than the current state of the art model in solving monotone sequence-to-sequence tasks (Schnober et al., 2016). We find that our system is robust in combating OCR-prone errors, as it obtains a CRR of 87.01% from an OCR output with CRR of 35.76% for one of the dataset settings. A human judgment survey performed on the models shows that our proposed model results in predictions which are faster to comprehend and faster to improve for a human than the other systems.
Tasks	Optical Character Recognition
Published	2018-09-06
URL	http://arxiv.org/abs/1809.02147v1
PDF	http://arxiv.org/pdf/1809.02147v1.pdf
PWC	https://paperswithcode.com/paper/upcycle-your-ocr-reusing-ocrs-for-post-ocr
Repo
Framework

Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions


Title	Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions
Authors	M. Wagner, H. Basevi, R. Shetty, W. Li, M. Malinowski, M. Fritz, A. Leonardis
Abstract	In-depth scene descriptions and question answering tasks have greatly increased the scope of today’s definition of scene understanding. While such tasks are in principle open ended, current formulations primarily focus on describing only the current state of the scenes under consideration. In contrast, in this paper, we focus on the future states of the scenes which are also conditioned on actions. We posit this as a question answering task, where an answer has to be given about a future scene state, given observations of the current scene, and a question that includes a hypothetical action. Our solution is a hybrid model which integrates a physics engine into a question answering architecture in order to anticipate future scene states resulting from object-object interactions caused by an action. We demonstrate first results on this challenging new problem and compare to baselines, where we outperform fully data-driven end-to-end learning approaches.
Tasks	Question Answering, Scene Understanding
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03707v2
PDF	http://arxiv.org/pdf/1809.03707v2.pdf
PWC	https://paperswithcode.com/paper/answering-visual-what-if-questions-from
Repo
Framework

Implementation of Fuzzy C-Means and Possibilistic C-Means Clustering Algorithms, Cluster Tendency Analysis and Cluster Validation


Title	Implementation of Fuzzy C-Means and Possibilistic C-Means Clustering Algorithms, Cluster Tendency Analysis and Cluster Validation
Authors	Md. Abu Bakr Siddique, Rezoana Bente Arif, Mohammad Mahmudur Rahman Khan, Zahidun Ashrafi
Abstract	In this paper, several two-dimensional clustering scenarios are given. In those scenarios, soft partitioning clustering algorithms (Fuzzy C-means (FCM) and Possibilistic c-means (PCM)) are applied. Afterward, VAT is used to investigate the clustering tendency visually, and then in order of checking cluster validation, three types of indices (e.g., PC, DI, and DBI) were used. After observing the clustering algorithms, it was evident that each of them has its limitations; however, PCM is more robust to noise than FCM as in case of FCM a noise point has to be considered as a member of any of the cluster.
Tasks
Published	2018-09-22
URL	https://arxiv.org/abs/1809.08417v3
PDF	https://arxiv.org/pdf/1809.08417v3.pdf
PWC	https://paperswithcode.com/paper/implementation-of-fuzzy-c-means-and
Repo
Framework

Automated Image Data Preprocessing with Deep Reinforcement Learning


Title	Automated Image Data Preprocessing with Deep Reinforcement Learning
Authors	Tran Ngoc Minh, Mathieu Sinn, Hoang Thanh Lam, Martin Wistuba
Abstract	Data preparation, i.e. the process of transforming raw data into a format that can be used for training effective machine learning models, is a tedious and time-consuming task. For image data, preprocessing typically involves a sequence of basic transformations such as cropping, filtering, rotating or flipping images. Currently, data scientists decide manually based on their experience which transformations to apply in which particular order to a given image data set. Besides constituting a bottleneck in real-world data science projects, manual image data preprocessing may yield suboptimal results as data scientists need to rely on intuition or trial-and-error approaches when exploring the space of possible image transformations and thus might not be able to discover the most effective ones. To mitigate the inefficiency and potential ineffectiveness of manual data preprocessing, this paper proposes a deep reinforcement learning framework to automatically discover the optimal data preprocessing steps for training an image classifier. The framework takes as input sets of labeled images and predefined preprocessing transformations. It jointly learns the classifier and the optimal preprocessing transformations for individual images. Experimental results show that the proposed approach not only improves the accuracy of image classifiers, but also makes them substantially more robust to noisy inputs at test time.
Tasks
Published	2018-06-15
URL	http://arxiv.org/abs/1806.05886v1
PDF	http://arxiv.org/pdf/1806.05886v1.pdf
PWC	https://paperswithcode.com/paper/automated-image-data-preprocessing-with-deep
Repo
Framework

Fisher Information and Natural Gradient Learning of Random Deep Networks


Title	Fisher Information and Natural Gradient Learning of Random Deep Networks
Authors	Shun-ichi Amari, Ryo Karakida, Masafumi Oizumi
Abstract	A deep neural network is a hierarchical nonlinear model transforming input signals to output signals. Its input-output relation is considered to be stochastic, being described for a given input by a parameterized conditional probability distribution of outputs. The space of parameters consisting of weights and biases is a Riemannian manifold, where the metric is defined by the Fisher information matrix. The natural gradient method uses the steepest descent direction in a Riemannian manifold, so it is effective in learning, avoiding plateaus. It requires inversion of the Fisher information matrix, however, which is practically impossible when the matrix has a huge number of dimensions. Many methods for approximating the natural gradient have therefore been introduced. The present paper uses statistical neurodynamical method to reveal the properties of the Fisher information matrix in a net of random connections under the mean field approximation. We prove that the Fisher information matrix is unit-wise block diagonal supplemented by small order terms of off-block-diagonal elements, which provides a justification for the quasi-diagonal natural gradient method by Y. Ollivier. A unitwise block-diagonal Fisher metrix reduces to the tensor product of the Fisher information matrices of single units. We further prove that the Fisher information matrix of a single unit has a simple reduced form, a sum of a diagonal matrix and a rank 2 matrix of weight-bias correlations. We obtain the inverse of Fisher information explicitly. We then have an explicit form of the natural gradient, without relying on the numerical matrix inversion, which drastically speeds up stochastic gradient learning.
Tasks
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07172v1
PDF	http://arxiv.org/pdf/1808.07172v1.pdf
PWC	https://paperswithcode.com/paper/fisher-information-and-natural-gradient
Repo
Framework

On the Importance of Visual Context for Data Augmentation in Scene Understanding


Title	On the Importance of Visual Context for Data Augmentation in Scene Understanding
Authors	Nikita Dvornik, Julien Mairal, Cordelia Schmid
Abstract	Performing data augmentation for learning deep neural networks is known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves generalization. While simple image transformations can already improve predictive performance in most vision tasks, larger gains can be obtained by leveraging task-specific prior knowledge. In this work, we consider object detection, semantic and instance segmentation and augment the training images by blending objects in existing scenes, using instance segmentation annotations. We observe that randomly pasting objects on images hurts the performance, unless the object is placed in the right context. To resolve this issue, we propose an explicit context model by using a convolutional neural network, which predicts whether an image region is suitable for placing a given object or not. In our experiments, we show that our approach is able to improve object detection, semantic and instance segmentation on the PASCAL VOC12 and COCO datasets, with significant gains in a limited annotation scenario, i.e. when only one category is annotated. We also show that the method is not limited to datasets that come with expensive pixel-wise instance annotations and can be used when only bounding boxes are available, by employing weakly-supervised learning for instance masks approximation.
Tasks	Data Augmentation, Instance Segmentation, Object Detection, Scene Understanding, Semantic Segmentation
Published	2018-09-06
URL	https://arxiv.org/abs/1809.02492v3
PDF	https://arxiv.org/pdf/1809.02492v3.pdf
PWC	https://paperswithcode.com/paper/on-the-importance-of-visual-context-for-data
Repo
Framework

Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers


Title	Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers
Authors	Ishai Rosenberg, Asaf Shabtai, Yuval Elovici, Lior Rokach
Abstract	In this paper, we present a generic, query-efficient black-box attack against API call-based machine learning malware classifiers. We generate adversarial examples by modifying the malware’s API call sequences and non-sequential features (printable strings). The adversarial examples will be misclassified by the target malware classifier without affecting the malware’s functionality. In contrast to previous studies, our attack minimizes the number of malware classifier queries required. In addition, in our attack, the attacker must only know the class predicted by the malware classifier; the attacker knowledge of the malware classifier’s confidence score is optional. We evaluate the attack effectiveness when attacks are performed against a variety of malware classifier architectures, including recurrent neural network (RNN) variants, deep neural networks, support vector machines, and gradient boosted decision trees. Our attack success rate is about 98% when the classifier’s confidence score is known and 88% when just the classifier’s predicted class is known. We implement four state-of-the-art query-efficient attacks and show that our attack requires fewer queries and less knowledge about the attacked model’s architecture than other existing query-efficient attacks, making it practical for attacking cloud-based malware classifiers at a minimal cost.
Tasks
Published	2018-04-23
URL	https://arxiv.org/abs/1804.08778v6
PDF	https://arxiv.org/pdf/1804.08778v6.pdf
PWC	https://paperswithcode.com/paper/query-efficient-gan-based-black-box-attack
Repo
Framework

Learning to Look around Objects for Top-View Representations of Outdoor Scenes


Title	Learning to Look around Objects for Top-View Representations of Outdoor Scenes
Authors	Samuel Schulter, Menghua Zhai, Nathan Jacobs, Manmohan Chandraker
Abstract	Given a single RGB image of a complex outdoor road scene in the perspective view, we address the novel problem of estimating an occlusion-reasoned semantic scene layout in the top-view. This challenging problem not only requires an accurate understanding of both the 3D geometry and the semantics of the visible scene, but also of occluded areas. We propose a convolutional neural network that learns to predict occluded portions of the scene layout by looking around foreground objects like cars or pedestrians. But instead of hallucinating RGB values, we show that directly predicting the semantics and depths in the occluded areas enables a better transformation into the top-view. We further show that this initial top-view representation can be significantly enhanced by learning priors and rules about typical road layouts from simulated or, if available, map data. Crucially, training our model does not require costly or subjective human annotations for occluded areas or the top-view, but rather uses readily available annotations for standard semantic segmentation. We extensively evaluate and analyze our approach on the KITTI and Cityscapes data sets.
Tasks	Semantic Segmentation
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10870v1
PDF	http://arxiv.org/pdf/1803.10870v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-look-around-objects-for-top-view
Repo
Framework

Improving Rotated Text Detection with Rotation Region Proposal Networks


Title	Improving Rotated Text Detection with Rotation Region Proposal Networks
Authors	Jing Huang, Viswanath Sivakumar, Mher Mnatsakanyan, Guan Pang
Abstract	A significant number of images shared on social media platforms such as Facebook and Instagram contain text in various forms. It’s increasingly becoming commonplace for bad actors to share misinformation, hate speech or other kinds of harmful content as text overlaid on images on such platforms. A scene-text understanding system should hence be able to handle text in various orientations that the adversary might use. Moreover, such a system can be incorporated into screen readers used to aid the visually impaired. In this work, we extend the scene-text extraction system at Facebook, Rosetta, to efficiently handle text in various orientations. Specifically, we incorporate the Rotation Region Proposal Networks (RRPN) in our text extraction pipeline and offer practical suggestions for building and deploying a model for detecting and recognizing text in arbitrary orientations efficiently. Experimental results show a significant improvement on detecting rotated text.
Tasks
Published	2018-11-16
URL	http://arxiv.org/abs/1811.07031v1
PDF	http://arxiv.org/pdf/1811.07031v1.pdf
PWC	https://paperswithcode.com/paper/improving-rotated-text-detection-with
Repo
Framework