July 28, 2019

3441 words 17 mins read

Paper Group ANR 163

Some Like it Hoax: Automated Fake News Detection in Social Networks. Combining Lexical Features and a Supervised Learning Approach for Arabic Sentiment Analysis. One Shot Joint Colocalization and Cosegmentation. Universal Consistency and Robustness of Localized Support Vector Machines. Strength Factors: An Uncertainty System for a Quantified Modal …


Title	Some Like it Hoax: Automated Fake News Detection in Social Networks
Authors	Eugenio Tacchini, Gabriele Ballarin, Marco L. Della Vedova, Stefano Moret, Luca de Alfaro
Abstract	In recent years, the reliability of information on the Internet has emerged as a crucial issue of modern society. Social network sites (SNSs) have revolutionized the way in which information is spread by allowing users to freely share content. As a consequence, SNSs are also increasingly used as vectors for the diffusion of misinformation and hoaxes. The amount of disseminated information and the rapidity of its diffusion make it practically impossible to assess reliability in a timely manner, highlighting the need for automatic hoax detection systems. As a contribution towards this objective, we show that Facebook posts can be classified with high accuracy as hoaxes or non-hoaxes on the basis of the users who “liked” them. We present two classification techniques, one based on logistic regression, the other on a novel adaptation of boolean crowdsourcing algorithms. On a dataset consisting of 15,500 Facebook posts and 909,236 users, we obtain classification accuracies exceeding 99% even when the training set contains less than 1% of the posts. We further show that our techniques are robust: they work even when we restrict our attention to the users who like both hoax and non-hoax posts. These results suggest that mapping the diffusion pattern of information can be a useful component of automatic hoax detection systems.
Tasks	Fake News Detection
Published	2017-04-25
URL	http://arxiv.org/abs/1704.07506v1
PDF	http://arxiv.org/pdf/1704.07506v1.pdf
PWC	https://paperswithcode.com/paper/some-like-it-hoax-automated-fake-news
Repo
Framework

Combining Lexical Features and a Supervised Learning Approach for Arabic Sentiment Analysis


Title	Combining Lexical Features and a Supervised Learning Approach for Arabic Sentiment Analysis
Authors	Samhaa R. El-Beltagy, Talaat Khalil, Amal Halaby, Muhammad Hammad
Abstract	The importance of building sentiment analysis tools for Arabic social media has been recognized during the past couple of years, especially with the rapid increase in the number of Arabic social media users. One of the main difficulties in tackling this problem is that text within social media is mostly colloquial, with many dialects being used within social media platforms. In this paper, we present a set of features that were integrated with a machine learning based sentiment analysis model and applied on Egyptian, Saudi, Levantine, and MSA Arabic social media datasets. Many of the proposed features were derived through the use of an Arabic Sentiment Lexicon. The model also presents emoticon based features, as well as input text related features such as the number of segments within the text, the length of the text, whether the text ends with a question mark or not, etc. We show that the presented features have resulted in an increased accuracy across six of the seven datasets we’ve experimented with and which are all benchmarked. Since the developed model out-performs all existing Arabic sentiment analysis systems that have publicly available datasets, we can state that this model presents state-of-the-art in Arabic sentiment analysis.
Tasks	Arabic Sentiment Analysis, Sentiment Analysis
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08451v1
PDF	http://arxiv.org/pdf/1710.08451v1.pdf
PWC	https://paperswithcode.com/paper/combining-lexical-features-and-a-supervised
Repo
Framework

One Shot Joint Colocalization and Cosegmentation


Title	One Shot Joint Colocalization and Cosegmentation
Authors	Abhishek Sharma
Abstract	This paper presents a novel framework in which image cosegmentation and colocalization are cast into a single optimization problem that integrates information from low level appearance cues with that of high level localization cues in a very weakly supervised manner. In contrast to multi-task learning paradigm that learns similar tasks using a shared representation, the proposed framework leverages two representations at different levels and simultaneously discriminates between foreground and background at the bounding box and superpixel level using discriminative clustering. We show empirically that constraining the two problems at different scales enables the transfer of semantic localization cues to improve cosegmentation output whereas local appearance based segmentation cues help colocalization. The unified framework outperforms strong baseline approaches, of learning the two problems separately, by a large margin on four benchmark datasets. Furthermore, it obtains competitive results compared to the state of the art for cosegmentation on two benchmark datasets and second best result for colocalization on Pascal VOC 2007.
Tasks	Multi-Task Learning
Published	2017-05-17
URL	http://arxiv.org/abs/1705.06000v1
PDF	http://arxiv.org/pdf/1705.06000v1.pdf
PWC	https://paperswithcode.com/paper/one-shot-joint-colocalization-and
Repo
Framework

Universal Consistency and Robustness of Localized Support Vector Machines


Title	Universal Consistency and Robustness of Localized Support Vector Machines
Authors	Florian Dumpert
Abstract	The massive amount of available data potentially used to discover patters in machine learning is a challenge for kernel based algorithms with respect to runtime and storage capacities. Local approaches might help to relieve these issues. From a statistical point of view local approaches allow additionally to deal with different structures in the data in different ways. This paper analyses properties of localized kernel based, non-parametric statistical machine learning methods, in particular of support vector machines (SVMs) and methods close to them. We will show there that locally learnt kernel methods are universal consistent. Furthermore, we give an upper bound for the maxbias in order to show statistical robustness of the proposed method.
Tasks
Published	2017-03-19
URL	http://arxiv.org/abs/1703.06528v1
PDF	http://arxiv.org/pdf/1703.06528v1.pdf
PWC	https://paperswithcode.com/paper/universal-consistency-and-robustness-of
Repo
Framework


Title	Strength Factors: An Uncertainty System for a Quantified Modal Logic
Authors	Naveen Sundar Govindarajulu, Selmer Bringsjord
Abstract	We present a new system S for handling uncertainty in a quantified modal logic (first-order modal logic). The system is based on both probability theory and proof theory. The system is derived from Chisholm’s epistemology. We concretize Chisholm’s system by grounding his undefined and primitive (i.e. foundational) concept of reasonablenes in probability and proof theory. S can be useful in systems that have to interact with humans and provide justifications for their uncertainty. As a demonstration of the system, we apply the system to provide a solution to the lottery paradox. Another advantage of the system is that it can be used to provide uncertainty values for counterfactual statements. Counterfactuals are statements that an agent knows for sure are false. Among other cases, counterfactuals are useful when systems have to explain their actions to users. Uncertainties for counterfactuals fall out naturally from our system. Efficient reasoning in just simple first-order logic is a hard problem. Resolution-based first-order reasoning systems have made significant progress over the last several decades in building systems that have solved non-trivial tasks (even unsolved conjectures in mathematics). We present a sketch of a novel algorithm for reasoning that extends first-order resolution. Finally, while there have been many systems of uncertainty for propositional logics, first-order logics and propositional modal logics, there has been very little work in building systems of uncertainty for first-order modal logics. The work described below is in progress; and once finished will address this lack.
Tasks
Published	2017-05-30
URL	http://arxiv.org/abs/1705.10726v2
PDF	http://arxiv.org/pdf/1705.10726v2.pdf
PWC	https://paperswithcode.com/paper/strength-factors-an-uncertainty-system-for-a
Repo
Framework

MAT: A Multi-strength Adversarial Training Method to Mitigate Adversarial Attacks


Title	MAT: A Multi-strength Adversarial Training Method to Mitigate Adversarial Attacks
Authors	Chang Song, Hsin-Pai Cheng, Huanrui Yang, Sicheng Li, Chunpeng Wu, Qing Wu, Hai Li, Yiran Chen
Abstract	Some recent works revealed that deep neural networks (DNNs) are vulnerable to so-called adversarial attacks where input examples are intentionally perturbed to fool DNNs. In this work, we revisit the DNN training process that includes adversarial examples into the training dataset so as to improve DNN’s resilience to adversarial attacks, namely, adversarial training. Our experiments show that different adversarial strengths, i.e., perturbation levels of adversarial examples, have different working zones to resist the attack. Based on the observation, we propose a multi-strength adversarial training method (MAT) that combines the adversarial training examples with different adversarial strengths to defend adversarial attacks. Two training structures - mixed MAT and parallel MAT - are developed to facilitate the tradeoffs between training time and memory occupation. Our results show that MAT can substantially minimize the accuracy degradation of deep learning systems to adversarial attacks on MNIST, CIFAR-10, CIFAR-100, and SVHN.
Tasks
Published	2017-05-27
URL	http://arxiv.org/abs/1705.09764v2
PDF	http://arxiv.org/pdf/1705.09764v2.pdf
PWC	https://paperswithcode.com/paper/mat-a-multi-strength-adversarial-training
Repo
Framework

Verification & Validation of Agent Based Simulations using the VOMAS (Virtual Overlay Multi-agent System) approach


Title	Verification & Validation of Agent Based Simulations using the VOMAS (Virtual Overlay Multi-agent System) approach
Authors	Muaz A. Niazi, Amir Hussain, Mario Kolberg
Abstract	Agent Based Models are very popular in a number of different areas. For example, they have been used in a range of domains ranging from modeling of tumor growth, immune systems, molecules to models of social networks, crowds and computer and mobile self-organizing networks. One reason for their success is their intuitiveness and similarity to human cognition. However, with this power of abstraction, in spite of being easily applicable to such a wide number of domains, it is hard to validate agent-based models. In addition, building valid and credible simulations is not just a challenging task but also a crucial exercise to ensure that what we are modeling is, at some level of abstraction, a model of our conceptual system; the system that we have in mind. In this paper, we address this important area of validation of agent based models by presenting a novel technique which has broad applicability and can be applied to all kinds of agent-based models. We present a framework, where a virtual overlay multi-agent system can be used to validate simulation models. In addition, since agent-based models have been typically growing, in parallel, in multiple domains, to cater for all of these, we present a new single validation technique applicable to all agent based models. Our technique, which allows for the validation of agent based simulations uses VOMAS: a Virtual Overlay Multi-agent System. This overlay multi-agent system can comprise various types of agents, which form an overlay on top of the agent based simulation model that needs to be validated. Other than being able to watch and log, each of these agents contains clearly defined constraints, which, if violated, can be logged in real time. To demonstrate its effectiveness, we show its broad applicability in a wide variety of simulation models ranging from social sciences to computer networks in spatial and non-spatial conceptual models.
Tasks
Published	2017-08-08
URL	http://arxiv.org/abs/1708.02361v1
PDF	http://arxiv.org/pdf/1708.02361v1.pdf
PWC	https://paperswithcode.com/paper/verification-validation-of-agent-based
Repo
Framework

Bayesian $l_0$-regularized Least Squares


Title	Bayesian $l_0$-regularized Least Squares
Authors	Nicholas G. Polson, Lei Sun
Abstract	Bayesian $l_0$-regularized least squares is a variable selection technique for high dimensional predictors. The challenge is optimizing a non-convex objective function via search over model space consisting of all possible predictor combinations. Spike-and-slab (a.k.a. Bernoulli-Gaussian) priors are the gold standard for Bayesian variable selection, with a caveat of computational speed and scalability. Single Best Replacement (SBR) provides a fast scalable alternative. We provide a link between Bayesian regularization and proximal updating, which provides an equivalence between finding a posterior mode and a posterior mean with a different regularization prior. This allows us to use SBR to find the spike-and-slab estimator. To illustrate our methodology, we provide simulation evidence and a real data example on the statistical properties and computational efficiency of SBR versus direct posterior sampling using spike-and-slab priors. Finally, we conclude with directions for future research.
Tasks
Published	2017-05-31
URL	http://arxiv.org/abs/1706.00098v2
PDF	http://arxiv.org/pdf/1706.00098v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-l_0-regularized-least-squares
Repo
Framework

Fast Fine-grained Image Classification via Weakly Supervised Discriminative Localization


Title	Fast Fine-grained Image Classification via Weakly Supervised Discriminative Localization
Authors	Xiangteng He, Yuxin Peng, Junjie Zhao
Abstract	Fine-grained image classification is to recognize hundreds of subcategories in each basic-level category. Existing methods employ discriminative localization to find the key distinctions among subcategories. However, they generally have two limitations: (1) Discriminative localization relies on region proposal methods to hypothesize the locations of discriminative regions, which are time-consuming. (2) The training of discriminative localization depends on object or part annotations, which are heavily labor-consuming. It is highly challenging to address the two key limitations simultaneously, and existing methods only focus on one of them. Therefore, we propose a weakly supervised discriminative localization approach (WSDL) for fast fine-grained image classification to address the two limitations at the same time, and its main advantages are: (1) n-pathway end-to-end discriminative localization network is designed to improve classification speed, which simultaneously localizes multiple different discriminative regions for one image to boost classification accuracy, and shares full-image convolutional features generated by region proposal network to accelerate the process of generating region proposals as well as reduce the computation of convolutional operation. (2) Multi-level attention guided localization learning is proposed to localize discriminative regions with different focuses automatically, without using object and part annotations, avoiding the labor consumption. Different level attentions focus on different characteristics of the image, which are complementary and boost the classification accuracy. Both are jointly employed to simultaneously improve classification speed and eliminate dependence on object and part annotations. Compared with state-of-the-art methods on 2 widely-used fine-grained image classification datasets, our WSDL approach achieves the best performance.
Tasks	Fine-Grained Image Classification, Image Classification
Published	2017-09-30
URL	http://arxiv.org/abs/1710.01168v1
PDF	http://arxiv.org/pdf/1710.01168v1.pdf
PWC	https://paperswithcode.com/paper/fast-fine-grained-image-classification-via
Repo
Framework

Guetzli: Perceptually Guided JPEG Encoder


Title	Guetzli: Perceptually Guided JPEG Encoder
Authors	Jyrki Alakuijala, Robert Obryk, Ostap Stoliarchuk, Zoltan Szabadka, Lode Vandevenne, Jan Wassenberg
Abstract	Guetzli is a new JPEG encoder that aims to produce visually indistinguishable images at a lower bit-rate than other common JPEG encoders. It optimizes both the JPEG global quantization tables and the DCT coefficient values in each JPEG block using a closed-loop optimizer. Guetzli uses Butteraugli, our perceptual distance metric, as the source of feedback in its optimization process. We reach a 29-45% reduction in data size for a given perceptual distance, according to Butteraugli, in comparison to other compressors we tried. Guetzli’s computation is currently extremely slow, which limits its applicability to compressing static content and serving as a proof- of-concept that we can achieve significant reductions in size by combining advanced psychovisual models with lossy compression techniques.
Tasks	Quantization
Published	2017-03-13
URL	http://arxiv.org/abs/1703.04421v1
PDF	http://arxiv.org/pdf/1703.04421v1.pdf
PWC	https://paperswithcode.com/paper/guetzli-perceptually-guided-jpeg-encoder
Repo
Framework

Asynchronous Distributed Variational Gaussian Processes for Regression


Title	Asynchronous Distributed Variational Gaussian Processes for Regression
Authors	Hao Peng, Shandian Zhe, Yuan Qi
Abstract	Gaussian processes (GPs) are powerful non-parametric function estimators. However, their applications are largely limited by the expensive computational cost of the inference procedures. Existing stochastic or distributed synchronous variational inferences, although have alleviated this issue by scaling up GPs to millions of samples, are still far from satisfactory for real-world large applications, where the data sizes are often orders of magnitudes larger, say, billions. To solve this problem, we propose ADVGP, the first Asynchronous Distributed Variational Gaussian Process inference for regression, on the recent large-scale machine learning platform, PARAMETERSERVER. ADVGP uses a novel, flexible variational framework based on a weight space augmentation, and implements the highly efficient, asynchronous proximal gradient optimization. While maintaining comparable or better predictive performance, ADVGP greatly improves upon the efficiency of the existing variational methods. With ADVGP, we effortlessly scale up GP regression to a real-world application with billions of samples and demonstrate an excellent, superior prediction accuracy to the popular linear models.
Tasks	Gaussian Processes
Published	2017-04-22
URL	http://arxiv.org/abs/1704.06735v3
PDF	http://arxiv.org/pdf/1704.06735v3.pdf
PWC	https://paperswithcode.com/paper/asynchronous-distributed-variational-gaussian
Repo
Framework

Clustering for Different Scales of Measurement - the Gap-Ratio Weighted K-means Algorithm


Title	Clustering for Different Scales of Measurement - the Gap-Ratio Weighted K-means Algorithm
Authors	Joris Guérin, Olivier Gibaru, Stéphane Thiery, Eric Nyiri
Abstract	This paper describes a method for clustering data that are spread out over large regions and which dimensions are on different scales of measurement. Such an algorithm was developed to implement a robotics application consisting in sorting and storing objects in an unsupervised way. The toy dataset used to validate such application consists of Lego bricks of different shapes and colors. The uncontrolled lighting conditions together with the use of RGB color features, respectively involve data with a large spread and different levels of measurement between data dimensions. To overcome the combination of these two characteristics in the data, we have developed a new weighted K-means algorithm, called gap-ratio K-means, which consists in weighting each dimension of the feature space before running the K-means algorithm. The weight associated with a feature is proportional to the ratio of the biggest gap between two consecutive data points, and the average of all the other gaps. This method is compared with two other variants of K-means on the Lego bricks clustering problem as well as two other common classification datasets.
Tasks
Published	2017-03-22
URL	http://arxiv.org/abs/1703.07625v1
PDF	http://arxiv.org/pdf/1703.07625v1.pdf
PWC	https://paperswithcode.com/paper/clustering-for-different-scales-of
Repo
Framework

Automatic Generation of Constrained Furniture Layouts


Title	Automatic Generation of Constrained Furniture Layouts
Authors	Paul Henderson, Kartic Subr, Vittorio Ferrari
Abstract	Efficient authoring of vast virtual environments hinges on algorithms that are able to automatically generate content while also being controllable. We propose a method to automatically generate furniture layouts for indoor environments. Our method is simple, efficient, human-interpretable and amenable to a wide variety of constraints. We model the composition of rooms into classes of objects and learn joint (co-occurrence) statistics from a database of training layouts. We generate new layouts by performing a sequence of conditional sampling steps, exploiting the statistics learned from the database. The generated layouts are specified as 3D object models, along with their positions and orientations. We show they are of equivalent perceived quality to the training layouts, and compare favorably to a state-of-the-art method. We incorporate constraints using a general mechanism – rejection sampling – which provides great flexibility at the cost of extra computation. We demonstrate the versatility of our method by applying a wide variety of constraints relevant to real-world applications.
Tasks
Published	2017-11-29
URL	http://arxiv.org/abs/1711.10939v3
PDF	http://arxiv.org/pdf/1711.10939v3.pdf
PWC	https://paperswithcode.com/paper/automatic-generation-of-constrained-furniture
Repo
Framework

Quality Resilient Deep Neural Networks


Title	Quality Resilient Deep Neural Networks
Authors	Samuel Dodge, Lina Karam
Abstract	We study deep neural networks for classification of images with quality distortions. We first show that networks fine-tuned on distorted data greatly outperform the original networks when tested on distorted data. However, fine-tuned networks perform poorly on quality distortions that they have not been trained for. We propose a mixture of experts ensemble method that is robust to different types of distortions. The “experts” in our model are trained on a particular type of distortion. The output of the model is a weighted sum of the expert models, where the weights are determined by a separate gating network. The gating network is trained to predict optimal weights for a particular distortion type and level. During testing, the network is blind to the distortion level and type, yet can still assign appropriate weights to the expert models. We additionally investigate weight sharing methods for the mixture model and show that improved performance can be achieved with a large reduction in the number of unique network parameters.
Tasks
Published	2017-03-23
URL	http://arxiv.org/abs/1703.08119v1
PDF	http://arxiv.org/pdf/1703.08119v1.pdf
PWC	https://paperswithcode.com/paper/quality-resilient-deep-neural-networks
Repo
Framework


Title	Generating Time-Based Label Refinements to Discover More Precise Process Models
Authors	Niek Tax, Emin Alasgarov, Natalia Sidorova, Wil M. P. van der Aalst, Reinder Haakma
Abstract	Process mining is a research field focused on the analysis of event data with the aim of extracting insights related to dynamic behavior. Applying process mining techniques on data from smart home environments has the potential to provide valuable insights in (un)healthy habits and to contribute to ambient assisted living solutions. Finding the right event labels to enable the application of process mining techniques is however far from trivial, as simply using the triggering sensor as the label for sensor events results in uninformative models that allow for too much behavior (overgeneralizing). Refinements of sensor level event labels suggested by domain experts have been shown to enable discovery of more precise and insightful process models. However, there exists no automated approach to generate refinements of event labels in the context of process mining. In this paper we propose a framework for the automated generation of label refinements based on the time attribute of events, allowing us to distinguish behaviourally different instances of the same event type based on their time attribute. We show on a case study with real life smart home event data that using automatically generated refined labels in process discovery, we can find more specific, and therefore more insightful, process models. We observe that one label refinement could have an effect on the usefulness of other label refinements when used together. Therefore, we explore four strategies to generate useful combinations of multiple label refinements and evaluate those on three real life smart home event logs.
Tasks
Published	2017-05-25
URL	http://arxiv.org/abs/1705.09359v2
PDF	http://arxiv.org/pdf/1705.09359v2.pdf
PWC	https://paperswithcode.com/paper/generating-time-based-label-refinements-to
Repo
Framework

Paper Group ANR 163

Some Like it Hoax: Automated Fake News Detection in Social Networks