May 7, 2019

2775 words 14 mins read

Paper Group ANR 123

Fast low-level pattern matching algorithm. VHT: Vertical Hoeffding Tree. Store Location Selection via Mining Search Query Logs of Baidu Maps. Vote-boosting ensembles. An Evaluation of Models for Runtime Approximation in Link Discovery. Juxtaposition of System Dynamics and Agent-based Simulation for a Case Study in Immunosenescence. HordeQBF: A Modu …

Fast low-level pattern matching algorithm


Title	Fast low-level pattern matching algorithm
Authors	Janja Paliska Soldo, Ana Sovic Krzic, and Damir Sersic
Abstract	This paper focuses on pattern matching in the DNA sequence. It was inspired by a previously reported method that proposes encoding both pattern and sequence using prime numbers. Although fast, the method is limited to rather small pattern lengths, due to computing precision problem. Our approach successfully deals with large patterns, due to our implementation that uses modular arithmetic. In order to get the results very fast, the code was adapted for multithreading and parallel implementations. The method is reduced to assembly language level instructions, thus the final result shows significant time and memory savings compared to the reference algorithm.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1611.06115v1
PDF	http://arxiv.org/pdf/1611.06115v1.pdf
PWC	https://paperswithcode.com/paper/fast-low-level-pattern-matching-algorithm
Repo
Framework

VHT: Vertical Hoeffding Tree


Title	VHT: Vertical Hoeffding Tree
Authors	Nicolas Kourtellis, Gianmarco De Francisci Morales, Albert Bifet, Arinto Murdopo
Abstract	IoT Big Data requires new machine learning methods able to scale to large size of data arriving at high speed. Decision trees are popular machine learning models since they are very effective, yet easy to interpret and visualize. In the literature, we can find distributed algorithms for learning decision trees, and also streaming algorithms, but not algorithms that combine both features. In this paper we present the Vertical Hoeffding Tree (VHT), the first distributed streaming algorithm for learning decision trees. It features a novel way of distributing decision trees via vertical parallelism. The algorithm is implemented on top of Apache SAMOA, a platform for mining distributed data streams, and thus able to run on real-world clusters. We run several experiments to study the accuracy and throughput performance of our new VHT algorithm, as well as its ability to scale while keeping its superior performance with respect to non-distributed decision trees.
Tasks
Published	2016-07-28
URL	http://arxiv.org/abs/1607.08325v1
PDF	http://arxiv.org/pdf/1607.08325v1.pdf
PWC	https://paperswithcode.com/paper/vht-vertical-hoeffding-tree
Repo
Framework

Store Location Selection via Mining Search Query Logs of Baidu Maps


Title	Store Location Selection via Mining Search Query Logs of Baidu Maps
Authors	Mengwen Xu, Tianyi Wang, Zhengwei Wu, Jingbo Zhou, Jian Li, Haishan Wu
Abstract	Choosing a good location when opening a new store is crucial for the future success of a business. Traditional methods include offline manual survey, which is very time consuming, and analytic models based on census data, which are un- able to adapt to the dynamic market. The rapid increase of the availability of big data from various types of mobile devices, such as online query data and offline positioning data, provides us with the possibility to develop automatic and accurate data-driven prediction models for business store placement. In this paper, we propose a Demand Distribution Driven Store Placement (D3SP) framework for business store placement by mining search query data from Baidu Maps. D3SP first detects the spatial-temporal distributions of customer demands on different business services via query data from Baidu Maps, the largest online map search engine in China, and detects the gaps between demand and sup- ply. Then we determine candidate locations via clustering such gaps. In the final stage, we solve the location optimization problem by predicting and ranking the number of customers. We not only deploy supervised regression models to predict the number of customers, but also learn to rank models to directly rank the locations. We evaluate our framework on various types of businesses in real-world cases, and the experiments results demonstrate the effectiveness of our methods. D3SP as the core function for store placement has already been implemented as a core component of our business analytics platform and could be potentially used by chain store merchants on Baidu Nuomi.
Tasks
Published	2016-06-12
URL	http://arxiv.org/abs/1606.03662v1
PDF	http://arxiv.org/pdf/1606.03662v1.pdf
PWC	https://paperswithcode.com/paper/store-location-selection-via-mining-search
Repo
Framework

Vote-boosting ensembles


Title	Vote-boosting ensembles
Authors	Maryam Sabzevari, Gonzalo Martínez-Muñoz, Alberto Suárez
Abstract	Vote-boosting is a sequential ensemble learning method in which the individual classifiers are built on different weighted versions of the training data. To build a new classifier, the weight of each training instance is determined in terms of the degree of disagreement among the current ensemble predictions for that instance. For low class-label noise levels, especially when simple base learners are used, emphasis should be made on instances for which the disagreement rate is high. When more flexible classifiers are used and as the noise level increases, the emphasis on these uncertain instances should be reduced. In fact, at sufficiently high levels of class-label noise, the focus should be on instances on which the ensemble classifiers agree. The optimal type of emphasis can be automatically determined using cross-validation. An extensive empirical analysis using the beta distribution as emphasis function illustrates that vote-boosting is an effective method to generate ensembles that are both accurate and robust.
Tasks
Published	2016-06-30
URL	http://arxiv.org/abs/1606.09458v2
PDF	http://arxiv.org/pdf/1606.09458v2.pdf
PWC	https://paperswithcode.com/paper/vote-boosting-ensembles
Repo
Framework

An Evaluation of Models for Runtime Approximation in Link Discovery


Title	An Evaluation of Models for Runtime Approximation in Link Discovery
Authors	Kleanthi Georgala, Micheal Hoffmann, Axel-Cyrille Ngonga Ngomo
Abstract	Time-efficient link discovery is of central importance to implement the vision of the Semantic Web. Some of the most rapid Link Discovery approaches rely internally on planning to execute link specifications. In newer works, linear models have been used to estimate the runtime the fastest planners. However, no other category of models has been studied for this purpose so far. In this paper, we study non-linear runtime estimation functions for runtime estimation. In particular, we study exponential and mixed models for the estimation of the runtimes of planners. To this end, we evaluate three different models for runtime on six datasets using 400 link specifications. We show that exponential and mixed models achieve better fits when trained but are only to be preferred in some cases. Our evaluation also shows that the use of better runtime approximation models has a positive impact on the overall execution of link specifications.
Tasks
Published	2016-12-01
URL	http://arxiv.org/abs/1612.00240v1
PDF	http://arxiv.org/pdf/1612.00240v1.pdf
PWC	https://paperswithcode.com/paper/an-evaluation-of-models-for-runtime
Repo
Framework

Juxtaposition of System Dynamics and Agent-based Simulation for a Case Study in Immunosenescence


Title	Juxtaposition of System Dynamics and Agent-based Simulation for a Case Study in Immunosenescence
Authors	Grazziela P. Figueredo, Peer-Olaf Siebers, Uwe Aickelin, Amanda Whitbrook, Jonathan M. Garibaldi
Abstract	Advances in healthcare and in the quality of life significantly increase human life expectancy. With the ageing of populations, new un-faced challenges are brought to science. The human body is naturally selected to be well-functioning until the age of reproduction to keep the species alive. However, as the lifespan extends, unseen problems due to the body deterioration emerge. There are several age-related diseases with no appropriate treatment; therefore, the complex ageing phenomena needs further understanding. Immunosenescence, the ageing of the immune system, is highly correlated to the negative effects of ageing, such as the increase of auto-inflammatory diseases and decrease in responsiveness to new diseases. Besides clinical and mathematical tools, we believe there is opportunity to further exploit simulation tools to understand immunosenescence. Compared to real-world experimentation, benefits include time and cost effectiveness due to the laborious, resource-intensiveness of the biological environment and the possibility of conducting experiments without ethic restrictions. Contrasted with mathematical models, simulation modelling is more suitable for representing complex systems and emergence. In addition, there is the belief that simulation models are easier to communicate in interdisciplinary contexts. Our work investigates the usefulness of simulations to understand immunosenescence by employing two different simulation methods, agent-based and system dynamics simulation, to a case study of immune cells depletion with age.
Tasks
Published	2016-07-20
URL	http://arxiv.org/abs/1607.05888v1
PDF	http://arxiv.org/pdf/1607.05888v1.pdf
PWC	https://paperswithcode.com/paper/juxtaposition-of-system-dynamics-and-agent
Repo
Framework

HordeQBF: A Modular and Massively Parallel QBF Solver


Title	HordeQBF: A Modular and Massively Parallel QBF Solver
Authors	Tomas Balyo, Florian Lonsing
Abstract	The recently developed massively parallel satisfiability (SAT) solver HordeSAT was designed in a modular way to allow the integration of any sequential CDCL-based SAT solver in its core. We integrated the QCDCL-based quantified Boolean formula (QBF) solver DepQBF in HordeSAT to obtain a massively parallel QBF solver—HordeQBF. In this paper we describe the details of this integration and report on results of the experimental evaluation of HordeQBF’s performance. HordeQBF achieves superlinear average and median speedup on the hard application instances of the 2014 QBF Gallery.
Tasks
Published	2016-04-13
URL	http://arxiv.org/abs/1604.03793v1
PDF	http://arxiv.org/pdf/1604.03793v1.pdf
PWC	https://paperswithcode.com/paper/hordeqbf-a-modular-and-massively-parallel-qbf
Repo
Framework

Efficient Algorithms for Adversarial Contextual Learning


Title	Efficient Algorithms for Adversarial Contextual Learning
Authors	Vasilis Syrgkanis, Akshay Krishnamurthy, Robert E. Schapire
Abstract	We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem. In this problem, the learner repeatedly makes an action on the basis of a context and receives reward for the chosen action, with the goal of achieving reward competitive with a large class of policies. We analyze two settings: i) in the transductive setting the learner knows the set of contexts a priori, ii) in the small separator setting, there exists a small set of contexts such that any two policies behave differently in one of the contexts in the set. Our algorithms fall into the follow the perturbed leader family \cite{Kalai2005} and achieve regret $O(T^{3/4}\sqrt{K\log(N)})$ in the transductive setting and $O(T^{2/3} d^{3/4} K\sqrt{\log(N)})$ in the separator setting, where $K$ is the number of actions, $N$ is the number of baseline policies, and $d$ is the size of the separator. We actually solve the more general adversarial contextual semi-bandit linear optimization problem, whilst in the full information setting we address the even more general contextual combinatorial optimization. We provide several extensions and implications of our algorithms, such as switching regret and efficient learning with predictable sequences.
Tasks	Combinatorial Optimization
Published	2016-02-08
URL	http://arxiv.org/abs/1602.02454v1
PDF	http://arxiv.org/pdf/1602.02454v1.pdf
PWC	https://paperswithcode.com/paper/efficient-algorithms-for-adversarial
Repo
Framework

Sense Embedding Learning for Word Sense Induction


Title	Sense Embedding Learning for Word Sense Induction
Authors	Linfeng Song, Zhiguo Wang, Haitao Mi, Daniel Gildea
Abstract	Conventional word sense induction (WSI) methods usually represent each instance with discrete linguistic features or cooccurrence features, and train a model for each polysemous word individually. In this work, we propose to learn sense embeddings for the WSI task. In the training stage, our method induces several sense centroids (embedding) for each polysemous word. In the testing stage, our method represents each instance as a contextual vector, and induces its sense by finding the nearest sense centroid in the embedding space. The advantages of our method are (1) distributed sense vectors are taken as the knowledge representations which are trained discriminatively, and usually have better performance than traditional count-based distributional models, and (2) a general model for the whole vocabulary is jointly trained to induce sense centroids under the mutlitask learning framework. Evaluated on SemEval-2010 WSI dataset, our method outperforms all participants and most of the recent state-of-the-art methods. We further verify the two advantages by comparing with carefully designed baselines.
Tasks	Word Sense Induction
Published	2016-06-17
URL	http://arxiv.org/abs/1606.05409v2
PDF	http://arxiv.org/pdf/1606.05409v2.pdf
PWC	https://paperswithcode.com/paper/sense-embedding-learning-for-word-sense
Repo
Framework

An Application of the Generalized Rectangular Fuzzy Model to Critical Thinking Assessment


Title	An Application of the Generalized Rectangular Fuzzy Model to Critical Thinking Assessment
Authors	Igor Ya. Subbotin, Michael Gr. Voskoglou
Abstract	The authors apply the Generalized Rectangular Model to assessing critical thinking skills and its relations with their language competency.
Tasks
Published	2016-01-08
URL	http://arxiv.org/abs/1601.03065v1
PDF	http://arxiv.org/pdf/1601.03065v1.pdf
PWC	https://paperswithcode.com/paper/an-application-of-the-generalized-rectangular
Repo
Framework

Evolutionary Demographic Algorithms


Title	Evolutionary Demographic Algorithms
Authors	Marco AR Erra, Pedro MM Mitra, Agostinho C Rosa
Abstract	Most of the problems in genetic algorithms are very complex and demand a large amount of resources that current technology can not offer. Our purpose was to develop a Java-JINI distributed library that implements Genetic Algorithms with sub-populations (coarse grain) and a graphical interface in order to configure and follow the evolution of the search. The sub-populations are simulated/evaluated in personal computers connected trough a network, keeping in mind different models of sub-populations, migration policies and network topologies. We show that this model delays the convergence of the population keeping a higher level of genetic diversity and allows a much greater number of evaluations since they are distributed among several computers compared with the traditional Genetic Algorithms.
Tasks
Published	2016-05-22
URL	http://arxiv.org/abs/1605.06714v1
PDF	http://arxiv.org/pdf/1605.06714v1.pdf
PWC	https://paperswithcode.com/paper/evolutionary-demographic-algorithms
Repo
Framework

Variational Autoencoders for Feature Detection of Magnetic Resonance Imaging Data


Title	Variational Autoencoders for Feature Detection of Magnetic Resonance Imaging Data
Authors	R. Devon Hjelm, Sergey M. Plis, Vince C. Calhoun
Abstract	Independent component analysis (ICA), as an approach to the blind source-separation (BSS) problem, has become the de-facto standard in many medical imaging settings. Despite successes and a large ongoing research effort, the limitation of ICA to square linear transformations have not been overcome, so that general INFOMAX is still far from being realized. As an alternative, we present feature analysis in medical imaging as a problem solved by Helmholtz machines, which include dimensionality reduction and reconstruction of the raw data under the same objective, and which recently have overcome major difficulties in inference and learning with deep and nonlinear configurations. We demonstrate one approach to training Helmholtz machines, variational auto-encoders (VAE), as a viable approach toward feature extraction with magnetic resonance imaging (MRI) data.
Tasks	Dimensionality Reduction
Published	2016-03-21
URL	http://arxiv.org/abs/1603.06624v1
PDF	http://arxiv.org/pdf/1603.06624v1.pdf
PWC	https://paperswithcode.com/paper/variational-autoencoders-for-feature
Repo
Framework

The Dark Side of Ethical Robots


Title	The Dark Side of Ethical Robots
Authors	Dieter Vanderelst, Alan Winfield
Abstract	Concerns over the risks associated with advances in Artificial Intelligence have prompted calls for greater efforts toward robust and beneficial AI, including machine ethics. Recently, roboticists have responded by initiating the development of so-called ethical robots. These robots would, ideally, evaluate the consequences of their actions and morally justify their choices. This emerging field promises to develop extensively over the next years. However, in this paper, we point out an inherent limitation of the emerging field of ethical robots. We show that building ethical robots also necessarily facilitates the construction of unethical robots. In three experiments, we show that it is remarkably easy to modify an ethical robot so that it behaves competitively, or even aggressively. The reason for this is that the specific AI, required to make an ethical robot, can always be exploited to make unethical robots. Hence, the development of ethical robots will not guarantee the responsible deployment of AI. While advocating for ethical robots, we conclude that preventing the misuse of robots is beyond the scope of engineering, and requires instead governance frameworks underpinned by legislation. Without this, the development of ethical robots will serve to increase the risks of robotic malpractice instead of diminishing it.
Tasks
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02583v1
PDF	http://arxiv.org/pdf/1606.02583v1.pdf
PWC	https://paperswithcode.com/paper/the-dark-side-of-ethical-robots
Repo
Framework

Survey of Expressivity in Deep Neural Networks


Title	Survey of Expressivity in Deep Neural Networks
Authors	Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein
Abstract	We survey results on neural network expressivity described in “On the Expressive Power of Deep Neural Networks”. The paper motivates and develops three natural measures of expressiveness, which all display an exponential dependence on the depth of the network. In fact, all of these measures are related to a fourth quantity, trajectory length. This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed. These results translate to consequences for networks during and after training. They suggest that parameters earlier in a network have greater influence on its expressive power – in particular, given a layer, its influence on expressivity is determined by the remaining depth of the network after that layer. This is verified with experiments on MNIST and CIFAR-10. We also explore the effect of training on the input-output map, and find that it trades off between the stability and expressivity.
Tasks
Published	2016-11-24
URL	http://arxiv.org/abs/1611.08083v1
PDF	http://arxiv.org/pdf/1611.08083v1.pdf
PWC	https://paperswithcode.com/paper/survey-of-expressivity-in-deep-neural
Repo
Framework

A latent-observed dissimilarity measure


Title	A latent-observed dissimilarity measure
Authors	Yasushi Terazono
Abstract	Quantitatively assessing relationships between latent variables and observed variables is important for understanding and developing generative models and representation learning. In this paper, we propose latent-observed dissimilarity (LOD) to evaluate the dissimilarity between the probabilistic characteristics of latent and observed variables. We also define four essential types of generative models with different independence/conditional independence configurations. Experiments using tractable real-world data show that LOD can effectively capture the differences between models and reflect the capability for higher layer learning. They also show that the conditional independence of latent variables given observed variables contributes to improving the transmission of information and characteristics from lower layers to higher layers.
Tasks	Representation Learning
Published	2016-03-30
URL	http://arxiv.org/abs/1603.09254v1
PDF	http://arxiv.org/pdf/1603.09254v1.pdf
PWC	https://paperswithcode.com/paper/a-latent-observed-dissimilarity-measure
Repo
Framework