Paper Group ANR 123
Fast low-level pattern matching algorithm. VHT: Vertical Hoeffding Tree. Store Location Selection via Mining Search Query Logs of Baidu Maps. Vote-boosting ensembles. An Evaluation of Models for Runtime Approximation in Link Discovery. Juxtaposition of System Dynamics and Agent-based Simulation for a Case Study in Immunosenescence. HordeQBF: A Modu …
Fast low-level pattern matching algorithm
Title | Fast low-level pattern matching algorithm |
Authors | Janja Paliska Soldo, Ana Sovic Krzic, and Damir Sersic |
Abstract | This paper focuses on pattern matching in the DNA sequence. It was inspired by a previously reported method that proposes encoding both pattern and sequence using prime numbers. Although fast, the method is limited to rather small pattern lengths, due to computing precision problem. Our approach successfully deals with large patterns, due to our implementation that uses modular arithmetic. In order to get the results very fast, the code was adapted for multithreading and parallel implementations. The method is reduced to assembly language level instructions, thus the final result shows significant time and memory savings compared to the reference algorithm. |
Tasks | |
Published | 2016-11-18 |
URL | http://arxiv.org/abs/1611.06115v1 |
http://arxiv.org/pdf/1611.06115v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-low-level-pattern-matching-algorithm |
Repo | |
Framework | |
VHT: Vertical Hoeffding Tree
Title | VHT: Vertical Hoeffding Tree |
Authors | Nicolas Kourtellis, Gianmarco De Francisci Morales, Albert Bifet, Arinto Murdopo |
Abstract | IoT Big Data requires new machine learning methods able to scale to large size of data arriving at high speed. Decision trees are popular machine learning models since they are very effective, yet easy to interpret and visualize. In the literature, we can find distributed algorithms for learning decision trees, and also streaming algorithms, but not algorithms that combine both features. In this paper we present the Vertical Hoeffding Tree (VHT), the first distributed streaming algorithm for learning decision trees. It features a novel way of distributing decision trees via vertical parallelism. The algorithm is implemented on top of Apache SAMOA, a platform for mining distributed data streams, and thus able to run on real-world clusters. We run several experiments to study the accuracy and throughput performance of our new VHT algorithm, as well as its ability to scale while keeping its superior performance with respect to non-distributed decision trees. |
Tasks | |
Published | 2016-07-28 |
URL | http://arxiv.org/abs/1607.08325v1 |
http://arxiv.org/pdf/1607.08325v1.pdf | |
PWC | https://paperswithcode.com/paper/vht-vertical-hoeffding-tree |
Repo | |
Framework | |
Store Location Selection via Mining Search Query Logs of Baidu Maps
Title | Store Location Selection via Mining Search Query Logs of Baidu Maps |
Authors | Mengwen Xu, Tianyi Wang, Zhengwei Wu, Jingbo Zhou, Jian Li, Haishan Wu |
Abstract | Choosing a good location when opening a new store is crucial for the future success of a business. Traditional methods include offline manual survey, which is very time consuming, and analytic models based on census data, which are un- able to adapt to the dynamic market. The rapid increase of the availability of big data from various types of mobile devices, such as online query data and offline positioning data, provides us with the possibility to develop automatic and accurate data-driven prediction models for business store placement. In this paper, we propose a Demand Distribution Driven Store Placement (D3SP) framework for business store placement by mining search query data from Baidu Maps. D3SP first detects the spatial-temporal distributions of customer demands on different business services via query data from Baidu Maps, the largest online map search engine in China, and detects the gaps between demand and sup- ply. Then we determine candidate locations via clustering such gaps. In the final stage, we solve the location optimization problem by predicting and ranking the number of customers. We not only deploy supervised regression models to predict the number of customers, but also learn to rank models to directly rank the locations. We evaluate our framework on various types of businesses in real-world cases, and the experiments results demonstrate the effectiveness of our methods. D3SP as the core function for store placement has already been implemented as a core component of our business analytics platform and could be potentially used by chain store merchants on Baidu Nuomi. |
Tasks | |
Published | 2016-06-12 |
URL | http://arxiv.org/abs/1606.03662v1 |
http://arxiv.org/pdf/1606.03662v1.pdf | |
PWC | https://paperswithcode.com/paper/store-location-selection-via-mining-search |
Repo | |
Framework | |
Vote-boosting ensembles
Title | Vote-boosting ensembles |
Authors | Maryam Sabzevari, Gonzalo Martínez-Muñoz, Alberto Suárez |
Abstract | Vote-boosting is a sequential ensemble learning method in which the individual classifiers are built on different weighted versions of the training data. To build a new classifier, the weight of each training instance is determined in terms of the degree of disagreement among the current ensemble predictions for that instance. For low class-label noise levels, especially when simple base learners are used, emphasis should be made on instances for which the disagreement rate is high. When more flexible classifiers are used and as the noise level increases, the emphasis on these uncertain instances should be reduced. In fact, at sufficiently high levels of class-label noise, the focus should be on instances on which the ensemble classifiers agree. The optimal type of emphasis can be automatically determined using cross-validation. An extensive empirical analysis using the beta distribution as emphasis function illustrates that vote-boosting is an effective method to generate ensembles that are both accurate and robust. |
Tasks | |
Published | 2016-06-30 |
URL | http://arxiv.org/abs/1606.09458v2 |
http://arxiv.org/pdf/1606.09458v2.pdf | |
PWC | https://paperswithcode.com/paper/vote-boosting-ensembles |
Repo | |
Framework | |
An Evaluation of Models for Runtime Approximation in Link Discovery
Title | An Evaluation of Models for Runtime Approximation in Link Discovery |
Authors | Kleanthi Georgala, Micheal Hoffmann, Axel-Cyrille Ngonga Ngomo |
Abstract | Time-efficient link discovery is of central importance to implement the vision of the Semantic Web. Some of the most rapid Link Discovery approaches rely internally on planning to execute link specifications. In newer works, linear models have been used to estimate the runtime the fastest planners. However, no other category of models has been studied for this purpose so far. In this paper, we study non-linear runtime estimation functions for runtime estimation. In particular, we study exponential and mixed models for the estimation of the runtimes of planners. To this end, we evaluate three different models for runtime on six datasets using 400 link specifications. We show that exponential and mixed models achieve better fits when trained but are only to be preferred in some cases. Our evaluation also shows that the use of better runtime approximation models has a positive impact on the overall execution of link specifications. |
Tasks | |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00240v1 |
http://arxiv.org/pdf/1612.00240v1.pdf | |
PWC | https://paperswithcode.com/paper/an-evaluation-of-models-for-runtime |
Repo | |
Framework | |
Juxtaposition of System Dynamics and Agent-based Simulation for a Case Study in Immunosenescence
Title | Juxtaposition of System Dynamics and Agent-based Simulation for a Case Study in Immunosenescence |
Authors | Grazziela P. Figueredo, Peer-Olaf Siebers, Uwe Aickelin, Amanda Whitbrook, Jonathan M. Garibaldi |
Abstract | Advances in healthcare and in the quality of life significantly increase human life expectancy. With the ageing of populations, new un-faced challenges are brought to science. The human body is naturally selected to be well-functioning until the age of reproduction to keep the species alive. However, as the lifespan extends, unseen problems due to the body deterioration emerge. There are several age-related diseases with no appropriate treatment; therefore, the complex ageing phenomena needs further understanding. Immunosenescence, the ageing of the immune system, is highly correlated to the negative effects of ageing, such as the increase of auto-inflammatory diseases and decrease in responsiveness to new diseases. Besides clinical and mathematical tools, we believe there is opportunity to further exploit simulation tools to understand immunosenescence. Compared to real-world experimentation, benefits include time and cost effectiveness due to the laborious, resource-intensiveness of the biological environment and the possibility of conducting experiments without ethic restrictions. Contrasted with mathematical models, simulation modelling is more suitable for representing complex systems and emergence. In addition, there is the belief that simulation models are easier to communicate in interdisciplinary contexts. Our work investigates the usefulness of simulations to understand immunosenescence by employing two different simulation methods, agent-based and system dynamics simulation, to a case study of immune cells depletion with age. |
Tasks | |
Published | 2016-07-20 |
URL | http://arxiv.org/abs/1607.05888v1 |
http://arxiv.org/pdf/1607.05888v1.pdf | |
PWC | https://paperswithcode.com/paper/juxtaposition-of-system-dynamics-and-agent |
Repo | |
Framework | |
HordeQBF: A Modular and Massively Parallel QBF Solver
Title | HordeQBF: A Modular and Massively Parallel QBF Solver |
Authors | Tomas Balyo, Florian Lonsing |
Abstract | The recently developed massively parallel satisfiability (SAT) solver HordeSAT was designed in a modular way to allow the integration of any sequential CDCL-based SAT solver in its core. We integrated the QCDCL-based quantified Boolean formula (QBF) solver DepQBF in HordeSAT to obtain a massively parallel QBF solver—HordeQBF. In this paper we describe the details of this integration and report on results of the experimental evaluation of HordeQBF’s performance. HordeQBF achieves superlinear average and median speedup on the hard application instances of the 2014 QBF Gallery. |
Tasks | |
Published | 2016-04-13 |
URL | http://arxiv.org/abs/1604.03793v1 |
http://arxiv.org/pdf/1604.03793v1.pdf | |
PWC | https://paperswithcode.com/paper/hordeqbf-a-modular-and-massively-parallel-qbf |
Repo | |
Framework | |
Efficient Algorithms for Adversarial Contextual Learning
Title | Efficient Algorithms for Adversarial Contextual Learning |
Authors | Vasilis Syrgkanis, Akshay Krishnamurthy, Robert E. Schapire |
Abstract | We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem. In this problem, the learner repeatedly makes an action on the basis of a context and receives reward for the chosen action, with the goal of achieving reward competitive with a large class of policies. We analyze two settings: i) in the transductive setting the learner knows the set of contexts a priori, ii) in the small separator setting, there exists a small set of contexts such that any two policies behave differently in one of the contexts in the set. Our algorithms fall into the follow the perturbed leader family \cite{Kalai2005} and achieve regret $O(T^{3/4}\sqrt{K\log(N)})$ in the transductive setting and $O(T^{2/3} d^{3/4} K\sqrt{\log(N)})$ in the separator setting, where $K$ is the number of actions, $N$ is the number of baseline policies, and $d$ is the size of the separator. We actually solve the more general adversarial contextual semi-bandit linear optimization problem, whilst in the full information setting we address the even more general contextual combinatorial optimization. We provide several extensions and implications of our algorithms, such as switching regret and efficient learning with predictable sequences. |
Tasks | Combinatorial Optimization |
Published | 2016-02-08 |
URL | http://arxiv.org/abs/1602.02454v1 |
http://arxiv.org/pdf/1602.02454v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-algorithms-for-adversarial |
Repo | |
Framework | |
Sense Embedding Learning for Word Sense Induction
Title | Sense Embedding Learning for Word Sense Induction |
Authors | Linfeng Song, Zhiguo Wang, Haitao Mi, Daniel Gildea |
Abstract | Conventional word sense induction (WSI) methods usually represent each instance with discrete linguistic features or cooccurrence features, and train a model for each polysemous word individually. In this work, we propose to learn sense embeddings for the WSI task. In the training stage, our method induces several sense centroids (embedding) for each polysemous word. In the testing stage, our method represents each instance as a contextual vector, and induces its sense by finding the nearest sense centroid in the embedding space. The advantages of our method are (1) distributed sense vectors are taken as the knowledge representations which are trained discriminatively, and usually have better performance than traditional count-based distributional models, and (2) a general model for the whole vocabulary is jointly trained to induce sense centroids under the mutlitask learning framework. Evaluated on SemEval-2010 WSI dataset, our method outperforms all participants and most of the recent state-of-the-art methods. We further verify the two advantages by comparing with carefully designed baselines. |
Tasks | Word Sense Induction |
Published | 2016-06-17 |
URL | http://arxiv.org/abs/1606.05409v2 |
http://arxiv.org/pdf/1606.05409v2.pdf | |
PWC | https://paperswithcode.com/paper/sense-embedding-learning-for-word-sense |
Repo | |
Framework | |
An Application of the Generalized Rectangular Fuzzy Model to Critical Thinking Assessment
Title | An Application of the Generalized Rectangular Fuzzy Model to Critical Thinking Assessment |
Authors | Igor Ya. Subbotin, Michael Gr. Voskoglou |
Abstract | The authors apply the Generalized Rectangular Model to assessing critical thinking skills and its relations with their language competency. |
Tasks | |
Published | 2016-01-08 |
URL | http://arxiv.org/abs/1601.03065v1 |
http://arxiv.org/pdf/1601.03065v1.pdf | |
PWC | https://paperswithcode.com/paper/an-application-of-the-generalized-rectangular |
Repo | |
Framework | |
Evolutionary Demographic Algorithms
Title | Evolutionary Demographic Algorithms |
Authors | Marco AR Erra, Pedro MM Mitra, Agostinho C Rosa |
Abstract | Most of the problems in genetic algorithms are very complex and demand a large amount of resources that current technology can not offer. Our purpose was to develop a Java-JINI distributed library that implements Genetic Algorithms with sub-populations (coarse grain) and a graphical interface in order to configure and follow the evolution of the search. The sub-populations are simulated/evaluated in personal computers connected trough a network, keeping in mind different models of sub-populations, migration policies and network topologies. We show that this model delays the convergence of the population keeping a higher level of genetic diversity and allows a much greater number of evaluations since they are distributed among several computers compared with the traditional Genetic Algorithms. |
Tasks | |
Published | 2016-05-22 |
URL | http://arxiv.org/abs/1605.06714v1 |
http://arxiv.org/pdf/1605.06714v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-demographic-algorithms |
Repo | |
Framework | |
Variational Autoencoders for Feature Detection of Magnetic Resonance Imaging Data
Title | Variational Autoencoders for Feature Detection of Magnetic Resonance Imaging Data |
Authors | R. Devon Hjelm, Sergey M. Plis, Vince C. Calhoun |
Abstract | Independent component analysis (ICA), as an approach to the blind source-separation (BSS) problem, has become the de-facto standard in many medical imaging settings. Despite successes and a large ongoing research effort, the limitation of ICA to square linear transformations have not been overcome, so that general INFOMAX is still far from being realized. As an alternative, we present feature analysis in medical imaging as a problem solved by Helmholtz machines, which include dimensionality reduction and reconstruction of the raw data under the same objective, and which recently have overcome major difficulties in inference and learning with deep and nonlinear configurations. We demonstrate one approach to training Helmholtz machines, variational auto-encoders (VAE), as a viable approach toward feature extraction with magnetic resonance imaging (MRI) data. |
Tasks | Dimensionality Reduction |
Published | 2016-03-21 |
URL | http://arxiv.org/abs/1603.06624v1 |
http://arxiv.org/pdf/1603.06624v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-autoencoders-for-feature |
Repo | |
Framework | |
The Dark Side of Ethical Robots
Title | The Dark Side of Ethical Robots |
Authors | Dieter Vanderelst, Alan Winfield |
Abstract | Concerns over the risks associated with advances in Artificial Intelligence have prompted calls for greater efforts toward robust and beneficial AI, including machine ethics. Recently, roboticists have responded by initiating the development of so-called ethical robots. These robots would, ideally, evaluate the consequences of their actions and morally justify their choices. This emerging field promises to develop extensively over the next years. However, in this paper, we point out an inherent limitation of the emerging field of ethical robots. We show that building ethical robots also necessarily facilitates the construction of unethical robots. In three experiments, we show that it is remarkably easy to modify an ethical robot so that it behaves competitively, or even aggressively. The reason for this is that the specific AI, required to make an ethical robot, can always be exploited to make unethical robots. Hence, the development of ethical robots will not guarantee the responsible deployment of AI. While advocating for ethical robots, we conclude that preventing the misuse of robots is beyond the scope of engineering, and requires instead governance frameworks underpinned by legislation. Without this, the development of ethical robots will serve to increase the risks of robotic malpractice instead of diminishing it. |
Tasks | |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02583v1 |
http://arxiv.org/pdf/1606.02583v1.pdf | |
PWC | https://paperswithcode.com/paper/the-dark-side-of-ethical-robots |
Repo | |
Framework | |
Survey of Expressivity in Deep Neural Networks
Title | Survey of Expressivity in Deep Neural Networks |
Authors | Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein |
Abstract | We survey results on neural network expressivity described in “On the Expressive Power of Deep Neural Networks”. The paper motivates and develops three natural measures of expressiveness, which all display an exponential dependence on the depth of the network. In fact, all of these measures are related to a fourth quantity, trajectory length. This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed. These results translate to consequences for networks during and after training. They suggest that parameters earlier in a network have greater influence on its expressive power – in particular, given a layer, its influence on expressivity is determined by the remaining depth of the network after that layer. This is verified with experiments on MNIST and CIFAR-10. We also explore the effect of training on the input-output map, and find that it trades off between the stability and expressivity. |
Tasks | |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08083v1 |
http://arxiv.org/pdf/1611.08083v1.pdf | |
PWC | https://paperswithcode.com/paper/survey-of-expressivity-in-deep-neural |
Repo | |
Framework | |
A latent-observed dissimilarity measure
Title | A latent-observed dissimilarity measure |
Authors | Yasushi Terazono |
Abstract | Quantitatively assessing relationships between latent variables and observed variables is important for understanding and developing generative models and representation learning. In this paper, we propose latent-observed dissimilarity (LOD) to evaluate the dissimilarity between the probabilistic characteristics of latent and observed variables. We also define four essential types of generative models with different independence/conditional independence configurations. Experiments using tractable real-world data show that LOD can effectively capture the differences between models and reflect the capability for higher layer learning. They also show that the conditional independence of latent variables given observed variables contributes to improving the transmission of information and characteristics from lower layers to higher layers. |
Tasks | Representation Learning |
Published | 2016-03-30 |
URL | http://arxiv.org/abs/1603.09254v1 |
http://arxiv.org/pdf/1603.09254v1.pdf | |
PWC | https://paperswithcode.com/paper/a-latent-observed-dissimilarity-measure |
Repo | |
Framework | |