May 6, 2019

3051 words 15 mins read

Paper Group ANR 167

An Efficient Method of Partitioning High Volumes of Multidimensional Data for Parallel Clustering Algorithms. Automatically Proving Mathematical Theorems with Evolutionary Algorithms and Proof Assistants. Identification of promising research directions using machine learning aided medical literature analysis. Sentiment Analysis for Twitter : Going …

An Efficient Method of Partitioning High Volumes of Multidimensional Data for Parallel Clustering Algorithms


Title	An Efficient Method of Partitioning High Volumes of Multidimensional Data for Parallel Clustering Algorithms
Authors	Saraswati Mishra, Avnish Chandra Suman
Abstract	An optimal data partitioning in parallel & distributed implementation of clustering algorithms is a necessary computation as it ensures independent task completion, fair distribution, less number of affected points and better & faster merging. Though partitioning using Kd Tree is being conventionally used in academia, it suffers from performance drenches and bias (non equal distribution) as dimensionality of data increases and hence is not suitable for practical use in industry where dimensionality can be of order of 100s to 1000s. To address these issues we propose two new partitioning techniques using existing mathematical models & study their feasibility, performance (bias and partitioning speed) & possible variants in choosing initial seeds. First method uses an n dimensional hashed grid based approach which is based on mapping the points in space to a set of cubes which hashes the points. Second method uses a tree of voronoi planes where each plane corresponds to a partition. We found that grid based approach was computationally impractical, while using a tree of voronoi planes (using scalable K-Means++ initial seeds) drastically outperformed the Kd-tree tree method as dimensionality increased.
Tasks
Published	2016-09-20
URL	http://arxiv.org/abs/1609.06221v1
PDF	http://arxiv.org/pdf/1609.06221v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-method-of-partitioning-high
Repo
Framework

Automatically Proving Mathematical Theorems with Evolutionary Algorithms and Proof Assistants


Title	Automatically Proving Mathematical Theorems with Evolutionary Algorithms and Proof Assistants
Authors	Li-An Yang, Jui-Pin Liu, Chao-Hong Chen, Ying-ping Chen
Abstract	Mathematical theorems are human knowledge able to be accumulated in the form of symbolic representation, and proving theorems has been considered intelligent behavior. Based on the BHK interpretation and the Curry-Howard isomorphism, proof assistants, software capable of interacting with human for constructing formal proofs, have been developed in the past several decades. Since proofs can be considered and expressed as programs, proof assistants simplify and verify a proof by computationally evaluating the program corresponding to the proof. Thanks to the transformation from logic to computation, it is now possible to generate or search for formal proofs directly in the realm of computation. Evolutionary algorithms, known to be flexible and versatile, have been successfully applied to handle a variety of scientific and engineering problems in numerous disciplines for also several decades. Examining the feasibility of establishing the link between evolutionary algorithms, as the program generator, and proof assistants, as the proof verifier, in order to automatically find formal proofs to a given logic sentence is the primary goal of this study. In the article, we describe in detail our first, ad-hoc attempt to fully automatically prove theorems as well as the preliminary results. Ten simple theorems from various branches of mathematics were proven, and most of these theorems cannot be proven by using the tactic auto alone in Coq, the adopted proof assistant. The implication and potential influence of this study are discussed, and the developed source code with the obtained experimental results are released as open source.
Tasks
Published	2016-02-24
URL	http://arxiv.org/abs/1602.07455v2
PDF	http://arxiv.org/pdf/1602.07455v2.pdf
PWC	https://paperswithcode.com/paper/automatically-proving-mathematical-theorems
Repo
Framework

Identification of promising research directions using machine learning aided medical literature analysis


Title	Identification of promising research directions using machine learning aided medical literature analysis
Authors	Victor Andrei, Ognjen Arandjelovic
Abstract	The rapidly expanding corpus of medical research literature presents major challenges in the understanding of previous work, the extraction of maximum information from collected data, and the identification of promising research directions. We present a case for the use of advanced machine learning techniques as an aide in this task and introduce a novel methodology that is shown to be capable of extracting meaningful information from large longitudinal corpora, and of tracking complex temporal changes within it.
Tasks
Published	2016-05-16
URL	http://arxiv.org/abs/1607.04660v1
PDF	http://arxiv.org/pdf/1607.04660v1.pdf
PWC	https://paperswithcode.com/paper/identification-of-promising-research
Repo
Framework

Sentiment Analysis for Twitter : Going Beyond Tweet Text


Title	Sentiment Analysis for Twitter : Going Beyond Tweet Text
Authors	Lahari Poddar, Kishaloy Halder, Xianyan Jia
Abstract	Analysing sentiment of tweets is important as it helps to determine the users’ opinion. Knowing people’s opinion is crucial for several purposes starting from gathering knowledge about customer base, e-governance, campaigning and many more. In this report, we aim to develop a system to detect the sentiment from tweets. We employ several linguistic features along with some other external sources of information to detect the sentiment of a tweet. We show that augmenting the 140 character-long tweet with information harvested from external urls shared in the tweet as well as Social Media features enhances the sentiment prediction accuracy significantly.
Tasks	Sentiment Analysis
Published	2016-11-29
URL	http://arxiv.org/abs/1611.09441v1
PDF	http://arxiv.org/pdf/1611.09441v1.pdf
PWC	https://paperswithcode.com/paper/sentiment-analysis-for-twitter-going-beyond
Repo
Framework

Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation


Title	Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation
Authors	Artem Sokolov, Stefan Riezler, Tanguy Urvoy
Abstract	We present an approach to structured prediction from bandit feedback, called Bandit Structured Prediction, where only the value of a task loss function at a single predicted point, instead of a correct structure, is observed in learning. We present an application to discriminative reranking in Statistical Machine Translation (SMT) where the learning algorithm only has access to a 1-BLEU loss evaluation of a predicted translation instead of obtaining a gold standard reference translation. In our experiment bandit feedback is obtained by evaluating BLEU on reference translations without revealing them to the algorithm. This can be thought of as a simulation of interactive machine translation where an SMT system is personalized by a user who provides single point feedback to predicted translations. Our experiments show that our approach improves translation quality and is comparable to approaches that employ more informative feedback in learning.
Tasks	Machine Translation, Structured Prediction
Published	2016-01-18
URL	http://arxiv.org/abs/1601.04468v1
PDF	http://arxiv.org/pdf/1601.04468v1.pdf
PWC	https://paperswithcode.com/paper/bandit-structured-prediction-for-learning
Repo
Framework

The Structured Weighted Violations Perceptron Algorithm


Title	The Structured Weighted Violations Perceptron Algorithm
Authors	Rotem Dror, Roi Reichart
Abstract	We present the Structured Weighted Violations Perceptron (SWVP) algorithm, a new structured prediction algorithm that generalizes the Collins Structured Perceptron (CSP). Unlike CSP, the update rule of SWVP explicitly exploits the internal structure of the predicted labels. We prove the convergence of SWVP for linearly separable training sets, provide mistake and generalization bounds, and show that in the general case these bounds are tighter than those of the CSP special case. In synthetic data experiments with data drawn from an HMM, various variants of SWVP substantially outperform its CSP special case. SWVP also provides encouraging initial dependency parsing results.
Tasks	Dependency Parsing, Structured Prediction
Published	2016-02-09
URL	http://arxiv.org/abs/1602.03040v3
PDF	http://arxiv.org/pdf/1602.03040v3.pdf
PWC	https://paperswithcode.com/paper/the-structured-weighted-violations-perceptron
Repo
Framework

Unifying Multi-Domain Multi-Task Learning: Tensor and Neural Network Perspectives


Title	Unifying Multi-Domain Multi-Task Learning: Tensor and Neural Network Perspectives
Authors	Yongxin Yang, Timothy M. Hospedales
Abstract	Multi-domain learning aims to benefit from simultaneously learning across several different but related domains. In this chapter, we propose a single framework that unifies multi-domain learning (MDL) and the related but better studied area of multi-task learning (MTL). By exploiting the concept of a \emph{semantic descriptor} we show how our framework encompasses various classic and recent MDL/MTL algorithms as special cases with different semantic descriptor encodings. As a second contribution, we present a higher order generalisation of this framework, capable of simultaneous multi-task-multi-domain learning. This generalisation has two mathematically equivalent views in multi-linear algebra and gated neural networks respectively. Moreover, by exploiting the semantic descriptor, it provides neural networks the capability of zero-shot learning (ZSL), where a classifier is generated for an unseen class without any training data; as well as zero-shot domain adaptation (ZSDA), where a model is generated for an unseen domain without any training data. In practice, this framework provides a powerful yet easy to implement method that can be flexibly applied to MTL, MDL, ZSL and ZSDA.
Tasks	Domain Adaptation, Multi-Task Learning, Zero-Shot Learning
Published	2016-11-28
URL	http://arxiv.org/abs/1611.09345v1
PDF	http://arxiv.org/pdf/1611.09345v1.pdf
PWC	https://paperswithcode.com/paper/unifying-multi-domain-multi-task-learning
Repo
Framework

DeLight-Net: Decomposing Reflectance Maps into Specular Materials and Natural Illumination


Title	DeLight-Net: Decomposing Reflectance Maps into Specular Materials and Natural Illumination
Authors	Stamatios Georgoulis, Konstantinos Rematas, Tobias Ritschel, Mario Fritz, Luc Van Gool, Tinne Tuytelaars
Abstract	In this paper we are extracting surface reflectance and natural environmental illumination from a reflectance map, i.e. from a single 2D image of a sphere of one material under one illumination. This is a notoriously difficult problem, yet key to various re-rendering applications. With the recent advances in estimating reflectance maps from 2D images their further decomposition has become increasingly relevant. To this end, we propose a Convolutional Neural Network (CNN) architecture to reconstruct both material parameters (i.e. Phong) as well as illumination (i.e. high-resolution spherical illumination maps), that is solely trained on synthetic data. We demonstrate that decomposition of synthetic as well as real photographs of reflectance maps, both in High Dynamic Range (HDR), and, for the first time, on Low Dynamic Range (LDR) as well. Results are compared to previous approaches quantitatively as well as qualitatively in terms of re-renderings where illumination, material, view or shape are changed.
Tasks
Published	2016-03-27
URL	http://arxiv.org/abs/1603.08240v1
PDF	http://arxiv.org/pdf/1603.08240v1.pdf
PWC	https://paperswithcode.com/paper/delight-net-decomposing-reflectance-maps-into
Repo
Framework

A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic


Title	A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic
Authors	Tingshan Huang, Harish Sethu, Nagarajan Kandasamy
Abstract	The monitoring and management of high-volume feature-rich traffic in large networks offers significant challenges in storage, transmission and computational costs. The predominant approach to reducing these costs is based on performing a linear mapping of the data to a low-dimensional subspace such that a certain large percentage of the variance in the data is preserved in the low-dimensional representation. This variance-based subspace approach to dimensionality reduction forces a fixed choice of the number of dimensions, is not responsive to real-time shifts in observed traffic patterns, and is vulnerable to normal traffic spoofing. Based on theoretical insights proved in this paper, we propose a new distance-based approach to dimensionality reduction motivated by the fact that the real-time structural differences between the covariance matrices of the observed and the normal traffic is more relevant to anomaly detection than the structure of the training data alone. Our approach, called the distance-based subspace method, allows a different number of reduced dimensions in different time windows and arrives at only the number of dimensions necessary for effective anomaly detection. We present centralized and distributed versions of our algorithm and, using simulation on real traffic traces, demonstrate the qualitative and quantitative advantages of the distance-based subspace approach.
Tasks	Anomaly Detection, Dimensionality Reduction
Published	2016-06-14
URL	http://arxiv.org/abs/1606.04552v1
PDF	http://arxiv.org/pdf/1606.04552v1.pdf
PWC	https://paperswithcode.com/paper/a-new-approach-to-dimensionality-reduction
Repo
Framework

Kernel classification of connectomes based on earth mover’s distance between graph spectra


Title	Kernel classification of connectomes based on earth mover’s distance between graph spectra
Authors	Yulia Dodonova, Mikhail Belyaev, Anna Tkachev, Dmitry Petrov, Leonid Zhukov
Abstract	In this paper, we tackle a problem of predicting phenotypes from structural connectomes. We propose that normalized Laplacian spectra can capture structural properties of brain networks, and hence graph spectral distributions are useful for a task of connectome-based classification. We introduce a kernel that is based on earth mover’s distance (EMD) between spectral distributions of brain networks. We access performance of an SVM classifier with the proposed kernel for a task of classification of autism spectrum disorder versus typical development based on a publicly available dataset. Classification quality (area under the ROC-curve) obtained with the EMD-based kernel on spectral distributions is 0.71, which is higher than that based on simpler graph embedding methods.
Tasks	Graph Embedding
Published	2016-11-27
URL	http://arxiv.org/abs/1611.08812v1
PDF	http://arxiv.org/pdf/1611.08812v1.pdf
PWC	https://paperswithcode.com/paper/kernel-classification-of-connectomes-based-on
Repo
Framework

Distributed learning with regularized least squares


Title	Distributed learning with regularized least squares
Authors	Shao-Bo Lin, Xin Guo, Ding-Xuan Zhou
Abstract	We study distributed learning with the least squares regularization scheme in a reproducing kernel Hilbert space (RKHS). By a divide-and-conquer approach, the algorithm partitions a data set into disjoint data subsets, applies the least squares regularization scheme to each data subset to produce an output function, and then takes an average of the individual output functions as a final global estimator or predictor. We show with error bounds in expectation in both the $L^2$-metric and RKHS-metric that the global output function of this distributed learning is a good approximation to the algorithm processing the whole data in one single machine. Our error bounds are sharp and stated in a general setting without any eigenfunction assumption. The analysis is achieved by a novel second order decomposition of operator differences in our integral operator approach. Even for the classical least squares regularization scheme in the RKHS associated with a general kernel, we give the best learning rate in the literature.
Tasks
Published	2016-08-11
URL	http://arxiv.org/abs/1608.03339v2
PDF	http://arxiv.org/pdf/1608.03339v2.pdf
PWC	https://paperswithcode.com/paper/distributed-learning-with-regularized-least
Repo
Framework

Efficient Globally Optimal 2D-to-3D Deformable Shape Matching


Title	Efficient Globally Optimal 2D-to-3D Deformable Shape Matching
Authors	Zorah Lähner, Emanuele Rodolà, Frank R. Schmidt, Michael M. Bronstein, Daniel Cremers
Abstract	We propose the first algorithm for non-rigid 2D-to-3D shape matching, where the input is a 2D shape represented as a planar curve and a 3D shape represented as a surface; the output is a continuous curve on the surface. We cast the problem as finding the shortest circular path on the prod- uct 3-manifold of the surface and the curve. We prove that the optimal matching can be computed in polynomial time with a (worst-case) complexity of $O(mn^2\log(n))$, where $m$ and $n$ denote the number of vertices on the template curve and the 3D shape respectively. We also demonstrate that in practice the runtime is essentially linear in $m!\cdot! n$ making it an efficient method for shape analysis and shape retrieval. Quantitative evaluation confirms that the method provides excellent results for sketch-based deformable 3D shape re- trieval.
Tasks
Published	2016-01-22
URL	http://arxiv.org/abs/1601.06070v2
PDF	http://arxiv.org/pdf/1601.06070v2.pdf
PWC	https://paperswithcode.com/paper/efficient-globally-optimal-2d-to-3d
Repo
Framework

Classifying and sorting cluttered piles of unknown objects with robots: a learning approach


Title	Classifying and sorting cluttered piles of unknown objects with robots: a learning approach
Authors	Janne V. Kujala, Tuomas J. Lukka, Harri Holopainen
Abstract	We consider the problem of sorting a densely cluttered pile of unknown objects using a robot. This yet unsolved problem is relevant in the robotic waste sorting business. By extending previous active learning approaches to grasping, we show a system that learns the task autonomously. Instead of predicting just whether a grasp succeeds, we predict the classes of the objects that end up being picked and thrown onto the target conveyor. Segmenting and identifying objects from the uncluttered target conveyor, as opposed to the working area, is easier due to the added structure since the thrown objects will be the only ones present. Instead of trying to segment or otherwise understand the cluttered working area in any way, we simply allow the controller to learn a mapping from an RGBD image in the neighborhood of the grasp to a predicted result—all segmentation etc. in the working area is implicit in the learned function. The grasp selection operates in two stages: The first stage is hardcoded and outputs a distribution of possible grasps that sometimes succeed. The second stage uses a purely learned criterion to choose the grasp to make from the proposal distribution created by the first stage. In an experiment, the system quickly learned to make good pickups and predict correctly, in advance, which class of object it was going to pick up and was able to sort the objects from a densely cluttered pile by color.
Tasks	Active Learning
Published	2016-09-05
URL	http://arxiv.org/abs/1609.01044v1
PDF	http://arxiv.org/pdf/1609.01044v1.pdf
PWC	https://paperswithcode.com/paper/classifying-and-sorting-cluttered-piles-of
Repo
Framework

The Data Complexity of Description Logic Ontologies


Title	The Data Complexity of Description Logic Ontologies
Authors	Carsten Lutz, Frank Wolter
Abstract	We analyze the data complexity of ontology-mediated querying where the ontologies are formulated in a description logic (DL) of the ALC family and queries are conjunctive queries, positive existential queries, or acyclic conjunctive queries. Our approach is non-uniform in the sense that we aim to understand the complexity of each single ontology instead of for all ontologies formulated in a certain language. While doing so, we quantify over the queries and are interested, for example, in the question whether all queries can be evaluated in polynomial time w.r.t. a given ontology. Our results include a PTime/coNP-dichotomy for ontologies of depth one in the description logic ALCFI, the same dichotomy for ALC- and ALCI-ontologies of unrestricted depth, and the non-existence of such a dichotomy for ALCF-ontologies. For the latter DL, we additionally show that it is undecidable whether a given ontology admits PTime query evaluation. We also consider the connection between PTime query evaluation and rewritability into (monadic) Datalog.
Tasks
Published	2016-11-08
URL	http://arxiv.org/abs/1611.02453v3
PDF	http://arxiv.org/pdf/1611.02453v3.pdf
PWC	https://paperswithcode.com/paper/the-data-complexity-of-description-logic
Repo
Framework

Personalized Risk Scoring for Critical Care Patients using Mixtures of Gaussian Process Experts


Title	Personalized Risk Scoring for Critical Care Patients using Mixtures of Gaussian Process Experts
Authors	Ahmed M. Alaa, Jinsung Yoon, Scott Hu, Mihaela van der Schaar
Abstract	We develop a personalized real time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs. Heterogeneity of the patients population is captured via a hierarchical latent class model. The proposed algorithm aims to discover the number of latent classes in the patients population, and train a mixture of Gaussian Process (GP) experts, where each expert models the physiological data streams associated with a specific class. Self-taught transfer learning is used to transfer the knowledge of latent classes learned from the domain of clinically stable patients to the domain of clinically deteriorating patients. For new patients, the posterior beliefs of all GP experts about the patient’s clinical status given her physiological data stream are computed, and a personalized risk score is evaluated as a weighted average of those beliefs, where the weights are learned from the patient’s hospital admission information. Experiments on a heterogeneous cohort of 6,313 patients admitted to Ronald Regan UCLA medical center show that our risk score outperforms the currently deployed risk scores, such as MEWS and Rothman scores.
Tasks	Transfer Learning
Published	2016-05-03
URL	http://arxiv.org/abs/1605.00959v1
PDF	http://arxiv.org/pdf/1605.00959v1.pdf
PWC	https://paperswithcode.com/paper/personalized-risk-scoring-for-critical-care
Repo
Framework