October 19, 2019

2471 words 12 mins read

Paper Group ANR 382

Agreement-based Learning. Using Detailed Access Trajectories for Learning Behavior Analysis. Computable Variants of AIXI which are More Powerful than AIXItl. Probabilistic Clustering Using Maximal Matrix Norm Couplings. Fast Construction of Correcting Ensembles for Legacy Artificial Intelligence Systems: Algorithms and a Case Study. On preserving n …

Agreement-based Learning


Title	Agreement-based Learning
Authors	Emmanouil Antonios Platanios
Abstract	Model selection is a problem that has occupied machine learning researchers for a long time. Recently, its importance has become evident through applications in deep learning. We propose an agreement-based learning framework that prevents many of the pitfalls associated with model selection. It relies on coupling the training of multiple models by encouraging them to agree on their predictions while training. In contrast with other model selection and combination approaches used in machine learning, the proposed framework is inspired by human learning. We also propose a learning algorithm defined within this framework which manages to significantly outperform alternatives in practice, and whose performance improves further with the availability of unlabeled data. Finally, we describe a number of potential directions for developing more flexible agreement-based learning algorithms.
Tasks	Model Selection
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01258v1
PDF	http://arxiv.org/pdf/1806.01258v1.pdf
PWC	https://paperswithcode.com/paper/agreement-based-learning
Repo
Framework

Using Detailed Access Trajectories for Learning Behavior Analysis


Title	Using Detailed Access Trajectories for Learning Behavior Analysis
Authors	Yanbang Wang, Nancy Law, Erik Hemberg, Una-May O’Reilly
Abstract	Student learning activity in MOOCs can be viewed from multiple perspectives. We present a new organization of MOOC learner activity data at a resolution that is in between the fine granularity of the clickstream and coarse organizations that count activities, aggregate students or use long duration time units. A detailed access trajectory (DAT) consists of binary values and is two dimensional with one axis that is a time series, e.g. days and the other that is a chronologically ordered list of a MOOC component type’s instances, e.g. videos in instructional order. Most popular MOOC platforms generate data that can be organized as detailed access trajectories (DATs).We explore the value of DATs by conducting four empirical mini-studies. Our studies suggest DATs contain rich information about students’ learning behaviors and facilitate MOOC learning analyses.
Tasks	Time Series
Published	2018-12-14
URL	http://arxiv.org/abs/1812.05767v1
PDF	http://arxiv.org/pdf/1812.05767v1.pdf
PWC	https://paperswithcode.com/paper/using-detailed-access-trajectories-for
Repo
Framework

Computable Variants of AIXI which are More Powerful than AIXItl


Title	Computable Variants of AIXI which are More Powerful than AIXItl
Authors	Susumu Katayama
Abstract	This paper presents Unlimited Computable AI, or UCAI, that is a family of computable variants of AIXI. UCAI is more powerful than AIXItl, that is a conventional family of computable variants of AIXI, in the following ways: 1) UCAI supports models of terminating computation, including typed lambda calculus, while AIXItl only supports Turing machine with timeout t, which can be simulated by typed lambda calculus for any t; 2) unlike UCAI, AIXItl limits the program length to l.
Tasks
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08592v3
PDF	http://arxiv.org/pdf/1805.08592v3.pdf
PWC	https://paperswithcode.com/paper/computable-variants-of-aixi-which-are-more
Repo
Framework

Probabilistic Clustering Using Maximal Matrix Norm Couplings


Title	Probabilistic Clustering Using Maximal Matrix Norm Couplings
Authors	David Qiu, Anuran Makur, Lizhong Zheng
Abstract	In this paper, we present a local information theoretic approach to explicitly learn probabilistic clustering of a discrete random variable. Our formulation yields a convex maximization problem for which it is NP-hard to find the global optimum. In order to algorithmically solve this optimization problem, we propose two relaxations that are solved via gradient ascent and alternating maximization. Experiments on the MSR Sentence Completion Challenge, MovieLens 100K, and Reuters21578 datasets demonstrate that our approach is competitive with existing techniques and worthy of further investigation.
Tasks
Published	2018-10-10
URL	http://arxiv.org/abs/1810.04738v1
PDF	http://arxiv.org/pdf/1810.04738v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-clustering-using-maximal-matrix
Repo
Framework

Fast Construction of Correcting Ensembles for Legacy Artificial Intelligence Systems: Algorithms and a Case Study


Title	Fast Construction of Correcting Ensembles for Legacy Artificial Intelligence Systems: Algorithms and a Case Study
Authors	Ivan Y. Tyukin, Alexander N. Gorban, Stephen Green, Danil Prokhorov
Abstract	This paper presents a technology for simple and computationally efficient improvements of a generic Artificial Intelligence (AI) system, including Multilayer and Deep Learning neural networks. The improvements are, in essence, small network ensembles constructed on top of the existing AI architectures. Theoretical foundations of the technology are based on Stochastic Separation Theorems and the ideas of the concentration of measure. We show that, subject to mild technical assumptions on statistical properties of internal signals in the original AI system, the technology enables instantaneous and computationally efficient removal of spurious and systematic errors with probability close to one on the datasets which are exponentially large in dimension. The method is illustrated with numerical examples and a case study of ten digits recognition from American Sign Language.
Tasks
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05593v2
PDF	http://arxiv.org/pdf/1810.05593v2.pdf
PWC	https://paperswithcode.com/paper/fast-construction-of-correcting-ensembles-for
Repo
Framework

On preserving non-discrimination when combining expert advice


Title	On preserving non-discrimination when combining expert advice
Authors	Avrim Blum, Suriya Gunasekar, Thodoris Lykouris, Nathan Srebro
Abstract	We study the interplay between sequential decision making and avoiding discrimination against protected groups, when examples arrive online and do not follow distributional assumptions. We consider the most basic extension of classical online learning: “Given a class of predictors that are individually non-discriminatory with respect to a particular metric, how can we combine them to perform as well as the best predictor, while preserving non-discrimination?” Surprisingly we show that this task is unachievable for the prevalent notion of “equalized odds” that requires equal false negative rates and equal false positive rates across groups. On the positive side, for another notion of non-discrimination, “equalized error rates”, we show that running separate instances of the classical multiplicative weights algorithm for each group achieves this guarantee. Interestingly, even for this notion, we show that algorithms with stronger performance guarantees than multiplicative weights cannot preserve non-discrimination.
Tasks	Decision Making
Published	2018-10-28
URL	http://arxiv.org/abs/1810.11829v2
PDF	http://arxiv.org/pdf/1810.11829v2.pdf
PWC	https://paperswithcode.com/paper/on-preserving-non-discrimination-when
Repo
Framework

Aligning Manifolds of Double Pendulum Dynamics Under the Influence of Noise


Title	Aligning Manifolds of Double Pendulum Dynamics Under the Influence of Noise
Authors	Fayeem Aziz, Aaron S. W. Wong, James S. Welsh, Stephan K. Chalup
Abstract	This study presents the results of a series of simulation experiments that evaluate and compare four different manifold alignment methods under the influence of noise. The data was created by simulating the dynamics of two slightly different double pendulums in three-dimensional space. The method of semi-supervised feature-level manifold alignment using global distance resulted in the most convincing visualisations. However, the semi-supervised feature-level local alignment methods resulted in smaller alignment errors. These local alignment methods were also more robust to noise and faster than the other methods.
Tasks
Published	2018-09-19
URL	http://arxiv.org/abs/1809.06992v2
PDF	http://arxiv.org/pdf/1809.06992v2.pdf
PWC	https://paperswithcode.com/paper/aligning-manifolds-of-double-pendulum
Repo
Framework

Semantically-informed distance and similarity measures for paraphrase plagiarism identification


Title	Semantically-informed distance and similarity measures for paraphrase plagiarism identification
Authors	Miguel A. Álvarez-Carmona, Marc Franco-Salvador, Esaú Villatoro-Tello, Manuel Montes-y-Gómez, Paolo Rosso, Luis Villaseñor-Pineda
Abstract	Paraphrase plagiarism identification represents a very complex task given that plagiarized texts are intentionally modified through several rewording techniques. Accordingly, this paper introduces two new measures for evaluating the relatedness of two given texts: a semantically-informed similarity measure and a semantically-informed edit distance. Both measures are able to extract semantic information from either an external resource or a distributed representation of words, resulting in informative features for training a supervised classifier for detecting paraphrase plagiarism. Obtained results indicate that the proposed metrics are consistently good in detecting different types of paraphrase plagiarism. In addition, results are very competitive against state-of-the art methods having the advantage of representing a much more simple but equally effective solution.
Tasks
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11611v1
PDF	http://arxiv.org/pdf/1805.11611v1.pdf
PWC	https://paperswithcode.com/paper/semantically-informed-distance-and-similarity
Repo
Framework

Detecting cities in aerial night-time images by learning structural invariants using single reference augmentation


Title	Detecting cities in aerial night-time images by learning structural invariants using single reference augmentation
Authors	Philipp Sadler
Abstract	This paper examines, if it is possible to learn structural invariants of city images by using only a single reference picture when producing transformations along the variants in the dataset. Previous work explored the problem of learning from only a few examples and showed that data augmentation techniques benefit performance and generalization for machine learning approaches. First a principal component analysis in conjunction with a Fourier transform is trained on a single reference augmentation training dataset using the city images. Secondly a convolutional neural network is trained on a similar dataset with more samples. The findings are that the convolutional neural network is capable of finding images of the same category whereas the applied principal component analysis in conjunction with a Fourier transform failed to solve this task.
Tasks	Data Augmentation
Published	2018-10-19
URL	http://arxiv.org/abs/1810.08597v1
PDF	http://arxiv.org/pdf/1810.08597v1.pdf
PWC	https://paperswithcode.com/paper/detecting-cities-in-aerial-night-time-images
Repo
Framework

Automatic segmentation of prostate zones


Title	Automatic segmentation of prostate zones
Authors	Germonda Mooij, Ines Bagulho, Henkjan Huisman
Abstract	Convolutional networks have become state-of-the-art techniques for automatic medical image analysis, with the U-net architecture being the most popular at this moment. In this article we report the application of a 3D version of U-net to the automatic segmentation of prostate peripheral and transition zones in 3D MRI images. Our results are slightly better than recent studies that used 2D U-net and handcrafted feature approaches. In addition, we test ideas for improving the 3D U-net setup, by 1) letting the network segment surrounding tissues, making use of the fixed anatomy, and 2) adjusting the network architecture to reflect the anisotropy in the dimensions of the MRI image volumes. While the latter adjustment gave a marginal improvement, the former adjustment showed a significant deterioration of the network performance. We were able to explain this deterioration by inspecting feature map activations in all layers of the network. We show that to segment more tissues the network replaces feature maps that were dedicated to detecting prostate peripheral zones, by feature maps detecting the surrounding tissues.
Tasks
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07146v1
PDF	http://arxiv.org/pdf/1806.07146v1.pdf
PWC	https://paperswithcode.com/paper/automatic-segmentation-of-prostate-zones
Repo
Framework

The eigenvalues of stochastic blockmodel graphs


Title	The eigenvalues of stochastic blockmodel graphs
Authors	Minh Tang
Abstract	We derive the limiting distribution for the largest eigenvalues of the adjacency matrix for a stochastic blockmodel graph when the number of vertices tends to infinity. We show that, in the limit, these eigenvalues are jointly multivariate normal with bounded covariances. Our result extends the classic result of F"{u}redi and Koml'{o}s on the fluctuation of the largest eigenvalue for Erd\H{o}s-R'{e}nyi graphs.
Tasks
Published	2018-03-30
URL	http://arxiv.org/abs/1803.11551v1
PDF	http://arxiv.org/pdf/1803.11551v1.pdf
PWC	https://paperswithcode.com/paper/the-eigenvalues-of-stochastic-blockmodel
Repo
Framework

Exact Sampling of Determinantal Point Processes without Eigendecomposition


Title	Exact Sampling of Determinantal Point Processes without Eigendecomposition
Authors	Agnès Desolneux, Claire Launay, Bruno Galerne
Abstract	Determinantal point processes (DPPs) enable the modeling of repulsion: they provide diverse sets of points. The repulsion is encoded in a kernel $K$ that can be seen as a matrix storing the similarity between points. The diversity comes from the fact that the inclusion probability of a subset is equal to the determinant of a submatrice of $K$. The exact algorithm to sample DPPs uses the spectral decomposition of $K$, a computation that becomes costly when dealing with a high number of points. Here, we present an alternative exact algorithm in the discrete setting that avoids the eigenvalues and the eigenvectors computation. Instead, it relies on Cholesky decompositions. This is a two steps strategy: first, it samples a Bernoulli point process with an appropriate distribution, then it samples the target DPP distribution through a thinning procedure. Not only is the method used here innovative, but this algorithm can be competitive with the original algorithm or even faster for some applications specified below.
Tasks	Point Processes
Published	2018-02-23
URL	https://arxiv.org/abs/1802.08429v4
PDF	https://arxiv.org/pdf/1802.08429v4.pdf
PWC	https://paperswithcode.com/paper/exact-sampling-of-determinantal-point
Repo
Framework

Discovering Signals from Web Sources to Predict Cyber Attacks


Title	Discovering Signals from Web Sources to Predict Cyber Attacks
Authors	Palash Goyal, KSM Tozammel Hossain, Ashok Deb, Nazgol Tavabi, Nathan Bartley, Andr’es Abeliuk, Emilio Ferrara, Kristina Lerman
Abstract	Cyber attacks are growing in frequency and severity. Over the past year alone we have witnessed massive data breaches that stole personal information of millions of people and wide-scale ransomware attacks that paralyzed critical infrastructure of several countries. Combating the rising cyber threat calls for a multi-pronged strategy, which includes predicting when these attacks will occur. The intuition driving our approach is this: during the planning and preparation stages, hackers leave digital traces of their activities on both the surface web and dark web in the form of discussions on platforms like hacker forums, social media, blogs and the like. These data provide predictive signals that allow anticipating cyber attacks. In this paper, we describe machine learning techniques based on deep neural networks and autoregressive time series models that leverage external signals from publicly available Web sources to forecast cyber attacks. Performance of our framework across ground truth data over real-world forecasting tasks shows that our methods yield a significant lift or increase of F1 for the top signals on predicted cyber attacks. Our results suggest that, when deployed, our system will be able to provide an effective line of defense against various types of targeted cyber attacks.
Tasks	Time Series
Published	2018-06-08
URL	http://arxiv.org/abs/1806.03342v1
PDF	http://arxiv.org/pdf/1806.03342v1.pdf
PWC	https://paperswithcode.com/paper/discovering-signals-from-web-sources-to
Repo
Framework

Image Based Fashion Product Recommendation with Deep Learning


Title	Image Based Fashion Product Recommendation with Deep Learning
Authors	Hessel Tuinhof, Clemens Pirker, Markus Haltmeier
Abstract	We develop a two-stage deep learning framework that recommends fashion images based on other input images of similar style. For that purpose, a neural network classifier is used as a data-driven, visually-aware feature extractor. The latter then serves as input for similarity-based recommendations using a ranking algorithm. Our approach is tested on the publicly available Fashion dataset. Initialization strategies using transfer learning from larger product databases are presented. Combined with more traditional content-based recommendation systems, our framework can help to increase robustness and performance, for example, by better matching a particular customer style.
Tasks	Product Recommendation, Recommendation Systems, Transfer Learning
Published	2018-05-06
URL	http://arxiv.org/abs/1805.08694v2
PDF	http://arxiv.org/pdf/1805.08694v2.pdf
PWC	https://paperswithcode.com/paper/image-based-fashion-product-recommendation
Repo
Framework

Factorization of Dempster-Shafer Belief Functions Based on Data


Title	Factorization of Dempster-Shafer Belief Functions Based on Data
Authors	Andrzej Matuszewski, Mieczysław A. Kłopotek
Abstract	One important obstacle in applying Dempster-Shafer Theory (DST) is its relationship to frequencies. In particular, there exist serious difficulties in finding factorizations of belief functions from data. In probability theory factorizations are usually related to notion of (conditional) independence and their possibility tested accordingly. However, in DST conditional belief distributions prove to be non-proper belief functions (that is ones connected with negative “frequencies”). This makes statistical testing of potential conditional independencies practically impossible, as no coherent interpretation could be found so far for negative belief function values. In this paper a novel attempt is made to overcome this difficulty. In the proposal no conditional beliefs are calculated, but instead a new measure F is introduced within the framework of DST, closely related to conditional independence, allowing to apply conventional statistical tests for detection of dependence/independence.
Tasks
Published	2018-12-14
URL	http://arxiv.org/abs/1812.06028v1
PDF	http://arxiv.org/pdf/1812.06028v1.pdf
PWC	https://paperswithcode.com/paper/factorization-of-dempster-shafer-belief
Repo
Framework