Paper Group ANR 440
Modeling Scalability of Distributed Machine Learning. A Stackelberg Game Perspective on the Conflict Between Machine Learning and Data Obfuscation. Deep neural heart rate variability analysis. Some Advances in Role Discovery in Graphs. Generative Adversarial Parallelization. Dueling Bandits with Dependent Arms. YodaNN: An Architecture for Ultra-Low …
Modeling Scalability of Distributed Machine Learning
Title | Modeling Scalability of Distributed Machine Learning |
Authors | Alexander Ulanov, Andrey Simanovsky, Manish Marwah |
Abstract | Present day machine learning is computationally intensive and processes large amounts of data. It is implemented in a distributed fashion in order to address these scalability issues. The work is parallelized across a number of computing nodes. It is usually hard to estimate in advance how many nodes to use for a particular workload. We propose a simple framework for estimating the scalability of distributed machine learning algorithms. We measure the scalability by means of the speedup an algorithm achieves with more nodes. We propose time complexity models for gradient descent and graphical model inference. We validate our models with experiments on deep learning training and belief propagation. This framework was used to study the scalability of machine learning algorithms in Apache Spark. |
Tasks | |
Published | 2016-10-20 |
URL | http://arxiv.org/abs/1610.06276v2 |
http://arxiv.org/pdf/1610.06276v2.pdf | |
PWC | https://paperswithcode.com/paper/modeling-scalability-of-distributed-machine |
Repo | |
Framework | |
A Stackelberg Game Perspective on the Conflict Between Machine Learning and Data Obfuscation
Title | A Stackelberg Game Perspective on the Conflict Between Machine Learning and Data Obfuscation |
Authors | Jeffrey Pawlick, Quanyan Zhu |
Abstract | Data is the new oil; this refrain is repeated extensively in the age of internet tracking, machine learning, and data analytics. Social network analysis, cookie-based advertising, and government surveillance are all evidence of the use of data for commercial and national interests. Public pressure, however, is mounting for the protection of privacy. Frameworks such as differential privacy offer machine learning algorithms methods to guarantee limits to information disclosure, but they are seldom implemented. Recently, however, developers have made significant efforts to undermine tracking through obfuscation tools that hide user characteristics in a sea of noise. These services highlight an emerging clash between tracking and data obfuscation. In this paper, we conceptualize this conflict through a dynamic game between users and a machine learning algorithm that uses empirical risk minimization. First, a machine learner declares a privacy protection level, and then users respond by choosing their own perturbation amounts. We study the interaction between the users and the learner using a Stackelberg game. The utility functions quantify accuracy using expected loss and privacy in terms of the bounds of differential privacy. In equilibrium, we find selfish users tend to cause significant utility loss to trackers by perturbing heavily, in a phenomenon reminiscent of public good games. Trackers, however, can improve the balance by proactively perturbing the data themselves. While other work in this area has studied privacy markets and mechanism design for truthful reporting of user information, we take a different viewpoint by considering both user and learner perturbation. |
Tasks | |
Published | 2016-06-21 |
URL | http://arxiv.org/abs/1606.06771v2 |
http://arxiv.org/pdf/1606.06771v2.pdf | |
PWC | https://paperswithcode.com/paper/a-stackelberg-game-perspective-on-the |
Repo | |
Framework | |
Deep neural heart rate variability analysis
Title | Deep neural heart rate variability analysis |
Authors | Tamas Madl |
Abstract | Despite of the pain and limited accuracy of blood tests for early recognition of cardiovascular disease, they dominate risk screening and triage. On the other hand, heart rate variability is non-invasive and cheap, but not considered accurate enough for clinical practice. Here, we tackle heart beat interval based classification with deep learning. We introduce an end to end differentiable hybrid architecture, consisting of a layer of biological neuron models of cardiac dynamics (modified FitzHugh Nagumo neurons) and several layers of a standard feed-forward neural network. The proposed model is evaluated on ECGs from 474 stable at-risk (coronary artery disease) patients, and 1172 chest pain patients of an emergency department. We show that it can significantly outperform models based on traditional heart rate variability predictors, as well as approaching or in some cases outperforming clinical blood tests, based only on 60 seconds of inter-beat intervals. |
Tasks | Electrocardiography (ECG), Heart Rate Variability |
Published | 2016-12-29 |
URL | http://arxiv.org/abs/1612.09205v1 |
http://arxiv.org/pdf/1612.09205v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-heart-rate-variability-analysis |
Repo | |
Framework | |
Some Advances in Role Discovery in Graphs
Title | Some Advances in Role Discovery in Graphs |
Authors | Sean Gilpin, Chia-Tung Kuo, Tina Eliassi-Rad, Ian Davidson |
Abstract | Role discovery in graphs is an emerging area that allows analysis of complex graphs in an intuitive way. In contrast to other graph prob- lems such as community discovery, which finds groups of highly connected nodes, the role discovery problem finds groups of nodes that share similar graph topological structure. However, existing work so far has two severe limitations that prevent its use in some domains. Firstly, it is completely unsupervised which is undesirable for a number of reasons. Secondly, most work is limited to a single relational graph. We address both these lim- itations in an intuitive and easy to implement alternating least squares framework. Our framework allows convex constraints to be placed on the role discovery problem which can provide useful supervision. In par- ticular we explore supervision to enforce i) sparsity, ii) diversity and iii) alternativeness. We then show how to lift this work for multi-relational graphs. A natural representation of a multi-relational graph is an order 3 tensor (rather than a matrix) and that a Tucker decomposition allows us to find complex interactions between collections of entities (E-groups) and the roles they play for a combination of relations (R-groups). Existing Tucker decomposition methods in tensor toolboxes are not suited for our purpose, so we create our own algorithm that we demonstrate is pragmatically useful. |
Tasks | |
Published | 2016-09-09 |
URL | http://arxiv.org/abs/1609.02646v1 |
http://arxiv.org/pdf/1609.02646v1.pdf | |
PWC | https://paperswithcode.com/paper/some-advances-in-role-discovery-in-graphs |
Repo | |
Framework | |
Generative Adversarial Parallelization
Title | Generative Adversarial Parallelization |
Authors | Daniel Jiwoong Im, He Ma, Chris Dongjoo Kim, Graham Taylor |
Abstract | Generative Adversarial Networks have become one of the most studied frameworks for unsupervised learning due to their intuitive formulation. They have also been shown to be capable of generating convincing examples in limited domains, such as low-resolution images. However, they still prove difficult to train in practice and tend to ignore modes of the data generating distribution. Quantitatively capturing effects such as mode coverage and more generally the quality of the generative model still remain elusive. We propose Generative Adversarial Parallelization, a framework in which many GANs or their variants are trained simultaneously, exchanging their discriminators. This eliminates the tight coupling between a generator and discriminator, leading to improved convergence and improved coverage of modes. We also propose an improved variant of the recently proposed Generative Adversarial Metric and show how it can score individual GANs or their collections under the GAP model. |
Tasks | |
Published | 2016-12-13 |
URL | http://arxiv.org/abs/1612.04021v1 |
http://arxiv.org/pdf/1612.04021v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-parallelization |
Repo | |
Framework | |
Dueling Bandits with Dependent Arms
Title | Dueling Bandits with Dependent Arms |
Authors | Bangrui Chen, Peter I. Frazier |
Abstract | We study dueling bandits with weak utility-based regret when preferences over arms have a total order and carry observable feature vectors. The order is assumed to be determined by these feature vectors, an unknown preference vector, and a known utility function. This structure introduces dependence between preferences for pairs of arms, and allows learning about the preference over one pair of arms from the preference over another pair of arms. We propose an algorithm for this setting called Comparing The Best (CTB), which we show has constant expected cumulative weak utility-based regret. We provide a Bayesian interpretation for CTB, an implementation appropriate for a small number of arms, and an alternate implementation for many arms that can be used when the input parameters satisfy a decomposability condition. We demonstrate through numerical experiments that CTB with appropriate input parameters outperforms all benchmarks considered. |
Tasks | |
Published | 2016-05-28 |
URL | http://arxiv.org/abs/1605.08838v2 |
http://arxiv.org/pdf/1605.08838v2.pdf | |
PWC | https://paperswithcode.com/paper/dueling-bandits-with-dependent-arms |
Repo | |
Framework | |
YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration
Title | YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration |
Authors | Renzo Andri, Lukas Cavigelli, Davide Rossi, Luca Benini |
Abstract | Convolutional neural networks (CNNs) have revolutionized the world of computer vision over the last few years, pushing image classification beyond human accuracy. The computational effort of today’s CNNs requires power-hungry parallel processors or GP-GPUs. Recent developments in CNN accelerators for system-on-chip integration have reduced energy consumption significantly. Unfortunately, even these highly optimized devices are above the power envelope imposed by mobile and deeply embedded applications and face hard limitations caused by CNN weight I/O and storage. This prevents the adoption of CNNs in future ultra-low power Internet of Things end-nodes for near-sensor analytics. Recent algorithmic and theoretical advancements enable competitive classification accuracy even when limiting CNNs to binary (+1/-1) weights during training. These new findings bring major optimization opportunities in the arithmetic core by removing the need for expensive multiplications, as well as reducing I/O bandwidth and storage. In this work, we present an accelerator optimized for binary-weight CNNs that achieves 1510 GOp/s at 1.2 V on a core area of only 1.33 MGE (Million Gate Equivalent) or 0.19 mm$^2$ and with a power dissipation of 895 {\mu}W in UMC 65 nm technology at 0.6 V. Our accelerator significantly outperforms the state-of-the-art in terms of energy and area efficiency achieving 61.2 TOp/s/W@0.6 V and 1135 GOp/s/MGE@1.2 V, respectively. |
Tasks | Image Classification |
Published | 2016-06-17 |
URL | http://arxiv.org/abs/1606.05487v4 |
http://arxiv.org/pdf/1606.05487v4.pdf | |
PWC | https://paperswithcode.com/paper/yodann-an-architecture-for-ultra-low-power |
Repo | |
Framework | |
On Unbounded Delays in Asynchronous Parallel Fixed-Point Algorithms
Title | On Unbounded Delays in Asynchronous Parallel Fixed-Point Algorithms |
Authors | Robert Hannah, Wotao Yin |
Abstract | The need for scalable numerical solutions has motivated the development of asynchronous parallel algorithms, where a set of nodes run in parallel with little or no synchronization, thus computing with delayed information. This paper studies the convergence of the asynchronous parallel algorithm ARock under potentially unbounded delays. ARock is a general asynchronous algorithm that has many applications. It parallelizes fixed-point iterations by letting a set of nodes randomly choose solution coordinates and update them in an asynchronous parallel fashion. ARock takes some recent asynchronous coordinate descent algorithms as special cases and gives rise to new asynchronous operator-splitting algorithms. Existing analysis of ARock assumes the delays to be bounded, and uses this bound to set a step size that is important to both convergence and efficiency. Other work, though allowing unbounded delays, imposes strict conditions on the underlying fixed-point operator, resulting in limited applications. In this paper, convergence is established under unbounded delays, which can be either stochastic or deterministic. The proposed step sizes are more practical and generally larger than those in the existing work. The step size adapts to the delay distribution or the current delay being experienced in the system. New Lyapunov functions, which are the key to analyzing asynchronous algorithms, are generated to obtain our results. A set of applicable optimization algorithms with large-scale applications are given, including machine learning and scientific computing algorithms. |
Tasks | |
Published | 2016-09-15 |
URL | http://arxiv.org/abs/1609.04746v2 |
http://arxiv.org/pdf/1609.04746v2.pdf | |
PWC | https://paperswithcode.com/paper/on-unbounded-delays-in-asynchronous-parallel |
Repo | |
Framework | |
On Bochner’s and Polya’s Characterizations of Positive-Definite Kernels and the Respective Random Feature Maps
Title | On Bochner’s and Polya’s Characterizations of Positive-Definite Kernels and the Respective Random Feature Maps |
Authors | Jie Chen, Dehua Cheng, Yan Liu |
Abstract | Positive-definite kernel functions are fundamental elements of kernel methods and Gaussian processes. A well-known construction of such functions comes from Bochner’s characterization, which connects a positive-definite function with a probability distribution. Another construction, which appears to have attracted less attention, is Polya’s criterion that characterizes a subset of these functions. In this paper, we study the latter characterization and derive a number of novel kernels little known previously. In the context of large-scale kernel machines, Rahimi and Recht (2007) proposed a random feature map (random Fourier) that approximates a kernel function, through independent sampling of the probability distribution in Bochner’s characterization. The authors also suggested another feature map (random binning), which, although not explicitly stated, comes from Polya’s characterization. We show that with the same number of random samples, the random binning map results in an Euclidean inner product closer to the kernel than does the random Fourier map. The superiority of the random binning map is confirmed empirically through regressions and classifications in the reproducing kernel Hilbert space. |
Tasks | Gaussian Processes |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08861v1 |
http://arxiv.org/pdf/1610.08861v1.pdf | |
PWC | https://paperswithcode.com/paper/on-bochners-and-polyas-characterizations-of |
Repo | |
Framework | |
Seeing the Forest from the Trees in Two Looks: Matrix Sketching by Cascaded Bilateral Sampling
Title | Seeing the Forest from the Trees in Two Looks: Matrix Sketching by Cascaded Bilateral Sampling |
Authors | Kai Zhang, Chuanren Liu, Jie Zhang, Hui Xiong, Eric Xing, Jieping Ye |
Abstract | Matrix sketching is aimed at finding close approximations of a matrix by factors of much smaller dimensions, which has important applications in optimization and machine learning. Given a matrix A of size m by n, state-of-the-art randomized algorithms take O(m * n) time and space to obtain its low-rank decomposition. Although quite useful, the need to store or manipulate the entire matrix makes it a computational bottleneck for truly large and dense inputs. Can we sketch an m-by-n matrix in O(m + n) cost by accessing only a small fraction of its rows and columns, without knowing anything about the remaining data? In this paper, we propose the cascaded bilateral sampling (CABS) framework to solve this problem. We start from demonstrating how the approximation quality of bilateral matrix sketching depends on the encoding powers of sampling. In particular, the sampled rows and columns should correspond to the code-vectors in the ground truth decompositions. Motivated by this analysis, we propose to first generate a pilot-sketch using simple random sampling, and then pursue more advanced, “follow-up” sampling on the pilot-sketch factors seeking maximal encoding powers. In this cascading process, the rise of approximation quality is shown to be lower-bounded by the improvement of encoding powers in the follow-up sampling step, thus theoretically guarantees the algorithmic boosting property. Computationally, our framework only takes linear time and space, and at the same time its performance rivals the quality of state-of-the-art algorithms consuming a quadratic amount of resources. Empirical evaluations on benchmark data fully demonstrate the potential of our methods in large scale matrix sketching and related areas. |
Tasks | |
Published | 2016-07-25 |
URL | http://arxiv.org/abs/1607.07395v3 |
http://arxiv.org/pdf/1607.07395v3.pdf | |
PWC | https://paperswithcode.com/paper/seeing-the-forest-from-the-trees-in-two-looks |
Repo | |
Framework | |
EmoFit: Affect Monitoring System for Sedentary Jobs
Title | EmoFit: Affect Monitoring System for Sedentary Jobs |
Authors | Amol Patwardhan, Gerald Knapp |
Abstract | Emotional and physical well-being at workplace is important for a positive work environment and higher productivity. Jobs such as software programming lead to a sedentary lifestyle and require high interaction with computers. Working at the same job for years can cause a feeling of intellectual stagnation and lack of drive. Many employees experience lack of motivation, mild to extreme depression due to reasons such as aversion towards job responsibilities and incompatibility with coworkers or boss. This research proposed an affect monitoring system EmoFit that would play the role of psychological and physical health trainer. The day to day computer activity and body language was analyzed to detect the physical and emotional well-being of the user. Keystrokes, activity interruptions, eye tracking, facial expressions, body posture and speech were monitored to gauge the users health. The system also provided activities such as at-desk exercise and stress relief game and motivational quotes in an attempt to promote users well-being. The experimental results and positive feedback from test subjects showed that EmoFit would help improve emotional and physical well-being at jobs that involve significant computer usage. |
Tasks | Eye Tracking |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01077v1 |
http://arxiv.org/pdf/1607.01077v1.pdf | |
PWC | https://paperswithcode.com/paper/emofit-affect-monitoring-system-for-sedentary |
Repo | |
Framework | |
Inter-Battery Topic Representation Learning
Title | Inter-Battery Topic Representation Learning |
Authors | Cheng Zhang, Hedvig Kjellstrom, Carl Henrik Ek |
Abstract | In this paper, we present the Inter-Battery Topic Model (IBTM). Our approach extends traditional topic models by learning a factorized latent variable representation. The structured representation leads to a model that marries benefits traditionally associated with a discriminative approach, such as feature selection, with those of a generative model, such as principled regularization and ability to handle missing data. The factorization is provided by representing data in terms of aligned pairs of observations as different views. This provides means for selecting a representation that separately models topics that exist in both views from the topics that are unique to a single view. This structured consolidation allows for efficient and robust inference and provides a compact and efficient representation. Learning is performed in a Bayesian fashion by maximizing a rigorous bound on the log-likelihood. Firstly, we illustrate the benefits of the model on a synthetic dataset,. The model is then evaluated in both uni- and multi-modality settings on two different classification tasks with off-the-shelf convolutional neural network (CNN) features which generate state-of-the-art results with extremely compact representations. |
Tasks | Feature Selection, Representation Learning, Topic Models |
Published | 2016-05-19 |
URL | http://arxiv.org/abs/1605.06155v2 |
http://arxiv.org/pdf/1605.06155v2.pdf | |
PWC | https://paperswithcode.com/paper/inter-battery-topic-representation-learning |
Repo | |
Framework | |
Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification
Title | Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification |
Authors | Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie |
Abstract | Mammogram classification is directly related to computer-aided diagnosis of breast cancer. Traditional methods requires great effort to annotate the training data by costly manual labeling and specialized computational models to detect these annotations during test. Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned costly need to annotate the training data. We explore three different schemes to construct deep multi-instance networks for whole mammogram classification. Experimental results on the INbreast dataset demonstrate the robustness of proposed deep networks compared to previous work using segmentation and detection annotations in the training. |
Tasks | Whole Mammogram Classification |
Published | 2016-12-18 |
URL | http://arxiv.org/abs/1612.05968v1 |
http://arxiv.org/pdf/1612.05968v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-multi-instance-networks-with-sparse |
Repo | |
Framework | |
DeepQA: Improving the estimation of single protein model quality with deep belief networks
Title | DeepQA: Improving the estimation of single protein model quality with deep belief networks |
Authors | Renzhi Cao, Debswapna Bhattacharya, Jie Hou, Jianlin Cheng |
Abstract | Protein quality assessment (QA) by ranking and selecting protein models has long been viewed as one of the major challenges for protein tertiary structure prediction. Especially, estimating the quality of a single protein model, which is important for selecting a few good models out of a large model pool consisting of mostly low-quality models, is still a largely unsolved problem. We introduce a novel single-model quality assessment method DeepQA based on deep belief network that utilizes a number of selected features describing the quality of a model from different perspectives, such as energy, physio-chemical characteristics, and structural information. The deep belief network is trained on several large datasets consisting of models from the Critical Assessment of Protein Structure Prediction (CASP) experiments, several publicly available datasets, and models generated by our in-house ab initio method. Our experiment demonstrate that deep belief network has better performance compared to Support Vector Machines and Neural Networks on the protein model quality assessment problem, and our method DeepQA achieves the state-of-the-art performance on CASP11 dataset. It also outperformed two well-established methods in selecting good outlier models from a large set of models of mostly low quality generated by ab initio modeling methods. DeepQA is a useful tool for protein single model quality assessment and protein structure prediction. The source code, executable, document and training/test datasets of DeepQA for Linux is freely available to non-commercial users at http://cactus.rnet.missouri.edu/DeepQA/. |
Tasks | |
Published | 2016-07-15 |
URL | http://arxiv.org/abs/1607.04379v1 |
http://arxiv.org/pdf/1607.04379v1.pdf | |
PWC | https://paperswithcode.com/paper/deepqa-improving-the-estimation-of-single |
Repo | |
Framework | |
Deep Multi-Modal Image Correspondence Learning
Title | Deep Multi-Modal Image Correspondence Learning |
Authors | Chen Liu, Jiajun Wu, Pushmeet Kohli, Yasutaka Furukawa |
Abstract | Inference of correspondences between images from different modalities is an extremely important perceptual ability that enables humans to understand and recognize cross-modal concepts. In this paper, we consider an instance of this problem that involves matching photographs of building interiors with their corresponding floorplan. This is a particularly challenging problem because a floorplan, as a stylized architectural drawing, is very different in appearance from a color photograph. Furthermore, individual photographs by themselves depict only a part of a floorplan (e.g., kitchen, bathroom, and living room). We propose the use of a number of different neural network architectures for this task, which are trained and evaluated on a novel large-scale dataset of 5 million floorplan images and 80 million associated photographs. Experimental evaluation reveals that our neural network architectures are able to identify visual cues that result in reliable matches across these two quite different modalities. In fact, the trained networks are able to even outperform human subjects in several challenging image matching problems. Our result implies that neural networks are effective at perceptual tasks that require long periods of reasoning even for humans to solve. |
Tasks | |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01225v1 |
http://arxiv.org/pdf/1612.01225v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-multi-modal-image-correspondence |
Repo | |
Framework | |