Paper Group ANR 565
Learning Gaussian Graphical Models Using Discriminated Hub Graphical Lasso. Traffic Optimization For a Mixture of Self-interested and Compliant Agents. Invariant components of synergy, redundancy, and unique information among three variables. Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems. A Voting-Based Syst …
Learning Gaussian Graphical Models Using Discriminated Hub Graphical Lasso
Title | Learning Gaussian Graphical Models Using Discriminated Hub Graphical Lasso |
Authors | Zhen Li, Jingtian Bai, Weilian Zhou |
Abstract | We develop a new method called Discriminated Hub Graphical Lasso (DHGL) based on Hub Graphical Lasso (HGL) by providing prior information of hubs. We apply this new method in two situations: with known hubs and without known hubs. Then we compare DHGL with HGL using several measures of performance. When some hubs are known, we can always estimate the precision matrix better via DHGL than HGL. When no hubs are known, we use Graphical Lasso (GL) to provide information of hubs and find that the performance of DHGL will always be better than HGL if correct prior information is given and will seldom degenerate when the prior information is wrong. |
Tasks | |
Published | 2017-05-17 |
URL | http://arxiv.org/abs/1705.06364v1 |
http://arxiv.org/pdf/1705.06364v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-gaussian-graphical-models-using |
Repo | |
Framework | |
Traffic Optimization For a Mixture of Self-interested and Compliant Agents
Title | Traffic Optimization For a Mixture of Self-interested and Compliant Agents |
Authors | Guni Sharon, Michael Albert, Tarun Rambha, Stephen Boyles, Peter Stone |
Abstract | This paper focuses on two commonly used path assignment policies for agents traversing a congested network: self-interested routing, and system-optimum routing. In the self-interested routing policy each agent selects a path that optimizes its own utility, while the system-optimum routing agents are assigned paths with the goal of maximizing system performance. This paper considers a scenario where a centralized network manager wishes to optimize utilities over all agents, i.e., implement a system-optimum routing policy. In many real-life scenarios, however, the system manager is unable to influence the route assignment of all agents due to limited influence on route choice decisions. Motivated by such scenarios, a computationally tractable method is presented that computes the minimal amount of agents that the system manager needs to influence (compliant agents) in order to achieve system optimal performance. Moreover, this methodology can also determine whether a given set of compliant agents is sufficient to achieve system optimum and compute the optimal route assignment for the compliant agents to do so. Experimental results are presented showing that in several large-scale, realistic traffic networks optimal flow can be achieved with as low as 13% of the agent being compliant and up to 54%. |
Tasks | |
Published | 2017-09-27 |
URL | http://arxiv.org/abs/1709.09569v1 |
http://arxiv.org/pdf/1709.09569v1.pdf | |
PWC | https://paperswithcode.com/paper/traffic-optimization-for-a-mixture-of-self |
Repo | |
Framework | |
Invariant components of synergy, redundancy, and unique information among three variables
Title | Invariant components of synergy, redundancy, and unique information among three variables |
Authors | Giuseppe Pica, Eugenio Piasini, Daniel Chicharro, Stefano Panzeri |
Abstract | In a system of three stochastic variables, the Partial Information Decomposition (PID) of Williams and Beer dissects the information that two variables (sources) carry about a third variable (target) into nonnegative information atoms that describe redundant, unique, and synergistic modes of dependencies among the variables. However, the classification of the three variables into two sources and one target limits the dependency modes that can be quantitatively resolved, and does not naturally suit all systems. Here, we extend the PID to describe trivariate modes of dependencies in full generality, without introducing additional decomposition axioms or making assumptions about the target/source nature of the variables. By comparing different PID lattices of the same system, we unveil a finer PID structure made of seven nonnegative information subatoms that are invariant to different target/source classifications and that are sufficient to construct any PID lattice. This finer structure naturally splits redundant information into two nonnegative components: the source redundancy, which arises from the pairwise correlations between the source variables, and the non-source redundancy, which does not, and relates to the synergistic information the sources carry about the target. The invariant structure is also sufficient to construct the system’s entropy, hence it characterizes completely all the interdependencies in the system. |
Tasks | |
Published | 2017-06-27 |
URL | http://arxiv.org/abs/1706.08921v1 |
http://arxiv.org/pdf/1706.08921v1.pdf | |
PWC | https://paperswithcode.com/paper/invariant-components-of-synergy-redundancy |
Repo | |
Framework | |
Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems
Title | Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems |
Authors | Jure Sokolic, Qiang Qiu, Miguel R. D. Rodrigues, Guillermo Sapiro |
Abstract | Security, privacy, and fairness have become critical in the era of data science and machine learning. More and more we see that achieving universally secure, private, and fair systems is practically impossible. We have seen for example how generative adversarial networks can be used to learn about the expected private training data; how the exploitation of additional data can reveal private information in the original one; and how what looks like unrelated features can teach us about each other. Confronted with this challenge, in this paper we open a new line of research, where the security, privacy, and fairness is learned and used in a closed environment. The goal is to ensure that a given entity (e.g., the company or the government), trusted to infer certain information with our data, is blocked from inferring protected information from it. For example, a hospital might be allowed to produce diagnosis on the patient (the positive task), without being able to infer the gender of the subject (negative task). Similarly, a company can guarantee that internally it is not using the provided data for any undesired task, an important goal that is not contradicting the virtually impossible challenge of blocking everybody from the undesired task. We design a system that learns to succeed on the positive task while simultaneously fail at the negative one, and illustrate this with challenging cases where the positive task is actually harder than the negative one being blocked. Fairness, to the information in the negative task, is often automatically obtained as a result of this proposed approach. The particular framework and examples open the door to security, privacy, and fairness in very important closed scenarios, ranging from private data accumulation companies like social networks to law-enforcement and hospitals. |
Tasks | |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08197v1 |
http://arxiv.org/pdf/1705.08197v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-succeed-while-teaching-to-fail |
Repo | |
Framework | |
A Voting-Based System for Ethical Decision Making
Title | A Voting-Based System for Ethical Decision Making |
Authors | Ritesh Noothigattu, Snehalkumar ‘Neil’ S. Gaikwad, Edmond Awad, Sohan Dsouza, Iyad Rahwan, Pradeep Ravikumar, Ariel D. Procaccia |
Abstract | We present a general approach to automating ethical decisions, drawing on machine learning and computational social choice. In a nutshell, we propose to learn a model of societal preferences, and, when faced with a specific ethical dilemma at runtime, efficiently aggregate those preferences to identify a desirable choice. We provide a concrete algorithm that instantiates our approach; some of its crucial steps are informed by a new theory of swap-dominance efficient voting rules. Finally, we implement and evaluate a system for ethical decision making in the autonomous vehicle domain, using preference data collected from 1.3 million people through the Moral Machine website. |
Tasks | Decision Making |
Published | 2017-09-20 |
URL | http://arxiv.org/abs/1709.06692v2 |
http://arxiv.org/pdf/1709.06692v2.pdf | |
PWC | https://paperswithcode.com/paper/a-voting-based-system-for-ethical-decision |
Repo | |
Framework | |
Dialectical Rough Sets, Parthood and Figures of Opposition-1
Title | Dialectical Rough Sets, Parthood and Figures of Opposition-1 |
Authors | A. Mani |
Abstract | In one perspective, the main theme of this research revolves around the inverse problem in the context of general rough sets that concerns the existence of rough basis for given approximations in a context. Granular operator spaces and variants were recently introduced by the present author as an optimal framework for anti-chain based algebraic semantics of general rough sets and the inverse problem. In the framework, various sub-types of crisp and non-crisp objects are identifiable that may be missed in more restrictive formalism. This is also because in the latter cases concepts of complementation and negation are taken for granted - while in reality they have a complicated dialectical basis. This motivates a general approach to dialectical rough sets building on previous work of the present author and figures of opposition. In this paper dialectical rough logics are invented from a semantic perspective, a concept of dialectical predicates is formalised, connection with dialetheias and glutty negation are established, parthood analyzed and studied from the viewpoint of classical and dialectical figures of opposition by the present author. Her methods become more geometrical and encompass parthood as a primary relation (as opposed to roughly equivalent objects) for algebraic semantics. |
Tasks | |
Published | 2017-03-29 |
URL | http://arxiv.org/abs/1703.10251v2 |
http://arxiv.org/pdf/1703.10251v2.pdf | |
PWC | https://paperswithcode.com/paper/dialectical-rough-sets-parthood-and-figures |
Repo | |
Framework | |
Learning to Singulate Objects using a Push Proposal Network
Title | Learning to Singulate Objects using a Push Proposal Network |
Authors | Andreas Eitel, Nico Hauff, Wolfram Burgard |
Abstract | Learning to act in unstructured environments, such as cluttered piles of objects, poses a substantial challenge for manipulation robots. We present a novel neural network-based approach that separates unknown objects in clutter by selecting favourable push actions. Our network is trained from data collected through autonomous interaction of a PR2 robot with randomly organized tabletop scenes. The model is designed to propose meaningful push actions based on over-segmented RGB-D images. We evaluate our approach by singulating up to 8 unknown objects in clutter. We demonstrate that our method enables the robot to perform the task with a high success rate and a low number of required push actions. Our results based on real-world experiments show that our network is able to generalize to novel objects of various sizes and shapes, as well as to arbitrary object configurations. Videos of our experiments can be viewed at http://robotpush.cs.uni-freiburg.de |
Tasks | |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.08101v2 |
http://arxiv.org/pdf/1707.08101v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-singulate-objects-using-a-push |
Repo | |
Framework | |
Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection
Title | Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection |
Authors | Taku Kato, Takahiro Shinozaki |
Abstract | Speech recognition systems have achieved high recognition performance for several tasks. However, the performance of such systems is dependent on the tremendously costly development work of preparing vast amounts of task-matched transcribed speech data for supervised training. The key problem here is the cost of transcribing speech data. The cost is repeatedly required to support new languages and new tasks. Assuming broad network services for transcribing speech data for many users, a system would become more self-sufficient and more useful if it possessed the ability to learn from very light feedback from the users without annoying them. In this paper, we propose a general reinforcement learning framework for speech recognition systems based on the policy gradient method. As a particular instance of the framework, we also propose a hypothesis selection-based reinforcement learning method. The proposed framework provides a new view for several existing training and adaptation methods. The experimental results show that the proposed method improves the recognition performance compared to unsupervised adaptation. |
Tasks | Speech Recognition |
Published | 2017-11-10 |
URL | http://arxiv.org/abs/1711.03689v1 |
http://arxiv.org/pdf/1711.03689v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-of-speech-recognition |
Repo | |
Framework | |
Biologically inspired protection of deep networks from adversarial attacks
Title | Biologically inspired protection of deep networks from adversarial attacks |
Authors | Aran Nayebi, Surya Ganguli |
Abstract | Inspired by biophysical principles underlying nonlinear dendritic computation in neural circuits, we develop a scheme to train deep neural networks to make them robust to adversarial attacks. Our scheme generates highly nonlinear, saturated neural networks that achieve state of the art performance on gradient based adversarial examples on MNIST, despite never being exposed to adversarially chosen examples during training. Moreover, these networks exhibit unprecedented robustness to targeted, iterative schemes for generating adversarial examples, including second-order methods. We further identify principles governing how these networks achieve their robustness, drawing on methods from information geometry. We find these networks progressively create highly flat and compressed internal representations that are sensitive to very few input dimensions, while still solving the task. Moreover, they employ highly kurtotic weight distributions, also found in the brain, and we demonstrate how such kurtosis can protect even linear classifiers from adversarial attack. |
Tasks | Adversarial Attack |
Published | 2017-03-27 |
URL | http://arxiv.org/abs/1703.09202v1 |
http://arxiv.org/pdf/1703.09202v1.pdf | |
PWC | https://paperswithcode.com/paper/biologically-inspired-protection-of-deep |
Repo | |
Framework | |
Fine-Grained Categorization via CNN-Based Automatic Extraction and Integration of Object-Level and Part-Level Features
Title | Fine-Grained Categorization via CNN-Based Automatic Extraction and Integration of Object-Level and Part-Level Features |
Authors | Ting Sun, Lin Sun, Dit-Yan Yeung |
Abstract | Fine-grained categorization can benefit from part-based features which reveal subtle visual differences between object categories. Handcrafted features have been widely used for part detection and classification. Although a recent trend seeks to learn such features automatically using powerful deep learning models such as convolutional neural networks (CNN), their training and possibly also testing require manually provided annotations which are costly to obtain. To relax these requirements, we assume in this study a general problem setting in which the raw images are only provided with object-level class labels for model training with no other side information needed. Specifically, by extracting and interpreting the hierarchical hidden layer features learned by a CNN, we propose an elaborate CNN-based system for fine-grained categorization. When evaluated on the Caltech-UCSD Birds-200-2011, FGVC-Aircraft, Cars and Stanford dogs datasets under the setting that only object-level class labels are used for training and no other annotations are available for both training and testing, our method achieves impressive performance that is superior or comparable to the state of the art. Moreover, it sheds some light on ingenious use of the hierarchical features learned by CNN which has wide applicability well beyond the current fine-grained categorization task. |
Tasks | |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07397v1 |
http://arxiv.org/pdf/1706.07397v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-categorization-via-cnn-based |
Repo | |
Framework | |
Unbiased estimates for linear regression via volume sampling
Title | Unbiased estimates for linear regression via volume sampling |
Authors | Michał Dereziński, Manfred K. Warmuth |
Abstract | Given a full rank matrix $X$ with more columns than rows, consider the task of estimating the pseudo inverse $X^+$ based on the pseudo inverse of a sampled subset of columns (of size at least the number of rows). We show that this is possible if the subset of columns is chosen proportional to the squared volume spanned by the rows of the chosen submatrix (ie, volume sampling). The resulting estimator is unbiased and surprisingly the covariance of the estimator also has a closed form: It equals a specific factor times $X^{+\top}X^+$. Pseudo inverse plays an important part in solving the linear least squares problem, where we try to predict a label for each column of $X$. We assume labels are expensive and we are only given the labels for the small subset of columns we sample from $X$. Using our methods we show that the weight vector of the solution for the sub problem is an unbiased estimator of the optimal solution for the whole problem based on all column labels. We believe that these new formulas establish a fundamental connection between linear least squares and volume sampling. We use our methods to obtain an algorithm for volume sampling that is faster than state-of-the-art and for obtaining bounds for the total loss of the estimated least-squares solution on all labeled columns. |
Tasks | |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.06908v5 |
http://arxiv.org/pdf/1705.06908v5.pdf | |
PWC | https://paperswithcode.com/paper/unbiased-estimates-for-linear-regression-via |
Repo | |
Framework | |
Integrated Model, Batch and Domain Parallelism in Training Neural Networks
Title | Integrated Model, Batch and Domain Parallelism in Training Neural Networks |
Authors | Amir Gholami, Ariful Azad, Peter Jin, Kurt Keutzer, Aydin Buluc |
Abstract | We propose a new integrated method of exploiting model, batch and domain parallelism for the training of deep neural networks (DNNs) on large distributed-memory computers using minibatch stochastic gradient descent (SGD). Our goal is to find an efficient parallelization strategy for a fixed batch size using $P$ processes. Our method is inspired by the communication-avoiding algorithms in numerical linear algebra. We see $P$ processes as logically divided into a $P_r \times P_c$ grid where the $P_r$ dimension is implicitly responsible for model/domain parallelism and the $P_c$ dimension is implicitly responsible for batch parallelism. In practice, the integrated matrix-based parallel algorithm encapsulates these types of parallelism automatically. We analyze the communication complexity and analytically demonstrate that the lowest communication costs are often achieved neither with pure model nor with pure data parallelism. We also show how the domain parallel approach can help in extending the theoretical scaling limit of the typical batch parallel method. |
Tasks | |
Published | 2017-12-12 |
URL | http://arxiv.org/abs/1712.04432v4 |
http://arxiv.org/pdf/1712.04432v4.pdf | |
PWC | https://paperswithcode.com/paper/integrated-model-batch-and-domain-parallelism |
Repo | |
Framework | |
A Bottom Up Procedure for Text Line Segmentation of Latin Script
Title | A Bottom Up Procedure for Text Line Segmentation of Latin Script |
Authors | Himanshu Jain, Archana Praveen Kumar |
Abstract | In this paper we present a bottom up procedure for segmentation of text lines written or printed in the Latin script. The proposed method uses a combination of image morphology, feature extraction and Gaussian mixture model to perform this task. The experimental results show the validity of the procedure. |
Tasks | |
Published | 2017-10-09 |
URL | http://arxiv.org/abs/1710.03027v1 |
http://arxiv.org/pdf/1710.03027v1.pdf | |
PWC | https://paperswithcode.com/paper/a-bottom-up-procedure-for-text-line |
Repo | |
Framework | |
Exploring Generalization in Deep Learning
Title | Exploring Generalization in Deep Learning |
Authors | Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester, Nathan Srebro |
Abstract | With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and robustness. We study how these measures can ensure generalization, highlighting the importance of scale normalization, and making a connection between sharpness and PAC-Bayes theory. We then investigate how well the measures explain different observed phenomena. |
Tasks | |
Published | 2017-06-27 |
URL | http://arxiv.org/abs/1706.08947v2 |
http://arxiv.org/pdf/1706.08947v2.pdf | |
PWC | https://paperswithcode.com/paper/exploring-generalization-in-deep-learning |
Repo | |
Framework | |
Bootstrapping the Out-of-sample Predictions for Efficient and Accurate Cross-Validation
Title | Bootstrapping the Out-of-sample Predictions for Efficient and Accurate Cross-Validation |
Authors | Ioannis Tsamardinos, Elissavet Greasidou, Michalis Tsagris, Giorgos Borboudakis |
Abstract | Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive performance of the final model. However, the cross-validated performance of the best configuration is optimistically biased. We present an efficient bootstrap method that corrects for the bias, called Bootstrap Bias Corrected CV (BBC-CV). BBC-CV’s main idea is to bootstrap the whole process of selecting the best-performing configuration on the out-of-sample predictions of each configuration, without additional training of models. In comparison to the alternatives, namely the nested cross-validation and a method by Tibshirani and Tibshirani, BBC-CV is computationally more efficient, has smaller variance and bias, and is applicable to any metric of performance (accuracy, AUC, concordance index, mean squared error). Subsequently, we employ again the idea of bootstrapping the out-of-sample predictions to speed up the CV process. Specifically, using a bootstrap-based hypothesis test we stop training of models on new folds of statistically-significantly inferior configurations. We name the method Bootstrap Corrected with Early Dropping CV (BCED-CV) that is both efficient and provides accurate performance estimates. |
Tasks | |
Published | 2017-08-23 |
URL | http://arxiv.org/abs/1708.07180v2 |
http://arxiv.org/pdf/1708.07180v2.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-the-out-of-sample-predictions |
Repo | |
Framework | |