April 1, 2020

3081 words 15 mins read

Paper Group ANR 444

Paper Group ANR 444

Progressive Identification of True Labels for Partial-Label Learning. LoCEC: Local Community-based Edge Classification in Large Online Social Networks. Preferential Batch Bayesian Optimization. An Ontology-based Context Model in Intelligent Environments. Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization. A Diffusion Theory for De …

Progressive Identification of True Labels for Partial-Label Learning

Title Progressive Identification of True Labels for Partial-Label Learning
Authors Jiaqi Lv, Miao Xu, Lei Feng, Gang Niu, Xin Geng, Masashi Sugiyama
Abstract Partial-label learning is one of the important weakly supervised learning problems, where each training example is equipped with a set of candidate labels that contains the true label. Most existing methods elaborately designed learning objectives as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data. The goal of this paper is to propose a novel framework of partial-label learning without implicit assumptions on the model or optimization algorithm. More specifically, we propose a general estimator of the classification risk, theoretically analyze the classifier-consistency, and establish an estimation error bound. We then explore a progressive identification method for approximately minimizing the proposed risk estimator, where the update of the model and identification of true labels are conducted in a seamless manner. The resulting algorithm is model-independent and loss-independent, and compatible with stochastic optimization. Thorough experiments demonstrate it sets the new state of the art.
Tasks Stochastic Optimization
Published 2020-02-19
URL https://arxiv.org/abs/2002.08053v1
PDF https://arxiv.org/pdf/2002.08053v1.pdf
PWC https://paperswithcode.com/paper/progressive-identification-of-true-labels-for
Repo
Framework

LoCEC: Local Community-based Edge Classification in Large Online Social Networks

Title LoCEC: Local Community-based Edge Classification in Large Online Social Networks
Authors Chonggang Song, Qian Lin, Guohui Ling, Zongyi Zhang, Hongzhao Chen, Jun Liao, Chuan Chen
Abstract Relationships in online social networks often imply social connections in the real world. An accurate understanding of relationship types benefits many applications, e.g. social advertising and recommendation. Some recent attempts have been proposed to classify user relationships into predefined types with the help of pre-labeled relationships or abundant interaction features on relationships. Unfortunately, both relationship feature data and label data are very sparse in real social platforms like WeChat, rendering existing methods inapplicable. In this paper, we present an in-depth analysis of WeChat relationships to identify the major challenges for the relationship classification task. To tackle the challenges, we propose a Local Community-based Edge Classification (LoCEC) framework that classifies user relationships in a social network into real-world social connection types. LoCEC enforces a three-phase processing, namely local community detection, community classification and relationship classification, to address the sparsity issue of relationship features and relationship labels. Moreover, LoCEC is designed to handle large-scale networks by allowing parallel and distributed processing. We conduct extensive experiments on the real-world WeChat network with hundreds of billions of edges to validate the effectiveness and efficiency of LoCEC.
Tasks Community Detection, Local Community Detection
Published 2020-02-11
URL https://arxiv.org/abs/2002.04180v2
PDF https://arxiv.org/pdf/2002.04180v2.pdf
PWC https://paperswithcode.com/paper/locec-local-community-based-edge
Repo
Framework

Preferential Batch Bayesian Optimization

Title Preferential Batch Bayesian Optimization
Authors Eero Siivola, Akash Kumar Dhaka, Michael Riis Andersen, Javier Gonzalez, Pablo Garcia Moreno, Aki Vehtari
Abstract Most research in Bayesian optimization (BO) has focused on direct feedback scenarios, where one has access to exact, or perturbed, values of some expensive-to-evaluate objective. This direction has been mainly driven by the use of BO in machine learning hyper-parameter configuration problems. However, in domains such as modelling human preferences, A/B tests or recommender systems, there is a need of methods that are able to replace direct feedback with preferential feedback, obtained via rankings or pairwise comparisons. In this work, we present Preferential Batch Bayesian Optimization (PBBO), a new framework that allows to find the optimum of a latent function of interest, given any type of parallel preferential feedback for a group of two or more points. We do so by using a Gaussian process model with a likelihood specially designed to enable parallel and efficient data collection mechanisms, which are key in modern machine learning. We show how the acquisitions developed under this framework generalize and augment previous approaches in Bayesian optimization, expanding the use of these techniques to a wider range of domains. An extensive simulation study shows the benefits of this approach, both with simulated functions and four real data sets.
Tasks Recommendation Systems
Published 2020-03-25
URL https://arxiv.org/abs/2003.11435v1
PDF https://arxiv.org/pdf/2003.11435v1.pdf
PWC https://paperswithcode.com/paper/preferential-batch-bayesian-optimization
Repo
Framework

An Ontology-based Context Model in Intelligent Environments

Title An Ontology-based Context Model in Intelligent Environments
Authors Tao Gu, Xiao Hang Wang, Hung Keng Pung, Da Qing Zhang
Abstract Computing becomes increasingly mobile and pervasive today; these changes imply that applications and services must be aware of and adapt to their changing contexts in highly dynamic environments. Today, building context-aware systems is a complex task due to lack of an appropriate infrastructure support in intelligent environments. A context-aware infrastructure requires an appropriate context model to represent, manipulate and access context information. In this paper, we propose a formal context model based on ontology using OWL to address issues including semantic context representation, context reasoning and knowledge sharing, context classification, context dependency and quality of context. The main benefit of this model is the ability to reason about various contexts. Based on our context model, we also present a Service-Oriented Context-Aware Middleware (SOCAM) architecture for building of context-aware services.
Tasks
Published 2020-03-06
URL https://arxiv.org/abs/2003.05055v1
PDF https://arxiv.org/pdf/2003.05055v1.pdf
PWC https://paperswithcode.com/paper/an-ontology-based-context-model-in
Repo
Framework

Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization

Title Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization
Authors Samuel Horváth, Lihua Lei, Peter Richtárik, Michael I. Jordan
Abstract Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve drastically different settings of hyperparameters, such as step-size schemes and batch sizes, in different regimes. Despite the appealing theoretical results, such divisive strategies provide little, if any, insight to practitioners to select algorithms that work broadly without tweaking the hyperparameters. In this work, blending the “geometrization” technique introduced by Lei & Jordan 2016 and the \texttt{SARAH} algorithm of Nguyen et al., 2017, we propose the Geometrized \texttt{SARAH} algorithm for non-convex finite-sum and stochastic optimization. Our algorithm is proved to achieve adaptivity to both the magnitude of the target accuracy and the Polyak-\L{}ojasiewicz (PL) constant if present. In addition, it achieves the best-available convergence rate for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
Tasks Stochastic Optimization
Published 2020-02-13
URL https://arxiv.org/abs/2002.05359v1
PDF https://arxiv.org/pdf/2002.05359v1.pdf
PWC https://paperswithcode.com/paper/adaptivity-of-stochastic-gradient-methods-for
Repo
Framework

A Diffusion Theory for Deep Learning Dynamics: Stochastic Gradient Descent Escapes From Sharp Minima Exponentially Fast

Title A Diffusion Theory for Deep Learning Dynamics: Stochastic Gradient Descent Escapes From Sharp Minima Exponentially Fast
Authors Zeke Xie, Issei Sato, Masashi Sugiyama
Abstract Stochastic optimization algorithms, such as Stochastic Gradient Descent (SGD) and its variants, are mainstream methods for training deep networks in practice. However, the theoretical mechanism behind gradient noise still remains to be further investigated. Deep learning is known to find flat minima with a large neighboring region in parameter space from which each weight vector has similar small error. In this paper, we focus on a fundamental problem in deep learning, “How can deep learning usually find flat minima among so many minima?” To answer the question, we develop a density diffusion theory (DDT) for revealing the fundamental dynamical mechanism of SGD and deep learning. More specifically, we study how escape time from loss valleys to the outside of valleys depends on minima sharpness, gradient noise and hyperparameters. One of the most interesting findings is that stochastic gradient noise from SGD can help escape from sharp minima exponentially faster than flat minima, while white noise can only help escape from sharp minima polynomially faster than flat minima. We also find large-batch training requires exponentially many iterations to pass through sharp minima and find flat minima. We present direct empirical evidence supporting the proposed theoretical results.
Tasks Stochastic Optimization
Published 2020-02-10
URL https://arxiv.org/abs/2002.03495v5
PDF https://arxiv.org/pdf/2002.03495v5.pdf
PWC https://paperswithcode.com/paper/a-diffusion-theory-for-deep-learning-dynamics
Repo
Framework

A Kernel Mean Embedding Approach to Reducing Conservativeness in Stochastic Programming and Control

Title A Kernel Mean Embedding Approach to Reducing Conservativeness in Stochastic Programming and Control
Authors Jia-Jie Zhu, Bernhard Schölkopf, Moritz Diehl
Abstract We apply kernel mean embedding methods to sample-based stochastic optimization and control. Specifically, we use the reduced-set expansion method as a way to discard sampled scenarios. The effect of such constraint removal is improved optimality and decreased conservativeness. This is achieved by solving a distributional-distance-regularized optimization problem. We demonstrated this optimization formulation is well-motivated in theory, computationally tractable, and effective in numerical algorithms.
Tasks Stochastic Optimization
Published 2020-01-28
URL https://arxiv.org/abs/2001.10398v1
PDF https://arxiv.org/pdf/2001.10398v1.pdf
PWC https://paperswithcode.com/paper/a-kernel-mean-embedding-approach-to-reducing
Repo
Framework

Symplectic networks: Intrinsic structure-preserving networks for identifying Hamiltonian systems

Title Symplectic networks: Intrinsic structure-preserving networks for identifying Hamiltonian systems
Authors Pengzhan Jin, Aiqing Zhu, George Em Karniadakis, Yifa Tang
Abstract This work presents a framework of constructing the neural networks preserving the symplectic structure, so-called symplectic networks (SympNets). With the symplectic networks, we show some numerical results about (\romannumeral1) solving the Hamiltonian systems by learning abundant data points over the phase space, and (\romannumeral2) predicting the phase flows by learning a series of points depending on time. All the experiments point out that the symplectic networks perform much more better than the fully-connected networks that without any prior information, especially in the task of predicting which is unable to do within the conventional numerical methods.
Tasks
Published 2020-01-11
URL https://arxiv.org/abs/2001.03750v1
PDF https://arxiv.org/pdf/2001.03750v1.pdf
PWC https://paperswithcode.com/paper/symplectic-networks-intrinsic-structure
Repo
Framework

A^2-GCN: An Attribute-aware Attentive GCN Model for Recommendation

Title A^2-GCN: An Attribute-aware Attentive GCN Model for Recommendation
Authors Fan Liu, Zhiyong Cheng, Lei Zhu, Chenghao Liu, Liqiang Nie
Abstract As important side information, attributes have been widely exploited in the existing recommender system for better performance. In the real-world scenarios, it is common that some attributes of items/users are missing (e.g., some movies miss the genre data). Prior studies usually use a default value (i.e., “other”) to represent the missing attribute, resulting in sub-optimal performance. To address this problem, in this paper, we present an attribute-aware attentive graph convolution network (A${^2}$-GCN). In particular, we first construct a graph, whereby users, items, and attributes are three types of nodes and their associations are edges. Thereafter, we leverage the graph convolution network to characterize the complicated interactions among <users, items, attributes>. To learn the node representation, we turn to the message-passing strategy to aggregate the message passed from the other directly linked types of nodes (e.g., a user or an attribute). To this end, we are capable of incorporating associate attributes to strengthen the user and item representations, and thus naturally solve the attribute missing problem. Considering the fact that for different users, the attributes of an item have different influence on their preference for this item, we design a novel attention mechanism to filter the message passed from an item to a target user by considering the attribute information. Extensive experiments have been conducted on several publicly accessible datasets to justify our model. Results show that our model outperforms several state-of-the-art methods and demonstrate the effectiveness of our attention method.
Tasks Recommendation Systems
Published 2020-03-20
URL https://arxiv.org/abs/2003.09086v1
PDF https://arxiv.org/pdf/2003.09086v1.pdf
PWC https://paperswithcode.com/paper/a2-gcn-an-attribute-aware-attentive-gcn-model
Repo
Framework

TensorShield: Tensor-based Defense Against Adversarial Attacks on Images

Title TensorShield: Tensor-based Defense Against Adversarial Attacks on Images
Authors Negin Entezari, Evangelos E. Papalexakis
Abstract Recent studies have demonstrated that machine learning approaches like deep neural networks (DNNs) are easily fooled by adversarial attacks. Subtle and imperceptible perturbations of the data are able to change the result of deep neural networks. Leveraging vulnerable machine learning methods raises many concerns especially in domains where security is an important factor. Therefore, it is crucial to design defense mechanisms against adversarial attacks. For the task of image classification, unnoticeable perturbations mostly occur in the high-frequency spectrum of the image. In this paper, we utilize tensor decomposition techniques as a preprocessing step to find a low-rank approximation of images which can significantly discard high-frequency perturbations. Recently a defense framework called Shield could “vaccinate” Convolutional Neural Networks (CNN) against adversarial examples by performing random-quality JPEG compressions on local patches of images on the ImageNet dataset. Our tensor-based defense mechanism outperforms the SLQ method from Shield by 14% against FastGradient Descent (FGSM) adversarial attacks, while maintaining comparable speed.
Tasks Image Classification
Published 2020-02-18
URL https://arxiv.org/abs/2002.10252v1
PDF https://arxiv.org/pdf/2002.10252v1.pdf
PWC https://paperswithcode.com/paper/tensorshield-tensor-based-defense-against
Repo
Framework

What the [MASK]? Making Sense of Language-Specific BERT Models

Title What the [MASK]? Making Sense of Language-Specific BERT Models
Authors Debora Nozza, Federico Bianchi, Dirk Hovy
Abstract Recently, Natural Language Processing (NLP) has witnessed an impressive progress in many areas, due to the advent of novel, pretrained contextual representation models. In particular, Devlin et al. (2019) proposed a model, called BERT (Bidirectional Encoder Representations from Transformers), which enables researchers to obtain state-of-the art performance on numerous NLP tasks by fine-tuning the representations on their data set and task, without the need for developing and training highly-specific architectures. The authors also released multilingual BERT (mBERT), a model trained on a corpus of 104 languages, which can serve as a universal language model. This model obtained impressive results on a zero-shot cross-lingual natural inference task. Driven by the potential of BERT models, the NLP community has started to investigate and generate an abundant number of BERT models that are trained on a particular language, and tested on a specific data domain and task. This allows us to evaluate the true potential of mBERT as a universal language model, by comparing it to the performance of these more specific models. This paper presents the current state of the art in language-specific BERT models, providing an overall picture with respect to different dimensions (i.e. architectures, data domains, and tasks). Our aim is to provide an immediate and straightforward overview of the commonalities and differences between Language-Specific (language-specific) BERT models and mBERT. We also provide an interactive and constantly updated website that can be used to explore the information we have collected, at https://bertlang.unibocconi.it.
Tasks Language Modelling
Published 2020-03-05
URL https://arxiv.org/abs/2003.02912v1
PDF https://arxiv.org/pdf/2003.02912v1.pdf
PWC https://paperswithcode.com/paper/what-the-mask-making-sense-of-language
Repo
Framework

I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents

Title I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents
Authors Shrimai Prabhumoye, Margaret Li, Jack Urbanek, Emily Dinan, Douwe Kiela, Jason Weston, Arthur Szlam
Abstract Dialogue research tends to distinguish between chit-chat and goal-oriented tasks. While the former is arguably more naturalistic and has a wider use of language, the latter has clearer metrics and a straightforward learning signal. Humans effortlessly combine the two, for example engaging in chit-chat with the goal of exchanging information or eliciting a specific response. Here, we bridge the divide between these two domains in the setting of a rich multi-player text-based fantasy environment where agents and humans engage in both actions and dialogue. Specifically, we train a goal-oriented model with reinforcement learning against an imitation-learned ``chit-chat’’ model with two approaches: the policy either learns to pick a topic or learns to pick an utterance given the top-K utterances from the chit-chat model. We show that both models outperform an inverse model baseline and can converse naturally with their dialogue partner in order to achieve goals. |
Tasks
Published 2020-02-07
URL https://arxiv.org/abs/2002.02878v2
PDF https://arxiv.org/pdf/2002.02878v2.pdf
PWC https://paperswithcode.com/paper/i-love-your-chain-mail-making-knights-smile-1
Repo
Framework

Modality-Balanced Models for Visual Dialogue

Title Modality-Balanced Models for Visual Dialogue
Authors Hyounghun Kim, Hao Tan, Mohit Bansal
Abstract The Visual Dialog task requires a model to exploit both image and conversational context information to generate the next response to the dialogue. However, via manual analysis, we find that a large number of conversational questions can be answered by only looking at the image without any access to the context history, while others still need the conversation context to predict the correct answers. We demonstrate that due to this reason, previous joint-modality (history and image) models over-rely on and are more prone to memorizing the dialogue history (e.g., by extracting certain keywords or patterns in the context information), whereas image-only models are more generalizable (because they cannot memorize or extract keywords from history) and perform substantially better at the primary normalized discounted cumulative gain (NDCG) task metric which allows multiple correct answers. Hence, this observation encourages us to explicitly maintain two models, i.e., an image-only model and an image-history joint model, and combine their complementary abilities for a more balanced multimodal model. We present multiple methods for this integration of the two models, via ensemble and consensus dropout fusion with shared parameters. Empirically, our models achieve strong results on the Visual Dialog challenge 2019 (rank 3 on NDCG and high balance across metrics), and substantially outperform the winner of the Visual Dialog challenge 2018 on most metrics.
Tasks Visual Dialog
Published 2020-01-17
URL https://arxiv.org/abs/2001.06354v1
PDF https://arxiv.org/pdf/2001.06354v1.pdf
PWC https://paperswithcode.com/paper/modality-balanced-models-for-visual-dialogue
Repo
Framework

Ensemble based discriminative models for Visual Dialog Challenge 2018

Title Ensemble based discriminative models for Visual Dialog Challenge 2018
Authors Shubham Agarwal, Raghav Goyal
Abstract This manuscript describes our approach for the Visual Dialog Challenge 2018. We use an ensemble of three discriminative models with different encoders and decoders for our final submission. Our best performing model on ‘test-std’ split achieves the NDCG score of 55.46 and the MRR value of 63.77, securing third position in the challenge.
Tasks Visual Dialog
Published 2020-01-15
URL https://arxiv.org/abs/2001.05865v1
PDF https://arxiv.org/pdf/2001.05865v1.pdf
PWC https://paperswithcode.com/paper/ensemble-based-discriminative-models-for
Repo
Framework

Scaling up Hybrid Probabilistic Inference with Logical and Arithmetic Constraints via Message Passing

Title Scaling up Hybrid Probabilistic Inference with Logical and Arithmetic Constraints via Message Passing
Authors Zhe Zeng, Paolo Morettin, Fanqi Yan, Antonio Vergari, Guy Van den Broeck
Abstract Weighted model integration (WMI) is a very appealing framework for probabilistic inference: it allows to express the complex dependencies of real-world problems where variables are both continuous and discrete, via the language of Satisfiability Modulo Theories (SMT), as well as to compute probabilistic queries with complex logical and arithmetic constraints. Yet, existing WMI solvers are not ready to scale to these problems. They either ignore the intrinsic dependency structure of the problem at all, or they are limited to too restrictive structures. To narrow this gap, we derive a factorized formalism of WMI enabling us to devise a scalable WMI solver based on message passing, MP-WMI. Namely, MP-WMI is the first WMI solver which allows to: 1) perform exact inference on the full class of tree-structured WMI problems; 2) compute all marginal densities in linear time; 3) amortize inference inter query. Experimental results show that our solver dramatically outperforms the existing WMI solvers on a large set of benchmarks.
Tasks
Published 2020-02-28
URL https://arxiv.org/abs/2003.00126v1
PDF https://arxiv.org/pdf/2003.00126v1.pdf
PWC https://paperswithcode.com/paper/scaling-up-hybrid-probabilistic-inference
Repo
Framework
comments powered by Disqus