April 1, 2020

3039 words 15 mins read

Paper Group ANR 485

Paper Group ANR 485

On Newton Screening. Learning High Order Feature Interactions with Fine Control Kernels. Referencing Source Code Artifacts: a Separate Concern in Software Citation. The DIDI dataset: Digital Ink Diagram data. Dynamic Incentive-aware Learning: Robust Pricing in Contextual Auctions. A Spatial-Temporal Attentive Network with Spatial Continuity for Tra …

On Newton Screening

Title On Newton Screening
Authors Jian Huang, Yuling Jiao, Lican Kang, Jin Liu, Yanyan Liu, Xiliang Lu, Yuanyuan Yang
Abstract Screening and working set techniques are important approaches to reducing the size of an optimization problem. They have been widely used in accelerating first-order methods for solving large-scale sparse learning problems. In this paper, we develop a new screening method called Newton screening (NS) which is a generalized Newton method with a built-in screening mechanism. We derive an equivalent KKT system for the Lasso and utilize a generalized Newton method to solve the KKT equations. Based on this KKT system, a built-in working set with a relatively small size is first determined using the sum of primal and dual variables generated from the previous iteration, then the primal variable is updated by solving a least-squares problem on the working set and the dual variable updated based on a closed-form expression. Moreover, we consider a sequential version of Newton screening (SNS) with a warm-start strategy. We show that NS possesses an optimal convergence property in the sense that it achieves one-step local convergence. Under certain regularity conditions on the feature matrix, we show that SNS hits a solution with the same signs as the underlying true target and achieves a sharp estimation error bound with high probability. Simulation studies and real data analysis support our theoretical results and demonstrate that SNS is faster and more accurate than several state-of-the-art methods in our comparative studies.
Tasks Sparse Learning
Published 2020-01-27
URL https://arxiv.org/abs/2001.10616v2
PDF https://arxiv.org/pdf/2001.10616v2.pdf
PWC https://paperswithcode.com/paper/on-newton-screening
Repo
Framework

Learning High Order Feature Interactions with Fine Control Kernels

Title Learning High Order Feature Interactions with Fine Control Kernels
Authors Hristo Paskov, Alex Paskov, Robert West
Abstract We provide a methodology for learning sparse statistical models that use as features all possible multiplicative interactions among an underlying atomic set of features. While the resulting optimization problems are exponentially sized, our methodology leads to algorithms that can often solve these problems exactly or provide approximate solutions based on combining highly correlated features. We also introduce an algorithmic paradigm, the Fine Control Kernel framework, so named because it is based on Fenchel Duality and is reminiscent of kernel methods. Its theory is tailored to large sparse learning problems, and it leads to efficient feature screening rules for interactions. These rules are inspired by the Apriori algorithm for market basket analysis – which also falls under the purview of Fine Control Kernels, and can be applied to a plurality of learning problems including the Lasso and sparse matrix estimation. Experiments on biomedical datasets demonstrate the efficacy of our methodology in deriving algorithms that efficiently produce interactions models which achieve state-of-the-art accuracy and are interpretable.
Tasks Sparse Learning
Published 2020-02-09
URL https://arxiv.org/abs/2002.03298v1
PDF https://arxiv.org/pdf/2002.03298v1.pdf
PWC https://paperswithcode.com/paper/learning-high-order-feature-interactions-with
Repo
Framework

Referencing Source Code Artifacts: a Separate Concern in Software Citation

Title Referencing Source Code Artifacts: a Separate Concern in Software Citation
Authors Roberto Di Cosmo, Morane Gruenpeter, Stefano Zacchiroli
Abstract Among the entities involved in software citation, software source code requires special attention, due to the role it plays in ensuring scientific reproducibility. To reference source code we need identifiers that are not only unique and persistent, but also support \emph{integrity} checking intrinsically. Suitable identifiers must guarantee that denotedobjects will always stay the same, without relying on external third parties and administrative processes. We analyze the role of identifiers for digital objects (IDOs), whose properties are different from, and complementary to, those of the various digital identifiers of objects (DIOs) that are today popular building blocks of software and data citation toolchains.We argue that both kinds of identifiers are needed and detail the syntax, semantics, and practical implementation of the persistent identifiers (PIDs) adopted by the Software Heritage project to reference billions of softwaresource code artifacts such as source code files, directories, and commits.
Tasks
Published 2020-01-23
URL https://arxiv.org/abs/2001.08647v1
PDF https://arxiv.org/pdf/2001.08647v1.pdf
PWC https://paperswithcode.com/paper/referencing-source-code-artifacts-a-separate
Repo
Framework

The DIDI dataset: Digital Ink Diagram data

Title The DIDI dataset: Digital Ink Diagram data
Authors Philippe Gervais, Thomas Deselaers, Emre Aksan, Otmar Hilliges
Abstract We are releasing a dataset of diagram drawings with dynamic drawing information. The dataset aims to foster research in interactive graphical symbolic understanding. The dataset was obtained using a prompted data collection effort.
Tasks
Published 2020-02-20
URL https://arxiv.org/abs/2002.09303v2
PDF https://arxiv.org/pdf/2002.09303v2.pdf
PWC https://paperswithcode.com/paper/the-didi-dataset-digital-ink-diagram-data
Repo
Framework

Dynamic Incentive-aware Learning: Robust Pricing in Contextual Auctions

Title Dynamic Incentive-aware Learning: Robust Pricing in Contextual Auctions
Authors Negin Golrezaei, Adel Javanmard, Vahab Mirrokni
Abstract Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers’ valuations for an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers’ valuations, i.e., buyers’ preferences. The seller’s goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is to minimize her regret for revenue, where the regret is computed against a clairvoyant policy that knows buyers’ heterogeneous preferences. Given the seller’s goal, utility-maximizing buyers have the incentive to bid untruthfully in order to manipulate the seller’s learning policy. We propose learning policies that are robust to such strategic behavior. These policies use the outcomes of the auctions, rather than the submitted bids, to estimate the preferences while controlling the long-term effect of the outcome of each auction on the future reserve prices. When the market noise distribution is known to the seller, we propose a policy called Contextual Robust Pricing (CORP) that achieves a T-period regret of $O(d\log(Td) \log (T))$, where $d$ is the dimension of {the} contextual information. When the market noise distribution is unknown to the seller, we propose two policies whose regrets are sublinear in $T$.
Tasks
Published 2020-02-25
URL https://arxiv.org/abs/2002.11137v1
PDF https://arxiv.org/pdf/2002.11137v1.pdf
PWC https://paperswithcode.com/paper/dynamic-incentive-aware-learning-robust-1
Repo
Framework

A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction

Title A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction
Authors Beihao Xia, Conghao Wang, Qinmu Peng, Xinge You, Dacheng Tao
Abstract It remains challenging to automatically predict the multi-agent trajectory due to multiple interactions including agent to agent interaction and scene to agent interaction. Although recent methods have achieved promising performance, most of them just consider spatial influence of the interactions and ignore the fact that temporal influence always accompanies spatial influence. Moreover, those methods based on scene information always require extra segmented scene images to generate multiple socially acceptable trajectories. To solve these limitations, we propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC). First, spatial-temporal attention mechanism is presented to explore the most useful and important information. Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity. Experiments are performed on the two widely used ETH-UCY datasets and demonstrate that the proposed model achieves state-of-the-art prediction accuracy and handles more complex scenarios.
Tasks Trajectory Prediction
Published 2020-03-13
URL https://arxiv.org/abs/2003.06107v2
PDF https://arxiv.org/pdf/2003.06107v2.pdf
PWC https://paperswithcode.com/paper/a-spatial-temporal-attentive-network-with
Repo
Framework

A general framework for causal classification

Title A general framework for causal classification
Authors Jiuyong Li, Weijia Zhang, Lin Liu, Kui Yu, Thuc Duy Le, Jixue Liu
Abstract In many applications, there is a need to predict the effect of an intervention on different individuals from data. For example, which customers are persuadable by a product promotion? which groups would benefit from a new policy? These are typical causal classification questions involving the effect or the change in outcomes made by an intervention. The questions cannot be answered with traditional classification methods as they only deal with static outcomes. In marketing research these questions are often answered with uplift modelling, using experimental data. Some machine learning methods have been proposed for heterogeneous causal effect estimation using either experimental or observational data. In principle these methods can be used for causal classification, but a limited number of methods, mainly tree based, on causal heterogeneity modelling, are inadequate for various real world applications. In this paper, we propose a general framework for causal classification, as a generalisation of both uplift modelling and causal heterogeneity modelling. When developing the framework, we have identified the conditions where causal classification in both observational and experimental data can be resolved by a naive solution using off-the-shelf classification methods, which supports flexible implementations for various applications. This result not only enables a practical way to solve the causal classification problem by using any existing classification method in the proposed framework, but also makes it possible to cross use the methods developed in both uplift modelling and causal heterogeneity modelling areas when the conditions are satisfied. Experiments have shown that our framework with off-the-shelf classification methods is as competitive as the tailor-designed uplift modelling and heterogeneous causal effect modelling methods.
Tasks
Published 2020-03-25
URL https://arxiv.org/abs/2003.11940v2
PDF https://arxiv.org/pdf/2003.11940v2.pdf
PWC https://paperswithcode.com/paper/a-general-framework-for-causal-classification
Repo
Framework

Adaptation of Engineering Wake Models using Gaussian Process Regression and High-Fidelity Simulation Data

Title Adaptation of Engineering Wake Models using Gaussian Process Regression and High-Fidelity Simulation Data
Authors Leif Erik Andersson, Bart Doekemeijer, Daan van der Hoek, Jan-Willem van Wingerden, Lars Imsland
Abstract This article investigates the optimization of yaw control inputs of a nine-turbine wind farm. The wind farm is simulated using the high-fidelity simulator SOWFA. The optimization is performed with a modifier adaptation scheme based on Gaussian processes. Modifier adaptation corrects for the mismatch between plant and model and helps to converge to the actual plan optimum. In the case study the modifier adaptation approach is compared with the Bayesian optimization approach. Moreover, the use of two different covariance functions in the Gaussian process regression is discussed. Practical recommendations concerning the data preparation and application of the approach are given. It is shown that both the modifier adaptation and the Bayesian optimization approach can improve the power production with overall smaller yaw misalignments in comparison to the Gaussian wake model.
Tasks Gaussian Processes
Published 2020-03-30
URL https://arxiv.org/abs/2003.13323v1
PDF https://arxiv.org/pdf/2003.13323v1.pdf
PWC https://paperswithcode.com/paper/adaptation-of-engineering-wake-models-using
Repo
Framework

Operator inference for non-intrusive model reduction of systems with non-polynomial nonlinear terms

Title Operator inference for non-intrusive model reduction of systems with non-polynomial nonlinear terms
Authors Peter Benner, Pawan Goyal, Boris Kramer, Benjamin Peherstorfer, Karen Willcox
Abstract This work presents a non-intrusive model reduction method to learn low-dimensional models of dynamical systems with non-polynomial nonlinear terms that are spatially local and that are given in analytic form. In contrast to state-of-the-art model reduction methods that are intrusive and thus require full knowledge of the governing equations and the operators of a full model of the discretized dynamical system, the proposed approach requires only the non-polynomial terms in analytic form and learns the rest of the dynamics from snapshots computed with a potentially black-box full-model solver. The proposed method learns operators for the linear and polynomially nonlinear dynamics via a least-squares problem, where the given non-polynomial terms are incorporated in the right-hand side. The least-squares problem is linear and thus can be solved efficiently in practice. The proposed method is demonstrated on three problems governed by partial differential equations, namely the diffusion-reaction Chafee-Infante model, a tubular reactor model for reactive flows, and a batch-chromatography model that describes a chemical separation process. The numerical results provide evidence that the proposed approach learns reduced models that achieve comparable accuracy as models constructed with state-of-the-art intrusive model reduction methods that require full knowledge of the governing equations.
Tasks
Published 2020-02-22
URL https://arxiv.org/abs/2002.09726v1
PDF https://arxiv.org/pdf/2002.09726v1.pdf
PWC https://paperswithcode.com/paper/operator-inference-for-non-intrusive-model
Repo
Framework

Faster Transformer Decoding: N-gram Masked Self-Attention

Title Faster Transformer Decoding: N-gram Masked Self-Attention
Authors Ciprian Chelba, Mia Chen, Ankur Bapna, Noam Shazeer
Abstract Motivated by the fact that most of the information relevant to the prediction of target tokens is drawn from the source sentence $S=s_1, \ldots, s_S$, we propose truncating the target-side window used for computing self-attention by making an $N$-gram assumption. Experiments on WMT EnDe and EnFr data sets show that the $N$-gram masked self-attention model loses very little in BLEU score for $N$ values in the range $4, \ldots, 8$, depending on the task.
Tasks
Published 2020-01-14
URL https://arxiv.org/abs/2001.04589v1
PDF https://arxiv.org/pdf/2001.04589v1.pdf
PWC https://paperswithcode.com/paper/faster-transformer-decoding-n-gram-masked
Repo
Framework

Variable fusion for Bayesian linear regression via spike-and-slab priors

Title Variable fusion for Bayesian linear regression via spike-and-slab priors
Authors Shengyi Wu, Kaito Shimamura, Kohei Yoshikawa, Kazuaki Murayama, Shuichi Kawano
Abstract In linear regression models, a fusion of the coefficients is used to identify the predictors having similar relationships with the response. This is called variable fusion. This paper presents a novel variable fusion method in terms of Bayesian linear regression models. We focus on hierarchical Bayesian models based on a spike-and-slab prior approach. A spike-and-slab prior is designed to perform variable fusion. To obtain estimates of parameters, we develop a Gibbs sampler for the parameters. Simulation studies and a real data analysis show that our proposed method has better performances than previous methods.
Tasks
Published 2020-03-30
URL https://arxiv.org/abs/2003.13299v1
PDF https://arxiv.org/pdf/2003.13299v1.pdf
PWC https://paperswithcode.com/paper/variable-fusion-for-bayesian-linear
Repo
Framework

Towards Palmprint Verification On Smartphones

Title Towards Palmprint Verification On Smartphones
Authors Yingyi Zhang, Lin Zhang, Ruixin Zhang, Shaoxin Li, Jilin Li, Feiyue Huang
Abstract With the rapid development of mobile devices, smartphones have gradually become an indispensable part of people’s lives. Meanwhile, biometric authentication has been corroborated to be an effective method for establishing a person’s identity with high confidence. Hence, recently, biometric technologies for smartphones have also become increasingly sophisticated and popular. But it is noteworthy that the application potential of palmprints for smartphones is seriously underestimated. Studies in the past two decades have shown that palmprints have outstanding merits in uniqueness and permanence, and have high user acceptance. However, currently, studies specializing in palmprint verification for smartphones are still quite sporadic, especially when compared to face- or fingerprint-oriented ones. In this paper, aiming to fill the aforementioned research gap, we conducted a thorough study of palmprint verification on smartphones and our contributions are twofold. First, to facilitate the study of palmprint verification on smartphones, we established an annotated palmprint dataset named MPD, which was collected by multi-brand smartphones in two separate sessions with various backgrounds and illumination conditions. As the largest dataset in this field, MPD contains 16,000 palm images collected from 200 subjects. Second, we built a DCNN-based palmprint verification system named DeepMPV+ for smartphones. In DeepMPV+, two key steps, ROI extraction and ROI matching, are both formulated as learning problems and then solved naturally by modern DCNN models. The efficiency and efficacy of DeepMPV+ have been corroborated by extensive experiments. To make our results fully reproducible, the labeled dataset and the relevant source codes have been made publicly available at https://cslinzhang.github.io/MobilePalmPrint/.
Tasks
Published 2020-03-30
URL https://arxiv.org/abs/2003.13266v1
PDF https://arxiv.org/pdf/2003.13266v1.pdf
PWC https://paperswithcode.com/paper/towards-palmprint-verification-on-smartphones
Repo
Framework

Machine Learning Approaches for Amharic Parts-of-speech Tagging

Title Machine Learning Approaches for Amharic Parts-of-speech Tagging
Authors Ibrahim Gashaw, H L. Shashirekha
Abstract Part-of-speech (POS) tagging is considered as one of the basic but necessary tools which are required for many Natural Language Processing (NLP) applications such as word sense disambiguation, information retrieval, information processing, parsing, question answering, and machine translation. Performance of the current POS taggers in Amharic is not as good as that of the contemporary POS taggers available for English and other European languages. The aim of this work is to improve POS tagging performance for the Amharic language, which was never above 91%. Usage of morphological knowledge, an extension of the existing annotated data, feature extraction, parameter tuning by applying grid search and the tagging algorithms have been examined and obtained significant performance difference from the previous works. We have used three different datasets for POS experiments.
Tasks Information Retrieval, Machine Translation, Part-Of-Speech Tagging, Question Answering, Word Sense Disambiguation
Published 2020-01-10
URL https://arxiv.org/abs/2001.03324v1
PDF https://arxiv.org/pdf/2001.03324v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-approaches-for-amharic-parts
Repo
Framework

Topical Result Caching in Web Search Engines

Title Topical Result Caching in Web Search Engines
Authors Ida Mele, Nicola Tonellotto, Ophir Frieder, Raffaele Perego
Abstract Caching search results is employed in information retrieval systems to expedite query processing and reduce back-end server workload. Motivated by the observation that queries belonging to different topics have different temporal-locality patterns, we investigate a novel caching model called STD (Static-Topic-Dynamic cache). It improves traditional SDC (Static-Dynamic Cache) that stores in a static cache the results of popular queries and manages the dynamic cache with a replacement policy for intercepting the temporal variations in the query stream. Our proposed caching scheme includes another layer for topic-based caching, where the entries are allocated to different topics (e.g., weather, education). The results of queries characterized by a topic are kept in the fraction of the cache dedicated to it. This permits to adapt the cache-space utilization to the temporal locality of the various topics and reduces cache misses due to those queries that are neither sufficiently popular to be in the static portion nor requested within short-time intervals to be in the dynamic portion. We simulate different configurations for STD using two real-world query streams. Experiments demonstrate that our approach outperforms SDC with an increase up to 3% in terms of hit rates, and up to 36% of gap reduction w.r.t. SDC from the theoretical optimal caching algorithm.
Tasks Information Retrieval
Published 2020-01-09
URL https://arxiv.org/abs/2001.03010v1
PDF https://arxiv.org/pdf/2001.03010v1.pdf
PWC https://paperswithcode.com/paper/topical-result-caching-in-web-search-engines
Repo
Framework

HintPose

Title HintPose
Authors Sanghoon Hong, Hunchul Park, Jonghyuk Park, Sukhyun Cho, Heewoong Park
Abstract Most of the top-down pose estimation models assume that there exists only one person in a bounding box. However, the assumption is not always correct. In this technical report, we introduce two ideas, instance cue and recurrent refinement, to an existing pose estimator so that the model is able to handle detection boxes with multiple persons properly. When we evaluated our model on the COCO17 keypoints dataset, it showed non-negligible improvement compared to its baseline model. Our model achieved 76.2 mAP as a single model and 77.3 mAP as an ensemble on the test-dev set without additional training data. After additional post-processing with a separate refinement network, our final predictions achieved 77.8 mAP on the COCO test-dev set.
Tasks Pose Estimation
Published 2020-03-04
URL https://arxiv.org/abs/2003.02170v1
PDF https://arxiv.org/pdf/2003.02170v1.pdf
PWC https://paperswithcode.com/paper/hintpose
Repo
Framework
comments powered by Disqus