October 15, 2019

2078 words 10 mins read

Paper Group NANR 253

Paper Group NANR 253

以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese]. Fast Bellman Updates for Robust MDPs. Geodesic Convolutional Shape Optimization. Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution. Inductive Two-Layer Modeling with Parametric Bregman Transfer. …

以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese]

Title 以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese]
Authors Wei-Chieh Chou, Yih-Ru Wang
Abstract
Tasks Multi-Label Classification
Published 2018-10-01
URL https://www.aclweb.org/anthology/O18-1016/
PDF https://www.aclweb.org/anthology/O18-1016
PWC https://paperswithcode.com/paper/aaeccc2e-e-a-eaa14a-ceaa-hierarchical-multi
Repo
Framework

Fast Bellman Updates for Robust MDPs

Title Fast Bellman Updates for Robust MDPs
Authors Chin Pang Ho, Marek Petrik, Wolfram Wiesemann
Abstract We describe two efficient, and exact, algorithms for computing Bellman updates in robust Markov decision processes (MDPs). The first algorithm uses a homotopy continuation method to compute updates for L1-constrained s,a-rectangular ambiguity sets. It runs in quasi-linear time for plain L1-norms and also generalizes to weighted L1-norms. The second algorithm uses bisection to compute updates for robust MDPs with s-rectangular ambiguity sets. This algorithm, when combined with the homotopy method, also has a quasi-linear runtime. Unlike previous methods, our algorithms compute the primal solution in addition to the optimal objective value, which makes them useful in policy iteration methods. Our experimental results indicate that the proposed methods are over 1,000 times faster than Gurobi, a state-of-the-art commercial optimization package, for small instances, and the performance gap grows considerably with problem size.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2318
PDF http://proceedings.mlr.press/v80/ho18a/ho18a.pdf
PWC https://paperswithcode.com/paper/fast-bellman-updates-for-robust-mdps
Repo
Framework

Geodesic Convolutional Shape Optimization

Title Geodesic Convolutional Shape Optimization
Authors Pierre Baque, Edoardo Remelli, Francois Fleuret, Pascal Fua
Abstract Aerodynamic shape optimization has many industrial applications. Existing methods, however, are so computationally demanding that typical engineering practices are to either simply try a limited number of hand-designed shapes or restrict oneself to shapes that can be parameterized using only few degrees of freedom. In this work, we introduce a new way to optimize complex shapes fast and accurately. To this end, we train Geodesic Convolutional Neural Networks to emulate a fluidynamics simulator. The key to making this approach practical is remeshing the original shape using a poly-cube map, which makes it possible to perform the computations on GPUs instead of CPUs. The neural net is then used to formulate an objective function that is differentiable with respect to the shape parameters, which can then be optimized using a gradient-based technique. This outperforms state-of-the-art methods by 5 to 20% for standard problems and, even more importantly, our approach applies to cases that previous methods cannot handle.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=1944
PDF http://proceedings.mlr.press/v80/baque18a/baque18a.pdf
PWC https://paperswithcode.com/paper/geodesic-convolutional-shape-optimization
Repo
Framework

Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution

Title Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution
Authors Liat Ein Dor, Alon Halfon, Yoav Kantor, Ran Levy, Yosi Mass, Ruty Rinott, Eyal Shnarch, Noam Slonim
Abstract
Tasks Entity Linking, Learning-To-Rank
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1408/
PDF https://www.aclweb.org/anthology/L18-1408
PWC https://paperswithcode.com/paper/semantic-relatedness-of-wikipedia-concepts
Repo
Framework

Inductive Two-Layer Modeling with Parametric Bregman Transfer

Title Inductive Two-Layer Modeling with Parametric Bregman Transfer
Authors Vignesh Ganapathiraman, Zhan Shi, Xinhua Zhang, Yaoliang Yu
Abstract Latent prediction models, exemplified by multi-layer networks, employ hidden variables that automate abstract feature discovery. They typically pose nonconvex optimization problems and effective semi-definite programming (SDP) relaxations have been developed to enable global solutions (Aslan et al., 2014).However, these models rely on nonparametric training of layer-wise kernel representations, and are therefore restricted to transductive learning which slows down test prediction. In this paper, we develop a new inductive learning framework for parametric transfer functions using matching losses. The result for ReLU utilizes completely positive matrices, and the inductive learner not only delivers superior accuracy but also offers an order of magnitude speedup over SDP with constant approximation guarantees.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2104
PDF http://proceedings.mlr.press/v80/ganapathiraman18a/ganapathiraman18a.pdf
PWC https://paperswithcode.com/paper/inductive-two-layer-modeling-with-parametric
Repo
Framework

SANTO: A Web-based Annotation Tool for Ontology-driven Slot Filling

Title SANTO: A Web-based Annotation Tool for Ontology-driven Slot Filling
Authors Matthias Hartung, Hendrik ter Horst, Frank Grimm, Tim Diekmann, Roman Klinger, Philipp Cimiano
Abstract Supervised machine learning algorithms require training data whose generation for complex relation extraction tasks tends to be difficult. Being optimized for relation extraction at sentence level, many annotation tools lack in facilitating the annotation of relational structures that are widely spread across the text. This leads to non-intuitive and cumbersome visualizations, making the annotation process unnecessarily time-consuming. We propose SANTO, an easy-to-use, domain-adaptive annotation tool specialized for complex slot filling tasks which may involve problems of cardinality and referential grounding. The web-based architecture enables fast and clearly structured annotation for multiple users in parallel. Relational structures are formulated as templates following the conceptualization of an underlying ontology. Further, import and export procedures of standard formats enable interoperability with external sources and tools.
Tasks Knowledge Base Population, Reading Comprehension, Relation Extraction, Slot Filling
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-4012/
PDF https://www.aclweb.org/anthology/P18-4012
PWC https://paperswithcode.com/paper/santo-a-web-based-annotation-tool-for
Repo
Framework

Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems

Title Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems
Authors Yair Carmon, John C. Duchi
Abstract We provide convergence rates for Krylov subspace solutions to the trust-region and cubic-regularized (nonconvex) quadratic problems. Such solutions may be efficiently computed by the Lanczos method and have long been used in practice. We prove error bounds of the form $1/t^2$ and $e^{-4t/\sqrt{\kappa}}$, where $\kappa$ is a condition number for the problem, and $t$ is the Krylov subspace order (number of Lanczos iterations). We also provide lower bounds showing that our analysis is sharp.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/8269-analysis-of-krylov-subspace-solutions-of-regularized-non-convex-quadratic-problems
PDF http://papers.nips.cc/paper/8269-analysis-of-krylov-subspace-solutions-of-regularized-non-convex-quadratic-problems.pdf
PWC https://paperswithcode.com/paper/analysis-of-krylov-subspace-solutions-of
Repo
Framework

Generalized Graph Embedding Models

Title Generalized Graph Embedding Models
Authors Qiao Liu, Xiaohui Yang, Rui Wan, Shouzhong Tu, Zufeng Wu
Abstract Many types of relations in physical, biological, social and information systems can be modeled as homogeneous or heterogeneous concept graphs. Hence, learning from and with graph embeddings has drawn a great deal of research interest recently, but only ad hoc solutions have been obtained this far. In this paper, we conjecture that the one-shot supervised learning mechanism is a bottleneck in improving the performance of the graph embedding learning algorithms, and propose to extend this by introducing a multi-shot unsupervised learning framework. Empirical results on several real-world data set show that the proposed model consistently and significantly outperforms existing state-of-the-art approaches on knowledge base completion and graph based multi-label classification tasks.
Tasks Graph Embedding, Knowledge Base Completion, Multi-Label Classification
Published 2018-01-01
URL https://openreview.net/forum?id=SJd0EAy0b
PDF https://openreview.net/pdf?id=SJd0EAy0b
PWC https://paperswithcode.com/paper/generalized-graph-embedding-models
Repo
Framework

Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences

Title Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences
Authors Morgan Ulinski, Bob Coyne, Julia Hirschberg
Abstract
Tasks Coreference Resolution, Image Retrieval
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1237/
PDF https://www.aclweb.org/anthology/L18-1237
PWC https://paperswithcode.com/paper/evaluating-the-wordseye-text-to-scene-system
Repo
Framework

Diffusing Policies : Towards Wasserstein Policy Gradient Flows

Title Diffusing Policies : Towards Wasserstein Policy Gradient Flows
Authors Pierre H. Richemond, Brendan Maginnis
Abstract Policy gradients methods often achieve better performance when the change in policy is limited to a small Kullback-Leibler divergence. We derive policy gradients where the change in policy is limited to a small Wasserstein distance (or trust region). This is done in the discrete and continuous multi-armed bandit settings with entropy regularisation. We show that in the small steps limit with respect to the Wasserstein distance $W_2$, policy dynamics are governed by the heat equation, following the Jordan-Kinderlehrer-Otto result. This means that policies undergo diffusion and advection, concentrating near actions with high reward. This helps elucidate the nature of convergence in the probability matching setup, and provides justification for empirical practices such as Gaussian policy priors and additive gradient noise.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=rk3mjYRp-
PDF https://openreview.net/pdf?id=rk3mjYRp-
PWC https://paperswithcode.com/paper/diffusing-policies-towards-wasserstein-policy
Repo
Framework

We Are Depleting Our Research Subject as We Are Investigating It: In Language Technology, more Replication and Diversity Are Needed

Title We Are Depleting Our Research Subject as We Are Investigating It: In Language Technology, more Replication and Diversity Are Needed
Authors Ant{'o}nio Branco
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1022/
PDF https://www.aclweb.org/anthology/L18-1022
PWC https://paperswithcode.com/paper/we-are-depleting-our-research-subject-as-we
Repo
Framework

Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions

Title Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions
Authors Wenruo Bai, Jeff Bilmes
Abstract We analyze the performance of the greedy algorithm, and also a discrete semi-gradient based algorithm, for maximizing the sum of a suBmodular and suPermodular (BP) function (both of which are non-negative monotone non-decreasing) under two types of constraints, either a cardinality constraint or $p\geq 1$ matroid independence constraints. These problems occur naturally in several real-world applications in data science, machine learning, and artificial intelligence. The problems are ordinarily inapproximable to any factor. Using the curvature $\curv_f$ of the submodular term, and introducing $\curv^g$ for the supermodular term (a natural dual curvature for supermodular functions), however, both of which are computable in linear time, we show that BP maximization can be efficiently approximated by both the greedy and the semi-gradient based algorithm. The algorithms yield multiplicative guarantees of $\frac{1}{\curv_f}\left[1-e^{-(1-\curv^g)\curv_f}\right]$ and $\frac{1-\curv^g}{(1-\curv^g)\curv_f + p}$ for the two types of constraints respectively. For pure monotone supermodular constrained maximization, these yield $1-\curvg$ and $(1-\curvg)/p$ for the two types of constraints respectively. We also analyze the hardness of BP maximization and show that our guarantees match hardness by a constant factor and by $O(\ln(p))$ respectively. Computational experiments are also provided supporting our analysis.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2350
PDF http://proceedings.mlr.press/v80/bai18a/bai18a.pdf
PWC https://paperswithcode.com/paper/greed-is-still-good-maximizing-monotone
Repo
Framework

A Unified Framework for Structured Low-rank Matrix Learning

Title A Unified Framework for Structured Low-rank Matrix Learning
Authors Pratik Jawanpuria, Bamdev Mishra
Abstract We consider the problem of learning a low-rank matrix, constrained to lie in a linear subspace, and introduce a novel factorization for modeling such matrices. A salient feature of the proposed factorization scheme is it decouples the low-rank and the structural constraints onto separate factors. We formulate the optimization problem on the Riemannian spectrahedron manifold, where the Riemannian framework allows to develop computationally efficient conjugate gradient and trust-region algorithms. Experiments on problems such as standard/robust/non-negative matrix completion, Hankel matrix learning and multi-task learning demonstrate the efficacy of our approach.
Tasks Matrix Completion, Multi-Task Learning
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=1930
PDF http://proceedings.mlr.press/v80/jawanpuria18a/jawanpuria18a.pdf
PWC https://paperswithcode.com/paper/a-unified-framework-for-structured-low-rank
Repo
Framework

Using Universal Dependencies in cross-linguistic complexity research

Title Using Universal Dependencies in cross-linguistic complexity research
Authors Aleks Berdicevskis, rs, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Katharina Ehret, Kilu von Prince, Daniel Ross, Bill Thompson, Chunxiao Yan, Vera Demberg, Gary Lupyan, Taraka Rama, Christian Bentz
Abstract We evaluate corpus-based measures of linguistic complexity obtained using Universal Dependencies (UD) treebanks. We propose a method of estimating robustness of the complexity values obtained using a given measure and a given treebank. The results indicate that measures of syntactic complexity might be on average less robust than those of morphological complexity. We also estimate the validity of complexity measures by comparing the results for very similar languages and checking for unexpected differences. We show that some of those differences that arise can be diminished by using parallel treebanks and, more importantly from the practical point of view, by harmonizing the language-specific solutions in the UD annotation.
Tasks
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6002/
PDF https://www.aclweb.org/anthology/W18-6002
PWC https://paperswithcode.com/paper/using-universal-dependencies-in-cross
Repo
Framework

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

Title Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling
Authors Shannon Mccurdy
Abstract Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results. We provide provable guarantees for deterministic column sampling using ridge leverage scores. The matrix sketch returned by our algorithm is a column subset of the original matrix, yielding additional interpretability. Like the randomized counterparts, the deterministic algorithm provides $(1+\epsilon)$ error column subset selection, $(1+\epsilon)$ error projection-cost preservation, and an additive-multiplicative spectral bound. We also show that under the assumption of power-law decay of ridge leverage scores, this deterministic algorithm is provably as accurate as randomized algorithms. Lastly, ridge regression is frequently used to regularize ill-posed linear least-squares problems. While ridge regression provides shrinkage for the regression coefficients, many of the coefficients remain small but non-zero. Performing ridge regression with the matrix sketch returned by our algorithm and a particular regularization parameter forces coefficients to zero and has a provable $(1+\epsilon)$ bound on the statistical risk. As such, it is an interesting alternative to elastic net regularization.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7513-ridge-regression-and-provable-deterministic-ridge-leverage-score-sampling
PDF http://papers.nips.cc/paper/7513-ridge-regression-and-provable-deterministic-ridge-leverage-score-sampling.pdf
PWC https://paperswithcode.com/paper/ridge-regression-and-provable-deterministic
Repo
Framework
comments powered by Disqus