October 15, 2019

2078 words 10 mins read

Paper Group NANR 253

以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese]. Fast Bellman Updates for Robust MDPs. Geodesic Convolutional Shape Optimization. Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution. Inductive Two-Layer Modeling with Parametric Bregman Transfer. …

以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese]


Title	以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese]
Authors	Wei-Chieh Chou, Yih-Ru Wang
Abstract
Tasks	Multi-Label Classification
Published	2018-10-01
URL	https://www.aclweb.org/anthology/O18-1016/
PDF	https://www.aclweb.org/anthology/O18-1016
PWC	https://paperswithcode.com/paper/aaeccc2e-e-a-eaa14a-ceaa-hierarchical-multi
Repo
Framework

Fast Bellman Updates for Robust MDPs


Title	Fast Bellman Updates for Robust MDPs
Authors	Chin Pang Ho, Marek Petrik, Wolfram Wiesemann
Abstract	We describe two efficient, and exact, algorithms for computing Bellman updates in robust Markov decision processes (MDPs). The first algorithm uses a homotopy continuation method to compute updates for L1-constrained s,a-rectangular ambiguity sets. It runs in quasi-linear time for plain L1-norms and also generalizes to weighted L1-norms. The second algorithm uses bisection to compute updates for robust MDPs with s-rectangular ambiguity sets. This algorithm, when combined with the homotopy method, also has a quasi-linear runtime. Unlike previous methods, our algorithms compute the primal solution in addition to the optimal objective value, which makes them useful in policy iteration methods. Our experimental results indicate that the proposed methods are over 1,000 times faster than Gurobi, a state-of-the-art commercial optimization package, for small instances, and the performance gap grows considerably with problem size.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2318
PDF	http://proceedings.mlr.press/v80/ho18a/ho18a.pdf
PWC	https://paperswithcode.com/paper/fast-bellman-updates-for-robust-mdps
Repo
Framework

Geodesic Convolutional Shape Optimization


Title	Geodesic Convolutional Shape Optimization
Authors	Pierre Baque, Edoardo Remelli, Francois Fleuret, Pascal Fua
Abstract	Aerodynamic shape optimization has many industrial applications. Existing methods, however, are so computationally demanding that typical engineering practices are to either simply try a limited number of hand-designed shapes or restrict oneself to shapes that can be parameterized using only few degrees of freedom. In this work, we introduce a new way to optimize complex shapes fast and accurately. To this end, we train Geodesic Convolutional Neural Networks to emulate a fluidynamics simulator. The key to making this approach practical is remeshing the original shape using a poly-cube map, which makes it possible to perform the computations on GPUs instead of CPUs. The neural net is then used to formulate an objective function that is differentiable with respect to the shape parameters, which can then be optimized using a gradient-based technique. This outperforms state-of-the-art methods by 5 to 20% for standard problems and, even more importantly, our approach applies to cases that previous methods cannot handle.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1944
PDF	http://proceedings.mlr.press/v80/baque18a/baque18a.pdf
PWC	https://paperswithcode.com/paper/geodesic-convolutional-shape-optimization
Repo
Framework

Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution


Title	Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution
Authors	Liat Ein Dor, Alon Halfon, Yoav Kantor, Ran Levy, Yosi Mass, Ruty Rinott, Eyal Shnarch, Noam Slonim
Abstract
Tasks	Entity Linking, Learning-To-Rank
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1408/
PDF	https://www.aclweb.org/anthology/L18-1408
PWC	https://paperswithcode.com/paper/semantic-relatedness-of-wikipedia-concepts
Repo
Framework

Inductive Two-Layer Modeling with Parametric Bregman Transfer


Title	Inductive Two-Layer Modeling with Parametric Bregman Transfer
Authors	Vignesh Ganapathiraman, Zhan Shi, Xinhua Zhang, Yaoliang Yu
Abstract	Latent prediction models, exemplified by multi-layer networks, employ hidden variables that automate abstract feature discovery. They typically pose nonconvex optimization problems and effective semi-definite programming (SDP) relaxations have been developed to enable global solutions (Aslan et al., 2014).However, these models rely on nonparametric training of layer-wise kernel representations, and are therefore restricted to transductive learning which slows down test prediction. In this paper, we develop a new inductive learning framework for parametric transfer functions using matching losses. The result for ReLU utilizes completely positive matrices, and the inductive learner not only delivers superior accuracy but also offers an order of magnitude speedup over SDP with constant approximation guarantees.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2104
PDF	http://proceedings.mlr.press/v80/ganapathiraman18a/ganapathiraman18a.pdf
PWC	https://paperswithcode.com/paper/inductive-two-layer-modeling-with-parametric
Repo
Framework

SANTO: A Web-based Annotation Tool for Ontology-driven Slot Filling


Title	SANTO: A Web-based Annotation Tool for Ontology-driven Slot Filling
Authors	Matthias Hartung, Hendrik ter Horst, Frank Grimm, Tim Diekmann, Roman Klinger, Philipp Cimiano
Abstract	Supervised machine learning algorithms require training data whose generation for complex relation extraction tasks tends to be difficult. Being optimized for relation extraction at sentence level, many annotation tools lack in facilitating the annotation of relational structures that are widely spread across the text. This leads to non-intuitive and cumbersome visualizations, making the annotation process unnecessarily time-consuming. We propose SANTO, an easy-to-use, domain-adaptive annotation tool specialized for complex slot filling tasks which may involve problems of cardinality and referential grounding. The web-based architecture enables fast and clearly structured annotation for multiple users in parallel. Relational structures are formulated as templates following the conceptualization of an underlying ontology. Further, import and export procedures of standard formats enable interoperability with external sources and tools.
Tasks	Knowledge Base Population, Reading Comprehension, Relation Extraction, Slot Filling
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-4012/
PDF	https://www.aclweb.org/anthology/P18-4012
PWC	https://paperswithcode.com/paper/santo-a-web-based-annotation-tool-for
Repo
Framework

Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems


Title	Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems
Authors	Yair Carmon, John C. Duchi
Abstract	We provide convergence rates for Krylov subspace solutions to the trust-region and cubic-regularized (nonconvex) quadratic problems. Such solutions may be efficiently computed by the Lanczos method and have long been used in practice. We prove error bounds of the form $1/t^2$ and $e^{-4t/\sqrt{\kappa}}$, where $\kappa$ is a condition number for the problem, and $t$ is the Krylov subspace order (number of Lanczos iterations). We also provide lower bounds showing that our analysis is sharp.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/8269-analysis-of-krylov-subspace-solutions-of-regularized-non-convex-quadratic-problems
PDF	http://papers.nips.cc/paper/8269-analysis-of-krylov-subspace-solutions-of-regularized-non-convex-quadratic-problems.pdf
PWC	https://paperswithcode.com/paper/analysis-of-krylov-subspace-solutions-of
Repo
Framework

Generalized Graph Embedding Models


Title	Generalized Graph Embedding Models
Authors	Qiao Liu, Xiaohui Yang, Rui Wan, Shouzhong Tu, Zufeng Wu
Abstract	Many types of relations in physical, biological, social and information systems can be modeled as homogeneous or heterogeneous concept graphs. Hence, learning from and with graph embeddings has drawn a great deal of research interest recently, but only ad hoc solutions have been obtained this far. In this paper, we conjecture that the one-shot supervised learning mechanism is a bottleneck in improving the performance of the graph embedding learning algorithms, and propose to extend this by introducing a multi-shot unsupervised learning framework. Empirical results on several real-world data set show that the proposed model consistently and significantly outperforms existing state-of-the-art approaches on knowledge base completion and graph based multi-label classification tasks.
Tasks	Graph Embedding, Knowledge Base Completion, Multi-Label Classification
Published	2018-01-01
URL	https://openreview.net/forum?id=SJd0EAy0b
PDF	https://openreview.net/pdf?id=SJd0EAy0b
PWC	https://paperswithcode.com/paper/generalized-graph-embedding-models
Repo
Framework

Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences


Title	Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences
Authors	Morgan Ulinski, Bob Coyne, Julia Hirschberg
Abstract
Tasks	Coreference Resolution, Image Retrieval
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1237/
PDF	https://www.aclweb.org/anthology/L18-1237
PWC	https://paperswithcode.com/paper/evaluating-the-wordseye-text-to-scene-system
Repo
Framework

Diffusing Policies : Towards Wasserstein Policy Gradient Flows


Title	Diffusing Policies : Towards Wasserstein Policy Gradient Flows
Authors	Pierre H. Richemond, Brendan Maginnis
Abstract	Policy gradients methods often achieve better performance when the change in policy is limited to a small Kullback-Leibler divergence. We derive policy gradients where the change in policy is limited to a small Wasserstein distance (or trust region). This is done in the discrete and continuous multi-armed bandit settings with entropy regularisation. We show that in the small steps limit with respect to the Wasserstein distance $W_2$, policy dynamics are governed by the heat equation, following the Jordan-Kinderlehrer-Otto result. This means that policies undergo diffusion and advection, concentrating near actions with high reward. This helps elucidate the nature of convergence in the probability matching setup, and provides justification for empirical practices such as Gaussian policy priors and additive gradient noise.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=rk3mjYRp-
PDF	https://openreview.net/pdf?id=rk3mjYRp-
PWC	https://paperswithcode.com/paper/diffusing-policies-towards-wasserstein-policy
Repo
Framework

We Are Depleting Our Research Subject as We Are Investigating It: In Language Technology, more Replication and Diversity Are Needed


Title	We Are Depleting Our Research Subject as We Are Investigating It: In Language Technology, more Replication and Diversity Are Needed
Authors	Ant{'o}nio Branco
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1022/
PDF	https://www.aclweb.org/anthology/L18-1022
PWC	https://paperswithcode.com/paper/we-are-depleting-our-research-subject-as-we
Repo
Framework

Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions


Title	Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions
Authors	Wenruo Bai, Jeff Bilmes
Abstract	We analyze the performance of the greedy algorithm, and also a discrete semi-gradient based algorithm, for maximizing the sum of a suBmodular and suPermodular (BP) function (both of which are non-negative monotone non-decreasing) under two types of constraints, either a cardinality constraint or $p\geq 1$ matroid independence constraints. These problems occur naturally in several real-world applications in data science, machine learning, and artificial intelligence. The problems are ordinarily inapproximable to any factor. Using the curvature $\curv_f$ of the submodular term, and introducing $\curv^g$ for the supermodular term (a natural dual curvature for supermodular functions), however, both of which are computable in linear time, we show that BP maximization can be efficiently approximated by both the greedy and the semi-gradient based algorithm. The algorithms yield multiplicative guarantees of $\frac{1}{\curv_f}\left[1-e^{-(1-\curv^g)\curv_f}\right]$ and $\frac{1-\curv^g}{(1-\curv^g)\curv_f + p}$ for the two types of constraints respectively. For pure monotone supermodular constrained maximization, these yield $1-\curvg$ and $(1-\curvg)/p$ for the two types of constraints respectively. We also analyze the hardness of BP maximization and show that our guarantees match hardness by a constant factor and by $O(\ln(p))$ respectively. Computational experiments are also provided supporting our analysis.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2350
PDF	http://proceedings.mlr.press/v80/bai18a/bai18a.pdf
PWC	https://paperswithcode.com/paper/greed-is-still-good-maximizing-monotone
Repo
Framework

A Unified Framework for Structured Low-rank Matrix Learning


Title	A Unified Framework for Structured Low-rank Matrix Learning
Authors	Pratik Jawanpuria, Bamdev Mishra
Abstract	We consider the problem of learning a low-rank matrix, constrained to lie in a linear subspace, and introduce a novel factorization for modeling such matrices. A salient feature of the proposed factorization scheme is it decouples the low-rank and the structural constraints onto separate factors. We formulate the optimization problem on the Riemannian spectrahedron manifold, where the Riemannian framework allows to develop computationally efficient conjugate gradient and trust-region algorithms. Experiments on problems such as standard/robust/non-negative matrix completion, Hankel matrix learning and multi-task learning demonstrate the efficacy of our approach.
Tasks	Matrix Completion, Multi-Task Learning
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1930
PDF	http://proceedings.mlr.press/v80/jawanpuria18a/jawanpuria18a.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-for-structured-low-rank
Repo
Framework

Using Universal Dependencies in cross-linguistic complexity research


Title	Using Universal Dependencies in cross-linguistic complexity research
Authors	Aleks Berdicevskis, rs, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Katharina Ehret, Kilu von Prince, Daniel Ross, Bill Thompson, Chunxiao Yan, Vera Demberg, Gary Lupyan, Taraka Rama, Christian Bentz
Abstract	We evaluate corpus-based measures of linguistic complexity obtained using Universal Dependencies (UD) treebanks. We propose a method of estimating robustness of the complexity values obtained using a given measure and a given treebank. The results indicate that measures of syntactic complexity might be on average less robust than those of morphological complexity. We also estimate the validity of complexity measures by comparing the results for very similar languages and checking for unexpected differences. We show that some of those differences that arise can be diminished by using parallel treebanks and, more importantly from the practical point of view, by harmonizing the language-specific solutions in the UD annotation.
Tasks
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6002/
PDF	https://www.aclweb.org/anthology/W18-6002
PWC	https://paperswithcode.com/paper/using-universal-dependencies-in-cross
Repo
Framework

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling


Title	Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling
Authors	Shannon Mccurdy
Abstract	Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results. We provide provable guarantees for deterministic column sampling using ridge leverage scores. The matrix sketch returned by our algorithm is a column subset of the original matrix, yielding additional interpretability. Like the randomized counterparts, the deterministic algorithm provides $(1+\epsilon)$ error column subset selection, $(1+\epsilon)$ error projection-cost preservation, and an additive-multiplicative spectral bound. We also show that under the assumption of power-law decay of ridge leverage scores, this deterministic algorithm is provably as accurate as randomized algorithms. Lastly, ridge regression is frequently used to regularize ill-posed linear least-squares problems. While ridge regression provides shrinkage for the regression coefficients, many of the coefficients remain small but non-zero. Performing ridge regression with the matrix sketch returned by our algorithm and a particular regularization parameter forces coefficients to zero and has a provable $(1+\epsilon)$ bound on the statistical risk. As such, it is an interesting alternative to elastic net regularization.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7513-ridge-regression-and-provable-deterministic-ridge-leverage-score-sampling
PDF	http://papers.nips.cc/paper/7513-ridge-regression-and-provable-deterministic-ridge-leverage-score-sampling.pdf
PWC	https://paperswithcode.com/paper/ridge-regression-and-provable-deterministic
Repo
Framework