Paper Group NANR 253
以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese]. Fast Bellman Updates for Robust MDPs. Geodesic Convolutional Shape Optimization. Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution. Inductive Two-Layer Modeling with Parametric Bregman Transfer. …
以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese]
Title | 以深層類神經網路標記中文階層式多標籤語意概念 (Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network ) [In Chinese] |
Authors | Wei-Chieh Chou, Yih-Ru Wang |
Abstract | |
Tasks | Multi-Label Classification |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/O18-1016/ |
https://www.aclweb.org/anthology/O18-1016 | |
PWC | https://paperswithcode.com/paper/aaeccc2e-e-a-eaa14a-ceaa-hierarchical-multi |
Repo | |
Framework | |
Fast Bellman Updates for Robust MDPs
Title | Fast Bellman Updates for Robust MDPs |
Authors | Chin Pang Ho, Marek Petrik, Wolfram Wiesemann |
Abstract | We describe two efficient, and exact, algorithms for computing Bellman updates in robust Markov decision processes (MDPs). The first algorithm uses a homotopy continuation method to compute updates for L1-constrained s,a-rectangular ambiguity sets. It runs in quasi-linear time for plain L1-norms and also generalizes to weighted L1-norms. The second algorithm uses bisection to compute updates for robust MDPs with s-rectangular ambiguity sets. This algorithm, when combined with the homotopy method, also has a quasi-linear runtime. Unlike previous methods, our algorithms compute the primal solution in addition to the optimal objective value, which makes them useful in policy iteration methods. Our experimental results indicate that the proposed methods are over 1,000 times faster than Gurobi, a state-of-the-art commercial optimization package, for small instances, and the performance gap grows considerably with problem size. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2318 |
http://proceedings.mlr.press/v80/ho18a/ho18a.pdf | |
PWC | https://paperswithcode.com/paper/fast-bellman-updates-for-robust-mdps |
Repo | |
Framework | |
Geodesic Convolutional Shape Optimization
Title | Geodesic Convolutional Shape Optimization |
Authors | Pierre Baque, Edoardo Remelli, Francois Fleuret, Pascal Fua |
Abstract | Aerodynamic shape optimization has many industrial applications. Existing methods, however, are so computationally demanding that typical engineering practices are to either simply try a limited number of hand-designed shapes or restrict oneself to shapes that can be parameterized using only few degrees of freedom. In this work, we introduce a new way to optimize complex shapes fast and accurately. To this end, we train Geodesic Convolutional Neural Networks to emulate a fluidynamics simulator. The key to making this approach practical is remeshing the original shape using a poly-cube map, which makes it possible to perform the computations on GPUs instead of CPUs. The neural net is then used to formulate an objective function that is differentiable with respect to the shape parameters, which can then be optimized using a gradient-based technique. This outperforms state-of-the-art methods by 5 to 20% for standard problems and, even more importantly, our approach applies to cases that previous methods cannot handle. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1944 |
http://proceedings.mlr.press/v80/baque18a/baque18a.pdf | |
PWC | https://paperswithcode.com/paper/geodesic-convolutional-shape-optimization |
Repo | |
Framework | |
Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution
Title | Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution |
Authors | Liat Ein Dor, Alon Halfon, Yoav Kantor, Ran Levy, Yosi Mass, Ruty Rinott, Eyal Shnarch, Noam Slonim |
Abstract | |
Tasks | Entity Linking, Learning-To-Rank |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1408/ |
https://www.aclweb.org/anthology/L18-1408 | |
PWC | https://paperswithcode.com/paper/semantic-relatedness-of-wikipedia-concepts |
Repo | |
Framework | |
Inductive Two-Layer Modeling with Parametric Bregman Transfer
Title | Inductive Two-Layer Modeling with Parametric Bregman Transfer |
Authors | Vignesh Ganapathiraman, Zhan Shi, Xinhua Zhang, Yaoliang Yu |
Abstract | Latent prediction models, exemplified by multi-layer networks, employ hidden variables that automate abstract feature discovery. They typically pose nonconvex optimization problems and effective semi-definite programming (SDP) relaxations have been developed to enable global solutions (Aslan et al., 2014).However, these models rely on nonparametric training of layer-wise kernel representations, and are therefore restricted to transductive learning which slows down test prediction. In this paper, we develop a new inductive learning framework for parametric transfer functions using matching losses. The result for ReLU utilizes completely positive matrices, and the inductive learner not only delivers superior accuracy but also offers an order of magnitude speedup over SDP with constant approximation guarantees. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2104 |
http://proceedings.mlr.press/v80/ganapathiraman18a/ganapathiraman18a.pdf | |
PWC | https://paperswithcode.com/paper/inductive-two-layer-modeling-with-parametric |
Repo | |
Framework | |
SANTO: A Web-based Annotation Tool for Ontology-driven Slot Filling
Title | SANTO: A Web-based Annotation Tool for Ontology-driven Slot Filling |
Authors | Matthias Hartung, Hendrik ter Horst, Frank Grimm, Tim Diekmann, Roman Klinger, Philipp Cimiano |
Abstract | Supervised machine learning algorithms require training data whose generation for complex relation extraction tasks tends to be difficult. Being optimized for relation extraction at sentence level, many annotation tools lack in facilitating the annotation of relational structures that are widely spread across the text. This leads to non-intuitive and cumbersome visualizations, making the annotation process unnecessarily time-consuming. We propose SANTO, an easy-to-use, domain-adaptive annotation tool specialized for complex slot filling tasks which may involve problems of cardinality and referential grounding. The web-based architecture enables fast and clearly structured annotation for multiple users in parallel. Relational structures are formulated as templates following the conceptualization of an underlying ontology. Further, import and export procedures of standard formats enable interoperability with external sources and tools. |
Tasks | Knowledge Base Population, Reading Comprehension, Relation Extraction, Slot Filling |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-4012/ |
https://www.aclweb.org/anthology/P18-4012 | |
PWC | https://paperswithcode.com/paper/santo-a-web-based-annotation-tool-for |
Repo | |
Framework | |
Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems
Title | Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems |
Authors | Yair Carmon, John C. Duchi |
Abstract | We provide convergence rates for Krylov subspace solutions to the trust-region and cubic-regularized (nonconvex) quadratic problems. Such solutions may be efficiently computed by the Lanczos method and have long been used in practice. We prove error bounds of the form $1/t^2$ and $e^{-4t/\sqrt{\kappa}}$, where $\kappa$ is a condition number for the problem, and $t$ is the Krylov subspace order (number of Lanczos iterations). We also provide lower bounds showing that our analysis is sharp. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8269-analysis-of-krylov-subspace-solutions-of-regularized-non-convex-quadratic-problems |
http://papers.nips.cc/paper/8269-analysis-of-krylov-subspace-solutions-of-regularized-non-convex-quadratic-problems.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-krylov-subspace-solutions-of |
Repo | |
Framework | |
Generalized Graph Embedding Models
Title | Generalized Graph Embedding Models |
Authors | Qiao Liu, Xiaohui Yang, Rui Wan, Shouzhong Tu, Zufeng Wu |
Abstract | Many types of relations in physical, biological, social and information systems can be modeled as homogeneous or heterogeneous concept graphs. Hence, learning from and with graph embeddings has drawn a great deal of research interest recently, but only ad hoc solutions have been obtained this far. In this paper, we conjecture that the one-shot supervised learning mechanism is a bottleneck in improving the performance of the graph embedding learning algorithms, and propose to extend this by introducing a multi-shot unsupervised learning framework. Empirical results on several real-world data set show that the proposed model consistently and significantly outperforms existing state-of-the-art approaches on knowledge base completion and graph based multi-label classification tasks. |
Tasks | Graph Embedding, Knowledge Base Completion, Multi-Label Classification |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SJd0EAy0b |
https://openreview.net/pdf?id=SJd0EAy0b | |
PWC | https://paperswithcode.com/paper/generalized-graph-embedding-models |
Repo | |
Framework | |
Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences
Title | Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences |
Authors | Morgan Ulinski, Bob Coyne, Julia Hirschberg |
Abstract | |
Tasks | Coreference Resolution, Image Retrieval |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1237/ |
https://www.aclweb.org/anthology/L18-1237 | |
PWC | https://paperswithcode.com/paper/evaluating-the-wordseye-text-to-scene-system |
Repo | |
Framework | |
Diffusing Policies : Towards Wasserstein Policy Gradient Flows
Title | Diffusing Policies : Towards Wasserstein Policy Gradient Flows |
Authors | Pierre H. Richemond, Brendan Maginnis |
Abstract | Policy gradients methods often achieve better performance when the change in policy is limited to a small Kullback-Leibler divergence. We derive policy gradients where the change in policy is limited to a small Wasserstein distance (or trust region). This is done in the discrete and continuous multi-armed bandit settings with entropy regularisation. We show that in the small steps limit with respect to the Wasserstein distance $W_2$, policy dynamics are governed by the heat equation, following the Jordan-Kinderlehrer-Otto result. This means that policies undergo diffusion and advection, concentrating near actions with high reward. This helps elucidate the nature of convergence in the probability matching setup, and provides justification for empirical practices such as Gaussian policy priors and additive gradient noise. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rk3mjYRp- |
https://openreview.net/pdf?id=rk3mjYRp- | |
PWC | https://paperswithcode.com/paper/diffusing-policies-towards-wasserstein-policy |
Repo | |
Framework | |
We Are Depleting Our Research Subject as We Are Investigating It: In Language Technology, more Replication and Diversity Are Needed
Title | We Are Depleting Our Research Subject as We Are Investigating It: In Language Technology, more Replication and Diversity Are Needed |
Authors | Ant{'o}nio Branco |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1022/ |
https://www.aclweb.org/anthology/L18-1022 | |
PWC | https://paperswithcode.com/paper/we-are-depleting-our-research-subject-as-we |
Repo | |
Framework | |
Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions
Title | Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions |
Authors | Wenruo Bai, Jeff Bilmes |
Abstract | We analyze the performance of the greedy algorithm, and also a discrete semi-gradient based algorithm, for maximizing the sum of a suBmodular and suPermodular (BP) function (both of which are non-negative monotone non-decreasing) under two types of constraints, either a cardinality constraint or $p\geq 1$ matroid independence constraints. These problems occur naturally in several real-world applications in data science, machine learning, and artificial intelligence. The problems are ordinarily inapproximable to any factor. Using the curvature $\curv_f$ of the submodular term, and introducing $\curv^g$ for the supermodular term (a natural dual curvature for supermodular functions), however, both of which are computable in linear time, we show that BP maximization can be efficiently approximated by both the greedy and the semi-gradient based algorithm. The algorithms yield multiplicative guarantees of $\frac{1}{\curv_f}\left[1-e^{-(1-\curv^g)\curv_f}\right]$ and $\frac{1-\curv^g}{(1-\curv^g)\curv_f + p}$ for the two types of constraints respectively. For pure monotone supermodular constrained maximization, these yield $1-\curvg$ and $(1-\curvg)/p$ for the two types of constraints respectively. We also analyze the hardness of BP maximization and show that our guarantees match hardness by a constant factor and by $O(\ln(p))$ respectively. Computational experiments are also provided supporting our analysis. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2350 |
http://proceedings.mlr.press/v80/bai18a/bai18a.pdf | |
PWC | https://paperswithcode.com/paper/greed-is-still-good-maximizing-monotone |
Repo | |
Framework | |
A Unified Framework for Structured Low-rank Matrix Learning
Title | A Unified Framework for Structured Low-rank Matrix Learning |
Authors | Pratik Jawanpuria, Bamdev Mishra |
Abstract | We consider the problem of learning a low-rank matrix, constrained to lie in a linear subspace, and introduce a novel factorization for modeling such matrices. A salient feature of the proposed factorization scheme is it decouples the low-rank and the structural constraints onto separate factors. We formulate the optimization problem on the Riemannian spectrahedron manifold, where the Riemannian framework allows to develop computationally efficient conjugate gradient and trust-region algorithms. Experiments on problems such as standard/robust/non-negative matrix completion, Hankel matrix learning and multi-task learning demonstrate the efficacy of our approach. |
Tasks | Matrix Completion, Multi-Task Learning |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1930 |
http://proceedings.mlr.press/v80/jawanpuria18a/jawanpuria18a.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-framework-for-structured-low-rank |
Repo | |
Framework | |
Using Universal Dependencies in cross-linguistic complexity research
Title | Using Universal Dependencies in cross-linguistic complexity research |
Authors | Aleks Berdicevskis, rs, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Katharina Ehret, Kilu von Prince, Daniel Ross, Bill Thompson, Chunxiao Yan, Vera Demberg, Gary Lupyan, Taraka Rama, Christian Bentz |
Abstract | We evaluate corpus-based measures of linguistic complexity obtained using Universal Dependencies (UD) treebanks. We propose a method of estimating robustness of the complexity values obtained using a given measure and a given treebank. The results indicate that measures of syntactic complexity might be on average less robust than those of morphological complexity. We also estimate the validity of complexity measures by comparing the results for very similar languages and checking for unexpected differences. We show that some of those differences that arise can be diminished by using parallel treebanks and, more importantly from the practical point of view, by harmonizing the language-specific solutions in the UD annotation. |
Tasks | |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6002/ |
https://www.aclweb.org/anthology/W18-6002 | |
PWC | https://paperswithcode.com/paper/using-universal-dependencies-in-cross |
Repo | |
Framework | |
Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling
Title | Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling |
Authors | Shannon Mccurdy |
Abstract | Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results. We provide provable guarantees for deterministic column sampling using ridge leverage scores. The matrix sketch returned by our algorithm is a column subset of the original matrix, yielding additional interpretability. Like the randomized counterparts, the deterministic algorithm provides $(1+\epsilon)$ error column subset selection, $(1+\epsilon)$ error projection-cost preservation, and an additive-multiplicative spectral bound. We also show that under the assumption of power-law decay of ridge leverage scores, this deterministic algorithm is provably as accurate as randomized algorithms. Lastly, ridge regression is frequently used to regularize ill-posed linear least-squares problems. While ridge regression provides shrinkage for the regression coefficients, many of the coefficients remain small but non-zero. Performing ridge regression with the matrix sketch returned by our algorithm and a particular regularization parameter forces coefficients to zero and has a provable $(1+\epsilon)$ bound on the statistical risk. As such, it is an interesting alternative to elastic net regularization. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7513-ridge-regression-and-provable-deterministic-ridge-leverage-score-sampling |
http://papers.nips.cc/paper/7513-ridge-regression-and-provable-deterministic-ridge-leverage-score-sampling.pdf | |
PWC | https://paperswithcode.com/paper/ridge-regression-and-provable-deterministic |
Repo | |
Framework | |