Paper Group NANR 162
Transductive Non-linear Learning for Chinese Hypernym Prediction. Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference. Asynchronous Coordinate Descent under More Realistic Assumptions. Bidirectional learning for time-series models with hidden units. Predicting Counselor Behaviors in Motivational Inte …
Transductive Non-linear Learning for Chinese Hypernym Prediction
Title | Transductive Non-linear Learning for Chinese Hypernym Prediction |
Authors | Chengyu Wang, Junchi Yan, Aoying Zhou, Xiaofeng He |
Abstract | Finding the correct hypernyms for entities is essential for taxonomy learning, fine-grained entity categorization, query understanding, etc. Due to the flexibility of the Chinese language, it is challenging to identify hypernyms in Chinese accurately. Rather than extracting hypernyms from texts, in this paper, we present a transductive learning approach to establish mappings from entities to hypernyms in the embedding space directly. It combines linear and non-linear embedding projection models, with the capacity of encoding arbitrary language-specific rules. Experiments on real-world datasets illustrate that our approach outperforms previous methods for Chinese hypernym prediction. |
Tasks | Relation Extraction |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1128/ |
https://www.aclweb.org/anthology/P17-1128 | |
PWC | https://paperswithcode.com/paper/transductive-non-linear-learning-for-chinese |
Repo | |
Framework | |
Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference
Title | Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference |
Authors | Aditya Chaudhry, Pan Xu, Quanquan Gu |
Abstract | Causal inference among high-dimensional time series data proves an important research problem in many fields. While in the classical regime one often establishes causality among time series via a concept known as “Granger causality,” existing approaches for Granger causal inference in high-dimensional data lack the means to characterize the uncertainty associated with Granger causality estimates (e.g., p-values and confidence intervals). We make two contributions in this work. First, we introduce a novel asymptotically unbiased Granger causality estimator with corresponding test statistics and confidence intervals to allow, for the first time, uncertainty characterization in high-dimensional Granger causal inference. Second, we introduce a novel method for false discovery rate control that achieves higher power in multiple testing than existing techniques and that can cope with dependent test statistics and dependent observations. We corroborate our theoretical results with experiments on both synthetic data and real-world climatological data. |
Tasks | Causal Inference, Time Series |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=479 |
http://proceedings.mlr.press/v70/chaudhry17a/chaudhry17a.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-assessment-and-false-discovery |
Repo | |
Framework | |
Asynchronous Coordinate Descent under More Realistic Assumptions
Title | Asynchronous Coordinate Descent under More Realistic Assumptions |
Authors | Tao Sun, Robert Hannah, Wotao Yin |
Abstract | Asynchronous-parallel algorithms have the potential to vastly speed up algorithms by eliminating costly synchronization. However, our understanding of these algorithms is limited because the current convergence theory of asynchronous block coordinate descent algorithms is based on somewhat unrealistic assumptions. In particular, the age of the shared optimization variables being used to update blocks is assumed to be independent of the block being updated. Additionally, it is assumed that the updates are applied to randomly chosen blocks. In this paper, we argue that these assumptions either fail to hold or will imply less efficient implementations. We then prove the convergence of asynchronous-parallel block coordinate descent under more realistic assumptions, in particular, always without the independence assumption. The analysis permits both the deterministic (essentially) cyclic and random rules for block choices. Because a bound on the asynchronous delays may or may not be available, we establish convergence for both bounded delays and unbounded delays. The analysis also covers nonconvex, weakly convex, and strongly convex functions. The convergence theory involves a Lyapunov function that directly incorporates both objective progress and delays. A continuous-time ODE is provided to motivate the construction at a high level. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7198-asynchronous-coordinate-descent-under-more-realistic-assumptions |
http://papers.nips.cc/paper/7198-asynchronous-coordinate-descent-under-more-realistic-assumptions.pdf | |
PWC | https://paperswithcode.com/paper/asynchronous-coordinate-descent-under-more |
Repo | |
Framework | |
Bidirectional learning for time-series models with hidden units
Title | Bidirectional learning for time-series models with hidden units |
Authors | Takayuki Osogami, Hiroshi Kajino, Taro Sekiyama |
Abstract | Hidden units can play essential roles in modeling time-series having long-term dependency or on-linearity but make it difficult to learn associated parameters. Here we propose a way to learn such a time-series model by training a backward model for the time-reversed time-series, where the backward model has a common set of parameters as the original (forward) model. Our key observation is that only a subset of the parameters is hard to learn, and that subset is complementary between the forward model and the backward model. By training both of the two models, we can effectively learn the values of the parameters that are hard to learn if only either of the two models is trained. We apply bidirectional learning to a dynamic Boltzmann machine extended with hidden units. Numerical experiments with synthetic and real datasets clearly demonstrate advantages of bidirectional learning. |
Tasks | Time Series |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=528 |
http://proceedings.mlr.press/v70/osogami17a/osogami17a.pdf | |
PWC | https://paperswithcode.com/paper/bidirectional-learning-for-time-series-models |
Repo | |
Framework | |
Predicting Counselor Behaviors in Motivational Interviewing Encounters
Title | Predicting Counselor Behaviors in Motivational Interviewing Encounters |
Authors | Ver{'o}nica P{'e}rez-Rosas, Rada Mihalcea, Kenneth Resnicow, Satinder Singh, Lawrence An, Kathy J. Goggin, Delwyn Catley |
Abstract | As the number of people receiving psycho-therapeutic treatment increases, the automatic evaluation of counseling practice arises as an important challenge in the clinical domain. In this paper, we address the automatic evaluation of counseling performance by analyzing counselors{'} language during their interaction with clients. In particular, we present a model towards the automation of Motivational Interviewing (MI) coding, which is the current gold standard to evaluate MI counseling. First, we build a dataset of hand labeled MI encounters; second, we use text-based methods to extract and analyze linguistic patterns associated with counselor behaviors; and third, we develop an automatic system to predict these behaviors. We introduce a new set of features based on semantic information and syntactic patterns, and show that they lead to accuracy figures of up to 90{%}, which represent a significant improvement with respect to features used in the past. |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1106/ |
https://www.aclweb.org/anthology/E17-1106 | |
PWC | https://paperswithcode.com/paper/predicting-counselor-behaviors-in |
Repo | |
Framework | |
Authorship Attribution Using Text Distortion
Title | Authorship Attribution Using Text Distortion |
Authors | Efstathios Stamatatos |
Abstract | Authorship attribution is associated with important applications in forensics and humanities research. A crucial point in this field is to quantify the personal style of writing, ideally in a way that is not affected by changes in topic or genre. In this paper, we present a novel method that enhances authorship attribution effectiveness by introducing a text distortion step before extracting stylometric measures. The proposed method attempts to mask topic-specific information that is not related to the personal style of authors. Based on experiments on two main tasks in authorship attribution, closed-set attribution and authorship verification, we demonstrate that the proposed approach can enhance existing methods especially under cross-topic conditions, where the training and test corpora do not match in topic. |
Tasks | Text Categorization |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1107/ |
https://www.aclweb.org/anthology/E17-1107 | |
PWC | https://paperswithcode.com/paper/authorship-attribution-using-text-distortion |
Repo | |
Framework | |
Complex Verbs are Different: Exploring the Visual Modality in Multi-Modal Models to Predict Compositionality
Title | Complex Verbs are Different: Exploring the Visual Modality in Multi-Modal Models to Predict Compositionality |
Authors | Maximilian K{"o}per, Sabine Schulte im Walde |
Abstract | This paper compares a neural network DSM relying on textual co-occurrences with a multi-modal model integrating visual information. We focus on nominal vs. verbal compounds, and zoom into lexical, empirical and perceptual target properties to explore the contribution of the visual modality. Our experiments show that (i) visual features contribute differently for verbs than for nouns, and (ii) images complement textual information, if (a) the textual modality by itself is poor and appropriate image subsets are used, or (b) the textual modality by itself is rich and large (potentially noisy) images are added. |
Tasks | Semantic Textual Similarity |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1728/ |
https://www.aclweb.org/anthology/W17-1728 | |
PWC | https://paperswithcode.com/paper/complex-verbs-are-different-exploring-the |
Repo | |
Framework | |
Efficient Second-Order Online Kernel Learning with Adaptive Embedding
Title | Efficient Second-Order Online Kernel Learning with Adaptive Embedding |
Authors | Daniele Calandriello, Alessandro Lazaric, Michal Valko |
Abstract | Online kernel learning (OKL) is a flexible framework to approach prediction problems, since the large approximation space provided by reproducing kernel Hilbert spaces can contain an accurate function for the problem. Nonetheless, optimizing over this space is computationally expensive. Not only first order methods accumulate $\O(\sqrt{T})$ more loss than the optimal function, but the curse of kernelization results in a $\O(t)$ per step complexity. Second-order methods get closer to the optimum much faster, suffering only $\O(\log(T))$ regret, but second-order updates are even more expensive, with a $\O(t^2)$ per-step cost. Existing approximate OKL methods try to reduce this complexity either by limiting the Support Vectors (SV) introduced in the predictor, or by avoiding the kernelization process altogether using embedding. Nonetheless, as long as the size of the approximation space or the number of SV does not grow over time, an adversary can always exploit the approximation process. In this paper, we propose PROS-N-KONS, a method that combines Nystrom sketching to project the input point in a small, accurate embedded space, and performs efficient second-order updates in this space. The embedded space is continuously updated to guarantee that the embedding remains accurate, and we show that the per-step cost only grows with the effective dimension of the problem and not with $T$. Moreover, the second-order updated allows us to achieve the logarithmic regret. We empirically compare our algorithm on recent large-scales benchmarks and show it performs favorably. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7194-efficient-second-order-online-kernel-learning-with-adaptive-embedding |
http://papers.nips.cc/paper/7194-efficient-second-order-online-kernel-learning-with-adaptive-embedding.pdf | |
PWC | https://paperswithcode.com/paper/efficient-second-order-online-kernel-learning |
Repo | |
Framework | |
StruAP: A Tool for Bundling Linguistic Trees through Structure-based Abstract Pattern
Title | StruAP: A Tool for Bundling Linguistic Trees through Structure-based Abstract Pattern |
Authors | Kohsuke Yanai, Misa Sato, Toshihiko Yanase, Kenzo Kurotsuchi, Yuta Koreeda, Yoshiki Niwa |
Abstract | We present a tool for developing tree structure patterns that makes it easy to define the relations among textual phrases and create a search index for these newly defined relations. By using the proposed tool, users develop tree structure patterns through abstracting syntax trees. The tool features (1) intuitive pattern syntax, (2) unique functions such as recursive call of patterns and use of lexicon dictionaries, and (3) whole workflow support for relation development and validation. We report the current implementation of the tool and its effectiveness. |
Tasks | Decision Making, Information Retrieval, Relation Extraction |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-2006/ |
https://www.aclweb.org/anthology/D17-2006 | |
PWC | https://paperswithcode.com/paper/struap-a-tool-for-bundling-linguistic-trees |
Repo | |
Framework | |
Gated Self-Matching Networks for Reading Comprehension and Question Answering
Title | Gated Self-Matching Networks for Reading Comprehension and Question Answering |
Authors | Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, Ming Zhou |
Abstract | In this paper, we present the gated self-matching networks for reading comprehension style question answering, which aims to answer questions from a given passage. We first match the question and passage with gated attention-based recurrent networks to obtain the question-aware passage representation. Then we propose a self-matching attention mechanism to refine the representation by matching the passage against itself, which effectively encodes information from the whole passage. We finally employ the pointer networks to locate the positions of answers from the passages. We conduct extensive experiments on the SQuAD dataset. The single model achieves 71.3{%} on the evaluation metrics of exact match on the hidden test set, while the ensemble model further boosts the results to 75.9{%}. At the time of submission of the paper, our model holds the first place on the SQuAD leaderboard for both single and ensemble model. |
Tasks | Question Answering, Reading Comprehension |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1018/ |
https://www.aclweb.org/anthology/P17-1018 | |
PWC | https://paperswithcode.com/paper/gated-self-matching-networks-for-reading |
Repo | |
Framework | |
Results of the WMT17 Neural MT Training Task
Title | Results of the WMT17 Neural MT Training Task |
Authors | Ond{\v{r}}ej Bojar, Jind{\v{r}}ich Helcl, Tom Kocmi, Jind{\v{r}}ich Libovick{'y}, Tom{'a}{\v{s}} Musil |
Abstract | |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4757/ |
https://www.aclweb.org/anthology/W17-4757 | |
PWC | https://paperswithcode.com/paper/results-of-the-wmt17-neural-mt-training-task |
Repo | |
Framework | |
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing
Title | Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing |
Authors | |
Abstract | |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2400/ |
https://www.aclweb.org/anthology/W17-2400 | |
PWC | https://paperswithcode.com/paper/proceedings-of-textgraphs-11-the-workshop-on |
Repo | |
Framework | |
SuperAgent: A Customer Service Chatbot for E-commerce Websites
Title | SuperAgent: A Customer Service Chatbot for E-commerce Websites |
Authors | Lei Cui, Shaohan Huang, Furu Wei, Chuanqi Tan, Chaoqun Duan, Ming Zhou |
Abstract | |
Tasks | Chatbot, Opinion Mining, Question Answering |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-4017/ |
https://www.aclweb.org/anthology/P17-4017 | |
PWC | https://paperswithcode.com/paper/superagent-a-customer-service-chatbot-for-e |
Repo | |
Framework | |
Coherent probabilistic forecasts for hierarchical time series
Title | Coherent probabilistic forecasts for hierarchical time series |
Authors | Souhaib Ben Taieb, James W. Taylor, Rob J. Hyndman |
Abstract | Many applications require forecasts for a hierarchy comprising a set of time series along with aggregates of subsets of these series. Hierarchical forecasting require not only good prediction accuracy at each level of the hierarchy, but also the coherency between different levels — the property that forecasts add up appropriately across the hierarchy. A fundamental limitation of prior research is the focus on forecasting the mean of each time series. We consider the situation where probabilistic forecasts are needed for each series in the hierarchy, and propose an algorithm to compute predictive distributions rather than mean forecasts only. Our algorithm has the advantage of synthesizing information from different levels in the hierarchy through a sparse forecast combination and a probabilistic hierarchical aggregation. We evaluate the accuracy of our forecasting algorithm on both simulated data and large-scale electricity smart meter data. The results show consistent performance gains compared to state-of-the art methods. |
Tasks | Time Series |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=633 |
http://proceedings.mlr.press/v70/taieb17a/taieb17a.pdf | |
PWC | https://paperswithcode.com/paper/coherent-probabilistic-forecasts-for |
Repo | |
Framework | |
The Role of Prosody and Speech Register in Word Segmentation: A Computational Modelling Perspective
Title | The Role of Prosody and Speech Register in Word Segmentation: A Computational Modelling Perspective |
Authors | Bogdan Ludusan, Reiko Mazuka, Mathieu Bernard, Alej Cristia, rina, Emmanuel Dupoux |
Abstract | This study explores the role of speech register and prosody for the task of word segmentation. Since these two factors are thought to play an important role in early language acquisition, we aim to quantify their contribution for this task. We study a Japanese corpus containing both infant- and adult-directed speech and we apply four different word segmentation models, with and without knowledge of prosodic boundaries. The results showed that the difference between registers is smaller than previously reported and that prosodic boundary information helps more adult- than infant-directed speech. |
Tasks | Language Acquisition |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2028/ |
https://www.aclweb.org/anthology/P17-2028 | |
PWC | https://paperswithcode.com/paper/the-role-of-prosody-and-speech-register-in |
Repo | |
Framework | |