Paper Group NANR 78
High-Rank Matrix Completion and Clustering under Self-Expressive Models. Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3). Forecasting Emerging Trends from Scientific Literature. If You Even Don’t Have a Bit of Bible: Learning Delexicalized POS Taggers. A Low-Rank Approximation Approach to Learning J …
High-Rank Matrix Completion and Clustering under Self-Expressive Models
Title | High-Rank Matrix Completion and Clustering under Self-Expressive Models |
Authors | Ehsan Elhamifar |
Abstract | We propose efficient algorithms for simultaneous clustering and completion of incomplete high-dimensional data that lie in a union of low-dimensional subspaces. We cast the problem as finding a completion of the data matrix so that each point can be reconstructed as a linear or affine combination of a few data points. Since the problem is NP-hard, we propose a lifting framework and reformulate the problem as a group-sparse recovery of each incomplete data point in a dictionary built using incomplete data, subject to rank-one constraints. To solve the problem efficiently, we propose a rank pursuit algorithm and a convex relaxation. The solution of our algorithms recover missing entries and provides a similarity matrix for clustering. Our algorithms can deal with both low-rank and high-rank matrices, does not suffer from initialization, does not need to know dimensions of subspaces and can work with a small number of data points. By extensive experiments on synthetic data and real problems of video motion segmentation and completion of motion capture data, we show that when the data matrix is low-rank, our algorithm performs on par with or better than low-rank matrix completion methods, while for high-rank data matrices, our method significantly outperforms existing algorithms. |
Tasks | Low-Rank Matrix Completion, Matrix Completion, Motion Capture, Motion Segmentation |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6357-high-rank-matrix-completion-and-clustering-under-self-expressive-models |
http://papers.nips.cc/paper/6357-high-rank-matrix-completion-and-clustering-under-self-expressive-models.pdf | |
PWC | https://paperswithcode.com/paper/high-rank-matrix-completion-and-clustering |
Repo | |
Framework | |
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
Title | Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3) |
Authors | |
Abstract | |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4800/ |
https://www.aclweb.org/anthology/W16-4800 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-third-workshop-on-nlp-for |
Repo | |
Framework | |
Forecasting Emerging Trends from Scientific Literature
Title | Forecasting Emerging Trends from Scientific Literature |
Authors | Kartik Asooja, Georgeta Bordea, Gabriela Vulcu, Paul Buitelaar |
Abstract | Text analysis methods for the automatic identification of emerging technologies by analyzing the scientific publications, are gaining attention because of their socio-economic impact. The approaches so far have been mainly focused on retrospective analysis by mapping scientific topic evolution over time. We propose regression based approaches to predict future keyword distribution. The prediction is based on historical data of the keywords, which in our case, are LREC conference proceedings. Considering the insufficient number of data points available from LREC proceedings, we do not employ standard time series forecasting methods. We form a dataset by extracting the keywords from previous year proceedings and quantify their yearly relevance using tf-idf scores. This dataset additionally contains ranked lists of related keywords and experts for each keyword. |
Tasks | Time Series, Time Series Forecasting |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1066/ |
https://www.aclweb.org/anthology/L16-1066 | |
PWC | https://paperswithcode.com/paper/forecasting-emerging-trends-from-scientific |
Repo | |
Framework | |
If You Even Don’t Have a Bit of Bible: Learning Delexicalized POS Taggers
Title | If You Even Don’t Have a Bit of Bible: Learning Delexicalized POS Taggers |
Authors | Zhiwei Yu, David Mare{\v{c}}ek, Zden{\v{e}}k {\v{Z}}abokrtsk{'y}, Daniel Zeman |
Abstract | Part-of-speech (POS) induction is one of the most popular tasks in research on unsupervised NLP. Various unsupervised and semi-supervised methods have been proposed to tag an unseen language. However, many of them require some partial understanding of the target language because they rely on dictionaries or parallel corpora such as the Bible. In this paper, we propose a different method named delexicalized tagging, for which we only need a raw corpus of the target language. We transfer tagging models trained on annotated corpora of one or more resource-rich languages. We employ language-independent features such as word length, frequency, neighborhood entropy, character classes (alphabetic vs. numeric vs. punctuation) etc. We demonstrate that such features can, to certain extent, serve as predictors of the part of speech, represented by the universal POS tag. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1015/ |
https://www.aclweb.org/anthology/L16-1015 | |
PWC | https://paperswithcode.com/paper/if-you-even-dont-have-a-bit-of-bible-learning |
Repo | |
Framework | |
A Low-Rank Approximation Approach to Learning Joint Embeddings of News Stories and Images for Timeline Summarization
Title | A Low-Rank Approximation Approach to Learning Joint Embeddings of News Stories and Images for Timeline Summarization |
Authors | William Yang Wang, Yashar Mehdad, Dragomir R. Radev, Am Stent, a |
Abstract | |
Tasks | Feature Engineering, Recommendation Systems, Representation Learning, Timeline Summarization |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1008/ |
https://www.aclweb.org/anthology/N16-1008 | |
PWC | https://paperswithcode.com/paper/a-low-rank-approximation-approach-to-learning |
Repo | |
Framework | |
SemAligner: A Method and Tool for Aligning Chunks with Semantic Relation Types and Semantic Similarity Scores
Title | SemAligner: A Method and Tool for Aligning Chunks with Semantic Relation Types and Semantic Similarity Scores |
Authors | Nabin Maharjan, Rajendra Banjade, Nobal Bikram Niraula, Vasile Rus |
Abstract | This paper introduces a ruled-based method and software tool, called SemAligner, for aligning chunks across texts in a given pair of short English texts. The tool, based on the top performing method at the Interpretable Short Text Similarity shared task at SemEval 2015, where it was used with human annotated (gold) chunks, can now additionally process plain text-pairs using two powerful chunkers we developed, e.g. using Conditional Random Fields. Besides aligning chunks, the tool automatically assigns semantic relations to the aligned chunks (such as EQUI for equivalent and OPPO for opposite) and semantic similarity scores that measure the strength of the semantic relation between the aligned chunks. Experiments show that SemAligner performs competitively for system generated chunks and that these results are also comparable to results obtained on gold chunks. SemAligner has other capabilities such as handling various input formats and chunkers as well as extending lookup resources. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1192/ |
https://www.aclweb.org/anthology/L16-1192 | |
PWC | https://paperswithcode.com/paper/semaligner-a-method-and-tool-for-aligning |
Repo | |
Framework | |
Joining-in-type Humanoid Robot Assisted Language Learning System
Title | Joining-in-type Humanoid Robot Assisted Language Learning System |
Authors | AlBara Khalifa, Tsuneo Kato, Seiichi Yamamoto |
Abstract | Dialogue robots are attractive to people, and in language learning systems, they motivate learners and let them practice conversational skills in more realistic environment. However, automatic speech recognition (ASR) of the second language (L2) learners is still a challenge, because their speech contains not just pronouncing, lexical, grammatical errors, but is sometimes totally disordered. Hence, we propose a novel robot assisted language learning (RALL) system using two robots, one as a teacher and the other as an advanced learner. The system is designed to simulate multiparty conversation, expecting implicit learning and enhancement of predictability of learners{'} utterance through an alignment similar to {``}interactive alignment{''}, which is observed in human-human conversation. We collected a database with the prototypes, and measured how much the alignment phenomenon observed in the database with initial analysis. | |
Tasks | Speech Recognition |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1037/ |
https://www.aclweb.org/anthology/L16-1037 | |
PWC | https://paperswithcode.com/paper/joining-in-type-humanoid-robot-assisted |
Repo | |
Framework | |
Protein contact prediction from amino acid co-evolution using convolutional networks for graph-valued images
Title | Protein contact prediction from amino acid co-evolution using convolutional networks for graph-valued images |
Authors | Vladimir Golkov, Marcin J. Skwark, Antonij Golkov, Alexey Dosovitskiy, Thomas Brox, Jens Meiler, Daniel Cremers |
Abstract | Proteins are the “building blocks of life”, the most abundant organic molecules, and the central focus of most areas of biomedicine. Protein structure is strongly related to protein function, thus structure prediction is a crucial task on the way to solve many biological questions. A contact map is a compact representation of the three-dimensional structure of a protein via the pairwise contacts between the amino acid constituting the protein. We use a convolutional network to calculate protein contact maps from inferred statistical coupling between positions in the protein sequence. The input to the network has an image-like structure amenable to convolutions, but every “pixel” instead of color channels contains a bipartite undirected edge-weighted graph. We propose several methods for treating such “graph-valued images” in a convolutional network. The proposed method outperforms state-of-the-art methods by a large margin. It also allows for a great flexibility with regard to the input data, which makes it useful for studying a wide range of problems. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6488-protein-contact-prediction-from-amino-acid-co-evolution-using-convolutional-networks-for-graph-valued-images |
http://papers.nips.cc/paper/6488-protein-contact-prediction-from-amino-acid-co-evolution-using-convolutional-networks-for-graph-valued-images.pdf | |
PWC | https://paperswithcode.com/paper/protein-contact-prediction-from-amino-acid-co |
Repo | |
Framework | |
Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation
Title | Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation |
Authors | Domagoj Alagi{'c}, Jan {\v{S}}najder |
Abstract | We introduce Cro36WSD, a freely-available medium-sized lexical sample for Croatian word sense disambiguation (WSD).Cro36WSD comprises 36 words: 12 adjectives, 12 nouns, and 12 verbs, balanced across both frequency bands and polysemy levels. We adopt the multi-label annotation scheme in the hope of lessening the drawbacks of discrete sense inventories and obtaining more realistic annotations from human experts. Sense-annotated data is collected through multiple annotation rounds to ensure high-quality annotations: with a 115 person-hours effort we reached an inter-annotator agreement score of 0.877. We analyze the obtained data and perform a correlation analysis between several relevant variables, including word frequency, number of senses, sense distribution skewness, average annotation time, and the observed inter-annotator agreement (IAA). Using the obtained data, we compile multi- and single-labeled dataset variants using different label aggregation schemes. Finally, we evaluate three different baseline WSD models on both dataset variants and report on the insights gained. We make both dataset variants freely available. |
Tasks | Word Sense Disambiguation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1267/ |
https://www.aclweb.org/anthology/L16-1267 | |
PWC | https://paperswithcode.com/paper/cro36wsd-a-lexical-sample-for-croatian-word |
Repo | |
Framework | |
Verbal fields in Hungarian simple sentences and infinitival clausal complements
Title | Verbal fields in Hungarian simple sentences and infinitival clausal complements |
Authors | Kata Balogh |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-3306/ |
https://www.aclweb.org/anthology/W16-3306 | |
PWC | https://paperswithcode.com/paper/verbal-fields-in-hungarian-simple-sentences |
Repo | |
Framework | |
Annotating Temporally-Anchored Spatial Knowledge on Top of OntoNotes Semantic Roles
Title | Annotating Temporally-Anchored Spatial Knowledge on Top of OntoNotes Semantic Roles |
Authors | Alakan Vempala, a, Eduardo Blanco |
Abstract | This paper presents a two-step methodology to annotate spatial knowledge on top of OntoNotes semantic roles. First, we manipulate semantic roles to automatically generate potential additional spatial knowledge. Second, we crowdsource annotations with Amazon Mechanical Turk to either validate or discard the potential additional spatial knowledge. The resulting annotations indicate whether entities are or are not located somewhere with a degree of certainty, and temporally anchor this spatial information. Crowdsourcing experiments show that the additional spatial knowledge is ubiquitous and intuitive to humans, and experimental results show that it can be inferred automatically using standard supervised machine learning techniques. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1604/ |
https://www.aclweb.org/anthology/L16-1604 | |
PWC | https://paperswithcode.com/paper/annotating-temporally-anchored-spatial-2 |
Repo | |
Framework | |
Decomposing Bilexical Dependencies into Semantic and Syntactic Vectors
Title | Decomposing Bilexical Dependencies into Semantic and Syntactic Vectors |
Authors | Jeff Mitchell |
Abstract | |
Tasks | Chunking, Language Modelling, Named Entity Recognition, Part-Of-Speech Tagging, Representation Learning, Semantic Role Labeling |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-1615/ |
https://www.aclweb.org/anthology/W16-1615 | |
PWC | https://paperswithcode.com/paper/decomposing-bilexical-dependencies-into |
Repo | |
Framework | |
Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter
Title | Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter |
Authors | Qi Zhang, Yang Wang, Yeyun Gong, Xuanjing Huang |
Abstract | |
Tasks | |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1080/ |
https://www.aclweb.org/anthology/D16-1080 | |
PWC | https://paperswithcode.com/paper/keyphrase-extraction-using-deep-recurrent |
Repo | |
Framework | |
Generating Paraphrases from DBPedia using Deep Learning
Title | Generating Paraphrases from DBPedia using Deep Learning |
Authors | Amin Sleimi, Claire Gardent |
Abstract | |
Tasks | Language Modelling, Text Generation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-3511/ |
https://www.aclweb.org/anthology/W16-3511 | |
PWC | https://paperswithcode.com/paper/generating-paraphrases-from-dbpedia-using |
Repo | |
Framework | |
ReadME generation from an OWL ontology describing NLP tools
Title | ReadME generation from an OWL ontology describing NLP tools |
Authors | Driss Sadoun, Satenik Mkhitaryan, Damien Nouvel, Mathieu Valette |
Abstract | |
Tasks | Text Generation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-3509/ |
https://www.aclweb.org/anthology/W16-3509 | |
PWC | https://paperswithcode.com/paper/readme-generation-from-an-owl-ontology |
Repo | |
Framework | |