Paper Group NANR 41
Multi-task Learning on MNIST Image Datasets. Comparing Bayesian Models of Annotation. Analyzing Middle High German Syntax with RDF and SPARQL. Modeling Speech Acts in Asynchronous Conversations: A Neural-CRF Approach. Cross-checking WordNet and SUMO Using Meronymy. Extending the Framework of Equilibrium Propagation to General Dynamics. Multiple Gra …
Multi-task Learning on MNIST Image Datasets
Title | Multi-task Learning on MNIST Image Datasets |
Authors | Po-Chen Hsieh, Chia-Ping Chen |
Abstract | We apply multi-task learning to image classification tasks on MNIST-like datasets. MNIST dataset has been referred to as the {\em drosophila} of machine learning and has been the testbed of many learning theories. The NotMNIST dataset and the FashionMNIST dataset have been created with the MNIST dataset as reference. In this work, we exploit these MNIST-like datasets for multi-task learning. The datasets are pooled together for learning the parameters of joint classification networks. Then the learned parameters are used as the initial parameters to retrain disjoint classification networks. The baseline recognition model are all-convolution neural networks. Without multi-task learning, the recognition accuracies for MNIST, NotMNIST and FashionMNIST are 99.56%, 97.22% and 94.32% respectively. With multi-task learning to pre-train the networks, the recognition accuracies are respectively 99.70%, 97.46% and 95.25%. The results re-affirm that multi-task learning framework, even with data with different genres, does lead to significant improvement. |
Tasks | Image Classification, Multi-Task Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=S1PWi_lC- |
https://openreview.net/pdf?id=S1PWi_lC- | |
PWC | https://paperswithcode.com/paper/multi-task-learning-on-mnist-image-datasets |
Repo | |
Framework | |
Comparing Bayesian Models of Annotation
Title | Comparing Bayesian Models of Annotation |
Authors | Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio |
Abstract | The analysis of crowdsourced annotations in natural language processing is concerned with identifying (1) gold standard labels, (2) annotator accuracies and biases, and (3) item difficulties and error patterns. Traditionally, majority voting was used for 1, and coefficients of agreement for 2 and 3. Lately, model-based analysis of corpus annotations have proven better at all three tasks. But there has been relatively little work comparing them on the same datasets. This paper aims to fill this gap by analyzing six models of annotation, covering different approaches to annotator ability, item difficulty, and parameter pooling (tying) across annotators and items. We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators. We conclude with guidelines for model selection, application, and implementation. |
Tasks | Model Selection |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/Q18-1040/ |
https://www.aclweb.org/anthology/Q18-1040 | |
PWC | https://paperswithcode.com/paper/comparing-bayesian-models-of-annotation |
Repo | |
Framework | |
Analyzing Middle High German Syntax with RDF and SPARQL
Title | Analyzing Middle High German Syntax with RDF and SPARQL |
Authors | Christian Chiarcos, Benjamin Kosmehl, Christian F{"a}th, Maria Sukhareva |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1717/ |
https://www.aclweb.org/anthology/L18-1717 | |
PWC | https://paperswithcode.com/paper/analyzing-middle-high-german-syntax-with-rdf |
Repo | |
Framework | |
Modeling Speech Acts in Asynchronous Conversations: A Neural-CRF Approach
Title | Modeling Speech Acts in Asynchronous Conversations: A Neural-CRF Approach |
Authors | Shafiq Joty, Tasnim Mohiuddin |
Abstract | Participants in an asynchronous conversation (e.g., forum, e-mail) interact with each other at different times, performing certain communicative acts, called speech acts (e.g., question, request). In this article, we propose a hybrid approach to speech act recognition in asynchronous conversations. Our approach works in two main steps: a long short-term memory recurrent neural network (LSTM-RNN) first encodes each sentence separately into a task-specific distributed representation, and this is then used in a conditional random field (CRF) model to capture the conversational dependencies between sentences. The LSTM-RNN model uses pretrained word embeddings learned from a large conversational corpus and is trained to classify sentences into speech act types. The CRF model can consider arbitrary graph structures to model conversational dependencies in an asynchronous conversation. In addition, to mitigate the problem of limited annotated data in the asynchronous domains, we adapt the LSTM-RNN model to learn from synchronous conversations (e.g., meetings), using domain adversarial training of neural networks. Empirical evaluation shows the effectiveness of our approach over existing ones: (i) LSTM-RNNs provide better task-specific representations, (ii) conversational word embeddings benefit the LSTM-RNNs more than the off-the-shelf ones, (iii) adversarial training gives better domain-invariant representations, and (iv) the global CRF model improves over local models. |
Tasks | Word Embeddings |
Published | 2018-12-01 |
URL | https://www.aclweb.org/anthology/J18-4012/ |
https://www.aclweb.org/anthology/J18-4012 | |
PWC | https://paperswithcode.com/paper/modeling-speech-acts-in-asynchronous |
Repo | |
Framework | |
Cross-checking WordNet and SUMO Using Meronymy
Title | Cross-checking WordNet and SUMO Using Meronymy |
Authors | Javier {'A}lvez, Itziar Gonzalez-Dios, German Rigau |
Abstract | |
Tasks | Automated Theorem Proving |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1723/ |
https://www.aclweb.org/anthology/L18-1723 | |
PWC | https://paperswithcode.com/paper/cross-checking-wordnet-and-sumo-using |
Repo | |
Framework | |
Extending the Framework of Equilibrium Propagation to General Dynamics
Title | Extending the Framework of Equilibrium Propagation to General Dynamics |
Authors | Benjamin Scellier, Anirudh Goyal, Jonathan Binas, Thomas Mesnard, Yoshua Bengio |
Abstract | The biological plausibility of the backpropagation algorithm has long been doubted by neuroscientists. Two major reasons are that neurons would need to send two different types of signal in the forward and backward phases, and that pairs of neurons would need to communicate through symmetric bidirectional connections. We present a simple two-phase learning procedure for fixed point recurrent networks that addresses both these issues. In our model, neurons perform leaky integration and synaptic weights are updated through a local mechanism. Our learning method extends the framework of Equilibrium Propagation to general dynamics, relaxing the requirement of an energy function. As a consequence of this generalization, the algorithm does not compute the true gradient of the objective function, but rather approximates it at a precision which is proven to be directly related to the degree of symmetry of the feedforward and feedback weights. We show experimentally that the intrinsic properties of the system lead to alignment of the feedforward and feedback weights, and that our algorithm optimizes the objective function. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SJTB5GZCb |
https://openreview.net/pdf?id=SJTB5GZCb | |
PWC | https://paperswithcode.com/paper/extending-the-framework-of-equilibrium |
Repo | |
Framework | |
Multiple Granularity Group Interaction Prediction
Title | Multiple Granularity Group Interaction Prediction |
Authors | Taiping Yao, Minsi Wang, Bingbing Ni, Huawei Wei, Xiaokang Yang |
Abstract | Most human activity analysis works (i.e., recognition orãprediction) only focus on a single granularity, i.e., eitherãmodelling global motion based on the coarse level movement such as human trajectories orãforecasting future detailed action based on body partsâ movement such as skeleton motion. In contrast, in this work, we propose a multi-granularity interaction prediction network which integratesãboth global motion and detailed local action. Built on a bi- directional LSTM network, theãproposed method possessesãbetween granularities links which encourage feature sharing as well as cross-feature consistency between both globalãand local granularity (e.g., trajectory or local action), and in turn predict long-term global location and local dynamics of each individual. We validate our method on severalãpublic datasets with promising performance. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Yao_Multiple_Granularity_Group_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Yao_Multiple_Granularity_Group_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/multiple-granularity-group-interaction |
Repo | |
Framework | |
Multilingual Word Segmentation: Training Many Language-Specific Tokenizers Smoothly Thanks to the Universal Dependencies Corpus
Title | Multilingual Word Segmentation: Training Many Language-Specific Tokenizers Smoothly Thanks to the Universal Dependencies Corpus |
Authors | Erwan Moreau, Carl Vogel |
Abstract | |
Tasks | Tokenization |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1180/ |
https://www.aclweb.org/anthology/L18-1180 | |
PWC | https://paperswithcode.com/paper/multilingual-word-segmentation-training-many |
Repo | |
Framework | |
A Review of Standard Text Classification Practices for Multi-label Toxicity Identification of Online Content
Title | A Review of Standard Text Classification Practices for Multi-label Toxicity Identification of Online Content |
Authors | Isuru Gunasekara, Isar Nejadgholi |
Abstract | Language toxicity identification presents a gray area in the ethical debate surrounding freedom of speech and censorship. Today{'}s social media landscape is littered with unfiltered content that can be anywhere from slightly abusive to hate inducing. In response, we focused on training a multi-label classifier to detect both the type and level of toxicity in online content. This content is typically colloquial and conversational in style. Its classification therefore requires huge amounts of annotated data due to its variability and inconsistency. We compare standard methods of text classification in this task. A conventional one-vs-rest SVM classifier with character and word level frequency-based representation of text reaches 0.9763 ROC AUC score. We demonstrated that leveraging more advanced technologies such as word embeddings, recurrent neural networks, attention mechanism, stacking of classifiers and semi-supervised training can improve the ROC AUC score of classification to 0.9862. We suggest that in order to choose the right model one has to consider the accuracy of models as well as inference complexity based on the application. |
Tasks | Text Classification, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5103/ |
https://www.aclweb.org/anthology/W18-5103 | |
PWC | https://paperswithcode.com/paper/a-review-of-standard-text-classification |
Repo | |
Framework | |
Camera Pose Estimation With Unknown Principal Point
Title | Camera Pose Estimation With Unknown Principal Point |
Authors | Viktor Larsson, Zuzana Kukelova, Yinqiang Zheng |
Abstract | To estimate the 6-DoF extrinsic pose of a pinhole camera with partially unknown intrinsic parameters is a critical sub-problem in structure-from-motion and camera localization. In most of existing camera pose estimation solvers, the principal point is assumed to be in the image center. Unfortunately, this assumption is not always true, especially for asymmetrically cropped images. In this paper, we develop the first exactly minimal solver for the case of unknown principal point and focal length by using four and a half point correspondences (P4.5Pfuv). We also present an extremely fast solver for the case of unknown aspect ratio (P5Pfuva). The new solvers outperform the previous state-of-the-art in terms of stability and speed. Finally, we explore the extremely challenging case of both unknown principal point and radial distortion, and develop the first practical non-minimal solver by using seven point correspondences (P7Pfruv). Experimental results on both simulated data and real Internet images demonstrate the usefulness of our new solvers. |
Tasks | Camera Localization, Pose Estimation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Larsson_Camera_Pose_Estimation_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Larsson_Camera_Pose_Estimation_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/camera-pose-estimation-with-unknown-principal |
Repo | |
Framework | |
Modeling Dynamic Missingness of Implicit Feedback for Recommendation
Title | Modeling Dynamic Missingness of Implicit Feedback for Recommendation |
Authors | Menghan Wang, Mingming Gong, Xiaolin Zheng, Kun Zhang |
Abstract | Implicit feedback is widely used in collaborative filtering methods for recommendation. It is well known that implicit feedback contains a large number of values that are \emph{missing not at random} (MNAR); and the missing data is a mixture of negative and unknown feedback, making it difficult to learn user’s negative preferences. Recent studies modeled \emph{exposure}, a latent missingness variable which indicates whether an item is missing to a user, to give each missing entry a confidence of being negative feedback. However, these studies use static models and ignore the information in temporal dependencies among items, which seems to be a essential underlying factor to subsequent missingness. To model and exploit the dynamics of missingness, we propose a latent variable named ``\emph{user intent}’’ to govern the temporal changes of item missingness, and a hidden Markov model to represent such a process. The resulting framework captures the dynamic item missingness and incorporate it into matrix factorization (MF) for recommendation. We also explore two types of constraints to achieve a more compact and interpretable representation of \emph{user intents}. Experiments on real-world datasets demonstrate the superiority of our method against state-of-the-art recommender systems. | |
Tasks | Recommendation Systems |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7901-modeling-dynamic-missingness-of-implicit-feedback-for-recommendation |
http://papers.nips.cc/paper/7901-modeling-dynamic-missingness-of-implicit-feedback-for-recommendation.pdf | |
PWC | https://paperswithcode.com/paper/modeling-dynamic-missingness-of-implicit |
Repo | |
Framework | |
BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools
Title | BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools |
Authors | Fatima Hamlaoui, Emmanuel-Moselly Makasso, Markus M{"u}ller, Jonas Engelmann, Gilles Adda, Alex Waibel, Sebastian St{"u}ker |
Abstract | |
Tasks | Machine Translation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1533/ |
https://www.aclweb.org/anthology/L18-1533 | |
PWC | https://paperswithcode.com/paper/bulbasaa-a-bilingual-basaa-french-speech |
Repo | |
Framework | |
Di-LSTM Contrast : A Deep Neural Network for Metaphor Detection
Title | Di-LSTM Contrast : A Deep Neural Network for Metaphor Detection |
Authors | Krishnkant Swarnkar, Anil Kumar Singh |
Abstract | The contrast between the contextual and general meaning of a word serves as an important clue for detecting its metaphoricity. In this paper, we present a deep neural architecture for metaphor detection which exploits this contrast. Additionally, we also use cost-sensitive learning by re-weighting examples, and baseline features like concreteness ratings, POS and WordNet-based features. The best performing system of ours achieves an overall F1 score of 0.570 on All POS category and 0.605 on the Verbs category at the Metaphor Shared Task 2018. |
Tasks | Topic Models, Word Embeddings |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0914/ |
https://www.aclweb.org/anthology/W18-0914 | |
PWC | https://paperswithcode.com/paper/di-lstm-contrast-a-deep-neural-network-for |
Repo | |
Framework | |
Accurate SHRG-Based Semantic Parsing
Title | Accurate SHRG-Based Semantic Parsing |
Authors | Yufei Chen, Weiwei Sun, Xiaojun Wan |
Abstract | We demonstrate that an SHRG-based parser can produce semantic graphs much more accurately than previously shown, by relating synchronous production rules to the syntacto-semantic composition process. Our parser achieves an accuracy of 90.35 for EDS (89.51 for DMRS) in terms of elementary dependency match, which is a 4.87 (5.45) point improvement over the best existing data-driven model, indicating, in our view, the importance of linguistically-informed derivation for data-driven semantic parsing. This accuracy is equivalent to that of English Resource Grammar guided models, suggesting that (recurrent) neural network models are able to effectively learn deep linguistic knowledge from annotations. |
Tasks | Semantic Composition, Semantic Parsing |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1038/ |
https://www.aclweb.org/anthology/P18-1038 | |
PWC | https://paperswithcode.com/paper/accurate-shrg-based-semantic-parsing |
Repo | |
Framework | |
Generating Differentially Private Datasets Using GANs
Title | Generating Differentially Private Datasets Using GANs |
Authors | Aleksei Triastcyn, Boi Faltings |
Abstract | In this paper, we present a technique for generating artificial datasets that retain statistical properties of the real data while providing differential privacy guarantees with respect to this data. We include a Gaussian noise layer in the discriminator of a generative adversarial network to make the output and the gradients differentially private with respect to the training data, and then use the generator component to synthesise privacy-preserving artificial dataset. Our experiments show that under a reasonably small privacy budget we are able to generate data of high quality and successfully train machine learning models on this artificial data. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJv4XWZA- |
https://openreview.net/pdf?id=rJv4XWZA- | |
PWC | https://paperswithcode.com/paper/generating-differentially-private-datasets |
Repo | |
Framework | |