Paper Group NANR 139
Manifold Learning in Quotient Spaces. Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages. Unsupervised Deep Generative Adversarial Hashing Network. Lifelong Learning by Adjusting Priors. Characters or Morphemes: How to Represent Words?. Improving Neural Network Performance by Injecting Background Knowledge …
Manifold Learning in Quotient Spaces
Title | Manifold Learning in Quotient Spaces |
Authors | Ãloi Mehr, André Lieutier, Fernando Sanchez Bermudez, Vincent Guitteny, Nicolas Thome, Matthieu Cord |
Abstract | When learning 3D shapes we are usually interested in their intrinsic geometry rather than in their orientation. To deal with the orientation variations the usual trick consists in augmenting the data to exhibit all possible variability, and thus let the model learn both the geometry as well as the rotations. In this paper we introduce a new autoencoder model for encoding and synthesis of 3D shapes. To get rid of undesirable input variability our model learns a manifold in a quotient space of the input space. Typically, we propose to quotient the space of 3D models by the action of rotations. Thus, our quotient autoencoder allows to directly learn in the space of interest, ignoring side information. This is reflected in better performances on reconstruction and interpolation tasks, as our experiments show that our model outperforms a vanilla autoencoder on the well-known Shapenet dataset. Moreover, our model learns a rotation-invariant representation, leading to interesting results in shapes co-alignment. Finally, we extend our quotient autoencoder to quotient by non-rigid transformations. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Mehr_Manifold_Learning_in_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Mehr_Manifold_Learning_in_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/manifold-learning-in-quotient-spaces |
Repo | |
Framework | |
Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages
Title | Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages |
Authors | Talha Javed, Nizar Habash, Dima Taji |
Abstract | |
Tasks | Dependency Parsing, Tokenization, Transliteration |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1345/ |
https://www.aclweb.org/anthology/L18-1345 | |
PWC | https://paperswithcode.com/paper/palmyra-a-platform-independent-dependency |
Repo | |
Framework | |
Unsupervised Deep Generative Adversarial Hashing Network
Title | Unsupervised Deep Generative Adversarial Hashing Network |
Authors | Kamran Ghasedi Dizaji, Feng Zheng, Najmeh Sadoughi, Yanhua Yang, Cheng Deng, Heng Huang |
Abstract | Unsupervised deep hash functions have not shown satisfactory improvements against the shallow alternatives, and usually, require supervised pretraining to avoid getting stuck in bad local minima. In this paper, we propose a deep unsupervised hashing function, called HashGAN, which outperforms unsupervised hashing models with significant margins without any supervised pretraining. HashGAN consists of three networks, a generator, a discriminator and an encoder. By sharing the parameters of the encoder and discriminator, we benefit from the adversarial loss as a data dependent regularization in training our deep hash function. Moreover, a novel loss function is introduced for hashing real images, resulting in minimum entropy, uniform frequency, consistent and independent hash bits. Furthermore, we train the generator conditioning on random binary inputs and also use these binary variables in a triplet ranking loss for improving hash codes. In our experiments, HashGAN outperforms the previous unsupervised hash functions in image retrieval and achieves the state-of-the-art performance in image clustering. We also provide an ablation study, showing the contribution of each component in our loss function. |
Tasks | Image Clustering, Image Retrieval |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Dizaji_Unsupervised_Deep_Generative_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Dizaji_Unsupervised_Deep_Generative_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-deep-generative-adversarial |
Repo | |
Framework | |
Lifelong Learning by Adjusting Priors
Title | Lifelong Learning by Adjusting Priors |
Authors | Ron Amit, Ron Meir |
Abstract | In representational lifelong learning an agent aims to continually learn to solve novel tasks while updating its representation in light of previous tasks. Under the assumption that future tasks are related to previous tasks, representations should be learned in such a way that they capture the common structure across learned tasks, while allowing the learner sufficient flexibility to adapt to novel aspects of a new task. We develop a framework for lifelong learning in deep neural networks that is based on generalization bounds, developed within the PAC-Bayes framework. Learning takes place through the construction of a distribution over networks based on the tasks seen so far, and its utilization for learning a new task. Thus, prior knowledge is incorporated through setting a history-dependent prior for novel tasks. We develop a gradient-based algorithm implementing these ideas, based on minimizing an objective function motivated by generalization bounds, and demonstrate its effectiveness through numerical examples. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJUBryZ0W |
https://openreview.net/pdf?id=rJUBryZ0W | |
PWC | https://paperswithcode.com/paper/lifelong-learning-by-adjusting-priors |
Repo | |
Framework | |
Characters or Morphemes: How to Represent Words?
Title | Characters or Morphemes: How to Represent Words? |
Authors | Ahmet {"U}st{"u}n, Murathan Kurfal{\i}, Burcu Can |
Abstract | In this paper, we investigate the effects of using subword information in representation learning. We argue that using syntactic subword units effects the quality of the word representations positively. We introduce a morpheme-based model and compare it against to word-based, character-based, and character n-gram level models. Our model takes a list of candidate segmentations of a word and learns the representation of the word based on different segmentations that are weighted by an attention mechanism. We performed experiments on Turkish as a morphologically rich language and English with a comparably poorer morphology. The results show that morpheme-based models are better at learning word representations of morphologically complex languages compared to character-based and character n-gram level models since the morphemes help to incorporate more syntactic knowledge in learning, that makes morpheme-based models better at syntactic tasks. |
Tasks | Representation Learning, Semantic Textual Similarity |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3019/ |
https://www.aclweb.org/anthology/W18-3019 | |
PWC | https://paperswithcode.com/paper/characters-or-morphemes-how-to-represent |
Repo | |
Framework | |
Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts
Title | Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts |
Authors | Wafia Adouane, Jean-Philippe Bernardy, Simon Dobnik |
Abstract | We explore the effect of injecting background knowledge to different deep neural network (DNN) configurations in order to mitigate the problem of the scarcity of annotated data when applying these models on datasets of low-resourced languages. The background knowledge is encoded in the form of lexicons and pre-trained sub-word embeddings. The DNN models are evaluated on the task of detecting code-switching and borrowing points in non-standardised user-generated Algerian texts. Overall results show that DNNs benefit from adding background knowledge. However, the gain varies between models and categories. The proposed DNN architectures are generic and could be applied to other low-resourced languages. |
Tasks | Word Embeddings |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3203/ |
https://www.aclweb.org/anthology/W18-3203 | |
PWC | https://paperswithcode.com/paper/improving-neural-network-performance-by |
Repo | |
Framework | |
Application and Analysis of a Multi-layered Scheme for Irony on the Italian Twitter Corpus TWITTIR`O
Title | Application and Analysis of a Multi-layered Scheme for Irony on the Italian Twitter Corpus TWITTIR`O |
Authors | Aless Cignarella, ra Teresa, Cristina Bosco, Viviana Patti, Mirko Lai |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1664/ |
https://www.aclweb.org/anthology/L18-1664 | |
PWC | https://paperswithcode.com/paper/application-and-analysis-of-a-multi-layered |
Repo | |
Framework | |
Learning without the Phase: Regularized PhaseMax Achieves Optimal Sample Complexity
Title | Learning without the Phase: Regularized PhaseMax Achieves Optimal Sample Complexity |
Authors | Fariborz Salehi, Ehsan Abbasi, Babak Hassibi |
Abstract | The problem of estimating an unknown signal, $\mathbf x_0\in \mathbb R^n$, from a vector $\mathbf y\in \mathbb R^m$ consisting of $m$ magnitude-only measurements of the form $y_i=\mathbf a_i\mathbf x_0$, where $\mathbf a_i$'s are the rows of a known measurement matrix $\mathbf A$ is a classical problem known as phase retrieval. This problem arises when measuring the phase is costly or altogether infeasible. In many applications in machine learning, signal processing, statistics, etc., the underlying signal has certain structure (sparse, low-rank, finite alphabet, etc.), opening of up the possibility of recovering $\mathbf x_0$ from a number of measurements smaller than the ambient dimension, i.e., $m<n$. Ideally, one would like to recover the signal from a number of phaseless measurements that is on the order of the “degrees of freedom” of the structured $\mathbf x_0$. To this end, inspired by the PhaseMax algorithm, we formulate a convex optimization problem, where the objective function relies on an initial estimate of the true signal and also includes an additive regularization term to encourage structure. The new formulation is referred to as {\textbf{regularized PhaseMax}}. We analyze the performance of regularized PhaseMax to find the minimum number of phaseless measurements required for perfect signal recovery. The results are asymptotic and are in terms of the geometrical properties (such as the Gaussian width) of certain convex cones. When the measurement matrix has i.i.d. Gaussian entries, we show that our proposed method is indeed order-wise optimal, allowing perfect recovery from a number of phaseless measurements that is only a constant factor away from the degrees of freedom. We explicitly compute this constant factor, in terms of the quality of the initial estimate, by deriving the exact phase transition. The theory well matches empirical results from numerical simulations. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8082-learning-without-the-phase-regularized-phasemax-achieves-optimal-sample-complexity |
http://papers.nips.cc/paper/8082-learning-without-the-phase-regularized-phasemax-achieves-optimal-sample-complexity.pdf | |
PWC | https://paperswithcode.com/paper/learning-without-the-phase-regularized |
Repo | |
Framework | |
Active learning for deep semantic parsing
Title | Active learning for deep semantic parsing |
Authors | Long Duong, Hadi Afshar, Dominique Estival, Glen Pink, Philip Cohen, Mark Johnson |
Abstract | Semantic parsing requires training data that is expensive and slow to collect. We apply active learning to both traditional and {``}overnight{''} data collection approaches. We show that it is possible to obtain good training hyperparameters from seed data which is only a small fraction of the full dataset. We show that uncertainty sampling based on least confidence score is competitive in traditional data collection but not applicable for overnight collection. We propose several active learning strategies for overnight data collection and show that different example selection strategies per domain perform best. | |
Tasks | Active Learning, Semantic Parsing |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2008/ |
https://www.aclweb.org/anthology/P18-2008 | |
PWC | https://paperswithcode.com/paper/active-learning-for-deep-semantic-parsing |
Repo | |
Framework | |
A Simple Cache Model for Image Recognition
Title | A Simple Cache Model for Image Recognition |
Authors | Emin Orhan |
Abstract | Training large-scale image recognition models is computationally expensive. This raises the question of whether there might be simple ways to improve the test performance of an already trained model without having to re-train or fine-tune it with new data. Here, we show that, surprisingly, this is indeed possible. The key observation we make is that the layers of a deep network close to the output layer contain independent, easily extractable class-relevant information that is not contained in the output layer itself. We propose to extract this extra class-relevant information using a simple key-value cache memory to improve the classification performance of the model at test time. Our cache memory is directly inspired by a similar cache model previously proposed for language modeling (Grave et al., 2017). This cache component does not require any training or fine-tuning; it can be applied to any pre-trained model and, by properly setting only two hyper-parameters, leads to significant improvements in its classification performance. Improvements are observed across several architectures and datasets. In the cache component, using features extracted from layers close to the output (but not from the output layer itself) as keys leads to the largest improvements. Concatenating features from multiple layers to form keys can further improve performance over using single-layer features as keys. The cache component also has a regularizing effect, a simple consequence of which is that it substantially increases the robustness of models against adversarial attacks. |
Tasks | Language Modelling |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8214-a-simple-cache-model-for-image-recognition |
http://papers.nips.cc/paper/8214-a-simple-cache-model-for-image-recognition.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-cache-model-for-image-recognition |
Repo | |
Framework | |
Does Syntactic Knowledge in Multilingual Language Models Transfer Across Languages?
Title | Does Syntactic Knowledge in Multilingual Language Models Transfer Across Languages? |
Authors | Prajit Dhar, Arianna Bisazza |
Abstract | Recent work has shown that neural models can be successfully trained on multiple languages simultaneously. We investigate whether such models learn to share and exploit common syntactic knowledge among the languages on which they are trained. This extended abstract presents our preliminary results. |
Tasks | Language Acquisition, Language Modelling |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5453/ |
https://www.aclweb.org/anthology/W18-5453 | |
PWC | https://paperswithcode.com/paper/does-syntactic-knowledge-in-multilingual |
Repo | |
Framework | |
Toddler-Inspired Visual Object Learning
Title | Toddler-Inspired Visual Object Learning |
Authors | Sven Bambach, David Crandall, Linda Smith, Chen Yu |
Abstract | Real-world learning systems have practical limitations on the quality and quantity of the training datasets that they can collect and consider. How should a system go about choosing a subset of the possible training examples that still allows for learning accurate, generalizable models? To help address this question, we draw inspiration from a highly efficient practical learning system: the human child. Using head-mounted cameras, eye gaze trackers, and a model of foveated vision, we collected first-person (egocentric) images that represents a highly accurate approximation of the “training data” that toddlers’ visual systems collect in everyday, naturalistic learning contexts. We used state-of-the-art computer vision learning models (convolutional neural networks) to help characterize the structure of these data, and found that child data produce significantly better object models than egocentric data experienced by adults in exactly the same environment. By using the CNNs as a modeling tool to investigate the properties of the child data that may enable this rapid learning, we found that child data exhibit a unique combination of quality and diversity, with not only many similar large, high-quality object views but also a greater number and diversity of rare views. This novel methodology of analyzing the visual “training data” used by children may not only reveal insights to improve machine learning, but also may suggest new experimental tools to better understand infant learning in developmental psychology. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7396-toddler-inspired-visual-object-learning |
http://papers.nips.cc/paper/7396-toddler-inspired-visual-object-learning.pdf | |
PWC | https://paperswithcode.com/paper/toddler-inspired-visual-object-learning |
Repo | |
Framework | |
Comparing Constraints for Taxonomic Organization
Title | Comparing Constraints for Taxonomic Organization |
Authors | Anne Cocos, Marianna Apidianaki, Chris Callison-Burch |
Abstract | Building a taxonomy from the ground up involves several sub-tasks: selecting terms to include, predicting semantic relations between terms, and selecting a subset of relational instances to keep, given constraints on the taxonomy graph. Methods for this final step {–} taxonomic organization {–} vary both in terms of the constraints they impose, and whether they enable discovery of synonymous terms. It is hard to isolate the impact of these factors on the quality of the resulting taxonomy because organization methods are rarely compared directly. In this paper, we present a head-to-head comparison of six taxonomic organization algorithms that vary with respect to their structural and transitivity constraints, and treatment of synonymy. We find that while transitive algorithms out-perform their non-transitive counterparts, the top-performing transitive algorithm is prohibitively slow for taxonomies with as few as 50 entities. We propose a simple modification to a non-transitive optimum branching algorithm to explicitly incorporate synonymy, resulting in a method that is substantially faster than the best transitive algorithm while giving complementary performance. |
Tasks | Entity Extraction |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1030/ |
https://www.aclweb.org/anthology/N18-1030 | |
PWC | https://paperswithcode.com/paper/comparing-constraints-for-taxonomic |
Repo | |
Framework | |
Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network
Title | Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network |
Authors | Daniel Merget, Matthias Rock, Gerhard Rigoll |
Abstract | While fully-convolutional neural networks are very strong at modeling local features, they fail to aggregate global context due to their constrained receptive field. Modern methods typically address the lack of global context by introducing cascades, pooling, or by fitting a statistical model. In this work, we propose a new approach that introduces global context into a fully-convolutional neural network directly. The key concept is an implicit kernel convolution within the network. The kernel convolution blurs the output of a local-context subnet, which is then refined by a global-context subnet using dilated convolutions. The kernel convolution is crucial for the convergence of the network because it smoothens the gradients and reduces overfitting. In a postprocessing step, a simple PCA-based 2D shape model is fitted to the network output in order to filter outliers. Our experiments demonstrate the effectiveness of our approach, outperforming several state-of-the-art methods in facial landmark detection. |
Tasks | Facial Landmark Detection |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Merget_Robust_Facial_Landmark_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Merget_Robust_Facial_Landmark_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/robust-facial-landmark-detection-via-a-fully |
Repo | |
Framework | |
Viewpoint Estimation—Insights & Model
Title | Viewpoint Estimation—Insights & Model |
Authors | Gilad Divon, Ayellet Tal |
Abstract | This paper addresses the problem of viewpoint estimation of an object in a given image. It presents five key insights and a CNN that is based on them. The network’s major properties are as follows. (i) The architecture jointly solves detection, classification, and viewpoint estimation. (ii) New types of data are added and trained on. (iii) A novel loss function, which takes into account both the geometry of the problem and the new types of data, is propose. Our network allows a substantial boost in performance: from 36.1% gained by SOTA algorithms to 45.9%. |
Tasks | Viewpoint Estimation |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Gilad_Divon_Viewpoint_Estimation_-_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Gilad_Divon_Viewpoint_Estimation_-_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/viewpoint-estimation-insights-model-1 |
Repo | |
Framework | |