October 15, 2019

2660 words 13 mins read

Paper Group NANR 139

Manifold Learning in Quotient Spaces. Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages. Unsupervised Deep Generative Adversarial Hashing Network. Lifelong Learning by Adjusting Priors. Characters or Morphemes: How to Represent Words?. Improving Neural Network Performance by Injecting Background Knowledge …

Manifold Learning in Quotient Spaces


Title	Manifold Learning in Quotient Spaces
Authors	Ãloi Mehr, AndrÃ© Lieutier, Fernando Sanchez Bermudez, Vincent Guitteny, Nicolas Thome, Matthieu Cord
Abstract	When learning 3D shapes we are usually interested in their intrinsic geometry rather than in their orientation. To deal with the orientation variations the usual trick consists in augmenting the data to exhibit all possible variability, and thus let the model learn both the geometry as well as the rotations. In this paper we introduce a new autoencoder model for encoding and synthesis of 3D shapes. To get rid of undesirable input variability our model learns a manifold in a quotient space of the input space. Typically, we propose to quotient the space of 3D models by the action of rotations. Thus, our quotient autoencoder allows to directly learn in the space of interest, ignoring side information. This is reflected in better performances on reconstruction and interpolation tasks, as our experiments show that our model outperforms a vanilla autoencoder on the well-known Shapenet dataset. Moreover, our model learns a rotation-invariant representation, leading to interesting results in shapes co-alignment. Finally, we extend our quotient autoencoder to quotient by non-rigid transformations.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Mehr_Manifold_Learning_in_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Mehr_Manifold_Learning_in_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/manifold-learning-in-quotient-spaces
Repo
Framework

Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages


Title	Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages
Authors	Talha Javed, Nizar Habash, Dima Taji
Abstract
Tasks	Dependency Parsing, Tokenization, Transliteration
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1345/
PDF	https://www.aclweb.org/anthology/L18-1345
PWC	https://paperswithcode.com/paper/palmyra-a-platform-independent-dependency
Repo
Framework

Unsupervised Deep Generative Adversarial Hashing Network


Title	Unsupervised Deep Generative Adversarial Hashing Network
Authors	Kamran Ghasedi Dizaji, Feng Zheng, Najmeh Sadoughi, Yanhua Yang, Cheng Deng, Heng Huang
Abstract	Unsupervised deep hash functions have not shown satisfactory improvements against the shallow alternatives, and usually, require supervised pretraining to avoid getting stuck in bad local minima. In this paper, we propose a deep unsupervised hashing function, called HashGAN, which outperforms unsupervised hashing models with significant margins without any supervised pretraining. HashGAN consists of three networks, a generator, a discriminator and an encoder. By sharing the parameters of the encoder and discriminator, we benefit from the adversarial loss as a data dependent regularization in training our deep hash function. Moreover, a novel loss function is introduced for hashing real images, resulting in minimum entropy, uniform frequency, consistent and independent hash bits. Furthermore, we train the generator conditioning on random binary inputs and also use these binary variables in a triplet ranking loss for improving hash codes. In our experiments, HashGAN outperforms the previous unsupervised hash functions in image retrieval and achieves the state-of-the-art performance in image clustering. We also provide an ablation study, showing the contribution of each component in our loss function.
Tasks	Image Clustering, Image Retrieval
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Dizaji_Unsupervised_Deep_Generative_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Dizaji_Unsupervised_Deep_Generative_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/unsupervised-deep-generative-adversarial
Repo
Framework

Lifelong Learning by Adjusting Priors


Title	Lifelong Learning by Adjusting Priors
Authors	Ron Amit, Ron Meir
Abstract	In representational lifelong learning an agent aims to continually learn to solve novel tasks while updating its representation in light of previous tasks. Under the assumption that future tasks are related to previous tasks, representations should be learned in such a way that they capture the common structure across learned tasks, while allowing the learner sufficient flexibility to adapt to novel aspects of a new task. We develop a framework for lifelong learning in deep neural networks that is based on generalization bounds, developed within the PAC-Bayes framework. Learning takes place through the construction of a distribution over networks based on the tasks seen so far, and its utilization for learning a new task. Thus, prior knowledge is incorporated through setting a history-dependent prior for novel tasks. We develop a gradient-based algorithm implementing these ideas, based on minimizing an objective function motivated by generalization bounds, and demonstrate its effectiveness through numerical examples.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=rJUBryZ0W
PDF	https://openreview.net/pdf?id=rJUBryZ0W
PWC	https://paperswithcode.com/paper/lifelong-learning-by-adjusting-priors
Repo
Framework

Characters or Morphemes: How to Represent Words?


Title	Characters or Morphemes: How to Represent Words?
Authors	Ahmet {"U}st{"u}n, Murathan Kurfal{\i}, Burcu Can
Abstract	In this paper, we investigate the effects of using subword information in representation learning. We argue that using syntactic subword units effects the quality of the word representations positively. We introduce a morpheme-based model and compare it against to word-based, character-based, and character n-gram level models. Our model takes a list of candidate segmentations of a word and learns the representation of the word based on different segmentations that are weighted by an attention mechanism. We performed experiments on Turkish as a morphologically rich language and English with a comparably poorer morphology. The results show that morpheme-based models are better at learning word representations of morphologically complex languages compared to character-based and character n-gram level models since the morphemes help to incorporate more syntactic knowledge in learning, that makes morpheme-based models better at syntactic tasks.
Tasks	Representation Learning, Semantic Textual Similarity
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-3019/
PDF	https://www.aclweb.org/anthology/W18-3019
PWC	https://paperswithcode.com/paper/characters-or-morphemes-how-to-represent
Repo
Framework

Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts


Title	Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts
Authors	Wafia Adouane, Jean-Philippe Bernardy, Simon Dobnik
Abstract	We explore the effect of injecting background knowledge to different deep neural network (DNN) configurations in order to mitigate the problem of the scarcity of annotated data when applying these models on datasets of low-resourced languages. The background knowledge is encoded in the form of lexicons and pre-trained sub-word embeddings. The DNN models are evaluated on the task of detecting code-switching and borrowing points in non-standardised user-generated Algerian texts. Overall results show that DNNs benefit from adding background knowledge. However, the gain varies between models and categories. The proposed DNN architectures are generic and could be applied to other low-resourced languages.
Tasks	Word Embeddings
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-3203/
PDF	https://www.aclweb.org/anthology/W18-3203
PWC	https://paperswithcode.com/paper/improving-neural-network-performance-by
Repo
Framework

Application and Analysis of a Multi-layered Scheme for Irony on the Italian Twitter Corpus TWITTIR`O


Title	Application and Analysis of a Multi-layered Scheme for Irony on the Italian Twitter Corpus TWITTIR`O
Authors	Aless Cignarella, ra Teresa, Cristina Bosco, Viviana Patti, Mirko Lai
Abstract
Tasks	Sentiment Analysis
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1664/
PDF	https://www.aclweb.org/anthology/L18-1664
PWC	https://paperswithcode.com/paper/application-and-analysis-of-a-multi-layered
Repo
Framework

Learning without the Phase: Regularized PhaseMax Achieves Optimal Sample Complexity


Title	Learning without the Phase: Regularized PhaseMax Achieves Optimal Sample Complexity
Authors	Fariborz Salehi, Ehsan Abbasi, Babak Hassibi
Abstract	The problem of estimating an unknown signal, $\mathbf x_0\in \mathbb R^n$, from a vector $\mathbf y\in \mathbb R^m$ consisting of $m$ magnitude-only measurements of the form $y_i=\mathbf a_i\mathbf x_0$, where $\mathbf a_i$'s are the rows of a known measurement matrix $\mathbf A$ is a classical problem known as phase retrieval. This problem arises when measuring the phase is costly or altogether infeasible. In many applications in machine learning, signal processing, statistics, etc., the underlying signal has certain structure (sparse, low-rank, finite alphabet, etc.), opening of up the possibility of recovering $\mathbf x_0$ from a number of measurements smaller than the ambient dimension, i.e., $m<n$. Ideally, one would like to recover the signal from a number of phaseless measurements that is on the order of the “degrees of freedom” of the structured $\mathbf x_0$. To this end, inspired by the PhaseMax algorithm, we formulate a convex optimization problem, where the objective function relies on an initial estimate of the true signal and also includes an additive regularization term to encourage structure. The new formulation is referred to as {\textbf{regularized PhaseMax}}. We analyze the performance of regularized PhaseMax to find the minimum number of phaseless measurements required for perfect signal recovery. The results are asymptotic and are in terms of the geometrical properties (such as the Gaussian width) of certain convex cones. When the measurement matrix has i.i.d. Gaussian entries, we show that our proposed method is indeed order-wise optimal, allowing perfect recovery from a number of phaseless measurements that is only a constant factor away from the degrees of freedom. We explicitly compute this constant factor, in terms of the quality of the initial estimate, by deriving the exact phase transition. The theory well matches empirical results from numerical simulations.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/8082-learning-without-the-phase-regularized-phasemax-achieves-optimal-sample-complexity
PDF	http://papers.nips.cc/paper/8082-learning-without-the-phase-regularized-phasemax-achieves-optimal-sample-complexity.pdf
PWC	https://paperswithcode.com/paper/learning-without-the-phase-regularized
Repo
Framework

Active learning for deep semantic parsing


Title	Active learning for deep semantic parsing
Authors	Long Duong, Hadi Afshar, Dominique Estival, Glen Pink, Philip Cohen, Mark Johnson
Abstract	Semantic parsing requires training data that is expensive and slow to collect. We apply active learning to both traditional and {``}overnight{''} data collection approaches. We show that it is possible to obtain good training hyperparameters from seed data which is only a small fraction of the full dataset. We show that uncertainty sampling based on least confidence score is competitive in traditional data collection but not applicable for overnight collection. We propose several active learning strategies for overnight data collection and show that different example selection strategies per domain perform best. \|
Tasks	Active Learning, Semantic Parsing
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2008/
PDF	https://www.aclweb.org/anthology/P18-2008
PWC	https://paperswithcode.com/paper/active-learning-for-deep-semantic-parsing
Repo
Framework

A Simple Cache Model for Image Recognition


Title	A Simple Cache Model for Image Recognition
Authors	Emin Orhan
Abstract	Training large-scale image recognition models is computationally expensive. This raises the question of whether there might be simple ways to improve the test performance of an already trained model without having to re-train or fine-tune it with new data. Here, we show that, surprisingly, this is indeed possible. The key observation we make is that the layers of a deep network close to the output layer contain independent, easily extractable class-relevant information that is not contained in the output layer itself. We propose to extract this extra class-relevant information using a simple key-value cache memory to improve the classification performance of the model at test time. Our cache memory is directly inspired by a similar cache model previously proposed for language modeling (Grave et al., 2017). This cache component does not require any training or fine-tuning; it can be applied to any pre-trained model and, by properly setting only two hyper-parameters, leads to significant improvements in its classification performance. Improvements are observed across several architectures and datasets. In the cache component, using features extracted from layers close to the output (but not from the output layer itself) as keys leads to the largest improvements. Concatenating features from multiple layers to form keys can further improve performance over using single-layer features as keys. The cache component also has a regularizing effect, a simple consequence of which is that it substantially increases the robustness of models against adversarial attacks.
Tasks	Language Modelling
Published	2018-12-01
URL	http://papers.nips.cc/paper/8214-a-simple-cache-model-for-image-recognition
PDF	http://papers.nips.cc/paper/8214-a-simple-cache-model-for-image-recognition.pdf
PWC	https://paperswithcode.com/paper/a-simple-cache-model-for-image-recognition
Repo
Framework

Does Syntactic Knowledge in Multilingual Language Models Transfer Across Languages?


Title	Does Syntactic Knowledge in Multilingual Language Models Transfer Across Languages?
Authors	Prajit Dhar, Arianna Bisazza
Abstract	Recent work has shown that neural models can be successfully trained on multiple languages simultaneously. We investigate whether such models learn to share and exploit common syntactic knowledge among the languages on which they are trained. This extended abstract presents our preliminary results.
Tasks	Language Acquisition, Language Modelling
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-5453/
PDF	https://www.aclweb.org/anthology/W18-5453
PWC	https://paperswithcode.com/paper/does-syntactic-knowledge-in-multilingual
Repo
Framework

Toddler-Inspired Visual Object Learning


Title	Toddler-Inspired Visual Object Learning
Authors	Sven Bambach, David Crandall, Linda Smith, Chen Yu
Abstract	Real-world learning systems have practical limitations on the quality and quantity of the training datasets that they can collect and consider. How should a system go about choosing a subset of the possible training examples that still allows for learning accurate, generalizable models? To help address this question, we draw inspiration from a highly efficient practical learning system: the human child. Using head-mounted cameras, eye gaze trackers, and a model of foveated vision, we collected first-person (egocentric) images that represents a highly accurate approximation of the “training data” that toddlers’ visual systems collect in everyday, naturalistic learning contexts. We used state-of-the-art computer vision learning models (convolutional neural networks) to help characterize the structure of these data, and found that child data produce significantly better object models than egocentric data experienced by adults in exactly the same environment. By using the CNNs as a modeling tool to investigate the properties of the child data that may enable this rapid learning, we found that child data exhibit a unique combination of quality and diversity, with not only many similar large, high-quality object views but also a greater number and diversity of rare views. This novel methodology of analyzing the visual “training data” used by children may not only reveal insights to improve machine learning, but also may suggest new experimental tools to better understand infant learning in developmental psychology.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7396-toddler-inspired-visual-object-learning
PDF	http://papers.nips.cc/paper/7396-toddler-inspired-visual-object-learning.pdf
PWC	https://paperswithcode.com/paper/toddler-inspired-visual-object-learning
Repo
Framework

Comparing Constraints for Taxonomic Organization


Title	Comparing Constraints for Taxonomic Organization
Authors	Anne Cocos, Marianna Apidianaki, Chris Callison-Burch
Abstract	Building a taxonomy from the ground up involves several sub-tasks: selecting terms to include, predicting semantic relations between terms, and selecting a subset of relational instances to keep, given constraints on the taxonomy graph. Methods for this final step {–} taxonomic organization {–} vary both in terms of the constraints they impose, and whether they enable discovery of synonymous terms. It is hard to isolate the impact of these factors on the quality of the resulting taxonomy because organization methods are rarely compared directly. In this paper, we present a head-to-head comparison of six taxonomic organization algorithms that vary with respect to their structural and transitivity constraints, and treatment of synonymy. We find that while transitive algorithms out-perform their non-transitive counterparts, the top-performing transitive algorithm is prohibitively slow for taxonomies with as few as 50 entities. We propose a simple modification to a non-transitive optimum branching algorithm to explicitly incorporate synonymy, resulting in a method that is substantially faster than the best transitive algorithm while giving complementary performance.
Tasks	Entity Extraction
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-1030/
PDF	https://www.aclweb.org/anthology/N18-1030
PWC	https://paperswithcode.com/paper/comparing-constraints-for-taxonomic
Repo
Framework

Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network


Title	Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network
Authors	Daniel Merget, Matthias Rock, Gerhard Rigoll
Abstract	While fully-convolutional neural networks are very strong at modeling local features, they fail to aggregate global context due to their constrained receptive field. Modern methods typically address the lack of global context by introducing cascades, pooling, or by fitting a statistical model. In this work, we propose a new approach that introduces global context into a fully-convolutional neural network directly. The key concept is an implicit kernel convolution within the network. The kernel convolution blurs the output of a local-context subnet, which is then refined by a global-context subnet using dilated convolutions. The kernel convolution is crucial for the convergence of the network because it smoothens the gradients and reduces overfitting. In a postprocessing step, a simple PCA-based 2D shape model is fitted to the network output in order to filter outliers. Our experiments demonstrate the effectiveness of our approach, outperforming several state-of-the-art methods in facial landmark detection.
Tasks	Facial Landmark Detection
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Merget_Robust_Facial_Landmark_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Merget_Robust_Facial_Landmark_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/robust-facial-landmark-detection-via-a-fully
Repo
Framework

Viewpoint Estimation—Insights & Model


Title	Viewpoint Estimation—Insights & Model
Authors	Gilad Divon, Ayellet Tal
Abstract	This paper addresses the problem of viewpoint estimation of an object in a given image. It presents five key insights and a CNN that is based on them. The network’s major properties are as follows. (i) The architecture jointly solves detection, classification, and viewpoint estimation. (ii) New types of data are added and trained on. (iii) A novel loss function, which takes into account both the geometry of the problem and the new types of data, is propose. Our network allows a substantial boost in performance: from 36.1% gained by SOTA algorithms to 45.9%.
Tasks	Viewpoint Estimation
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Gilad_Divon_Viewpoint_Estimation_-_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Gilad_Divon_Viewpoint_Estimation_-_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/viewpoint-estimation-insights-model-1
Repo
Framework