Paper Group ANR 168
Parallel Long Short-Term Memory for Multi-stream Classification. A Survey of Neural Network Techniques for Feature Extraction from Text. Composing Meta-Policies for Autonomous Driving Using Hierarchical Deep Reinforcement Learning. A General Framework for Flexible Multi-Cue Photometric Point Cloud Registration. Foundations of Declarative Data Analy …
Parallel Long Short-Term Memory for Multi-stream Classification
Title | Parallel Long Short-Term Memory for Multi-stream Classification |
Authors | Mohamed Bouaziz, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori |
Abstract | Recently, machine learning methods have provided a broad spectrum of original and efficient algorithms based on Deep Neural Networks (DNN) to automatically predict an outcome with respect to a sequence of inputs. Recurrent hidden cells allow these DNN-based models to manage long-term dependencies such as Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM). Nevertheless, these RNNs process a single input stream in one (LSTM) or two (Bidirectional LSTM) directions. But most of the information available nowadays is from multistreams or multimedia documents, and require RNNs to process these information synchronously during the training. This paper presents an original LSTM-based architecture, named Parallel LSTM (PLSTM), that carries out multiple parallel synchronized input sequences in order to predict a common output. The proposed PLSTM method could be used for parallel sequence classification purposes. The PLSTM approach is evaluated on an automatic telecast genre sequences classification task and compared with different state-of-the-art architectures. Results show that the proposed PLSTM method outperforms the baseline n-gram models as well as the state-of-the-art LSTM approach. |
Tasks | |
Published | 2017-02-11 |
URL | http://arxiv.org/abs/1702.03402v1 |
http://arxiv.org/pdf/1702.03402v1.pdf | |
PWC | https://paperswithcode.com/paper/parallel-long-short-term-memory-for-multi |
Repo | |
Framework | |
A Survey of Neural Network Techniques for Feature Extraction from Text
Title | A Survey of Neural Network Techniques for Feature Extraction from Text |
Authors | Vineet John |
Abstract | This paper aims to catalyze the discussions about text feature extraction techniques using neural network architectures. The research questions discussed in the paper focus on the state-of-the-art neural network techniques that have proven to be useful tools for language processing, language generation, text classification and other computational linguistics tasks. |
Tasks | Text Classification, Text Generation |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1704.08531v1 |
http://arxiv.org/pdf/1704.08531v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-neural-network-techniques-for |
Repo | |
Framework | |
Composing Meta-Policies for Autonomous Driving Using Hierarchical Deep Reinforcement Learning
Title | Composing Meta-Policies for Autonomous Driving Using Hierarchical Deep Reinforcement Learning |
Authors | Richard Liaw, Sanjay Krishnan, Animesh Garg, Daniel Crankshaw, Joseph E. Gonzalez, Ken Goldberg |
Abstract | Rather than learning new control policies for each new task, it is possible, when tasks share some structure, to compose a “meta-policy” from previously learned policies. This paper reports results from experiments using Deep Reinforcement Learning on a continuous-state, discrete-action autonomous driving simulator. We explore how Deep Neural Networks can represent meta-policies that switch among a set of previously learned policies, specifically in settings where the dynamics of a new scenario are composed of a mixture of previously learned dynamics and where the state observation is possibly corrupted by sensing noise. We also report the results of experiments varying dynamics mixes, distractor policies, magnitudes/distributions of sensing noise, and obstacles. In a fully observed experiment, the meta-policy learning algorithm achieves 2.6x the reward achieved by the next best policy composition technique with 80% less exploration. In a partially observed experiment, the meta-policy learning algorithm converges after 50 iterations while a direct application of RL fails to converge even after 200 iterations. |
Tasks | Autonomous Driving |
Published | 2017-11-04 |
URL | http://arxiv.org/abs/1711.01503v1 |
http://arxiv.org/pdf/1711.01503v1.pdf | |
PWC | https://paperswithcode.com/paper/composing-meta-policies-for-autonomous |
Repo | |
Framework | |
A General Framework for Flexible Multi-Cue Photometric Point Cloud Registration
Title | A General Framework for Flexible Multi-Cue Photometric Point Cloud Registration |
Authors | Bartolomeo Della Corte, Igor Bogoslavskyi, Cyrill Stachniss, Giorgio Grisetti |
Abstract | The ability to build maps is a key functionality for the majority of mobile robots. A central ingredient to most mapping systems is the registration or alignment of the recorded sensor data. In this paper, we present a general methodology for photometric registration that can deal with multiple different cues. We provide examples for registering RGBD as well as 3D LIDAR data. In contrast to popular point cloud registration approaches such as ICP our method does not rely on explicit data association and exploits multiple modalities such as raw range and image data streams. Color, depth, and normal information are handled in an uniform manner and the registration is obtained by minimizing the pixel-wise difference between two multi-channel images. We developed a flexible and general framework and implemented our approach inside that framework. We also released our implementation as open source C++ code. The experiments show that our approach allows for an accurate registration of the sensor data without requiring an explicit data association or model-specific adaptations to datasets or sensors. Our approach exploits the different cues in a natural and consistent way and the registration can be done at framerate for a typical range or imaging sensor. |
Tasks | Point Cloud Registration |
Published | 2017-09-13 |
URL | http://arxiv.org/abs/1709.05945v1 |
http://arxiv.org/pdf/1709.05945v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-framework-for-flexible-multi-cue |
Repo | |
Framework | |
Foundations of Declarative Data Analysis Using Limit Datalog Programs
Title | Foundations of Declarative Data Analysis Using Limit Datalog Programs |
Authors | Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev, Boris Motik, Ian Horrocks |
Abstract | Motivated by applications in declarative data analysis, we study $\mathit{Datalog}{\mathbb{Z}}$—an extension of positive Datalog with arithmetic functions over integers. This language is known to be undecidable, so we propose two fragments. In $\mathit{limit}~\mathit{Datalog}{\mathbb{Z}}$ predicates are axiomatised to keep minimal/maximal numeric values, allowing us to show that fact entailment is coNExpTime-complete in combined, and coNP-complete in data complexity. Moreover, an additional $\mathit{stability}$ requirement causes the complexity to drop to ExpTime and PTime, respectively. Finally, we show that stable $\mathit{Datalog}_{\mathbb{Z}}$ can express many useful data analysis tasks, and so our results provide a sound foundation for the development of advanced information systems. |
Tasks | |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.06927v2 |
http://arxiv.org/pdf/1705.06927v2.pdf | |
PWC | https://paperswithcode.com/paper/foundations-of-declarative-data-analysis |
Repo | |
Framework | |
A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers
Title | A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers |
Authors | Thomas E. Potok, Catherine Schuman, Steven R. Young, Robert M. Patton, Federico Spedalieri, Jeremy Liu, Ke-Thia Yao, Garrett Rose, Gangotree Chakma |
Abstract | Current Deep Learning approaches have been very successful using convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers. Three limitations of this approach are: 1) they are based on a simple layered network topology, i.e., highly connected layers, without intra-layer connections; 2) the networks are manually configured to achieve optimal results, and 3) the implementation of neuron model is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determine network topology, and neuromorphic computing for a low-power hardware implementation. We use the MNIST dataset for our experiment, due to input size limitations of current quantum computers. Our results show the feasibility of using the three architectures in tandem to address the above deep learning limitations. We show a quantum computer can find high quality values of intra-layer connections weights, in a tractable time as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware. |
Tasks | |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.05364v2 |
http://arxiv.org/pdf/1703.05364v2.pdf | |
PWC | https://paperswithcode.com/paper/a-study-of-complex-deep-learning-networks-on |
Repo | |
Framework | |
TokTrack: A Complete Token Provenance and Change Tracking Dataset for the English Wikipedia
Title | TokTrack: A Complete Token Provenance and Change Tracking Dataset for the English Wikipedia |
Authors | Fabian Flöck, Kenan Erdogan, Maribel Acosta |
Abstract | We present a dataset that contains every instance of all tokens (~ words) ever written in undeleted, non-redirect English Wikipedia articles until October 2016, in total 13,545,349,787 instances. Each token is annotated with (i) the article revision it was originally created in, and (ii) lists with all the revisions in which the token was ever deleted and (potentially) re-added and re-deleted from its article, enabling a complete and straightforward tracking of its history. This data would be exceedingly hard to create by an average potential user as it is (i) very expensive to compute and as (ii) accurately tracking the history of each token in revisioned documents is a non-trivial task. Adapting a state-of-the-art algorithm, we have produced a dataset that allows for a range of analyses and metrics, already popular in research and going beyond, to be generated on complete-Wikipedia scale; ensuring quality and allowing researchers to forego expensive text-comparison computation, which so far has hindered scalable usage. We show how this data enables, on token-level, computation of provenance, measuring survival of content over time, very detailed conflict metrics, and fine-grained interactions of editors like partial reverts, re-additions and other metrics, in the process gaining several novel insights. |
Tasks | |
Published | 2017-03-23 |
URL | http://arxiv.org/abs/1703.08244v1 |
http://arxiv.org/pdf/1703.08244v1.pdf | |
PWC | https://paperswithcode.com/paper/toktrack-a-complete-token-provenance-and |
Repo | |
Framework | |
Gradient Methods for Submodular Maximization
Title | Gradient Methods for Submodular Maximization |
Authors | Hamed Hassani, Mahdi Soltanolkotabi, Amin Karbasi |
Abstract | In this paper, we study the problem of maximizing continuous submodular functions that naturally arise in many learning applications such as those involving utility functions in active learning and sensing, matrix approximations and network inference. Despite the apparent lack of convexity in such functions, we prove that stochastic projected gradient methods can provide strong approximation guarantees for maximizing continuous submodular functions with convex constraints. More specifically, we prove that for monotone continuous DR-submodular functions, all fixed points of projected gradient ascent provide a factor $1/2$ approximation to the global maxima. We also study stochastic gradient and mirror methods and show that after $\mathcal{O}(1/\epsilon^2)$ iterations these methods reach solutions which achieve in expectation objective values exceeding $(\frac{\text{OPT}}{2}-\epsilon)$. An immediate application of our results is to maximize submodular functions that are defined stochastically, i.e. the submodular function is defined as an expectation over a family of submodular functions with an unknown distribution. We will show how stochastic gradient methods are naturally well-suited for this setting, leading to a factor $1/2$ approximation when the function is monotone. In particular, it allows us to approximately maximize discrete, monotone submodular optimization problems via projected gradient descent on a continuous relaxation, directly connecting the discrete and continuous domains. Finally, experiments on real data demonstrate that our projected gradient methods consistently achieve the best utility compared to other continuous baselines while remaining competitive in terms of computational effort. |
Tasks | Active Learning |
Published | 2017-08-13 |
URL | http://arxiv.org/abs/1708.03949v2 |
http://arxiv.org/pdf/1708.03949v2.pdf | |
PWC | https://paperswithcode.com/paper/gradient-methods-for-submodular-maximization |
Repo | |
Framework | |
Towards Linguistically Generalizable NLP Systems: A Workshop and Shared Task
Title | Towards Linguistically Generalizable NLP Systems: A Workshop and Shared Task |
Authors | Allyson Ettinger, Sudha Rao, Hal Daumé III, Emily M. Bender |
Abstract | This paper presents a summary of the first Workshop on Building Linguistically Generalizable Natural Language Processing Systems, and the associated Build It Break It, The Language Edition shared task. The goal of this workshop was to bring together researchers in NLP and linguistics with a shared task aimed at testing the generalizability of NLP systems beyond the distributions of their training data. We describe the motivation, setup, and participation of the shared task, provide discussion of some highlighted results, and discuss lessons learned. |
Tasks | |
Published | 2017-11-04 |
URL | http://arxiv.org/abs/1711.01505v1 |
http://arxiv.org/pdf/1711.01505v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-linguistically-generalizable-nlp |
Repo | |
Framework | |
A New Semantic Theory of Natural Language
Title | A New Semantic Theory of Natural Language |
Authors | Kun Xing |
Abstract | Formal Semantics and Distributional Semantics are two important semantic frameworks in Natural Language Processing (NLP). Cognitive Semantics belongs to the movement of Cognitive Linguistics, which is based on contemporary cognitive science. Each framework could deal with some meaning phenomena, but none of them fulfills all requirements proposed by applications. A unified semantic theory characterizing all important language phenomena has both theoretical and practical significance; however, although many attempts have been made in recent years, no existing theory has achieved this goal yet. This article introduces a new semantic theory that has the potential to characterize most of the important meaning phenomena of natural language and to fulfill most of the necessary requirements for philosophical analysis and for NLP applications. The theory is based on a unified representation of information, and constructs a kind of mathematical model called cognitive model to interpret natural language expressions in a compositional manner. It accepts the empirical assumption of Cognitive Semantics, and overcomes most shortcomings of Formal Semantics and of Distributional Semantics. The theory, however, is not a simple combination of existing theories, but an extensive generalization of classic logic and Formal Semantics. It inherits nearly all advantages of Formal Semantics, and also provides descriptive contents for objects and events as fine-gram as possible, descriptive contents which represent the results of human cognition. |
Tasks | |
Published | 2017-09-10 |
URL | http://arxiv.org/abs/1709.04857v1 |
http://arxiv.org/pdf/1709.04857v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-semantic-theory-of-natural-language |
Repo | |
Framework | |
Multiple Reflection Symmetry Detection via Linear-Directional Kernel Density Estimation
Title | Multiple Reflection Symmetry Detection via Linear-Directional Kernel Density Estimation |
Authors | Mohamed Elawady, Olivier Alata, Christophe Ducottet, Cecile Barat, Philippe Colantoni |
Abstract | Symmetry is an important composition feature by investigating similar sides inside an image plane. It has a crucial effect to recognize man-made or nature objects within the universe. Recent symmetry detection approaches used a smoothing kernel over different voting maps in the polar coordinate system to detect symmetry peaks, which split the regions of symmetry axis candidates in inefficient way. We propose a reliable voting representation based on weighted linear-directional kernel density estimation, to detect multiple symmetries over challenging real-world and synthetic images. Experimental evaluation on two public datasets demonstrates the superior performance of the proposed algorithm to detect global symmetry axes respect to the major image shapes. |
Tasks | Density Estimation |
Published | 2017-04-21 |
URL | http://arxiv.org/abs/1704.06392v1 |
http://arxiv.org/pdf/1704.06392v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-reflection-symmetry-detection-via |
Repo | |
Framework | |
Machine Translation using Semantic Web Technologies: A Survey
Title | Machine Translation using Semantic Web Technologies: A Survey |
Authors | Diego Moussallem, Matthias Wauer, Axel-Cyrille Ngonga Ngomo |
Abstract | A large number of machine translation approaches have recently been developed to facilitate the fluid migration of content across languages. However, the literature suggests that many obstacles must still be dealt with to achieve better automatic translations. One of these obstacles is lexical and syntactic ambiguity. A promising way of overcoming this problem is using Semantic Web technologies. This article presents the results of a systematic review of machine translation approaches that rely on Semantic Web technologies for translating texts. Overall, our survey suggests that while Semantic Web technologies can enhance the quality of machine translation outputs for various problems, the combination of both is still in its infancy. |
Tasks | Machine Translation |
Published | 2017-11-26 |
URL | http://arxiv.org/abs/1711.09476v3 |
http://arxiv.org/pdf/1711.09476v3.pdf | |
PWC | https://paperswithcode.com/paper/machine-translation-using-semantic-web |
Repo | |
Framework | |
Synthesising Dynamic Textures using Convolutional Neural Networks
Title | Synthesising Dynamic Textures using Convolutional Neural Networks |
Authors | Christina M. Funke, Leon A. Gatys, Alexander S. Ecker, Matthias Bethge |
Abstract | Here we present a parametric model for dynamic textures. The model is based on spatiotemporal summary statistics computed from the feature representations of a Convolutional Neural Network (CNN) trained on object recognition. We demonstrate how the model can be used to synthesise new samples of dynamic textures and to predict motion in simple movies. |
Tasks | Object Recognition |
Published | 2017-02-22 |
URL | http://arxiv.org/abs/1702.07006v1 |
http://arxiv.org/pdf/1702.07006v1.pdf | |
PWC | https://paperswithcode.com/paper/synthesising-dynamic-textures-using |
Repo | |
Framework | |
Regularizing Model Complexity and Label Structure for Multi-Label Text Classification
Title | Regularizing Model Complexity and Label Structure for Multi-Label Text Classification |
Authors | Bingyu Wang, Cheng Li, Virgil Pavlu, Javed Aslam |
Abstract | Multi-label text classification is a popular machine learning task where each document is assigned with multiple relevant labels. This task is challenging due to high dimensional features and correlated labels. Multi-label text classifiers need to be carefully regularized to prevent the severe over-fitting in the high dimensional space, and also need to take into account label dependencies in order to make accurate predictions under uncertainty. We demonstrate significant and practical improvement by carefully regularizing the model complexity during training phase, and also regularizing the label search space during prediction phase. Specifically, we regularize the classifier training using Elastic-net (L1+L2) penalty for reducing model complexity/size, and employ early stopping to prevent overfitting. At prediction time, we apply support inference to restrict the search space to label sets encountered in the training set, and F-optimizer GFM to make optimal predictions for the F1 metric. We show that although support inference only provides density estimations on existing label combinations, when combined with GFM predictor, the algorithm can output unseen label combinations. Taken collectively, our experiments show state of the art results on many benchmark datasets. Beyond performance and practical contributions, we make some interesting observations. Contrary to the prior belief, which deems support inference as purely an approximate inference procedure, we show that support inference acts as a strong regularizer on the label prediction structure. It allows the classifier to take into account label dependencies during prediction even if the classifiers had not modeled any label dependencies during training. |
Tasks | Multi-Label Text Classification, Text Classification |
Published | 2017-05-01 |
URL | http://arxiv.org/abs/1705.00740v1 |
http://arxiv.org/pdf/1705.00740v1.pdf | |
PWC | https://paperswithcode.com/paper/regularizing-model-complexity-and-label |
Repo | |
Framework | |
Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening
Title | Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening |
Authors | Mohsen Ahmadi Fahandar, Eyke Hüllermeier, Inés Couso |
Abstract | We consider the problem of statistical inference for ranking data, specifically rank aggregation, under the assumption that samples are incomplete in the sense of not comprising all choice alternatives. In contrast to most existing methods, we explicitly model the process of turning a full ranking into an incomplete one, which we call the coarsening process. To this end, we propose the concept of rank-dependent coarsening, which assumes that incomplete rankings are produced by projecting a full ranking to a random subset of ranks. For a concrete instantiation of our model, in which full rankings are drawn from a Plackett-Luce distribution and observations take the form of pairwise preferences, we study the performance of various rank aggregation methods. In addition to predictive accuracy in the finite sample setting, we address the theoretical question of consistency, by which we mean the ability to recover a target ranking when the sample size goes to infinity, despite a potential bias in the observations caused by the (unknown) coarsening. |
Tasks | |
Published | 2017-12-04 |
URL | http://arxiv.org/abs/1712.01158v1 |
http://arxiv.org/pdf/1712.01158v1.pdf | |
PWC | https://paperswithcode.com/paper/statistical-inference-for-incomplete-ranking |
Repo | |
Framework | |