Paper Group NANR 266
State Abstractions for Lifelong Reinforcement Learning. Learning to select examples for program synthesis. Learning non-linear transform with discriminative and minimum information loss priors. Corpus Phonetics: Past, Present, and Future. Visualization of the occurrence trend of infectious diseases using Twitter. Trust Your Model: Light Field Depth …
State Abstractions for Lifelong Reinforcement Learning
Title | State Abstractions for Lifelong Reinforcement Learning |
Authors | David Abel, Dilip Arumugam, Lucas Lehnert, Michael Littman |
Abstract | In lifelong reinforcement learning, agents must effectively transfer knowledge across tasks while simultaneously addressing exploration, credit assignment, and generalization. State abstraction can help overcome these hurdles by compressing the representation used by an agent, thereby reducing the computational and statistical burdens of learning. To this end, we here develop theory to compute and use state abstractions in lifelong reinforcement learning. We introduce two new classes of abstractions: (1) transitive state abstractions, whose optimal form can be computed efficiently, and (2) PAC state abstractions, which are guaranteed to hold with respect to a distribution of tasks. We show that the joint family of transitive PAC abstractions can be acquired efficiently, preserve near optimal-behavior, and experimentally reduce sample complexity in simple domains, thereby yielding a family of desirable abstractions for use in lifelong reinforcement learning. Along with these positive results, we show that there are pathological cases where state abstractions can negatively impact performance. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2087 |
http://proceedings.mlr.press/v80/abel18a/abel18a.pdf | |
PWC | https://paperswithcode.com/paper/state-abstractions-for-lifelong-reinforcement |
Repo | |
Framework | |
Learning to select examples for program synthesis
Title | Learning to select examples for program synthesis |
Authors | Yewen Pu, Zachery Miranda, Armando Solar-Lezama, Leslie Pack Kaelbling |
Abstract | Program synthesis is a class of regression problems where one seeks a solution, in the form of a source-code program, that maps the inputs to their corresponding outputs exactly. Due to its precise and combinatorial nature, it is commonly formulated as a constraint satisfaction problem, where input-output examples are expressed constraints, and solved with a constraint solver. A key challenge of this formulation is that of scalability: While constraint solvers work well with few well-chosen examples, constraining the entire set of example constitutes a significant overhead in both time and memory. In this paper we address this challenge by constructing a representative subset of examples that is both small and is able to constrain the solver sufficiently. We build the subset one example at a time, using a trained discriminator to predict the probability of unchosen input-output examples conditioned on the chosen input-output examples, adding the least probable example to the subset. Experiment on a diagram drawing domain shows our approach produces subset of examples that are small and representative for the constraint solver. |
Tasks | Program Synthesis |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=B1CQGfZ0b |
https://openreview.net/pdf?id=B1CQGfZ0b | |
PWC | https://paperswithcode.com/paper/learning-to-select-examples-for-program |
Repo | |
Framework | |
Learning non-linear transform with discriminative and minimum information loss priors
Title | Learning non-linear transform with discriminative and minimum information loss priors |
Authors | Dimche Kostadinov, Slava Voloshynovskiy |
Abstract | This paper proposes a novel approach for learning discriminative and sparse representations. It consists of utilizing two different models. A predefined number of non-linear transform models are used in the learning stage, and one sparsifying transform model is used at test time. The non-linear transform models have discriminative and minimum information loss priors. A novel measure related to the discriminative prior is proposed and defined on the support intersection for the transform representations. The minimum information loss prior is expressed as a constraint on the conditioning and the expected coherence of the transform matrix. An equivalence between the non-linear models and the sparsifying model is shown only when the measure that is used to define the discriminative prior goes to zero. An approximation of the measure used in the discriminative prior is addressed, connecting it to a similarity concentration. To quantify the discriminative properties of the transform representation, we introduce another measure and present its bounds. Reflecting the discriminative quality of the transform representation we name it as discrimination power. To support and validate the theoretical analysis a practical learning algorithm is presented. We evaluate the advantages and the potential of the proposed algorithm by a computer simulation. A favorable performance is shown considering the execution time, the quality of the representation, measured by the discrimination power and the recognition accuracy in comparison with the state-of-the-art methods of the same category. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SJzmJEq6W |
https://openreview.net/pdf?id=SJzmJEq6W | |
PWC | https://paperswithcode.com/paper/learning-non-linear-transform-with |
Repo | |
Framework | |
Corpus Phonetics: Past, Present, and Future
Title | Corpus Phonetics: Past, Present, and Future |
Authors | Mark Liberman |
Abstract | Invited talk |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3801/ |
https://www.aclweb.org/anthology/W18-3801 | |
PWC | https://paperswithcode.com/paper/corpus-phonetics-past-present-and-future |
Repo | |
Framework | |
Visualization of the occurrence trend of infectious diseases using Twitter
Title | Visualization of the occurrence trend of infectious diseases using Twitter |
Authors | Ryusei Matsumoto, Minoru Yoshida, Kazuyuki Matsumoto, Hironobu Matsuda, Kenji Kita |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1081/ |
https://www.aclweb.org/anthology/L18-1081 | |
PWC | https://paperswithcode.com/paper/visualization-of-the-occurrence-trend-of |
Repo | |
Framework | |
Trust Your Model: Light Field Depth Estimation With Inline Occlusion Handling
Title | Trust Your Model: Light Field Depth Estimation With Inline Occlusion Handling |
Authors | Hendrik Schilling, Maximilian Diebold, Carsten Rother, Bernd Jähne |
Abstract | We address the problem of depth estimation from light-field images. Our main contribution is a new way to handle occlusions which improves general accuracy and quality of object borders. In contrast to all prior work we work with a model which directly incorporates both depth and occlusion, using a local optimization scheme based on the PatchMatch algorithm. The key benefit of this joint approach is that we utilize all available data, and not erroneously discard valuable information in pre-processing steps. We see the benefit of our approach not only at improved object boundaries, but also at smooth surface reconstruction, where we outperform even methods which focus on good surface regularization. We have evaluated our method on a public light-field dataset, where we achieve state-of-the-art results in nine out of twelve error metrics, with a close tie for the remaining three. |
Tasks | Depth Estimation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Schilling_Trust_Your_Model_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Schilling_Trust_Your_Model_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/trust-your-model-light-field-depth-estimation |
Repo | |
Framework | |
Linearly Constrained Weights: Resolving the Vanishing Gradient Problem by Reducing Angle Bias
Title | Linearly Constrained Weights: Resolving the Vanishing Gradient Problem by Reducing Angle Bias |
Authors | Takuro Kutsuna |
Abstract | In this paper, we first identify \textit{angle bias}, a simple but remarkable phenomenon that causes the vanishing gradient problem in a multilayer perceptron (MLP) with sigmoid activation functions. We then propose \textit{linearly constrained weights (LCW)} to reduce the angle bias in a neural network, so as to train the network under the constraints that the sum of the elements of each weight vector is zero. A reparameterization technique is presented to efficiently train a model with LCW by embedding the constraints on weight vectors into the structure of the network. Interestingly, batch normalization (Ioffe & Szegedy, 2015) can be viewed as a mechanism to correct angle bias. Preliminary experiments show that LCW helps train a 100-layered MLP more efficiently than does batch normalization. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HylgYB3pZ |
https://openreview.net/pdf?id=HylgYB3pZ | |
PWC | https://paperswithcode.com/paper/linearly-constrained-weights-resolving-the |
Repo | |
Framework | |
Authorship Attribution By Consensus Among Multiple Features
Title | Authorship Attribution By Consensus Among Multiple Features |
Authors | Jagadeesh Patchala, Raj Bhatnagar |
Abstract | Most existing research on authorship attribution uses various lexical, syntactic and semantic features. In this paper we demonstrate an effective template-based approach for combining various syntactic features of a document for authorship analysis. The parse-tree based features that we propose are independent of the topic of a document and reflect the innate writing styles of authors. We show that the use of templates including sub-trees of parse trees in conjunction with other syntactic features result in improved author attribution rates. Another contribution is the demonstration that Dempster{'}s rule based combination of evidence from syntactic features performs better than other evidence-combination methods. We also demonstrate that our methodology works well for the case where actual author is not included in the candidate author set. |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1234/ |
https://www.aclweb.org/anthology/C18-1234 | |
PWC | https://paperswithcode.com/paper/authorship-attribution-by-consensus-among |
Repo | |
Framework | |
USI-IR at IEST 2018: Sequence Modeling and Pseudo-Relevance Feedback for Implicit Emotion Detection
Title | USI-IR at IEST 2018: Sequence Modeling and Pseudo-Relevance Feedback for Implicit Emotion Detection |
Authors | Esteban R{'\i}ssola, Anastasia Giachanou, Fabio Crestani |
Abstract | This paper describes the participation of USI-IR in WASSA 2018 Implicit Emotion Shared Task. We propose a relevance feedback approach employing a sequential model (biLSTM) and word embeddings derived from a large collection of tweets. To this end, we assume that the top-\textit{k} predictions produce at a first classification step are correct (based on the model accuracy) and use them as new examples to re-train the network. |
Tasks | Emotion Recognition, Sentiment Analysis, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6233/ |
https://www.aclweb.org/anthology/W18-6233 | |
PWC | https://paperswithcode.com/paper/usi-ir-at-iest-2018-sequence-modeling-and |
Repo | |
Framework | |
Where Have I Heard This Story Before? Identifying Narrative Similarity in Movie Remakes
Title | Where Have I Heard This Story Before? Identifying Narrative Similarity in Movie Remakes |
Authors | Snigdha Chaturvedi, Shashank Srivastava, Dan Roth |
Abstract | People can identify correspondences between narratives in everyday life. For example, an analogy with the Cinderella story may be made in describing the unexpected success of an underdog in seemingly different stories. We present a new task and dataset for story understanding: identifying instances of similar narratives from a collection of narrative texts. We present an initial approach for this problem, which finds correspondences between narratives in terms of plot events, and resemblances between characters and their social relationships. Our approach yields an 8{%} absolute improvement in performance over a competitive information-retrieval baseline on a novel dataset of plot summaries of 577 movie remakes from Wikipedia. |
Tasks | Information Retrieval |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2106/ |
https://www.aclweb.org/anthology/N18-2106 | |
PWC | https://paperswithcode.com/paper/where-have-i-heard-this-story-before |
Repo | |
Framework | |
Convolutional Interaction Network for Natural Language Inference
Title | Convolutional Interaction Network for Natural Language Inference |
Authors | Jingjing Gong, Xipeng Qiu, Xinchi Chen, Dong Liang, Xuanjing Huang |
Abstract | Attention-based neural models have achieved great success in natural language inference (NLI). In this paper, we propose the Convolutional Interaction Network (CIN), a general model to capture the interaction between two sentences, which can be an alternative to the attention mechanism for NLI. Specifically, CIN encodes one sentence with the filters dynamically generated based on another sentence. Since the filters may be designed to have various numbers and sizes, CIN can capture more complicated interaction patterns. Experiments on three large datasets demonstrate CIN{'}s efficacy. |
Tasks | Information Retrieval, Natural Language Inference, Question Answering |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1186/ |
https://www.aclweb.org/anthology/D18-1186 | |
PWC | https://paperswithcode.com/paper/convolutional-interaction-network-for-natural |
Repo | |
Framework | |
DisNet: A novel method for distance estimation from monocular camera
Title | DisNet: A novel method for distance estimation from monocular camera |
Authors | Muhammad Abdul Haseeb, Jianyu Guan, Danijela Ristić-Durrant, Axel Gräser |
Abstract | In this paper, a machine learning setup that provides the obstacle detection system with a method to estimate the distance from the monocular camera to the object viewed with the camera is presented. In particular, the preliminary results of on-going research to allow the onboard multisensory system, which is under development within H2020 Shift2Rail project SMART, to autonomously learn distances to objects, possible obstacles on the rail tracks ahead of the locomotive are given. The presented distance estimation system is based on Multi Hidden-Layer Neural Network, named DisNet, which is used to learn and predict the distance between the object and the camera sensor. The DisNet was trained using a supervised learning technique where the input features were manually calculated parameters of the object bounding boxes resulted from the YOLO object classifier and outputs were the accurate 3D laser scanner measurements of the distances to objects in the recorded scene. The presented DisNet-based distance estimation system was evaluated on the images of railway scenes as well as on the images of a road scene. Shown results demonstrate a general nature of the proposed DisNet system that enables its use for the estimation of distances to objects imaged with different types of monocular cameras. |
Tasks | Depth Estimation |
Published | 2018-10-01 |
URL | https://webcache.googleusercontent.com/search?q=cache:x7y3KzAxZdgJ:https://project.inria.fr/ppniv18/files/2018/10/paper22.pdf+&cd=2&hl=en&ct=clnk&gl=ca |
https://project.inria.fr/ppniv18/files/2018/10/paper22.pdf | |
PWC | https://paperswithcode.com/paper/disnet-a-novel-method-for-distance-estimation |
Repo | |
Framework | |
Propagating LSTM: 3D Pose Estimation based on Joint Interdependency
Title | Propagating LSTM: 3D Pose Estimation based on Joint Interdependency |
Authors | Kyoungoh Lee, Inwoong Lee, Sanghoon Lee |
Abstract | We present a novel 3D pose estimation method based on joint interdependency (JI) for acquiring 3D joints from the human pose of an RGB image. The JI incorporates the body part based structural connectivity of joints to learn the high spatial correlation of human posture on our method. Towards this goal, we propose a new long short-term memory (LSTM)-based deep learning architecture named propagating LSTM networks (p-LSTMs), where each LSTM is connected sequentially to reconstruct 3D depth from the centroid to edge joints through learning the intrinsic JI. In the first LSTM, the seed joints of 3D pose are created and reconstructed into the whole-body joints through the connected LSTMs. Utilizing the p-LSTMs, we achieve the higher accuracy of about 11.2% than state-of-the-art methods on the largest publicly available database. Importantly, we demonstrate that the JI drastically reduces the structural errors at body edges, thereby leads to a significant improvement. |
Tasks | 3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Kyoungoh_Lee_Propagating_LSTM_3D_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Kyoungoh_Lee_Propagating_LSTM_3D_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/propagating-lstm-3d-pose-estimation-based-on |
Repo | |
Framework | |
Learning to Generate Word Representations using Subword Information
Title | Learning to Generate Word Representations using Subword Information |
Authors | Yeachan Kim, Kang-Min Kim, Ji-Min Lee, SangKeun Lee |
Abstract | Distributed representations of words play a major role in the field of natural language processing by encoding semantic and syntactic information of words. However, most existing works on learning word representations typically regard words as individual atomic units and thus are blind to subword information in words. This further gives rise to a difficulty in representing out-of-vocabulary (OOV) words. In this paper, we present a character-based word representation approach to deal with this limitation. The proposed model learns to generate word representations from characters. In our model, we employ a convolutional neural network and a highway network over characters to extract salient features effectively. Unlike previous models that learn word representations from a large corpus, we take a set of pre-trained word embeddings and generalize it to word entries, including OOV words. To demonstrate the efficacy of the proposed model, we perform both an intrinsic and an extrinsic task which are word similarity and language modeling, respectively. Experimental results show clearly that the proposed model significantly outperforms strong baseline models that regard words or their subwords as atomic units. For example, we achieve as much as 18.5{%} improvement on average in perplexity for morphologically rich languages compared to strong baselines in the language modeling task. |
Tasks | Chunking, Language Modelling, Named Entity Recognition, Question Answering, Text Classification, Word Embeddings |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1216/ |
https://www.aclweb.org/anthology/C18-1216 | |
PWC | https://paperswithcode.com/paper/learning-to-generate-word-representations |
Repo | |
Framework | |
Image Quality Assessment Techniques Improve Training and Evaluation of Energy-Based Generative Adversarial Networks
Title | Image Quality Assessment Techniques Improve Training and Evaluation of Energy-Based Generative Adversarial Networks |
Authors | Michael O. Vertolli, Jim Davies |
Abstract | We propose a new, multi-component energy function for energy-based Generative Adversarial Networks (GANs) based on methods from the image quality assessment literature. Our approach expands on the Boundary Equilibrium Generative Adversarial Network (BEGAN) by outlining some of the short-comings of the original energy and loss functions. We address these short-comings by incorporating an l1 score, the Gradient Magnitude Similarity score, and a chrominance score into the new energy function. We then provide a set of systematic experiments that explore its hyper-parameters. We show that each of the energy function’s components is able to represent a slightly different set of features, which require their own evaluation criteria to assess whether they have been adequately learned. We show that models using the new energy function are able to produce better image representations than the BEGAN model in predicted ways. |
Tasks | Image Quality Assessment |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=ryzm6BATZ |
https://openreview.net/pdf?id=ryzm6BATZ | |
PWC | https://paperswithcode.com/paper/image-quality-assessment-techniques-improve |
Repo | |
Framework | |