October 16, 2019

3367 words 16 mins read

Paper Group ANR 1110

Escape Room: A Configurable Testbed for Hierarchical Reinforcement Learning. Automatically Segmenting the Left Atrium from Cardiac Images Using Successive 3D U-Nets and a Contour Loss. Encoderless Gimbal Calibration of Dynamic Multi-Camera Clusters. Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation …

Escape Room: A Configurable Testbed for Hierarchical Reinforcement Learning


Title	Escape Room: A Configurable Testbed for Hierarchical Reinforcement Learning
Authors	Jacob Menashe, Peter Stone
Abstract	Recent successes in Reinforcement Learning have encouraged a fast-growing network of RL researchers and a number of breakthroughs in RL research. As the RL community and the body of RL work grows, so does the need for widely applicable benchmarks that can fairly and effectively evaluate a variety of RL algorithms. This need is particularly apparent in the realm of Hierarchical Reinforcement Learning (HRL). While many existing test domains may exhibit hierarchical action or state structures, modern RL algorithms still exhibit great difficulty in solving domains that necessitate hierarchical modeling and action planning, even when such domains are seemingly trivial. These difficulties highlight both the need for more focus on HRL algorithms themselves, and the need for new testbeds that will encourage and validate HRL research. Existing HRL testbeds exhibit a Goldilocks problem; they are often either too simple (e.g. Taxi) or too complex (e.g. Montezuma’s Revenge from the Arcade Learning Environment). In this paper we present the Escape Room Domain (ERD), a new flexible, scalable, and fully implemented testing domain for HRL that bridges the “moderate complexity” gap left behind by existing alternatives. ERD is open-source and freely available through GitHub, and conforms to widely-used public testing interfaces for simple integration and testing with a variety of public RL agent implementations. We show that the ERD presents a suite of challenges with scalable difficulty to provide a smooth learning gradient from Taxi to the Arcade Learning Environment.
Tasks	Atari Games, Hierarchical Reinforcement Learning, Montezuma’s Revenge
Published	2018-12-22
URL	http://arxiv.org/abs/1812.09521v1
PDF	http://arxiv.org/pdf/1812.09521v1.pdf
PWC	https://paperswithcode.com/paper/escape-room-a-configurable-testbed-for
Repo
Framework

Automatically Segmenting the Left Atrium from Cardiac Images Using Successive 3D U-Nets and a Contour Loss


Title	Automatically Segmenting the Left Atrium from Cardiac Images Using Successive 3D U-Nets and a Contour Loss
Authors	Shuman Jia, Antoine Despinasse, Zihao Wang, Hervé Delingette, Xavier Pennec, Pierre Jaïs, Hubert Cochet, Maxime Sermesant
Abstract	Radiological imaging offers effective measurement of anatomy, which is useful in disease diagnosis and assessment. Previous study has shown that the left atrial wall remodeling can provide information to predict treatment outcome in atrial fibrillation. Nevertheless, the segmentation of the left atrial structures from medical images is still very time-consuming. Current advances in neural network may help creating automatic segmentation models that reduce the workload for clinicians. In this preliminary study, we propose automated, two-stage, three-dimensional U-Nets with convolutional neural network, for the challenging task of left atrial segmentation. Unlike previous two-dimensional image segmentation methods, we use 3D U-Nets to obtain the heart cavity directly in 3D. The dual 3D U-Net structure consists of, a first U-Net to coarsely segment and locate the left atrium, and a second U-Net to accurately segment the left atrium under higher resolution. In addition, we introduce a Contour loss based on additional distance information to adjust the final segmentation. We randomly split the data into training datasets (80 subjects) and validation datasets (20 subjects) to train multiple models, with different augmentation setting. Experiments show that the average Dice coefficients for validation datasets are around 0.91 - 0.92, the sensitivity around 0.90-0.94 and the specificity 0.99. Compared with traditional Dice loss, models trained with Contour loss in general offer smaller Hausdorff distance with similar Dice coefficient, and have less connected components in predictions. Finally, we integrate several trained models in an ensemble prediction to segment testing datasets.
Tasks	Semantic Segmentation
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02518v1
PDF	http://arxiv.org/pdf/1812.02518v1.pdf
PWC	https://paperswithcode.com/paper/automatically-segmenting-the-left-atrium-from
Repo
Framework

Encoderless Gimbal Calibration of Dynamic Multi-Camera Clusters


Title	Encoderless Gimbal Calibration of Dynamic Multi-Camera Clusters
Authors	Christopher L. Choi, Jason Rebello, Leonid Koppel, Pranav Ganti, Arun Das, Steven L. Waslander
Abstract	Dynamic Camera Clusters (DCCs) are multi-camera systems where one or more cameras are mounted on actuated mechanisms such as a gimbal. Existing methods for DCC calibration rely on joint angle measurements to resolve the time-varying transformation between the dynamic and static camera. This information is usually provided by motor encoders, however, joint angle measurements are not always readily available on off-the-shelf mechanisms. In this paper, we present an encoderless approach for DCC calibration which simultaneously estimates the kinematic parameters of the transformation chain as well as the unknown joint angles. We also demonstrate the integration of an encoderless gimbal mechanism with a state-of-the art VIO algorithm, and show the extensions required in order to perform simultaneous online estimation of the joint angles and vehicle localization state. The proposed calibration approach is validated both in simulation and on a physical DCC composed of a 2-DOF gimbal mounted on a UAV. Finally, we show the experimental results of the calibrated mechanism integrated into the OKVIS VIO package, and demonstrate successful online joint angle estimation while maintaining localization accuracy that is comparable to a standard static multi-camera configuration.
Tasks	Calibration
Published	2018-07-24
URL	http://arxiv.org/abs/1807.09304v1
PDF	http://arxiv.org/pdf/1807.09304v1.pdf
PWC	https://paperswithcode.com/paper/encoderless-gimbal-calibration-of-dynamic
Repo
Framework

Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation


Title	Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation
Authors	Lu Wang, Wei Zhang, Xiaofeng He, Hongyuan Zha
Abstract	Dynamic treatment recommendation systems based on large-scale electronic health records (EHRs) become a key to successfully improve practical clinical outcomes. Prior relevant studies recommend treatments either use supervised learning (e.g. matching the indicator signal which denotes doctor prescriptions), or reinforcement learning (e.g. maximizing evaluation signal which indicates cumulative reward from survival rates). However, none of these studies have considered to combine the benefits of supervised learning and reinforcement learning. In this paper, we propose Supervised Reinforcement Learning with Recurrent Neural Network (SRL-RNN), which fuses them into a synergistic learning framework. Specifically, SRL-RNN applies an off-policy actor-critic framework to handle complex relations among multiple medications, diseases and individual characteristics. The “actor” in the framework is adjusted by both the indicator signal and evaluation signal to ensure effective prescription and low mortality. RNN is further utilized to solve the Partially-Observed Markov Decision Process (POMDP) problem due to the lack of fully observed states in real world applications. Experiments on the publicly real-world dataset, i.e., MIMIC-3, illustrate that our model can reduce the estimated mortality, while providing promising accuracy in matching doctors’ prescriptions.
Tasks	Recommendation Systems
Published	2018-07-04
URL	http://arxiv.org/abs/1807.01473v2
PDF	http://arxiv.org/pdf/1807.01473v2.pdf
PWC	https://paperswithcode.com/paper/supervised-reinforcement-learning-with
Repo
Framework

Memory Time Span in LSTMs for Multi-Speaker Source Separation


Title	Memory Time Span in LSTMs for Multi-Speaker Source Separation
Authors	Jeroen Zegers, Hugo Van hamme
Abstract	With deep learning approaches becoming state-of-the-art in many speech (as well as non-speech) related machine learning tasks, efforts are being taken to delve into the neural networks which are often considered as a black box. In this paper it is analyzed how recurrent neural network (RNNs) cope with temporal dependencies by determining the relevant memory time span in a long short-term memory (LSTM) cell. This is done by leaking the state variable with a controlled lifetime and evaluating the task performance. This technique can be used for any task to estimate the time span the LSTM exploits in that specific scenario. The focus in this paper is on the task of separating speakers from overlapping speech. We discern two effects: A long term effect, probably due to speaker characterization and a short term effect, probably exploiting phone-size formant tracks.
Tasks	Multi-Speaker Source Separation
Published	2018-08-24
URL	http://arxiv.org/abs/1808.08097v1
PDF	http://arxiv.org/pdf/1808.08097v1.pdf
PWC	https://paperswithcode.com/paper/memory-time-span-in-lstms-for-multi-speaker
Repo
Framework

OFF-ApexNet on Micro-expression Recognition System


Title	OFF-ApexNet on Micro-expression Recognition System
Authors	Sze-Teng Liong, Y. S. Gan, Wei-Chuen Yau, Yen-Chang Huang, Tan Lit Ken
Abstract	When a person attempts to conceal an emotion, the genuine emotion is manifest as a micro-expression. Exploration of automatic facial micro-expression recognition systems is relatively new in the computer vision domain. This is due to the difficulty in implementing optimal feature extraction methods to cope with the subtlety and brief motion characteristics of the expression. Most of the existing approaches extract the subtle facial movements based on hand-crafted features. In this paper, we address the micro-expression recognition task with a convolutional neural network (CNN) architecture, which well integrates the features extracted from each video. A new feature descriptor, Optical Flow Features from Apex frame Network (OFF-ApexNet) is introduced. This feature descriptor combines the optical ow guided context with the CNN. Firstly, we obtain the location of the apex frame from each video sequence as it portrays the highest intensity of facial motion among all frames. Then, the optical ow information are attained from the apex frame and a reference frame (i.e., onset frame). Finally, the optical flow features are fed into a pre-designed CNN model for further feature enhancement as well as to carry out the expression classification. To evaluate the effectiveness of OFF-ApexNet, comprehensive evaluations are conducted on three public spontaneous micro-expression datasets (i.e., SMIC, CASME II and SAMM). The promising recognition result suggests that the proposed method can optimally describe the significant micro-expression details. In particular, we report that, in a multi-database with leave-one-subject-out cross-validation experimental protocol, the recognition performance reaches 74.60% of recognition accuracy and F-measure of 71.04%. We also note that this is the first work that performs cross-dataset validation on three databases in this domain.
Tasks	Optical Flow Estimation
Published	2018-05-10
URL	http://arxiv.org/abs/1805.08699v1
PDF	http://arxiv.org/pdf/1805.08699v1.pdf
PWC	https://paperswithcode.com/paper/off-apexnet-on-micro-expression-recognition
Repo
Framework

Post-mortem Human Iris Recognition


Title	Post-mortem Human Iris Recognition
Authors	Mateusz Trokielewicz, Adam Czajka, Piotr Maciejewicz
Abstract	This paper presents a unique analysis of post-mortem human iris recognition. Post-mortem human iris images were collected at the university mortuary in three sessions separated by approximately 11 hours, with the first session organized from 5 to 7 hours after demise. Analysis performed for four independent iris recognition methods shows that the common claim of the iris being useless for biometric identification soon after death is not entirely true. Since the pupil has a constant and neutral dilation after death (the so called “cadaveric position”), this makes the iris pattern perfectly visible from the standpoint of dilation. We found that more than 90% of irises are still correctly recognized when captured a few hours after death, and that serious iris deterioration begins approximately 22 hours later, since the recognition rate drops to a range of 13.3-73.3% (depending on the method used) when the cornea starts to be cloudy. There were only two failures to enroll (out of 104 images) observed for only a single method (out of four employed in this study). These findings show that the dynamics of post-mortem changes to the iris that are important for biometric identification are much more moderate than previously believed. To the best of our knowledge, this paper presents the first experimental study of how iris recognition works after death, and we hope that these preliminary findings will stimulate further research in this area.
Tasks	Iris Recognition
Published	2018-09-01
URL	http://arxiv.org/abs/1809.00208v1
PDF	http://arxiv.org/pdf/1809.00208v1.pdf
PWC	https://paperswithcode.com/paper/post-mortem-human-iris-recognition
Repo
Framework

Investor Reaction to Financial Disclosures Across Topics: An Application of Latent Dirichlet Allocation


Title	Investor Reaction to Financial Disclosures Across Topics: An Application of Latent Dirichlet Allocation
Authors	Stefan Feuerriegel, Nicolas Pröllochs
Abstract	This paper provides a holistic study of how stock prices vary in their response to financial disclosures across different topics. Thereby, we specifically shed light into the extensive amount of filings for which no a priori categorization of their content exists. For this purpose, we utilize an approach from data mining - namely, latent Dirichlet allocation - as a means of topic modeling. This technique facilitates our task of automatically categorizing, ex ante, the content of more than 70,000 regulatory 8-K filings from U.S. companies. We then evaluate the subsequent stock market reaction. Our empirical evidence suggests a considerable discrepancy among various types of news stories in terms of their relevance and impact on financial markets. For instance, we find a statistically significant abnormal return in response to earnings results and credit rating, but also for disclosures regarding business strategy, the health sector, as well as mergers and acquisitions. Our results yield findings that benefit managers, investors and policy-makers by indicating how regulatory filings should be structured and the topics most likely to precede changes in stock valuations.
Tasks
Published	2018-05-08
URL	http://arxiv.org/abs/1805.03308v1
PDF	http://arxiv.org/pdf/1805.03308v1.pdf
PWC	https://paperswithcode.com/paper/investor-reaction-to-financial-disclosures
Repo
Framework

BEST : A decision tree algorithm that handles missing values


Title	BEST : A decision tree algorithm that handles missing values
Authors	Cédric Beaulac, Jeffrey S. Rosenthal
Abstract	The main contribution of this paper is the development of a new decision tree algorithm. The proposed approach allows users to guide the algorithm through the data partitioning process. We believe this feature has many applications but in this paper we demonstrate how to utilize this algorithm to analyse data sets containing missing values. We tested our algorithm against simulated data sets with various missing data structures and a real data set. The results demonstrate that this new classification procedure efficiently handles missing values and produces results that are slightly more accurate and more interpretable than most common procedures without any imputations or pre-processing.
Tasks
Published	2018-04-26
URL	https://arxiv.org/abs/1804.10168v3
PDF	https://arxiv.org/pdf/1804.10168v3.pdf
PWC	https://paperswithcode.com/paper/best-a-decision-tree-algorithm-that-handles
Repo
Framework

Multi-Source Pointer Network for Product Title Summarization


Title	Multi-Source Pointer Network for Product Title Summarization
Authors	Fei Sun, Peng Jiang, Hanxiao Sun, Changhua Pei, Wenwu Ou, Xiaobo Wang
Abstract	In this paper, we study the product title summarization problem in E-commerce applications for display on mobile devices. Comparing with conventional sentence summarization, product title summarization has some extra and essential constraints. For example, factual errors or loss of the key information are intolerable for E-commerce applications. Therefore, we abstract two more constraints for product title summarization: (i) do not introduce irrelevant information; (ii) retain the key information (e.g., brand name and commodity name). To address these issues, we propose a novel multi-source pointer network by adding a new knowledge encoder for pointer network. The first constraint is handled by pointer mechanism. For the second constraint, we restore the key information by copying words from the knowledge encoder with the help of the soft gating mechanism. For evaluation, we build a large collection of real-world product titles along with human-written short titles. Experimental results demonstrate that our model significantly outperforms the other baselines. Finally, online deployment of our proposed model has yielded a significant business impact, as measured by the click-through rate.
Tasks
Published	2018-08-21
URL	http://arxiv.org/abs/1808.06885v3
PDF	http://arxiv.org/pdf/1808.06885v3.pdf
PWC	https://paperswithcode.com/paper/multi-source-pointer-network-for-product
Repo
Framework

Multi-Observation Regression


Title	Multi-Observation Regression
Authors	Rafael Frongillo, Nishant A. Mehta, Tom Morgan, Bo Waggoner
Abstract	Recent work introduced loss functions which measure the error of a prediction based on multiple simultaneous observations or outcomes. In this paper, we explore the theoretical and practical questions that arise when using such multi-observation losses for regression on data sets of $(x,y)$ pairs. When a loss depends on only one observation, the average empirical loss decomposes by applying the loss to each pair, but for the multi-observation case, empirical loss is not even well-defined, and the possibility of statistical guarantees is unclear without several $(x,y)$ pairs with exactly the same $x$ value. We propose four algorithms formalizing the concept of empirical risk minimization for this problem, two of which have statistical guarantees in settings allowing both slow and fast convergence rates, but which are out-performed empirically by the other two. Empirical results demonstrate practicality of these algorithms in low-dimensional settings, while lower bounds demonstrate intrinsic difficulty in higher dimensions. Finally, we demonstrate the potential benefit of the algorithms over natural baselines that use traditional single-observation losses via both lower bounds and simulations.
Tasks
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09680v1
PDF	http://arxiv.org/pdf/1802.09680v1.pdf
PWC	https://paperswithcode.com/paper/multi-observation-regression
Repo
Framework

Learning Montezuma’s Revenge from a Single Demonstration


Title	Learning Montezuma’s Revenge from a Single Demonstration
Authors	Tim Salimans, Richard Chen
Abstract	We propose a new method for learning from a single demonstration to solve hard exploration tasks like the Atari game Montezuma’s Revenge. Instead of imitating human demonstrations, as proposed in other recent works, our approach is to maximize rewards directly. Our agent is trained using off-the-shelf reinforcement learning, but starts every episode by resetting to a state from a demonstration. By starting from such demonstration states, the agent requires much less exploration to learn a game compared to when it starts from the beginning of the game at every episode. We analyze reinforcement learning for tasks with sparse rewards in a simple toy environment, where we show that the run-time of standard RL methods scales exponentially in the number of states between rewards. Our method reduces this to quadratic scaling, opening up many tasks that were previously infeasible. We then apply our method to Montezuma’s Revenge, for which we present a trained agent achieving a high-score of 74,500, better than any previously published result.
Tasks	Montezuma’s Revenge
Published	2018-12-08
URL	http://arxiv.org/abs/1812.03381v1
PDF	http://arxiv.org/pdf/1812.03381v1.pdf
PWC	https://paperswithcode.com/paper/learning-montezumas-revenge-from-a-single
Repo
Framework

KinshipGAN: Synthesizing of Kinship Faces From Family Photos by Regularizing a Deep Face Network


Title	KinshipGAN: Synthesizing of Kinship Faces From Family Photos by Regularizing a Deep Face Network
Authors	Savas Ozkan, Akin Ozkan
Abstract	In this paper, we propose a kinship generator network that can synthesize a possible child face by analyzing his/her parent’s photo. For this purpose, we focus on to handle the scarcity of kinship datasets throughout the paper by proposing novel solutions in particular. To extract robust features, we integrate a pre-trained face model to the kinship face generator. Moreover, the generator network is regularized with an additional face dataset and adversarial loss to decrease the overfitting of the limited samples. Lastly, we adapt cycle-domain transformation to attain a more stable results. Experiments are conducted on Families in the Wild (FIW) dataset. The experimental results show that the contributions presented in the paper provide important performance improvements compared to the baseline architecture and our proposed method yields promising perceptual results.
Tasks
Published	2018-06-22
URL	http://arxiv.org/abs/1806.08600v2
PDF	http://arxiv.org/pdf/1806.08600v2.pdf
PWC	https://paperswithcode.com/paper/kinshipgan-synthesizing-of-kinship-faces-from
Repo
Framework

Motion Feature Network: Fixed Motion Filter for Action Recognition


Title	Motion Feature Network: Fixed Motion Filter for Action Recognition
Authors	Myunggi Lee, Seungeui Lee, Sungjoon Son, Gyutae Park, Nojun Kwak
Abstract	Spatio-temporal representations in frame sequences play an important role in the task of action recognition. Previously, a method of using optical flow as a temporal information in combination with a set of RGB images that contain spatial information has shown great performance enhancement in the action recognition tasks. However, it has an expensive computational cost and requires two-stream (RGB and optical flow) framework. In this paper, we propose MFNet (Motion Feature Network) containing motion blocks which make it possible to encode spatio-temporal information between adjacent frames in a unified network that can be trained end-to-end. The motion block can be attached to any existing CNN-based action recognition frameworks with only a small additional cost. We evaluated our network on two of the action recognition datasets (Jester and Something-Something) and achieved competitive performances for both datasets by training the networks from scratch.
Tasks	Action Recognition In Videos, Optical Flow Estimation, Temporal Action Localization
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10037v2
PDF	http://arxiv.org/pdf/1807.10037v2.pdf
PWC	https://paperswithcode.com/paper/motion-feature-network-fixed-motion-filter
Repo
Framework

Deep Predictive Models in Interactive Music


Title	Deep Predictive Models in Interactive Music
Authors	Charles P. Martin, Kai Olav Ellefsen, Jim Torresen
Abstract	Musical performance requires prediction to operate instruments, to perform in groups and to improvise. In this paper, we investigate how a number of digital musical instruments (DMIs), including two of our own, have applied predictive machine learning models that assist users by predicting unknown states of musical processes. We characterise these predictions as focussed within a musical instrument, at the level of individual performers, and between members of an ensemble. These models can connect to existing frameworks for DMI design and have parallels in the cognitive predictions of human musicians. We discuss how recent advances in deep learning highlight the role of prediction in DMIs, by allowing data-driven predictive models with a long memory of past states. The systems we review are used to motivate musical use-cases where prediction is a necessary component, and to highlight a number of challenges for DMI designers seeking to apply deep predictive models in interactive music systems of the future.
Tasks
Published	2018-01-31
URL	http://arxiv.org/abs/1801.10492v3
PDF	http://arxiv.org/pdf/1801.10492v3.pdf
PWC	https://paperswithcode.com/paper/deep-predictive-models-in-interactive-music
Repo
Framework