January 29, 2020

3381 words 16 mins read

Paper Group ANR 668

Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective. Towards fully automated post-event data collection and analysis: pre-event and post-event information fusion. Learn to Segment Organs with a Few Bounding Boxes. Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization. Less is Mo …

Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective


Title	Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective
Authors	Nikki Lijing Kuang, Clement H. C. Leung
Abstract	In reinforcement learning, a decision needs to be made at some point as to whether it is worthwhile to carry on with the learning process or to terminate it. In many such situations, stochastic elements are often present which govern the occurrence of rewards, with the sequential occurrences of positive rewards randomly interleaved with negative rewards. For most practical learners, the learning is considered useful if the number of positive rewards always exceeds the negative ones. A situation that often calls for learning termination is when the number of negative rewards exceeds the number of positive rewards. However, while this seems reasonable, the error of premature termination, whereby termination is enacted along with the conclusion of learning failure despite the positive rewards eventually far outnumber the negative ones, can be significant. In this paper, using combinatorial analysis we study the error probability in wrongly terminating a reinforcement learning activity which undermines the effectiveness of an optimal policy, and we show that the resultant error can be quite high. Whilst we demonstrate mathematically that such errors can never be eliminated, we propose some practical mechanisms that can effectively reduce such errors. Simulation experiments have been carried out, the results of which are in close agreement with our theoretical findings.
Tasks
Published	2019-02-11
URL	http://arxiv.org/abs/1902.04179v1
PDF	http://arxiv.org/pdf/1902.04179v1.pdf
PWC	https://paperswithcode.com/paper/performance-dynamics-and-termination-errors
Repo
Framework

Towards fully automated post-event data collection and analysis: pre-event and post-event information fusion


Title	Towards fully automated post-event data collection and analysis: pre-event and post-event information fusion
Authors	Ali Lenjani, Shirley J. Dyke, Ilias Bilionis, Chul Min Yeum, Kenzo Kamiya, Jongseong Choi, Xiaoyu Liu, Arindam G. Chowdhury
Abstract	In post-event reconnaissance missions, engineers and researchers collect perishable information about damaged buildings in the affected geographical region to learn from the consequences of the event. A typical post-event reconnaissance mission is conducted by first doing a preliminary survey, followed by a detailed survey. The preliminary survey is typically conducted by driving slowly along a pre-determined route, observing the damage, and noting where further detailed data should be collected. This involves several manual, time-consuming steps that can be accelerated by exploiting recent advances in computer vision and artificial intelligence. The objective of this work is to develop and validate an automated technique to support post-event reconnaissance teams in the rapid collection of reliable and sufficiently comprehensive data, for planning the detailed survey. The technique incorporates several methods designed to automate the process of categorizing buildings based on their key physical attributes, and rapidly assessing their post-event structural condition. It is divided into pre-event and post-event streams, each intending to first extract all possible information about the target buildings using both pre-event and post-event images. Algorithms based on convolutional neural network (CNNs) are implemented for scene (image) classification. A probabilistic approach is developed to fuse the results obtained from analyzing several images to yield a robust decision regarding the attributes and condition of a target building. We validate the technique using post-event images captured during reconnaissance missions that took place after hurricanes Harvey and Irma. The validation data were collected by a structural wind and coastal engineering reconnaissance team, the National Science Foundation (NSF) funded Structural Extreme Events Reconnaissance (StEER) Network.
Tasks	Image Classification
Published	2019-06-30
URL	https://arxiv.org/abs/1907.05285v1
PDF	https://arxiv.org/pdf/1907.05285v1.pdf
PWC	https://paperswithcode.com/paper/towards-fully-automated-post-event-data
Repo
Framework

Learn to Segment Organs with a Few Bounding Boxes


Title	Learn to Segment Organs with a Few Bounding Boxes
Authors	Abhijeet Parida, Arianne Tran, Nassir Navab, Shadi Albarqouni
Abstract	Semantic segmentation is an import task in the medical field to identify the exact extent and orientation of significant structures like organs and pathology. Deep neural networks can perform this task well by leveraging the information from a large well-labeled data-set. This paper aims to present a method that mitigates the necessity of an extensive well-labeled data-set. This method also addresses semi-supervision by enabling segmentation based on bounding box annotations, avoiding the need for full pixel-level annotations. The network presented consists of a single U-Net based unbranched architecture that generates a few-shot segmentation for an unseen human organ using just 4 example annotations of that specific organ. The network is trained by alternately minimizing the nearest neighbor loss for prototype learning and a weighted cross-entropy loss for segmentation learning to perform a fast 3D segmentation with a median score of 54.64%.
Tasks	Semantic Segmentation
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07809v1
PDF	https://arxiv.org/pdf/1909.07809v1.pdf
PWC	https://paperswithcode.com/paper/learn-to-segment-organs-with-a-few-bounding
Repo
Framework

Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization


Title	Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization
Authors	Michael Volpp, Lukas P. Fröhlich, Kirsten Fischer, Andreas Doerr, Stefan Falkner, Frank Hutter, Christian Daniel
Abstract	Transferring knowledge across tasks to improve data-efficiency is one of the open key challenges in the field of global black-box optimization. Readily available algorithms are typically designed to be universal optimizers and, therefore, often suboptimal for specific tasks. We propose a novel transfer learning method to obtain customized optimizers within the well-established framework of Bayesian optimization, allowing our algorithm to utilize the proven generalization capabilities of Gaussian processes. Using reinforcement learning to meta-train an acquisition function (AF) on a set of related tasks, the proposed method learns to extract implicit structural information and to exploit it for improved data-efficiency. We present experiments on a simulation-to-real transfer task as well as on several synthetic functions and on two hyperparameter search problems. The results show that our algorithm (1) automatically identifies structural properties of objective functions from available source tasks or simulations, (2) performs favourably in settings with both scarse and abundant source data, and (3) falls back to the performance level of general AFs if no particular structure is present.
Tasks	Gaussian Processes, Meta-Learning, Transfer Learning
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02642v5
PDF	https://arxiv.org/pdf/1904.02642v5.pdf
PWC	https://paperswithcode.com/paper/meta-learning-acquisition-functions-for
Repo
Framework

Less is More: Learning Highlight Detection from Video Duration


Title	Less is More: Learning Highlight Detection from Video Duration
Authors	Bo Xiong, Yannis Kalantidis, Deepti Ghadiyaram, Kristen Grauman
Abstract	Highlight detection has the potential to significantly ease video browsing, but existing methods often suffer from expensive supervision requirements, where human viewers must manually identify highlights in training videos. We propose a scalable unsupervised solution that exploits video duration as an implicit supervision signal. Our key insight is that video segments from shorter user-generated videos are more likely to be highlights than those from longer videos, since users tend to be more selective about the content when capturing shorter videos. Leveraging this insight, we introduce a novel ranking framework that prefers segments from shorter videos, while properly accounting for the inherent noise in the (unlabeled) training data. We use it to train a highlight detector with 10M hashtagged Instagram videos. In experiments on two challenging public video highlight detection benchmarks, our method substantially improves the state-of-the-art for unsupervised highlight detection.
Tasks
Published	2019-03-03
URL	http://arxiv.org/abs/1903.00859v1
PDF	http://arxiv.org/pdf/1903.00859v1.pdf
PWC	https://paperswithcode.com/paper/less-is-more-learning-highlight-detection
Repo
Framework

ZeLiC and ZeChipC: Time Series Interpolation Methods for Lebesgue or Event-based Sampling


Title	ZeLiC and ZeChipC: Time Series Interpolation Methods for Lebesgue or Event-based Sampling
Authors	Matthieu Bellucci, Luis Miralles, M. Atif Qureshi, Brian Mac Namee
Abstract	Lebesgue sampling is based on collecting information depending on the values of the signal. Although the interpolation methods for periodic sampling have been a topic of research for a long time, there is a lack of study in methods capable of taking advantage of the Lebesgue sampling characteristics to reconstruct time series more accurately. Indeed, Lebesgue sampling contains additional information about the shape of the signal in-between two sampled points. Using this information would allow us to generate an interpolated signal closer to the original one. That is to say, the average distance between the interpolated signal and the original signal will be smaller than a signal interpolated with other interpolation methods. In this paper, we propose two novel time series interpolation methods specifically designed for Lebesgue sampling called ZeLiC and ZeChipC. ZeLiC is an algorithm that combines both Zero-order hold interpolation and Linear interpolation to reconstruct time series. ZeChipC is a similar idea, it is a combination of Zero-order hold and PCHIP interpolation. Zero-order hold interpolation is favourable for interpolating abrupt changes while Linear and PCHIP interpolation are more suitable for smooth transitions. In order to apply one method or the other, we have introduced a new concept called tolerated region. ZeLiC and ZeChipC include a new functionality to adapt the reconstructed signal to concave/convex regions. The proposed methods have been compared with the state-of-the-art interpolation methods using Lebesgue sampling and have offered higher average performance. Additionally, we have compared the performance of the methods using both Riemann and Lebesgue sampling using an approximate number of sampled points. The performance of the combination “Lebesgue sampling with ZeChipC interpolation method” is clearly much better than any other combination.
Tasks	Time Series
Published	2019-06-06
URL	https://arxiv.org/abs/1906.03110v1
PDF	https://arxiv.org/pdf/1906.03110v1.pdf
PWC	https://paperswithcode.com/paper/zelic-and-zechipc-time-series-interpolation
Repo
Framework

Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring


Title	Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
Authors	Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, Jason Weston
Abstract	The use of deep pre-trained bidirectional transformers has led to remarkable progress in a number of applications (Devlin et al., 2018). For tasks that make pairwise comparisons between sequences, matching a given input with a corresponding label, two approaches are common: Cross-encoders performing full self-attention over the pair and Bi-encoders encoding the pair separately. The former often performs better, but is too slow for practical use. In this work, we develop a new transformer architecture, the Poly-encoder, that learns global rather than token level self-attention features. We perform a detailed comparison of all three approaches, including what pre-training and fine-tuning strategies work best. We show our models achieve state-of-the-art results on three existing tasks; that Poly-encoders are faster than Cross-encoders and more accurate than Bi-encoders; and that the best results are obtained by pre-training on large datasets similar to the downstream tasks.
Tasks	Conversational Response Selection
Published	2019-04-22
URL	https://arxiv.org/abs/1905.01969v4
PDF	https://arxiv.org/pdf/1905.01969v4.pdf
PWC	https://paperswithcode.com/paper/190501969
Repo
Framework

Differential Privacy-enabled Federated Learning for Sensitive Health Data


Title	Differential Privacy-enabled Federated Learning for Sensitive Health Data
Authors	Olivia Choudhury, Aris Gkoulalas-Divanis, Theodoros Salonidis, Issa Sylla, Yoonyoung Park, Grace Hsu, Amar Das
Abstract	Leveraging real-world health data for machine learning tasks requires addressing many practical challenges, such as distributed data silos, privacy concerns with creating a centralized database from person-specific sensitive data, resource constraints for transferring and integrating data from multiple sites, and risk of a single point of failure. In this paper, we introduce a federated learning framework that can learn a global model from distributed health data held locally at different sites. The framework offers two levels of privacy protection. First, it does not move or share raw data across sites or with a centralized server during the model training process. Second, it uses a differential privacy mechanism to further protect the model from potential privacy attacks. We perform a comprehensive evaluation of our approach on two healthcare applications, using real-world electronic health data of 1 million patients. We demonstrate the feasibility and effectiveness of the federated learning framework in offering an elevated level of privacy and maintaining utility of the global model.
Tasks
Published	2019-10-07
URL	https://arxiv.org/abs/1910.02578v3
PDF	https://arxiv.org/pdf/1910.02578v3.pdf
PWC	https://paperswithcode.com/paper/differential-privacy-enabled-federated
Repo
Framework

Joint-task Self-supervised Learning for Temporal Correspondence


Title	Joint-task Self-supervised Learning for Temporal Correspondence
Authors	Xueting Li, Sifei Liu, Shalini De Mello, Xiaolong Wang, Jan Kautz, Ming-Hsuan Yang
Abstract	This paper proposes to learn reliable dense correspondence from videos in a self-supervised manner. Our learning process integrates two highly related tasks: tracking large image regions \emph{and} establishing fine-grained pixel-level associations between consecutive video frames. We exploit the synergy between both tasks through a shared inter-frame affinity matrix, which simultaneously models transitions between video frames at both the region- and pixel-levels. While region-level localization helps reduce ambiguities in fine-grained matching by narrowing down search regions; fine-grained matching provides bottom-up features to facilitate region-level localization. Our method outperforms the state-of-the-art self-supervised methods on a variety of visual correspondence tasks, including video-object and part-segmentation propagation, keypoint tracking, and object tracking. Our self-supervised method even surpasses the fully-supervised affinity feature representation obtained from a ResNet-18 pre-trained on the ImageNet.
Tasks	Object Tracking
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11895v1
PDF	https://arxiv.org/pdf/1909.11895v1.pdf
PWC	https://paperswithcode.com/paper/joint-task-self-supervised-learning-for
Repo
Framework

Infant Contact-less Non-Nutritive Sucking Pattern Quantification via Facial Gesture Analysis


Title	Infant Contact-less Non-Nutritive Sucking Pattern Quantification via Facial Gesture Analysis
Authors	Xiaofei Huang, Alaina Martens, Emily Zimmerman, Sarah Ostadabbas
Abstract	Non-nutritive sucking (NNS) is defined as the sucking action that occurs when a finger, pacifier, or other object is placed in the baby’s mouth, but there is no nutrient delivered. In addition to providing a sense of safety, NNS even can be regarded as an indicator of infant’s central nervous system development. The rich data, such as sucking frequency, the number of cycles, and their amplitude during baby’s non-nutritive sucking is important clue for judging the brain development of infants or preterm infants. Nowadays most researchers are collecting NNS data by using some contact devices such as pressure transducers. However, such invasive contact will have a direct impact on the baby’s natural sucking behavior, resulting in significant distortion in the collected data. Therefore, we propose a novel contact-less NNS data acquisition and quantification scheme, which leverages the facial landmarks tracking technology to extract the movement signals of baby’s jaw from recorded baby’s sucking video. Since completion of the sucking action requires a large amount of synchronous coordination and neural integration of the facial muscles and the cranial nerves, the facial muscle movement signals accompanying baby’s sucking pacifier can indirectly replace the NNS signal. We have evaluated our method on videos collected from several infants during their NNS behaviors and we have achieved the quantified NNS patterns closely comparable to results from visual inspection as well as contact-based sensor readings.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01821v1
PDF	https://arxiv.org/pdf/1906.01821v1.pdf
PWC	https://paperswithcode.com/paper/infant-contact-less-non-nutritive-sucking
Repo
Framework

Predicting Eating Events in Free Living Individuals – A Technical Report


Title	Predicting Eating Events in Free Living Individuals – A Technical Report
Authors	Jiayi Wang, Jiue-An Yang, Supun Nakandala, Arun Kumar, Marta M. Jankowska
Abstract	This technical report records the experiments of applying multiple machine learning algorithms for predicting eating and food purchasing behaviors of free-living individuals. Data was collected with accelerometer, global positioning system (GPS), and body-worn cameras called SenseCam over a one week period in 81 individuals from a variety of ages and demographic backgrounds. These data were turned into minute-level features from sensors as well as engineered features that included time (e.g., time since last eating) and environmental context (e.g., distance to nearest grocery store). Algorithms include Logistic Regression, RBF-SVM, Random Forest, and Gradient Boosting. Our results show that the Gradient Boosting model has the highest mean accuracy score (0.7289) for predicting eating events before 0 to 4 minutes. For predicting food purchasing events, the RBF-SVM model (0.7395) outperforms others. For both prediction models, temporal and spatial features were important contributors to predicting eating and food purchasing events.
Tasks
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05304v1
PDF	https://arxiv.org/pdf/1908.05304v1.pdf
PWC	https://paperswithcode.com/paper/predicting-eating-events-in-free-living
Repo
Framework

Determination of the Mitotically Most Active Region for Computer-Aided Mitotic Count


Title	Determination of the Mitotically Most Active Region for Computer-Aided Mitotic Count
Authors	Marc Aubreville, Christof A. Bertram, Christian Marzahl, Corinne Gurtner, Martina Dettwiler, Anja Schmidt, Florian Bartenschlager, Sophie Merz, Marco Fragoso, Olivia Kershaw, Robert Klopfleisch, Andreas Maier
Abstract	Manual count of mitotic figures, which is determined in the tumor region with the highest mitotic activity, is a key parameter of most grading schemes. It can be, however, strongly dependent on the area selection due to uneven mitotic figure distribution. We aimed to assess the question, how significantly the area selection impacts the mitotic count, which has a known high inter-rater disagreement. On a data set of 32 whole slide images of H&E-stained canine cutaneous mast cell tumor, fully annotated for mitotic figures, we asked 8 veterinary pathologists (5 board-certified, 3 in training) to select a field of interest for the mitotic count. To assess the potential difference in grading, we compared the mitotic count of the selected regions to the overall distribution on the slide. Additionally, we evaluated three deep learning-based methods on the same task: In one approach, the model would directly try to predict the mitotic count for the presented image patches as a regression task. The second method aims at deriving a segmentation mask for mitotic figures, which is then used to obtain a mitotic density. Finally, we evaluated a two-stage object-detection pipeline based on state-of-the-art architectures to identify individual mitotic figures. We found that the predictions by all models were, on average, better than those of the experts. The two-stage object detector performed best and outperformed most of the human experts on the majority of tumor cases. The correlation between the predicted and the ground truth mitotic count was also best for this approach (0.963 to 0.979). Further, we found considerable differences in position selection between experts, which could partially explain the high variance that has been reported for the manual mitotic count. To achieve better inter-rater agreement, we propose to use a computer-based area selection for the manual mitotic count.
Tasks	Object Detection
Published	2019-02-12
URL	https://arxiv.org/abs/1902.05414v2
PDF	https://arxiv.org/pdf/1902.05414v2.pdf
PWC	https://paperswithcode.com/paper/field-of-interest-prediction-for-computer
Repo
Framework

PAC-Bayesian Transportation Bound


Title	PAC-Bayesian Transportation Bound
Authors	Kohei Miyaguchi
Abstract	Empirically, the PAC-Bayesian analysis is known to produce tight risk bounds for practical machine learning algorithms. However, in its naive form, it can only deal with stochastic predictors while such predictors are rarely used and deterministic predictors often performs well in practice. To fill this gap, we develop a new generalization error bound, the PAC-Bayesian transportation bound, unifying the PAC-Bayesian analysis and the chaining method in view of the optimal transportation. It is the first PAC-Bayesian bound that relates the risks of any two predictors according to their distance, and capable of evaluating the cost of de-randomization of stochastic predictors faced with continuous loss functions. As an example, we give an upper bound on the de-randomization cost of spectrally normalized neural networks (NNs) to evaluate how much randomness contributes to the generalization of NNs.
Tasks
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13435v3
PDF	https://arxiv.org/pdf/1905.13435v3.pdf
PWC	https://paperswithcode.com/paper/pac-bayesian-transportation-bound
Repo
Framework

Two-Headed Monster And Crossed Co-Attention Networks


Title	Two-Headed Monster And Crossed Co-Attention Networks
Authors	Yaoyiran Li, Jing Jiang
Abstract	This paper presents some preliminary investigations of a new co-attention mechanism in neural transduction models. We propose a paradigm, termed Two-Headed Monster (THM), which consists of two symmetric encoder modules and one decoder module connected with co-attention. As a specific and concrete implementation of THM, Crossed Co-Attention Networks (CCNs) are designed based on the Transformer model. We demonstrate CCNs on WMT 2014 EN-DE and WMT 2016 EN-FI translation tasks and our model outperforms the strong Transformer baseline by 0.51 (big) and 0.74 (base) BLEU points on EN-DE and by 0.17 (big) and 0.47 (base) BLEU points on EN-FI.
Tasks
Published	2019-11-10
URL	https://arxiv.org/abs/1911.03897v1
PDF	https://arxiv.org/pdf/1911.03897v1.pdf
PWC	https://paperswithcode.com/paper/two-headed-monster-and-crossed-co-attention
Repo
Framework

Underwater Stereo using Refraction-free Image Synthesized from Light Field Camera


Title	Underwater Stereo using Refraction-free Image Synthesized from Light Field Camera
Authors	Kazuto Ichimaru, Hiroshi Kawasaki
Abstract	There is a strong demand on capturing underwater scenes without distortions caused by refraction. Since a light field camera can capture several light rays at each point of an image plane from various directions, if geometrically correct rays are chosen, it is possible to synthesize a refraction-free image. In this paper, we propose a novel technique to efficiently select such rays to synthesize a refraction-free image from an underwater image captured by a light field camera. In addition, we propose a stereo technique to reconstruct 3D shapes using a pair of our refraction-free images, which are central projection. In the experiment, we captured several underwater scenes by two light field cameras, synthesized refraction free images and applied stereo technique to reconstruct 3D shapes. The results are compared with previous techniques which are based on approximation, showing the strength of our method.
Tasks
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09588v1
PDF	https://arxiv.org/pdf/1905.09588v1.pdf
PWC	https://paperswithcode.com/paper/underwater-stereo-using-refraction-free-image
Repo
Framework