October 17, 2019

2945 words 14 mins read

Paper Group ANR 892

Recurrent Neural Networks based Obesity Status Prediction Using Activity Data. 3D Video Quality Assessment. Refacing: reconstructing anonymized facial features using GANs. Plug-In Stochastic Gradient Method. LinkNBed: Multi-Graph Representation Learning with Entity Linkage. Data-efficient Auto-tuning with Bayesian Optimization: An Industrial Contro …

Recurrent Neural Networks based Obesity Status Prediction Using Activity Data


Title	Recurrent Neural Networks based Obesity Status Prediction Using Activity Data
Authors	Qinghan Xue, Xiaoran Wang, Samuel Meehan, Jilong Kuang, Alex Gao, Mooi Choo Chuah
Abstract	Obesity is a serious public health concern world-wide, which increases the risk of many diseases, including hypertension, stroke, and type 2 diabetes. To tackle this problem, researchers across the health ecosystem are collecting diverse types of data, which includes biomedical, behavioral and activity, and utilizing machine learning techniques to mine hidden patterns for obesity status improvement prediction. While existing machine learning methods such as Recurrent Neural Networks (RNNs) can provide exceptional results, it is challenging to discover hidden patterns of the sequential data due to the irregular observation time instances. Meanwhile, the lack of understanding of why those learning models are effective also limits further improvements on their architectures. Thus, in this work, we develop a RNN based time-aware architecture to tackle the challenging problem of handling irregular observation times and relevant feature extractions from longitudinal patient records for obesity status improvement prediction. To improve the prediction performance, we train our model using two data sources: (i) electronic medical records containing information regarding lab tests, diagnoses, and demographics; (ii) continuous activity data collected from popular wearables. Evaluations of real-world data demonstrate that our proposed method can capture the underlying structures in users’ time sequences with irregularities, and achieve an accuracy of 77-86% in predicting the obesity status improvement.
Tasks
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07828v1
PDF	http://arxiv.org/pdf/1809.07828v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-networks-based-obesity
Repo
Framework

3D Video Quality Assessment


Title	3D Video Quality Assessment
Authors	Amin Banitalebi Dehkordi
Abstract	A key factor in designing 3D systems is to understand how different visual cues and distortions affect the perceptual quality of 3D video. The ultimate way to assess video quality is through subjective tests. However, subjective evaluation is time consuming, expensive, and in most cases not even possible. An alternative solution is objective quality metrics, which attempt to model the Human Visual System (HVS) in order to assess the perceptual quality. The potential of 3D technology to significantly improve the immersiveness of video content has been hampered by the difficulty of objectively assessing Quality of Experience (QoE). A no-reference (NR) objective 3D quality metric, which could help determine capturing parameters and improve playback perceptual quality, would be welcomed by camera and display manufactures. Network providers would embrace a full-reference (FR) 3D quality metric, as they could use it to ensure efficient QoE-based resource management during compression and Quality of Service (QoS) during transmission.
Tasks	Video Quality Assessment
Published	2018-03-13
URL	http://arxiv.org/abs/1803.04836v1
PDF	http://arxiv.org/pdf/1803.04836v1.pdf
PWC	https://paperswithcode.com/paper/3d-video-quality-assessment
Repo
Framework

Refacing: reconstructing anonymized facial features using GANs


Title	Refacing: reconstructing anonymized facial features using GANs
Authors	David Abramian, Anders Eklund
Abstract	Anonymization of medical images is necessary for protecting the identity of the test subjects, and is therefore an essential step in data sharing. However, recent developments in deep learning may raise the bar on the amount of distortion that needs to be applied to guarantee anonymity. To test such possibilities, we have applied the novel CycleGAN unsupervised image-to-image translation framework on sagittal slices of T1 MR images, in order to reconstruct facial features from anonymized data. We applied the CycleGAN framework on both face-blurred and face-removed images. Our results show that face blurring may not provide adequate protection against malicious attempts at identifying the subjects, while face removal provides more robust anonymization, but is still partially reversible.
Tasks	Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06455v2
PDF	http://arxiv.org/pdf/1810.06455v2.pdf
PWC	https://paperswithcode.com/paper/refacing-reconstructing-anonymized-facial
Repo
Framework

Plug-In Stochastic Gradient Method


Title	Plug-In Stochastic Gradient Method
Authors	Yu Sun, Brendt Wohlberg, Ulugbek S. Kamilov
Abstract	Plug-and-play priors (PnP) is a popular framework for regularized signal reconstruction by using advanced denoisers within an iterative algorithm. In this paper, we discuss our recent online variant of PnP that uses only a subset of measurements at every iteration, which makes it scalable to very large datasets. We additionally present novel convergence results for both batch and online PnP algorithms.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03659v1
PDF	http://arxiv.org/pdf/1811.03659v1.pdf
PWC	https://paperswithcode.com/paper/plug-in-stochastic-gradient-method
Repo
Framework

LinkNBed: Multi-Graph Representation Learning with Entity Linkage


Title	LinkNBed: Multi-Graph Representation Learning with Entity Linkage
Authors	Rakshit Trivedi, Bunyamin Sisman, Jun Ma, Christos Faloutsos, Hongyuan Zha, Xin Luna Dong
Abstract	Knowledge graphs have emerged as an important model for studying complex multi-relational data. This has given rise to the construction of numerous large scale but incomplete knowledge graphs encoding information extracted from various resources. An effective and scalable approach to jointly learn over multiple graphs and eventually construct a unified graph is a crucial next step for the success of knowledge-based inference for many downstream applications. To this end, we propose LinkNBed, a deep relational learning framework that learns entity and relationship representations across multiple graphs. We identify entity linkage across graphs as a vital component to achieve our goal. We design a novel objective that leverage entity linkage and build an efficient multi-task training procedure. Experiments on link prediction and entity linkage demonstrate substantial improvements over the state-of-the-art relational learning approaches.
Tasks	Graph Representation Learning, Knowledge Graphs, Link Prediction, Relational Reasoning, Representation Learning
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08447v1
PDF	http://arxiv.org/pdf/1807.08447v1.pdf
PWC	https://paperswithcode.com/paper/linknbed-multi-graph-representation-learning
Repo
Framework

Data-efficient Auto-tuning with Bayesian Optimization: An Industrial Control Study


Title	Data-efficient Auto-tuning with Bayesian Optimization: An Industrial Control Study
Authors	Matthias Neumann-Brosig, Alonso Marco, Dieter Schwarzmann, Sebastian Trimpe
Abstract	Bayesian optimization is proposed for automatic learning of optimal controller parameters from experimental data. A probabilistic description (a Gaussian process) is used to model the unknown function from controller parameters to a user-defined cost. The probabilistic model is updated with data, which is obtained by testing a set of parameters on the physical system and evaluating the cost. In order to learn fast, the Bayesian optimization algorithm selects the next parameters to evaluate in a systematic way, for example, by maximizing information gain about the optimum. The algorithm thus iteratively finds the globally optimal parameters with only few experiments. Taking throttle valve control as a representative industrial control example, the proposed auto-tuning method is shown to outperform manual calibration: it consistently achieves better performance with a low number of experiments. The proposed auto-tuning framework is flexible and can handle different control structures and objectives.
Tasks	Calibration
Published	2018-12-15
URL	http://arxiv.org/abs/1812.06325v2
PDF	http://arxiv.org/pdf/1812.06325v2.pdf
PWC	https://paperswithcode.com/paper/data-efficient-auto-tuning-with-bayesian
Repo
Framework

A Generative Adversarial Model for Right Ventricle Segmentation


Title	A Generative Adversarial Model for Right Ventricle Segmentation
Authors	Nicoló Savioli, Miguel Silva Vieira, Pablo Lamata, Giovanni Montana
Abstract	The clinical management of several cardiovascular conditions, such as pulmonary hypertension, require the assessment of the right ventricular (RV) function. This work addresses the fully automatic and robust access to one of the key RV biomarkers, its ejection fraction, from the gold standard imaging modality, MRI. The problem becomes the accurate segmentation of the RV blood pool from cine MRI sequences. This work proposes a solution based on Fully Convolutional Neural Networks (FCNN), where our first contribution is the optimal combination of three concepts (the convolution Gated Recurrent Units (GRU), the Generative Adversarial Networks (GAN), and the L1 loss function) that achieves an improvement of 0.05 and 3.49 mm in Dice Index and Hausdorff Distance respectively with respect to the baseline FCNN. This improvement is then doubled by our second contribution, the ROI-GAN, that sets two GANs to cooperate working at two fields of view of the image, its full resolution and the region of interest (ROI). Our rationale here is to better guide the FCNN learning by combining global (full resolution) and local Region Of Interest (ROI) features. The study is conducted in a large in-house dataset of $\sim$ 23.000 segmented MRI slices, and its generality is verified in a publicly available dataset.
Tasks
Published	2018-09-27
URL	http://arxiv.org/abs/1810.03969v1
PDF	http://arxiv.org/pdf/1810.03969v1.pdf
PWC	https://paperswithcode.com/paper/a-generative-adversarial-model-for-right
Repo
Framework

Video Summarization by Learning from Unpaired Data


Title	Video Summarization by Learning from Unpaired Data
Authors	Mrigank Rochan, Yang Wang
Abstract	We consider the problem of video summarization. Given an input raw video, the goal is to select a small subset of key frames from the input video to create a shorter summary video that best describes the content of the original video. Most of the current state-of-the-art video summarization approaches use supervised learning and require labeled training data. Each training instance consists of a raw input video and its ground truth summary video curated by human annotators. However, it is very expensive and difficult to create such labeled training examples. To address this limitation, we propose a novel formulation to learn video summarization from unpaired data. We present an approach that learns to generate optimal video summaries using a set of raw videos ($V$) and a set of summary videos ($S$), where there exists no correspondence between $V$ and $S$. We argue that this type of data is much easier to collect. Our model aims to learn a mapping function $F : V \rightarrow S$ such that the distribution of resultant summary videos from $F(V)$ is similar to the distribution of $S$ with the help of an adversarial objective. In addition, we enforce a diversity constraint on $F(V)$ to ensure that the generated video summaries are visually diverse. Experimental results on two benchmark datasets indicate that our proposed approach significantly outperforms other alternative methods.
Tasks	Video Summarization
Published	2018-05-30
URL	http://arxiv.org/abs/1805.12174v2
PDF	http://arxiv.org/pdf/1805.12174v2.pdf
PWC	https://paperswithcode.com/paper/learning-video-summarization-using-unpaired
Repo
Framework

Omnidirectional DSO: Direct Sparse Odometry with Fisheye Cameras


Title	Omnidirectional DSO: Direct Sparse Odometry with Fisheye Cameras
Authors	Hidenobu Matsuki, Lukas von Stumberg, Vladyslav Usenko, Jörg Stückler, Daniel Cremers
Abstract	We propose a novel real-time direct monocular visual odometry for omnidirectional cameras. Our method extends direct sparse odometry (DSO) by using the unified omnidirectional model as a projection function, which can be applied to fisheye cameras with a field-of-view (FoV) well above 180 degrees. This formulation allows for using the full area of the input image even with strong distortion, while most existing visual odometry methods can only use a rectified and cropped part of it. Model parameters within an active keyframe window are jointly optimized, including the intrinsic/extrinsic camera parameters, 3D position of points, and affine brightness parameters. Thanks to the wide FoV, image overlap between frames becomes bigger and points are more spatially distributed. Our results demonstrate that our method provides increased accuracy and robustness over state-of-the-art visual odometry algorithms.
Tasks	Monocular Visual Odometry, Visual Odometry
Published	2018-08-08
URL	http://arxiv.org/abs/1808.02775v1
PDF	http://arxiv.org/pdf/1808.02775v1.pdf
PWC	https://paperswithcode.com/paper/omnidirectional-dso-direct-sparse-odometry
Repo
Framework

A Synchronized Stereo and Plenoptic Visual Odometry Dataset


Title	A Synchronized Stereo and Plenoptic Visual Odometry Dataset
Authors	Niclas Zeller, Franz Quint, Uwe Stilla
Abstract	We present a new dataset to evaluate monocular, stereo, and plenoptic camera based visual odometry algorithms. The dataset comprises a set of synchronized image sequences recorded by a micro lens array (MLA) based plenoptic camera and a stereo camera system. For this, the stereo cameras and the plenoptic camera were assembled on a common hand-held platform. All sequences are recorded in a very large loop, where beginning and end show the same scene. Therefore, the tracking accuracy of a visual odometry algorithm can be measured from the drift between beginning and end of the sequence. For both, the plenoptic camera and the stereo system, we supply full intrinsic camera models, as well as vignetting data. The dataset consists of 11 sequences which were recorded in challenging indoor and outdoor scenarios. We present, by way of example, the results achieved by state-of-the-art algorithms.
Tasks	Visual Odometry
Published	2018-07-24
URL	http://arxiv.org/abs/1807.09372v2
PDF	http://arxiv.org/pdf/1807.09372v2.pdf
PWC	https://paperswithcode.com/paper/a-synchronized-stereo-and-plenoptic-visual
Repo
Framework

Dilated Temporal Relational Adversarial Network for Generic Video Summarization


Title	Dilated Temporal Relational Adversarial Network for Generic Video Summarization
Authors	Yujia Zhang, Michael Kampffmeyer, Xiaodan Liang, Dingwen Zhang, Min Tan, Eric P. Xing
Abstract	The large amount of videos popping up every day, make it more and more critical that key information within videos can be extracted and understood in a very short time. Video summarization, the task of finding the smallest subset of frames, which still conveys the whole story of a given video, is thus of great significance to improve efficiency of video understanding. We propose a novel Dilated Temporal Relational Generative Adversarial Network (DTR-GAN) to achieve frame-level video summarization. Given a video, it selects the set of key frames, which contain the most meaningful and compact information. Specifically, DTR-GAN learns a dilated temporal relational generator and a discriminator with three-player loss in an adversarial manner. A new dilated temporal relation (DTR) unit is introduced to enhance temporal representation capturing. The generator uses this unit to effectively exploit global multi-scale temporal context to select key frames and to complement the commonly used Bi-LSTM. To ensure that summaries capture enough key video representation from a global perspective rather than a trivial randomly shorten sequence, we present a discriminator that learns to enforce both the information completeness and compactness of summaries via a three-player loss. The loss includes the generated summary loss, the random summary loss, and the real summary (ground-truth) loss, which play important roles for better regularizing the learned model to obtain useful summaries. Comprehensive experiments on three public datasets show the effectiveness of the proposed approach.
Tasks	Video Summarization, Video Understanding
Published	2018-04-30
URL	https://arxiv.org/abs/1804.11228v2
PDF	https://arxiv.org/pdf/1804.11228v2.pdf
PWC	https://paperswithcode.com/paper/dtr-gan-dilated-temporal-relational
Repo
Framework

Superregular grammars do not provide additional explanatory power but allow for a compact analysis of animal song


Title	Superregular grammars do not provide additional explanatory power but allow for a compact analysis of animal song
Authors	Takashi Morita, Hiroki Koda
Abstract	A pervasive belief with regard to the differences between human language and animal vocal sequences (song) is that they belong to different classes of computational complexity, with animal song belonging to regular languages, whereas human language is superregular. This argument, however, lacks empirical evidence since superregular analyses of animal song are understudied. The goal of this paper is to perform a superregular analysis of animal song, using data from gibbons as a case study, and demonstrate that a superregular analysis can be effectively used with non-human data. A key finding is that a superregular analysis does not increase explanatory power but rather provides for compact analysis: Fewer grammatical rules are necessary once superregularity is allowed. This pattern is analogous to a previous computational analysis of human language, and accordingly, the null hypothesis, that human language and animal song are governed by the same type of grammatical systems, cannot be rejected.
Tasks
Published	2018-11-05
URL	https://arxiv.org/abs/1811.02507v2
PDF	https://arxiv.org/pdf/1811.02507v2.pdf
PWC	https://paperswithcode.com/paper/superregular-grammars-do-not-provide
Repo
Framework

Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains


Title	Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
Authors	Jiahao Pang, Wenxiu Sun, Chengxi Yang, Jimmy Ren, Ruichao Xiao, Jin Zeng, Liang Lin
Abstract	Despite the recent success of stereo matching with convolutional neural networks (CNNs), it remains arduous to generalize a pre-trained deep stereo model to a novel domain. A major difficulty is to collect accurate ground-truth disparities for stereo pairs in the target domain. In this work, we propose a self-adaptation approach for CNN training, utilizing both synthetic training data (with ground-truth disparities) and stereo pairs in the new domain (without ground-truths). Our method is driven by two empirical observations. By feeding real stereo pairs of different domains to stereo models pre-trained with synthetic data, we see that: i) a pre-trained model does not generalize well to the new domain, producing artifacts at boundaries and ill-posed regions; however, ii) feeding an up-sampled stereo pair leads to a disparity map with extra details. To avoid i) while exploiting ii), we formulate an iterative optimization problem with graph Laplacian regularization. At each iteration, the CNN adapts itself better to the new domain: we let the CNN learn its own higher-resolution output; at the meanwhile, a graph Laplacian regularization is imposed to discriminatively keep the desired edges while smoothing out the artifacts. We demonstrate the effectiveness of our method in two domains: daily scenes collected by smartphone cameras, and street views captured in a driving car.
Tasks	Stereo Matching, Stereo Matching Hand
Published	2018-03-18
URL	http://arxiv.org/abs/1803.06641v1
PDF	http://arxiv.org/pdf/1803.06641v1.pdf
PWC	https://paperswithcode.com/paper/zoom-and-learn-generalizing-deep-stereo
Repo
Framework

ALMN: Deep Embedding Learning with Geometrical Virtual Point Generating


Title	ALMN: Deep Embedding Learning with Geometrical Virtual Point Generating
Authors	Binghui Chen, Weihong Deng
Abstract	Deep embedding learning becomes more attractive for discriminative feature learning, but many methods still require hard-class mining, which is computationally complex and performance-sensitive. To this end, we propose Adaptive Large Margin N-Pair loss (ALMN) to address the aforementioned issues. Instead of exploring hard example-mining strategy, we introduce the concept of large margin constraint. This constraint aims at encouraging local-adaptive large angular decision margin among dissimilar samples in multimodal feature space so as to significantly encourage intraclass compactness and interclass separability. And it is mainly achieved by a simple yet novel geometrical Virtual Point Generating (VPG) method, which converts artificially setting a fixed margin into automatically generating a boundary training sample in feature space and is an open question. We demonstrate the effectiveness of our method on several popular datasets for image retrieval and clustering tasks.
Tasks	Image Retrieval
Published	2018-06-04
URL	http://arxiv.org/abs/1806.00974v2
PDF	http://arxiv.org/pdf/1806.00974v2.pdf
PWC	https://paperswithcode.com/paper/almn-deep-embedding-learning-with-geometrical
Repo
Framework

Machine Learning and Applied Linguistics


Title	Machine Learning and Applied Linguistics
Authors	Sowmya Vajjala
Abstract	This entry introduces the topic of machine learning and provides an overview of its relevance for applied linguistics and language learning. The discussion will focus on giving an introduction to the methods and applications of machine learning in applied linguistics, and will provide references for further study.
Tasks
Published	2018-03-24
URL	http://arxiv.org/abs/1803.09103v1
PDF	http://arxiv.org/pdf/1803.09103v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-and-applied-linguistics
Repo
Framework