October 18, 2019

3337 words 16 mins read

Paper Group ANR 492

HMLasso: Lasso with High Missing Rate. Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data. A Visual Interaction Framework for Dimensionality Reduction Based Data Exploration. Estimating Metric Poses of Dynamic Objects Using Monocular Visual-Inertial Fusion. Local reservoir model for choice-based learning. End …

HMLasso: Lasso with High Missing Rate


Title	HMLasso: Lasso with High Missing Rate
Authors	Masaaki Takada, Hironori Fujisawa, Takeichiro Nishikawa
Abstract	Sparse regression such as the Lasso has achieved great success in handling high-dimensional data. However, one of the biggest practical problems is that high-dimensional data often contain large amounts of missing values. Convex Conditioned Lasso (CoCoLasso) has been proposed for dealing with high-dimensional data with missing values, but it performs poorly when there are many missing values, so that the high missing rate problem has not been resolved. In this paper, we propose a novel Lasso-type regression method for high-dimensional data with high missing rates. We effectively incorporate mean imputed covariance, overcoming its inherent estimation bias. The result is an optimally weighted modification of CoCoLasso according to missing ratios. We theoretically and experimentally show that our proposed method is highly effective even when there are many missing values.
Tasks
Published	2018-11-01
URL	https://arxiv.org/abs/1811.00255v4
PDF	https://arxiv.org/pdf/1811.00255v4.pdf
PWC	https://paperswithcode.com/paper/hmlasso-lasso-for-high-dimensional-and-highly
Repo
Framework

Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data


Title	Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data
Authors	Xinlei Pan, Sung-Li Chiang, John Canny
Abstract	Self-driving vehicle vision systems must deal with an extremely broad and challenging set of scenes. They can potentially exploit an enormous amount of training data collected from vehicles in the field, but the volumes are too large to train offline naively. Not all training instances are equally valuable though, and importance sampling can be used to prioritize which training images to collect. This approach assumes that objects in images are labeled with high accuracy. To generate accurate labels in the field, we exploit the spatio-temporal coherence of vehicle video. We use a near-to-far labeling strategy by first labeling large, close objects in the video, and tracking them back in time to induce labels on small distant presentations of those objects. In this paper we demonstrate the feasibility of this approach in several steps. First, we note that an optimal subset (relative to all the objects encountered and labeled) of labeled objects in images can be obtained by importance sampling using gradients of the recognition network. Next we show that these gradients can be approximated with very low error using the loss function, which is already available when the CNN is running inference. Then, we generalize these results to objects in a larger scene using an object detection system. Finally, we describe a self-labeling scheme using object tracking. Objects are tracked back in time (near-to-far) and labels of near objects are used to check accuracy of those objects in the far field. We then evaluate the accuracy of models trained on importance sampled data vs models trained on complete data.
Tasks	Object Detection, Object Tracking
Published	2018-08-26
URL	http://arxiv.org/abs/1808.08603v1
PDF	http://arxiv.org/pdf/1808.08603v1.pdf
PWC	https://paperswithcode.com/paper/label-and-sample-efficient-training-of
Repo
Framework

A Visual Interaction Framework for Dimensionality Reduction Based Data Exploration


Title	A Visual Interaction Framework for Dimensionality Reduction Based Data Exploration
Authors	Marco Cavallo, Çağatay Demiralp
Abstract	Dimensionality reduction is a common method for analyzing and visualizing high-dimensional data. However, reasoning dynamically about the results of a dimensionality reduction is difficult. Dimensionality-reduction algorithms use complex optimizations to reduce the number of dimensions of a dataset, but these new dimensions often lack a clear relation to the initial data dimensions, thus making them difficult to interpret. Here we propose a visual interaction framework to improve dimensionality-reduction based exploratory data analysis. We introduce two interaction techniques, forward projection and backward projection, for dynamically reasoning about dimensionally reduced data. We also contribute two visualization techniques, prolines and feasibility maps, to facilitate the effective use of the proposed interactions. We apply our framework to PCA and autoencoder-based dimensionality reductions. Through data-exploration examples, we demonstrate how our visual interactions can improve the use of dimensionality reduction in exploratory data analysis.
Tasks	Dimensionality Reduction
Published	2018-11-28
URL	http://arxiv.org/abs/1811.12199v1
PDF	http://arxiv.org/pdf/1811.12199v1.pdf
PWC	https://paperswithcode.com/paper/a-visual-interaction-framework-for
Repo
Framework

Estimating Metric Poses of Dynamic Objects Using Monocular Visual-Inertial Fusion


Title	Estimating Metric Poses of Dynamic Objects Using Monocular Visual-Inertial Fusion
Authors	Kejie Qiu, Tong Qin, Hongwen Xie, Shaojie Shen
Abstract	A monocular 3D object tracking system generally has only up-to-scale pose estimation results without any prior knowledge of the tracked object. In this paper, we propose a novel idea to recover the metric scale of an arbitrary dynamic object by optimizing the trajectory of the objects in the world frame, without motion assumptions. By introducing an additional constraint in the time domain, our monocular visual-inertial tracking system can obtain continuous six degree of freedom (6-DoF) pose estimation without scale ambiguity. Our method requires neither fixed multi-camera nor depth sensor settings for scale observability, instead, the IMU inside the monocular sensing suite provides scale information for both camera itself and the tracked object. We build the proposed system on top of our monocular visual-inertial system (VINS) to obtain accurate state estimation of the monocular camera in the world frame. The whole system consists of a 2D object tracker, an object region-based visual bundle adjustment (BA), VINS and a correlation analysis-based metric scale estimator. Experimental comparisons with ground truth demonstrate the tracking accuracy of our 3D tracking performance while a mobile augmented reality (AR) demo shows the feasibility of potential applications.
Tasks	Object Tracking, Pose Estimation
Published	2018-08-21
URL	http://arxiv.org/abs/1808.06753v1
PDF	http://arxiv.org/pdf/1808.06753v1.pdf
PWC	https://paperswithcode.com/paper/estimating-metric-poses-of-dynamic-objects
Repo
Framework

Local reservoir model for choice-based learning


Title	Local reservoir model for choice-based learning
Authors	Makoto Naruse, Eiji Yamamoto, Takashi Nakao, Takuma Akimoto, Hayato Saigo, Kazuya Okamura, Izumi Ojima, Georg Northoff, Hirokazu Hori
Abstract	Decision making based on behavioral and neural observations of living systems has been extensively studied in brain science, psychology, and other disciplines. Decision-making mechanisms have also been experimentally implemented in physical processes, such as single photons and chaotic lasers. The findings of these experiments suggest that there is a certain common basis in describing decision making, regardless of its physical realizations. In this study, we propose a local reservoir model to account for choice-based learning (CBL). CBL describes decision consistency as a phenomenon where making a certain decision increases the possibility of making that same decision again later, which has been intensively investigated in neuroscience, psychology, etc. Our proposed model is inspired by the viewpoint that a decision is affected by its local environment, which is referred to as a local reservoir. If the size of the local reservoir is large enough, consecutive decision making will not be affected by previous decisions, thus showing lower degrees of decision consistency in CBL. In contrast, if the size of the local reservoir decreases, a biased distribution occurs within it, which leads to higher degrees of decision consistency in CBL. In this study, an analytical approach on local reservoirs is presented, as well as several numerical demonstrations. Furthermore, a physical architecture for CBL based on single photons is discussed, and the effects of local reservoirs is numerically demonstrated. Decision consistency in human decision-making tasks and in recruiting empirical data are evaluated based on local reservoir. In summary, the proposed local reservoir model paves a path toward establishing a foundation for computational mechanisms and the systematic analysis of decision making on different levels.
Tasks	Decision Making
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04324v1
PDF	http://arxiv.org/pdf/1804.04324v1.pdf
PWC	https://paperswithcode.com/paper/local-reservoir-model-for-choice-based
Repo
Framework

End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning


Title	End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning
Authors	Wenhan Luo, Peng Sun, Fangwei Zhong, Wei Liu, Tong Zhang, Yizhou Wang
Abstract	We study active object tracking, where a tracker takes visual observations (i.e., frame sequences) as input and produces the corresponding camera control signals as output (e.g., move forward, turn left, etc.). Conventional methods tackle tracking and camera control tasks separately, and the resulting system is difficult to tune jointly. These methods also require significant human efforts for image labeling and expensive trial-and-error system tuning in the real world. To address these issues, we propose, in this paper, an end-to-end solution via deep reinforcement learning. A ConvNet-LSTM function approximator is adopted for the direct frame-to-action prediction. We further propose an environment augmentation technique and a customized reward function, which are crucial for successful training. The tracker trained in simulators (ViZDoom and Unreal Engine) demonstrates good generalization behaviors in the case of unseen object moving paths, unseen object appearances, unseen backgrounds, and distracting objects. The system is robust and can restore tracking after occasional lost of the target being tracked. We also find that the tracking ability, obtained solely from simulators, can potentially transfer to real-world scenarios. We demonstrate successful examples of such transfer, via experiments over the VOT dataset and the deployment of a real-world robot using the proposed active tracker trained in simulation.
Tasks	Object Tracking
Published	2018-08-10
URL	http://arxiv.org/abs/1808.03405v2
PDF	http://arxiv.org/pdf/1808.03405v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-active-object-tracking-and-its
Repo
Framework

An Occam’s Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets


Title	An Occam’s Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets
Authors	Valentin Vielzeuf, Corentin Kervadec, Stéphane Pateux, Alexis Lechervy, Frédéric Jurie
Abstract	This paper presents a light-weight and accurate deep neural model for audiovisual emotion recognition. To design this model, the authors followed a philosophy of simplicity, drastically limiting the number of parameters to learn from the target datasets, always choosing the simplest earning methods: i) transfer learning and low-dimensional space embedding allows to reduce the dimensionality of the representations. ii) The isual temporal information is handled by a simple score-per-frame selection process, averaged across time. iii) A simple frame selection echanism is also proposed to weight the images of a sequence. iv) The fusion of the different modalities is performed at prediction level (late usion). We also highlight the inherent challenges of the AFEW dataset and the difficulty of model selection with as few as 383 validation equences. The proposed real-time emotion classifier achieved a state-of-the-art accuracy of 60.64 % on the test set of AFEW, and ranked 4th at he Emotion in the Wild 2018 challenge.
Tasks	Emotion Recognition, Model Selection, Transfer Learning
Published	2018-08-08
URL	http://arxiv.org/abs/1808.02668v1
PDF	http://arxiv.org/pdf/1808.02668v1.pdf
PWC	https://paperswithcode.com/paper/an-occams-razor-view-on-learning-audiovisual
Repo
Framework

Unnamed Entity Recognition of Sense Mentions


Title	Unnamed Entity Recognition of Sense Mentions
Authors	Ndapa Nakashole
Abstract	We consider the problem of recognizing mentions of human senses in text. Our contribution is a method for acquiring labeled data, and a learning method that is trained on this data. Experiments show the effectiveness of our proposed data labeling approach and our learning model on the task of sense recognition in text.
Tasks
Published	2018-11-17
URL	https://arxiv.org/abs/1811.07092v2
PDF	https://arxiv.org/pdf/1811.07092v2.pdf
PWC	https://paperswithcode.com/paper/unnamed-entity-recognition-of-sense-mentions
Repo
Framework

Sample Complexity of Sparse System Identification Problem


Title	Sample Complexity of Sparse System Identification Problem
Authors	Salar Fattahi, Somayeh Sojoudi
Abstract	In this paper, we study the system identification problem for sparse linear time-invariant systems. We propose a sparsity promoting block-regularized estimator to identify the dynamics of the system with only a limited number of input-state data samples. We characterize the properties of this estimator under high-dimensional scaling, where the growth rate of the system dimension is comparable to or even faster than that of the number of available sample trajectories. In particular, using contemporary results on high-dimensional statistics, we show that the proposed estimator results in a small element-wise error, provided that the number of sample trajectories is above a threshold. This threshold depends polynomially on the size of each block and the number of nonzero elements at different rows of input and state matrices, but only logarithmically on the system dimension. A by-product of this result is that the number of sample trajectories required for sparse system identification is significantly smaller than the dimension of the system. Furthermore, we show that, unlike the recently celebrated least-squares estimators for system identification problems, the method developed in this work is capable of \textit{exact recovery} of the underlying sparsity structure of the system with the aforementioned number of data samples. Extensive case studies on synthetically generated systems, physical mass-spring networks, and multi-agent systems are offered to demonstrate the effectiveness of the proposed method.
Tasks
Published	2018-03-21
URL	http://arxiv.org/abs/1803.07753v2
PDF	http://arxiv.org/pdf/1803.07753v2.pdf
PWC	https://paperswithcode.com/paper/sample-complexity-of-sparse-system
Repo
Framework

Sobolev Descent


Title	Sobolev Descent
Authors	Youssef Mroueh, Tom Sercu, Anant Raj
Abstract	We study a simplification of GAN training: the problem of transporting particles from a source to a target distribution. Starting from the Sobolev GAN critic, part of the gradient regularized GAN family, we show a strong relation with Optimal Transport (OT). Specifically with the less popular dynamic formulation of OT that finds a path of distributions from source to target minimizing a ``kinetic energy’'. We introduce Sobolev descent that constructs similar paths by following gradient flows of a critic function in a kernel space or parametrized by a neural network. In the kernel version, we show convergence to the target distribution in the MMD sense. We show in theory and experiments that regularization has an important role in favoring smooth transitions between distributions, avoiding large gradients from the critic. This analysis in a simplified particle setting provides insight in paths to equilibrium in GANs. \|
Tasks
Published	2018-05-30
URL	https://arxiv.org/abs/1805.12062v2
PDF	https://arxiv.org/pdf/1805.12062v2.pdf
PWC	https://paperswithcode.com/paper/regularized-kernel-and-neural-sobolev-descent
Repo
Framework

LSTMs Exploit Linguistic Attributes of Data


Title	LSTMs Exploit Linguistic Attributes of Data
Authors	Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, Noah A. Smith
Abstract	While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data. We investigate how the properties of natural language data affect an LSTM’s ability to learn a nonlinguistic task: recalling elements from its input. We find that models trained on natural language data are able to recall tokens from much longer sequences than models trained on non-language sequential data. Furthermore, we show that the LSTM learns to solve the memorization task by explicitly using a subset of its neurons to count timesteps in the input. We hypothesize that the patterns and structure in natural language data enable LSTMs to learn by providing approximate ways of reducing loss, but understanding the effect of different training data on the learnability of LSTMs remains an open question.
Tasks
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11653v2
PDF	http://arxiv.org/pdf/1805.11653v2.pdf
PWC	https://paperswithcode.com/paper/lstms-exploit-linguistic-attributes-of-data
Repo
Framework

Deep Learned Frame Prediction for Video Compression


Title	Deep Learned Frame Prediction for Video Compression
Authors	Serkan Sulun
Abstract	Motion compensation is one of the most essential methods for any video compression algorithm. Video frame prediction is a task analogous to motion compensation. In recent years, the task of frame prediction is undertaken by deep neural networks (DNNs). In this thesis we create a DNN to perform learned frame prediction and additionally implement a codec that contains our DNN. We train our network using two methods for two different goals. Firstly we train our network based on mean square error (MSE) only, aiming to obtain highest PSNR values at frame prediction and video compression. Secondly we use adversarial training to produce visually more realistic frame predictions. For frame prediction, we compare our method with the baseline methods of frame difference and 16x16 block motion compensation. For video compression we further include x264 video codec in the comparison. We show that in frame prediction, adversarial training produces frames that look sharper and more realistic, compared MSE based training, but in video compression it consistently performs worse. This proves that even though adversarial training is useful for generating video frames that are more pleasing to the human eye, they should not be employed for video compression. Moreover, our network trained with MSE produces accurate frame predictions, and in quantitative results, for both tasks, it produces comparable results in all videos and outperforms other methods on average. More specifically, learned frame prediction outperforms other methods in terms of rate-distortion performance in case of high motion video, while the rate-distortion performance of our method is competitive with that of x264 in low motion video.
Tasks	Motion Compensation, Video Compression
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10946v1
PDF	http://arxiv.org/pdf/1811.10946v1.pdf
PWC	https://paperswithcode.com/paper/deep-learned-frame-prediction-for-video
Repo
Framework

Fuzzy expert system for prediction of prostate cancer


Title	Fuzzy expert system for prediction of prostate cancer
Authors	Juthika Mahanta, Subhasis Panda
Abstract	A fuzzy expert system (FES) for the prediction of prostate cancer (PC) is prescribed in this article. Age, prostate-specific antigen (PSA), prostate volume (PV) and $%$ Free PSA ($%$FPSA) are fed as inputs into the FES and prostate cancer risk (PCR) is obtained as the output. Using knowledge based rules in Mamdani type inference method the output is calculated. If PCR $\ge 50%$, then the patient shall be advised to go for a biopsy test for confirmation. The efficacy of the designed FES is tested against a clinical data set. The true prediction for all the patients turns out to be $68.91%$ whereas only for positive biopsy cases it rises to $73.77%$. This simple yet effective FES can be used as supportive tool for decision making in medical diagnosis.
Tasks	Decision Making, Medical Diagnosis
Published	2018-12-01
URL	http://arxiv.org/abs/1812.00236v1
PDF	http://arxiv.org/pdf/1812.00236v1.pdf
PWC	https://paperswithcode.com/paper/fuzzy-expert-system-for-prediction-of
Repo
Framework

Estimation of Personalized Effects Associated With Causal Pathways


Title	Estimation of Personalized Effects Associated With Causal Pathways
Authors	Razieh Nabi, Phyllis Kanki, Ilya Shpitser
Abstract	The goal of personalized decision making is to map a unit’s characteristics to an action tailored to maximize the expected outcome for that unit. Obtaining high-quality mappings of this type is the goal of the dynamic regime literature. In healthcare settings, optimizing policies with respect to a particular causal pathway may be of interest as well. For example, we may wish to maximize the chemical effect of a drug given data from an observational study where the chemical effect of the drug on the outcome is entangled with the indirect effect mediated by differential adherence. In such cases, we may wish to optimize the direct effect of a drug, while keeping the indirect effect to that of some reference treatment. [16] shows how to combine mediation analysis and dynamic treatment regime ideas to defines policies associated with causal pathways and counterfactual responses to these policies. In this paper, we derive a variety of methods for learning high quality policies of this type from data, in a causal model corresponding to a longitudinal setting of practical importance. We illustrate our methods via a dataset of HIV patients undergoing therapy, gathered in the Nigerian PEPFAR program.
Tasks	Decision Making
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10791v1
PDF	http://arxiv.org/pdf/1809.10791v1.pdf
PWC	https://paperswithcode.com/paper/estimation-of-personalized-effects-associated
Repo
Framework

Two-path 3D CNNs for calibration of system parameters for OCT-based motion compensation


Title	Two-path 3D CNNs for calibration of system parameters for OCT-based motion compensation
Authors	Nils Gessert, Martin Gromniak, Matthias Schlüter, Alexander Schlaefer
Abstract	Automatic motion compensation and adjustment of an intraoperative imaging modality’s field of view is a common problem during interventions. Optical coherence tomography (OCT) is an imaging modality which is used in interventions due to its high spatial resolution of few micrometers and its temporal resolution of potentially several hundred volumes per second. However, performing motion compensation with OCT is problematic due to its small field of view which might lead to tracked objects being lost quickly. We propose a novel deep learning-based approach that directly learns input parameters of motors that move the scan area for motion compensation from optical coherence tomography volumes. We design a two-path 3D convolutional neural network (CNN) architecture that takes two volumes with an object to be tracked as its input and predicts the necessary motor input parameters to compensate the object’s movement. In this way, we learn the calibration between object movement and system parameters for motion compensation with arbitrary objects. Thus, we avoid error-prone hand-eye calibration and handcrafted feature tracking from classical approaches. We achieve an average correlation coefficient of 0.998 between predicted and ground-truth motor parameters which leads to sub-voxel accuracy. Furthermore, we show that our deep learning model is real-time capable for use with the system’s high volume acquisition frequency.
Tasks	Calibration, Motion Compensation
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09582v1
PDF	http://arxiv.org/pdf/1810.09582v1.pdf
PWC	https://paperswithcode.com/paper/two-path-3d-cnns-for-calibration-of-system
Repo
Framework