January 26, 2020

3214 words 16 mins read

Paper Group ANR 1595

Points2Pix: 3D Point-Cloud to Image Translation using conditional Generative Adversarial Networks. Benchmarking machine learning models on eICU critical care dataset. Learning Super-resolution 3D Segmentation of Plant Root MRI Images from Few Examples. Revisiting Self-Training for Neural Sequence Generation. Controversial stimuli: pitting neural ne …

Points2Pix: 3D Point-Cloud to Image Translation using conditional Generative Adversarial Networks


Title	Points2Pix: 3D Point-Cloud to Image Translation using conditional Generative Adversarial Networks
Authors	Stefan Milz, Martin Simon, Kai Fischer, Maximillian Pöpperl
Abstract	We present the first approach for 3D point-cloud to image translation based on conditional Generative Adversarial Networks (cGAN). The model handles multi-modal information sources from different domains, i.e. raw point-sets and images. The generator is capable of processing three conditions, whereas the point-cloud is encoded as raw point-set and camera projection. An image background patch is used as constraint to bias environmental texturing. A global approximation function within the generator is directly applied on the point-cloud (Point-Net). Hence, the representative learning model incorporates global 3D characteristics directly at the latent feature space. Conditions are used to bias the background and the viewpoint of the generated image. This opens up new ways in augmenting or texturing 3D data to aim the generation of fully individual images. We successfully evaluated our method on the Kitti and SunRGBD dataset with an outstanding object detection inception score.
Tasks	Object Detection
Published	2019-01-26
URL	https://arxiv.org/abs/1901.09280v2
PDF	https://arxiv.org/pdf/1901.09280v2.pdf
PWC	https://paperswithcode.com/paper/points2pix-3d-point-cloud-to-image
Repo
Framework

Benchmarking machine learning models on eICU critical care dataset


Title	Benchmarking machine learning models on eICU critical care dataset
Authors	Seyedmostafa Sheikhalishahi, Vevake Balaraman, Venet Osmani
Abstract	Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as vision and NLP) have already established various competitions and benchmarks, whereas only recent availability of large clinical datasets has enabled the possibility of public benchmarks. Taking advantage of this opportunity, we propose a public benchmark suite to address four areas of critical care, namely mortality prediction, estimation of length of stay, patient phenotyping and risk of decompensation. We define each task and compare the performance of both clinical models as well as baseline and deep models using eICU critical care dataset of around 73,000 patients. Furthermore, we investigate the impact of numerical variables as well as handling of categorical variables for each of the defined tasks.
Tasks	Mortality Prediction
Published	2019-10-02
URL	https://arxiv.org/abs/1910.00964v1
PDF	https://arxiv.org/pdf/1910.00964v1.pdf
PWC	https://paperswithcode.com/paper/benchmarking-machine-learning-models-on-eicu
Repo
Framework

Learning Super-resolution 3D Segmentation of Plant Root MRI Images from Few Examples


Title	Learning Super-resolution 3D Segmentation of Plant Root MRI Images from Few Examples
Authors	Ali Oguz Uzman, Jannis Horn, Sven Behnke
Abstract	Analyzing plant roots is crucial to understand plant performance in different soil environments. While magnetic resonance imaging (MRI) can be used to obtain 3D images of plant roots, extracting the root structural model is challenging due to highly noisy soil environments and low-resolution of MRI images. To improve both contrast and resolution, we adapt the state-of-the-art method RefineNet for 3D segmentation of the plant root MRI images in super-resolution. The networks are trained from few manual segmentations that are augmented by geometric transformations, realistic noise, and other variabilities. The resulting segmentations contain most root structures, including branches not extracted by the human annotator.
Tasks	Super-Resolution
Published	2019-03-16
URL	http://arxiv.org/abs/1903.06855v1
PDF	http://arxiv.org/pdf/1903.06855v1.pdf
PWC	https://paperswithcode.com/paper/learning-super-resolution-3d-segmentation-of
Repo
Framework

Revisiting Self-Training for Neural Sequence Generation


Title	Revisiting Self-Training for Neural Sequence Generation
Authors	Junxian He, Jiatao Gu, Jiajun Shen, Marc’Aurelio Ranzato
Abstract	Self-training is one of the earliest and simplest semi-supervised methods. The key idea is to augment the original labeled dataset with unlabeled data paired with the model’s prediction (i.e. the pseudo-parallel data). While self-training has been extensively studied on classification problems, in complex sequence generation tasks (e.g. machine translation) it is still unclear how self-training works due to the compositionality of the target space. In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks. Through careful examination of the performance gains, we find that the perturbation on the hidden states (i.e. dropout) is critical for self-training to benefit from the pseudo-parallel data, which acts as a regularizer and forces the model to yield close predictions for similar unlabeled inputs. Such effect helps the model correct some incorrect predictions on unlabeled data. To further encourage this mechanism, we propose to inject noise to the input space, resulting in a “noisy” version of self-training. Empirical study on standard machine translation and text summarization benchmarks shows that noisy self-training is able to effectively utilize unlabeled data and improve the performance of the supervised baseline by a large margin.
Tasks	Machine Translation, Text Summarization
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13788v2
PDF	https://arxiv.org/pdf/1909.13788v2.pdf
PWC	https://paperswithcode.com/paper/revisiting-self-training-for-neural-sequence
Repo
Framework

Controversial stimuli: pitting neural networks against each other as models of human recognition


Title	Controversial stimuli: pitting neural networks against each other as models of human recognition
Authors	Tal Golan, Prashant C. Raju, Nikolaus Kriegeskorte
Abstract	Distinct scientific theories can make similar predictions. To adjudicate between theories, we must design experiments for which the theories make distinct predictions. Here we consider the problem of comparing deep neural networks as models of human visual recognition. To efficiently determine which models better explain human responses, we synthesize controversial stimuli: images for which different models produce distinct responses. We tested nine different models, which employed different architectures and recognition algorithms, including discriminative and generative models, all trained to recognize handwritten digits (from the MNIST set of digit images). We synthesized controversial stimuli to maximize the disagreement among the models. Human subjects viewed hundreds of these stimuli and judged the probability of presence of each digit in each image. We quantified how accurately each model predicted the human judgements. We found that the generative models (which learn the distribution of images for each class) better predicted the human judgments than the discriminative models (which learn to directly map from images to labels). The best performing model was the generative Analysis-by-Synthesis model (based on variational autoencoders). However, a simpler generative model (based on Gaussian-kernel-density estimation) also performed better than each of the discriminative models. None of the candidate models fully explained the human responses. We discuss the advantages and limitations of controversial stimuli as an experimental paradigm and how they generalize and improve on adversarial examples as probes of discrepancies between models and human perception.
Tasks	Density Estimation
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09288v1
PDF	https://arxiv.org/pdf/1911.09288v1.pdf
PWC	https://paperswithcode.com/paper/controversial-stimuli-pitting-neural-networks
Repo
Framework

Space lower bounds for linear prediction in the streaming model


Title	Space lower bounds for linear prediction in the streaming model
Authors	Yuval Dagan, Gil Kur, Ohad Shamir
Abstract	We show that fundamental learning tasks, such as finding an approximate linear separator or linear regression, require memory at least \emph{quadratic} in the dimension, in a natural streaming setting. This implies that such problems cannot be solved (at least in this setting) by scalable memory-efficient streaming algorithms. Our results build on a memory lower bound for a simple linear-algebraic problem – finding orthogonal vectors – and utilize the estimates on the packing of the Grassmannian, the manifold of all linear subspaces of fixed dimension.
Tasks
Published	2019-02-09
URL	https://arxiv.org/abs/1902.03498v3
PDF	https://arxiv.org/pdf/1902.03498v3.pdf
PWC	https://paperswithcode.com/paper/space-lower-bounds-for-linear-prediction
Repo
Framework

Learning to Solve Large-Scale Security-Constrained Unit Commitment Problems


Title	Learning to Solve Large-Scale Security-Constrained Unit Commitment Problems
Authors	Alinson S. Xavier, Feng Qiu, Shabbir Ahmed
Abstract	Security-Constrained Unit Commitment (SCUC) is a fundamental problem in power systems and electricity markets. In practical settings, SCUC is repeatedly solved via Mixed-Integer Linear Programming, sometimes multiple times per day, with only minor changes in input data. In this work, we propose a number of machine learning (ML) techniques to effectively extract information from previously solved instances in order to significantly improve the computational performance of MIP solvers when solving similar instances in the future. Based on statistical data, we predict redundant constraints in the formulation, good initial feasible solutions and affine subspaces where the optimal solution is likely to lie, leading to significant reduction in problem size. Computational results on a diverse set of realistic and large-scale instances show that, using the proposed techniques, SCUC can be solved on average 4.3x faster with optimality guarantees, and 10.2x faster without optimality guarantees, but with no observed reduction in solution quality. Out-of-distribution experiments provides evidence that the method is somewhat robust against dataset shift.
Tasks
Published	2019-02-04
URL	https://arxiv.org/abs/1902.01697v2
PDF	https://arxiv.org/pdf/1902.01697v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-solve-large-scale-security
Repo
Framework

Generating Logical Forms from Graph Representations of Text and Entities


Title	Generating Logical Forms from Graph Representations of Text and Entities
Authors	Peter Shaw, Philip Massey, Angelica Chen, Francesco Piccinno, Yasemin Altun
Abstract	Structured information about entities is critical for many semantic parsing tasks. We present an approach that uses a Graph Neural Network (GNN) architecture to incorporate information about relevant entities and their relations during parsing. Combined with a decoder copy mechanism, this approach provides a conceptually simple mechanism to generate logical forms with entities. We demonstrate that this approach is competitive with the state-of-the-art across several tasks without pre-training, and outperforms existing approaches when combined with BERT pre-training.
Tasks	Semantic Parsing
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08407v3
PDF	https://arxiv.org/pdf/1905.08407v3.pdf
PWC	https://paperswithcode.com/paper/generating-logical-forms-from-graph
Repo
Framework

Multiple Face Analyses through Adversarial Learning


Title	Multiple Face Analyses through Adversarial Learning
Authors	Shangfei Wang, Shi Yin, Longfei Hao, Guang Liang
Abstract	This inherent relations among multiple face analysis tasks, such as landmark detection, head pose estimation, gender recognition and face attribute estimation are crucial to boost the performance of each task, but have not been thoroughly explored since typically these multiple face analysis tasks are handled as separate tasks. In this paper, we propose a novel deep multi-task adversarial learning method to localize facial landmark, estimate head pose and recognize gender jointly or estimate multiple face attributes simultaneously through exploring their dependencies from both image representation-level and label-level. Specifically, the proposed method consists of a deep recognition network R and a discriminator D. The deep recognition network is used to learn the shared middle-level image representation and conducts multiple face analysis tasks simultaneously. Through multi-task learning mechanism, the recognition network explores the dependencies among multiple face analysis tasks, such as facial landmark localization, head pose estimation, gender recognition and face attribute estimation from image representation-level. The discriminator is introduced to enforce the distribution of the multiple face analysis tasks to converge to that inherent in the ground-truth labels. During training, the recognizer tries to confuse the discriminator, while the discriminator competes with the recognizer through distinguishing the predicted label combination from the ground-truth one. Though adversarial learning, we explore the dependencies among multiple face analysis tasks from label-level. Experimental results on four benchmark databases, i.e., the AFLW database, the Multi-PIE database, the CelebA database and the LFWA database, demonstrate the effectiveness of the proposed method for multiple face analyses.
Tasks	Face Alignment, Head Pose Estimation, Multi-Task Learning, Pose Estimation
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07846v1
PDF	https://arxiv.org/pdf/1911.07846v1.pdf
PWC	https://paperswithcode.com/paper/multiple-face-analyses-through-adversarial
Repo
Framework

Contrast Enhancement of Medical X-Ray Image Using Morphological Operators with Optimal Structuring Element


Title	Contrast Enhancement of Medical X-Ray Image Using Morphological Operators with Optimal Structuring Element
Authors	Rafsanjany Kushol, Md. Nishat Raihan, Md Sirajus Salekin, A. B. M. Ashikur Rahman
Abstract	To guide surgical and medical treatment X-ray images have been used by physicians in every modern healthcare organization and hospitals. Doctor’s evaluation process and disease identification in the area of skeletal system can be performed in a faster and efficient way with the help of X-ray imaging technique as they can depict bone structure painlessly. This paper presents an efficient contrast enhancement technique using morphological operators which will help to visualize important bone segments and soft tissues more clearly. Top-hat and Bottom-hat transform are utilized to enhance the image where gradient magnitude value is calculated for automatically selecting the structuring element (SE) size. Experimental evaluation on different x-ray imaging databases shows the effectiveness of our method which also produces comparatively better output against some existing image enhancement techniques.
Tasks	Image Enhancement
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08545v1
PDF	https://arxiv.org/pdf/1905.08545v1.pdf
PWC	https://paperswithcode.com/paper/contrast-enhancement-of-medical-x-ray-image
Repo
Framework

Physics-guided Design and Learning of Neural Networks for Predicting Drag Force on Particle Suspensions in Moving Fluids


Title	Physics-guided Design and Learning of Neural Networks for Predicting Drag Force on Particle Suspensions in Moving Fluids
Authors	Nikhil Muralidhar, Jie Bu, Ze Cao, Long He, Naren Ramakrishnan, Danesh Tafti, Anuj Karpatne
Abstract	Physics-based simulations are often used to model and understand complex physical systems and processes in domains like fluid dynamics. Such simulations, although used frequently, have many limitations which could arise either due to the inability to accurately model a physical process owing to incomplete knowledge about certain facets of the process or due to the underlying process being too complex to accurately encode into a simulation model. In such situations, it is often useful to rely on machine learning methods to fill in the gap by learning a model of the complex physical process directly from simulation data. However, as data generation through simulations is costly, we need to develop models, being cognizant of data paucity issues. In such scenarios it is often helpful if the rich physical knowledge of the application domain is incorporated in the architectural design of machine learning models. Further, we can also use information from physics-based simulations to guide the learning process using aggregate supervision to favorably constrain the learning process. In this paper, we propose PhyDNN, a deep learning model using physics-guided structural priors and physics-guided aggregate supervision for modeling the drag forces acting on each particle in a Computational Fluid Dynamics-Discrete Element Method(CFD-DEM). We conduct extensive experiments in the context of drag force prediction and showcase the usefulness of including physics knowledge in our deep learning formulation both in the design and through learning process. Our proposed PhyDNN model has been compared to several state-of-the-art models and achieves a significant performance improvement of 8.46% on average across all baseline models. The source code has been made available and the dataset used is detailed in [1, 2].
Tasks
Published	2019-11-06
URL	https://arxiv.org/abs/1911.04240v1
PDF	https://arxiv.org/pdf/1911.04240v1.pdf
PWC	https://paperswithcode.com/paper/physics-guided-design-and-learning-of-neural
Repo
Framework

Automatic Calcium Scoring in Cardiac and Chest CT Using DenseRAUnet


Title	Automatic Calcium Scoring in Cardiac and Chest CT Using DenseRAUnet
Authors	Jiechao Ma, Rongguo Zhang
Abstract	Cardiovascular disease (CVD) is a common and strong threat to human beings, featuring high prevalence, disability and mortality. The amount of coronary artery calcification (CAC) is an effective factor for CVD risk evaluation. Conventionally, CAC is quantified using ECG-synchronized cardiac CT but rarely from general chest CT scans. However, compared with ECG-synchronized cardiac CT, chest CT is more prevalent and economical in clinical practice. To address this, we propose an automatic method based on Dense U-Net to segment coronary calcium pixels on both types of CT scans. Our contribution is two-fold. First, we propose a novel network called DenseRAUnet, which takes advantage of Dense U-net, ResNet and atrous convolutions. We prove the robustness and generalizability of our model by training it exclusively on chest CT while test on both types of CT scans. Second, we design a loss function combining bootstrap with IoU function to balance foreground and background classes. DenseRAUnet is trained in a 2.5D fashion and tested on a private dataset consisting of 144 scans. Results show an F1-score of 0.75, with 0.83 accuracy of predicting cardiovascular disease risk.
Tasks
Published	2019-07-26
URL	https://arxiv.org/abs/1907.11392v1
PDF	https://arxiv.org/pdf/1907.11392v1.pdf
PWC	https://paperswithcode.com/paper/automatic-calcium-scoring-in-cardiac-and
Repo
Framework

Composition operators on reproducing kernel Hilbert spaces with analytic positive definite functions


Title	Composition operators on reproducing kernel Hilbert spaces with analytic positive definite functions
Authors	Masahiro Ikeda, Isao Ishikawa, Yoshihiro Sawano
Abstract	Composition operators have been extensively studied in complex analysis, and recently, they have been utilized in engineering and machine learning. Here, we focus on composition operators associated with maps in Euclidean spaces that are on reproducing kernel Hilbert spaces with respect to analytic positive definite functions, and prove the maps are affine if the composition operators are bounded. Our result covers composition operators on Paley-Wiener spaces and reproducing kernel spaces with respect to the Gaussian kernel on ${\mathbb R}^d$, widely used in the context of engineering.
Tasks
Published	2019-11-27
URL	https://arxiv.org/abs/1911.11992v1
PDF	https://arxiv.org/pdf/1911.11992v1.pdf
PWC	https://paperswithcode.com/paper/composition-operators-on-reproducing-kernel
Repo
Framework

RIO: 3D Object Instance Re-Localization in Changing Indoor Environments


Title	RIO: 3D Object Instance Re-Localization in Changing Indoor Environments
Authors	Johanna Wald, Armen Avetisyan, Nassir Navab, Federico Tombari, Matthias Nießner
Abstract	In this work, we introduce the task of 3D object instance re-localization (RIO): given one or multiple objects in an RGB-D scan, we want to estimate their corresponding 6DoF poses in another 3D scan of the same environment taken at a later point in time. We consider RIO a particularly important task in 3D vision since it enables a wide range of practical applications, including AI-assistants or robots that are asked to find a specific object in a 3D scene. To address this problem, we first introduce 3RScan, a novel dataset and benchmark, which features 1482 RGB-D scans of 478 environments across multiple time steps. Each scene includes several objects whose positions change over time, together with ground truth annotations of object instances and their respective 6DoF mappings among re-scans. Automatically finding 6DoF object poses leads to a particular challenging feature matching task due to varying partial observations and changes in the surrounding context. To this end, we introduce a new data-driven approach that efficiently finds matching features using a fully-convolutional 3D correspondence network operating on multiple spatial scales. Combined with a 6DoF pose optimization, our method outperforms state-of-the-art baselines on our newly-established benchmark, achieving an accuracy of 30.58%.
Tasks
Published	2019-08-16
URL	https://arxiv.org/abs/1908.06109v1
PDF	https://arxiv.org/pdf/1908.06109v1.pdf
PWC	https://paperswithcode.com/paper/rio-3d-object-instance-re-localization-in
Repo
Framework

Simultaneous Spectral-Spatial Feature Selection and Extraction for Hyperspectral Images


Title	Simultaneous Spectral-Spatial Feature Selection and Extraction for Hyperspectral Images
Authors	Lefei Zhang, Qian Zhang, Bo Du, Xin Huang, Yuan Yan Tang, Dacheng Tao
Abstract	In hyperspectral remote sensing data mining, it is important to take into account of both spectral and spatial information, such as the spectral signature, texture feature and morphological property, to improve the performances, e.g., the image classification accuracy. In a feature representation point of view, a nature approach to handle this situation is to concatenate the spectral and spatial features into a single but high dimensional vector and then apply a certain dimension reduction technique directly on that concatenated vector before feed it into the subsequent classifier. However, multiple features from various domains definitely have different physical meanings and statistical properties, and thus such concatenation hasn’t efficiently explore the complementary properties among different features, which should benefit for boost the feature discriminability. Furthermore, it is also difficult to interpret the transformed results of the concatenated vector. Consequently, finding a physically meaningful consensus low dimensional feature representation of original multiple features is still a challenging task. In order to address the these issues, we propose a novel feature learning framework, i.e., the simultaneous spectral-spatial feature selection and extraction algorithm, for hyperspectral images spectral-spatial feature representation and classification. Specifically, the proposed method learns a latent low dimensional subspace by projecting the spectral-spatial feature into a common feature space, where the complementary information has been effectively exploited, and simultaneously, only the most significant original features have been transformed. Encouraging experimental results on three public available hyperspectral remote sensing datasets confirm that our proposed method is effective and efficient.
Tasks	Dimensionality Reduction, Feature Selection, Image Classification
Published	2019-04-08
URL	http://arxiv.org/abs/1904.03982v1
PDF	http://arxiv.org/pdf/1904.03982v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-spectral-spatial-feature
Repo
Framework