Paper Group ANR 1595
Points2Pix: 3D Point-Cloud to Image Translation using conditional Generative Adversarial Networks. Benchmarking machine learning models on eICU critical care dataset. Learning Super-resolution 3D Segmentation of Plant Root MRI Images from Few Examples. Revisiting Self-Training for Neural Sequence Generation. Controversial stimuli: pitting neural ne …
Points2Pix: 3D Point-Cloud to Image Translation using conditional Generative Adversarial Networks
Title | Points2Pix: 3D Point-Cloud to Image Translation using conditional Generative Adversarial Networks |
Authors | Stefan Milz, Martin Simon, Kai Fischer, Maximillian Pöpperl |
Abstract | We present the first approach for 3D point-cloud to image translation based on conditional Generative Adversarial Networks (cGAN). The model handles multi-modal information sources from different domains, i.e. raw point-sets and images. The generator is capable of processing three conditions, whereas the point-cloud is encoded as raw point-set and camera projection. An image background patch is used as constraint to bias environmental texturing. A global approximation function within the generator is directly applied on the point-cloud (Point-Net). Hence, the representative learning model incorporates global 3D characteristics directly at the latent feature space. Conditions are used to bias the background and the viewpoint of the generated image. This opens up new ways in augmenting or texturing 3D data to aim the generation of fully individual images. We successfully evaluated our method on the Kitti and SunRGBD dataset with an outstanding object detection inception score. |
Tasks | Object Detection |
Published | 2019-01-26 |
URL | https://arxiv.org/abs/1901.09280v2 |
https://arxiv.org/pdf/1901.09280v2.pdf | |
PWC | https://paperswithcode.com/paper/points2pix-3d-point-cloud-to-image |
Repo | |
Framework | |
Benchmarking machine learning models on eICU critical care dataset
Title | Benchmarking machine learning models on eICU critical care dataset |
Authors | Seyedmostafa Sheikhalishahi, Vevake Balaraman, Venet Osmani |
Abstract | Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as vision and NLP) have already established various competitions and benchmarks, whereas only recent availability of large clinical datasets has enabled the possibility of public benchmarks. Taking advantage of this opportunity, we propose a public benchmark suite to address four areas of critical care, namely mortality prediction, estimation of length of stay, patient phenotyping and risk of decompensation. We define each task and compare the performance of both clinical models as well as baseline and deep models using eICU critical care dataset of around 73,000 patients. Furthermore, we investigate the impact of numerical variables as well as handling of categorical variables for each of the defined tasks. |
Tasks | Mortality Prediction |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.00964v1 |
https://arxiv.org/pdf/1910.00964v1.pdf | |
PWC | https://paperswithcode.com/paper/benchmarking-machine-learning-models-on-eicu |
Repo | |
Framework | |
Learning Super-resolution 3D Segmentation of Plant Root MRI Images from Few Examples
Title | Learning Super-resolution 3D Segmentation of Plant Root MRI Images from Few Examples |
Authors | Ali Oguz Uzman, Jannis Horn, Sven Behnke |
Abstract | Analyzing plant roots is crucial to understand plant performance in different soil environments. While magnetic resonance imaging (MRI) can be used to obtain 3D images of plant roots, extracting the root structural model is challenging due to highly noisy soil environments and low-resolution of MRI images. To improve both contrast and resolution, we adapt the state-of-the-art method RefineNet for 3D segmentation of the plant root MRI images in super-resolution. The networks are trained from few manual segmentations that are augmented by geometric transformations, realistic noise, and other variabilities. The resulting segmentations contain most root structures, including branches not extracted by the human annotator. |
Tasks | Super-Resolution |
Published | 2019-03-16 |
URL | http://arxiv.org/abs/1903.06855v1 |
http://arxiv.org/pdf/1903.06855v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-super-resolution-3d-segmentation-of |
Repo | |
Framework | |
Revisiting Self-Training for Neural Sequence Generation
Title | Revisiting Self-Training for Neural Sequence Generation |
Authors | Junxian He, Jiatao Gu, Jiajun Shen, Marc’Aurelio Ranzato |
Abstract | Self-training is one of the earliest and simplest semi-supervised methods. The key idea is to augment the original labeled dataset with unlabeled data paired with the model’s prediction (i.e. the pseudo-parallel data). While self-training has been extensively studied on classification problems, in complex sequence generation tasks (e.g. machine translation) it is still unclear how self-training works due to the compositionality of the target space. In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks. Through careful examination of the performance gains, we find that the perturbation on the hidden states (i.e. dropout) is critical for self-training to benefit from the pseudo-parallel data, which acts as a regularizer and forces the model to yield close predictions for similar unlabeled inputs. Such effect helps the model correct some incorrect predictions on unlabeled data. To further encourage this mechanism, we propose to inject noise to the input space, resulting in a “noisy” version of self-training. Empirical study on standard machine translation and text summarization benchmarks shows that noisy self-training is able to effectively utilize unlabeled data and improve the performance of the supervised baseline by a large margin. |
Tasks | Machine Translation, Text Summarization |
Published | 2019-09-30 |
URL | https://arxiv.org/abs/1909.13788v2 |
https://arxiv.org/pdf/1909.13788v2.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-self-training-for-neural-sequence |
Repo | |
Framework | |
Controversial stimuli: pitting neural networks against each other as models of human recognition
Title | Controversial stimuli: pitting neural networks against each other as models of human recognition |
Authors | Tal Golan, Prashant C. Raju, Nikolaus Kriegeskorte |
Abstract | Distinct scientific theories can make similar predictions. To adjudicate between theories, we must design experiments for which the theories make distinct predictions. Here we consider the problem of comparing deep neural networks as models of human visual recognition. To efficiently determine which models better explain human responses, we synthesize controversial stimuli: images for which different models produce distinct responses. We tested nine different models, which employed different architectures and recognition algorithms, including discriminative and generative models, all trained to recognize handwritten digits (from the MNIST set of digit images). We synthesized controversial stimuli to maximize the disagreement among the models. Human subjects viewed hundreds of these stimuli and judged the probability of presence of each digit in each image. We quantified how accurately each model predicted the human judgements. We found that the generative models (which learn the distribution of images for each class) better predicted the human judgments than the discriminative models (which learn to directly map from images to labels). The best performing model was the generative Analysis-by-Synthesis model (based on variational autoencoders). However, a simpler generative model (based on Gaussian-kernel-density estimation) also performed better than each of the discriminative models. None of the candidate models fully explained the human responses. We discuss the advantages and limitations of controversial stimuli as an experimental paradigm and how they generalize and improve on adversarial examples as probes of discrepancies between models and human perception. |
Tasks | Density Estimation |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09288v1 |
https://arxiv.org/pdf/1911.09288v1.pdf | |
PWC | https://paperswithcode.com/paper/controversial-stimuli-pitting-neural-networks |
Repo | |
Framework | |
Space lower bounds for linear prediction in the streaming model
Title | Space lower bounds for linear prediction in the streaming model |
Authors | Yuval Dagan, Gil Kur, Ohad Shamir |
Abstract | We show that fundamental learning tasks, such as finding an approximate linear separator or linear regression, require memory at least \emph{quadratic} in the dimension, in a natural streaming setting. This implies that such problems cannot be solved (at least in this setting) by scalable memory-efficient streaming algorithms. Our results build on a memory lower bound for a simple linear-algebraic problem – finding orthogonal vectors – and utilize the estimates on the packing of the Grassmannian, the manifold of all linear subspaces of fixed dimension. |
Tasks | |
Published | 2019-02-09 |
URL | https://arxiv.org/abs/1902.03498v3 |
https://arxiv.org/pdf/1902.03498v3.pdf | |
PWC | https://paperswithcode.com/paper/space-lower-bounds-for-linear-prediction |
Repo | |
Framework | |
Learning to Solve Large-Scale Security-Constrained Unit Commitment Problems
Title | Learning to Solve Large-Scale Security-Constrained Unit Commitment Problems |
Authors | Alinson S. Xavier, Feng Qiu, Shabbir Ahmed |
Abstract | Security-Constrained Unit Commitment (SCUC) is a fundamental problem in power systems and electricity markets. In practical settings, SCUC is repeatedly solved via Mixed-Integer Linear Programming, sometimes multiple times per day, with only minor changes in input data. In this work, we propose a number of machine learning (ML) techniques to effectively extract information from previously solved instances in order to significantly improve the computational performance of MIP solvers when solving similar instances in the future. Based on statistical data, we predict redundant constraints in the formulation, good initial feasible solutions and affine subspaces where the optimal solution is likely to lie, leading to significant reduction in problem size. Computational results on a diverse set of realistic and large-scale instances show that, using the proposed techniques, SCUC can be solved on average 4.3x faster with optimality guarantees, and 10.2x faster without optimality guarantees, but with no observed reduction in solution quality. Out-of-distribution experiments provides evidence that the method is somewhat robust against dataset shift. |
Tasks | |
Published | 2019-02-04 |
URL | https://arxiv.org/abs/1902.01697v2 |
https://arxiv.org/pdf/1902.01697v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-solve-large-scale-security |
Repo | |
Framework | |
Generating Logical Forms from Graph Representations of Text and Entities
Title | Generating Logical Forms from Graph Representations of Text and Entities |
Authors | Peter Shaw, Philip Massey, Angelica Chen, Francesco Piccinno, Yasemin Altun |
Abstract | Structured information about entities is critical for many semantic parsing tasks. We present an approach that uses a Graph Neural Network (GNN) architecture to incorporate information about relevant entities and their relations during parsing. Combined with a decoder copy mechanism, this approach provides a conceptually simple mechanism to generate logical forms with entities. We demonstrate that this approach is competitive with the state-of-the-art across several tasks without pre-training, and outperforms existing approaches when combined with BERT pre-training. |
Tasks | Semantic Parsing |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08407v3 |
https://arxiv.org/pdf/1905.08407v3.pdf | |
PWC | https://paperswithcode.com/paper/generating-logical-forms-from-graph |
Repo | |
Framework | |
Multiple Face Analyses through Adversarial Learning
Title | Multiple Face Analyses through Adversarial Learning |
Authors | Shangfei Wang, Shi Yin, Longfei Hao, Guang Liang |
Abstract | This inherent relations among multiple face analysis tasks, such as landmark detection, head pose estimation, gender recognition and face attribute estimation are crucial to boost the performance of each task, but have not been thoroughly explored since typically these multiple face analysis tasks are handled as separate tasks. In this paper, we propose a novel deep multi-task adversarial learning method to localize facial landmark, estimate head pose and recognize gender jointly or estimate multiple face attributes simultaneously through exploring their dependencies from both image representation-level and label-level. Specifically, the proposed method consists of a deep recognition network R and a discriminator D. The deep recognition network is used to learn the shared middle-level image representation and conducts multiple face analysis tasks simultaneously. Through multi-task learning mechanism, the recognition network explores the dependencies among multiple face analysis tasks, such as facial landmark localization, head pose estimation, gender recognition and face attribute estimation from image representation-level. The discriminator is introduced to enforce the distribution of the multiple face analysis tasks to converge to that inherent in the ground-truth labels. During training, the recognizer tries to confuse the discriminator, while the discriminator competes with the recognizer through distinguishing the predicted label combination from the ground-truth one. Though adversarial learning, we explore the dependencies among multiple face analysis tasks from label-level. Experimental results on four benchmark databases, i.e., the AFLW database, the Multi-PIE database, the CelebA database and the LFWA database, demonstrate the effectiveness of the proposed method for multiple face analyses. |
Tasks | Face Alignment, Head Pose Estimation, Multi-Task Learning, Pose Estimation |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07846v1 |
https://arxiv.org/pdf/1911.07846v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-face-analyses-through-adversarial |
Repo | |
Framework | |
Contrast Enhancement of Medical X-Ray Image Using Morphological Operators with Optimal Structuring Element
Title | Contrast Enhancement of Medical X-Ray Image Using Morphological Operators with Optimal Structuring Element |
Authors | Rafsanjany Kushol, Md. Nishat Raihan, Md Sirajus Salekin, A. B. M. Ashikur Rahman |
Abstract | To guide surgical and medical treatment X-ray images have been used by physicians in every modern healthcare organization and hospitals. Doctor’s evaluation process and disease identification in the area of skeletal system can be performed in a faster and efficient way with the help of X-ray imaging technique as they can depict bone structure painlessly. This paper presents an efficient contrast enhancement technique using morphological operators which will help to visualize important bone segments and soft tissues more clearly. Top-hat and Bottom-hat transform are utilized to enhance the image where gradient magnitude value is calculated for automatically selecting the structuring element (SE) size. Experimental evaluation on different x-ray imaging databases shows the effectiveness of our method which also produces comparatively better output against some existing image enhancement techniques. |
Tasks | Image Enhancement |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08545v1 |
https://arxiv.org/pdf/1905.08545v1.pdf | |
PWC | https://paperswithcode.com/paper/contrast-enhancement-of-medical-x-ray-image |
Repo | |
Framework | |
Physics-guided Design and Learning of Neural Networks for Predicting Drag Force on Particle Suspensions in Moving Fluids
Title | Physics-guided Design and Learning of Neural Networks for Predicting Drag Force on Particle Suspensions in Moving Fluids |
Authors | Nikhil Muralidhar, Jie Bu, Ze Cao, Long He, Naren Ramakrishnan, Danesh Tafti, Anuj Karpatne |
Abstract | Physics-based simulations are often used to model and understand complex physical systems and processes in domains like fluid dynamics. Such simulations, although used frequently, have many limitations which could arise either due to the inability to accurately model a physical process owing to incomplete knowledge about certain facets of the process or due to the underlying process being too complex to accurately encode into a simulation model. In such situations, it is often useful to rely on machine learning methods to fill in the gap by learning a model of the complex physical process directly from simulation data. However, as data generation through simulations is costly, we need to develop models, being cognizant of data paucity issues. In such scenarios it is often helpful if the rich physical knowledge of the application domain is incorporated in the architectural design of machine learning models. Further, we can also use information from physics-based simulations to guide the learning process using aggregate supervision to favorably constrain the learning process. In this paper, we propose PhyDNN, a deep learning model using physics-guided structural priors and physics-guided aggregate supervision for modeling the drag forces acting on each particle in a Computational Fluid Dynamics-Discrete Element Method(CFD-DEM). We conduct extensive experiments in the context of drag force prediction and showcase the usefulness of including physics knowledge in our deep learning formulation both in the design and through learning process. Our proposed PhyDNN model has been compared to several state-of-the-art models and achieves a significant performance improvement of 8.46% on average across all baseline models. The source code has been made available and the dataset used is detailed in [1, 2]. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.04240v1 |
https://arxiv.org/pdf/1911.04240v1.pdf | |
PWC | https://paperswithcode.com/paper/physics-guided-design-and-learning-of-neural |
Repo | |
Framework | |
Automatic Calcium Scoring in Cardiac and Chest CT Using DenseRAUnet
Title | Automatic Calcium Scoring in Cardiac and Chest CT Using DenseRAUnet |
Authors | Jiechao Ma, Rongguo Zhang |
Abstract | Cardiovascular disease (CVD) is a common and strong threat to human beings, featuring high prevalence, disability and mortality. The amount of coronary artery calcification (CAC) is an effective factor for CVD risk evaluation. Conventionally, CAC is quantified using ECG-synchronized cardiac CT but rarely from general chest CT scans. However, compared with ECG-synchronized cardiac CT, chest CT is more prevalent and economical in clinical practice. To address this, we propose an automatic method based on Dense U-Net to segment coronary calcium pixels on both types of CT scans. Our contribution is two-fold. First, we propose a novel network called DenseRAUnet, which takes advantage of Dense U-net, ResNet and atrous convolutions. We prove the robustness and generalizability of our model by training it exclusively on chest CT while test on both types of CT scans. Second, we design a loss function combining bootstrap with IoU function to balance foreground and background classes. DenseRAUnet is trained in a 2.5D fashion and tested on a private dataset consisting of 144 scans. Results show an F1-score of 0.75, with 0.83 accuracy of predicting cardiovascular disease risk. |
Tasks | |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.11392v1 |
https://arxiv.org/pdf/1907.11392v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-calcium-scoring-in-cardiac-and |
Repo | |
Framework | |
Composition operators on reproducing kernel Hilbert spaces with analytic positive definite functions
Title | Composition operators on reproducing kernel Hilbert spaces with analytic positive definite functions |
Authors | Masahiro Ikeda, Isao Ishikawa, Yoshihiro Sawano |
Abstract | Composition operators have been extensively studied in complex analysis, and recently, they have been utilized in engineering and machine learning. Here, we focus on composition operators associated with maps in Euclidean spaces that are on reproducing kernel Hilbert spaces with respect to analytic positive definite functions, and prove the maps are affine if the composition operators are bounded. Our result covers composition operators on Paley-Wiener spaces and reproducing kernel spaces with respect to the Gaussian kernel on ${\mathbb R}^d$, widely used in the context of engineering. |
Tasks | |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.11992v1 |
https://arxiv.org/pdf/1911.11992v1.pdf | |
PWC | https://paperswithcode.com/paper/composition-operators-on-reproducing-kernel |
Repo | |
Framework | |
RIO: 3D Object Instance Re-Localization in Changing Indoor Environments
Title | RIO: 3D Object Instance Re-Localization in Changing Indoor Environments |
Authors | Johanna Wald, Armen Avetisyan, Nassir Navab, Federico Tombari, Matthias Nießner |
Abstract | In this work, we introduce the task of 3D object instance re-localization (RIO): given one or multiple objects in an RGB-D scan, we want to estimate their corresponding 6DoF poses in another 3D scan of the same environment taken at a later point in time. We consider RIO a particularly important task in 3D vision since it enables a wide range of practical applications, including AI-assistants or robots that are asked to find a specific object in a 3D scene. To address this problem, we first introduce 3RScan, a novel dataset and benchmark, which features 1482 RGB-D scans of 478 environments across multiple time steps. Each scene includes several objects whose positions change over time, together with ground truth annotations of object instances and their respective 6DoF mappings among re-scans. Automatically finding 6DoF object poses leads to a particular challenging feature matching task due to varying partial observations and changes in the surrounding context. To this end, we introduce a new data-driven approach that efficiently finds matching features using a fully-convolutional 3D correspondence network operating on multiple spatial scales. Combined with a 6DoF pose optimization, our method outperforms state-of-the-art baselines on our newly-established benchmark, achieving an accuracy of 30.58%. |
Tasks | |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.06109v1 |
https://arxiv.org/pdf/1908.06109v1.pdf | |
PWC | https://paperswithcode.com/paper/rio-3d-object-instance-re-localization-in |
Repo | |
Framework | |
Simultaneous Spectral-Spatial Feature Selection and Extraction for Hyperspectral Images
Title | Simultaneous Spectral-Spatial Feature Selection and Extraction for Hyperspectral Images |
Authors | Lefei Zhang, Qian Zhang, Bo Du, Xin Huang, Yuan Yan Tang, Dacheng Tao |
Abstract | In hyperspectral remote sensing data mining, it is important to take into account of both spectral and spatial information, such as the spectral signature, texture feature and morphological property, to improve the performances, e.g., the image classification accuracy. In a feature representation point of view, a nature approach to handle this situation is to concatenate the spectral and spatial features into a single but high dimensional vector and then apply a certain dimension reduction technique directly on that concatenated vector before feed it into the subsequent classifier. However, multiple features from various domains definitely have different physical meanings and statistical properties, and thus such concatenation hasn’t efficiently explore the complementary properties among different features, which should benefit for boost the feature discriminability. Furthermore, it is also difficult to interpret the transformed results of the concatenated vector. Consequently, finding a physically meaningful consensus low dimensional feature representation of original multiple features is still a challenging task. In order to address the these issues, we propose a novel feature learning framework, i.e., the simultaneous spectral-spatial feature selection and extraction algorithm, for hyperspectral images spectral-spatial feature representation and classification. Specifically, the proposed method learns a latent low dimensional subspace by projecting the spectral-spatial feature into a common feature space, where the complementary information has been effectively exploited, and simultaneously, only the most significant original features have been transformed. Encouraging experimental results on three public available hyperspectral remote sensing datasets confirm that our proposed method is effective and efficient. |
Tasks | Dimensionality Reduction, Feature Selection, Image Classification |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.03982v1 |
http://arxiv.org/pdf/1904.03982v1.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-spectral-spatial-feature |
Repo | |
Framework | |