January 28, 2020

3025 words 15 mins read

Paper Group ANR 996

From Receptive to Productive: Learning to Use Confusing Words through Automatically Selected Example Sentences. Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event Captioning. Robust Super-Resolution GAN, with Manifold-based and Perception Loss. Blood lactate concentration prediction in critical care patients: handling missing values. …

From Receptive to Productive: Learning to Use Confusing Words through Automatically Selected Example Sentences


Title	From Receptive to Productive: Learning to Use Confusing Words through Automatically Selected Example Sentences
Authors	Chieh-Yang Huang, Yi-Ting Huang, Mei-Hua Chen, Lun-Wei Ku
Abstract	Knowing how to use words appropriately has been a key to improving language proficiency. Previous studies typically discuss how students learn receptively to select the correct candidate from a set of confusing words in the fill-in-the-blank task where specific context is given. In this paper, we go one step further, assisting students to learn to use confusing words appropriately in a productive task: sentence translation. We leverage the GiveMeExample system, which suggests example sentences for each confusing word, to achieve this goal. In this study, students learn to differentiate the confusing words by reading the example sentences, and then choose the appropriate word(s) to complete the sentence translation task. Results show students made substantial progress in terms of sentence structure. In addition, highly proficient students better managed to learn confusing words. In view of the influence of the first language on learners, we further propose an effective approach to improve the quality of the suggested sentences.
Tasks
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02782v1
PDF	https://arxiv.org/pdf/1906.02782v1.pdf
PWC	https://paperswithcode.com/paper/from-receptive-to-productive-learning-to-use
Repo
Framework


Title	Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event Captioning
Authors	Tanzila Rahman, Bicheng Xu, Leonid Sigal
Abstract	Multi-modal learning, particularly among imaging and linguistic modalities, has made amazing strides in many high-level fundamental visual understanding problems, ranging from language grounding to dense event captioning. However, much of the research has been limited to approaches that either do not take audio corresponding to video into account at all, or those that model the audio-visual correlations in service of sound or sound source localization. In this paper, we present the evidence, that audio signals can carry surprising amount of information when it comes to high-level visual-lingual tasks. Specifically, we focus on the problem of weakly-supervised dense event captioning in videos and show that audio on its own can nearly rival performance of a state-of-the-art visual model and, combined with video, can improve on the state-of-the-art performance. Extensive experiments on the ActivityNet Captions dataset show that our proposed multi-modal approach outperforms state-of-the-art unimodal methods, as well as validate specific feature representation and architecture design choices.
Tasks
Published	2019-09-22
URL	https://arxiv.org/abs/1909.09944v2
PDF	https://arxiv.org/pdf/1909.09944v2.pdf
PWC	https://paperswithcode.com/paper/190909944
Repo
Framework

Robust Super-Resolution GAN, with Manifold-based and Perception Loss


Title	Robust Super-Resolution GAN, with Manifold-based and Perception Loss
Authors	Uddeshya Upadhyay, Suyash P. Awate
Abstract	Super-resolution using deep neural networks typically relies on highly curated training sets that are often unavailable in clinical deployment scenarios. Using loss functions that assume Gaussian-distributed residuals makes the learning sensitive to corruptions in clinical training sets. We propose novel loss functions that are robust to corruptions in training sets by modeling heavy-tailed non-Gaussian distributions on the residuals. We propose a loss based on an autoencoder-based manifold-distance between the super-resolved and high-resolution images, to reproduce realistic textural content in super-resolved images. We propose to learn to super-resolve images to match human perceptions of structure, luminance, and contrast. Results on a large clinical dataset shows the advantages of each of our contributions, where our framework improves over the state of the art.
Tasks	Super-Resolution
Published	2019-03-16
URL	http://arxiv.org/abs/1903.06920v1
PDF	http://arxiv.org/pdf/1903.06920v1.pdf
PWC	https://paperswithcode.com/paper/robust-super-resolution-gan-with-manifold
Repo
Framework

Blood lactate concentration prediction in critical care patients: handling missing values


Title	Blood lactate concentration prediction in critical care patients: handling missing values
Authors	Behrooz Mamandipoor, Mahshid Majd, Monica Moz, Venet Osmani
Abstract	Blood lactate concentration is a strong indicator of mortality risk in critically ill patients. While frequent lactate measurements are necessary to assess patient’s health state, the measurement is an invasive procedure that can increase risk of hospital-acquired infections. For this reason we formally define the problem of lactate prediction as a clinically relevant benchmark problem for machine learning community so as to assist clinical decision making in blood lactate testing. Accordingly, we demonstrate the relevant challenges of the problem and its data in addition to the adopted solutions. Also, we evaluate the performance of different prediction algorithms on a large dataset of ICU patients from the multi-centre eICU database. More specifically, we focus on investigating the impact of missing value imputation methods in lactate prediction for each algorithm. The experimental analysis shows promising prediction results that encourages further investigation of this problem.
Tasks	Decision Making, Imputation
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01473v1
PDF	https://arxiv.org/pdf/1910.01473v1.pdf
PWC	https://paperswithcode.com/paper/blood-lactate-concentration-prediction-in
Repo
Framework

Heteroscedastic Calibration of Uncertainty Estimators in Deep Learning


Title	Heteroscedastic Calibration of Uncertainty Estimators in Deep Learning
Authors	Bindya Venkatesh, Jayaraman J. Thiagarajan
Abstract	The role of uncertainty quantification (UQ) in deep learning has become crucial with growing use of predictive models in high-risk applications. Though a large class of methods exists for measuring deep uncertainties, in practice, the resulting estimates are found to be poorly calibrated, thus making it challenging to translate them into actionable insights. A common workaround is to utilize a separate recalibration step, which adjusts the estimates to compensate for the miscalibration. Instead, we propose to repurpose the heteroscedastic regression objective as a surrogate for calibration and enable any existing uncertainty estimator to be inherently calibrated. In addition to eliminating the need for recalibration, this also regularizes the training process. Using regression experiments, we demonstrate the effectiveness of the proposed heteroscedastic calibration with two popular uncertainty estimators.
Tasks	Calibration
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14179v1
PDF	https://arxiv.org/pdf/1910.14179v1.pdf
PWC	https://paperswithcode.com/paper/heteroscedastic-calibration-of-uncertainty
Repo
Framework

Cylindrical shape decomposition for 3D segmentation of tubular objects


Title	Cylindrical shape decomposition for 3D segmentation of tubular objects
Authors	Ali Abdollahzadeh, Alejandra Sierra, Jussi Tohka
Abstract	We develop a cylindrical shape decomposition (CSD) algorithm to decompose an object, which is a union of several tubular structures, into its semantic components. We decompose the object using its curve skeleton and translational sweeps. For that, CSD partitions the curve skeleton into maximal-length sub-skeletons over an orientation function, each sub-skeleton corresponds to a semantic component. To find the intersection of the tubular components, CSD translationally sweeps the object in decomposition intervals to identify critical points at which the shape of the object changes substantially. CSD cuts the object at critical points and assigns the same label to parts along the same sub-skeleton, thereby constructing a semantic component. CSD further reconstructs the semantic components between parts using generalized cylinders. We apply CSD for the segmentation of axons in large 3D electron microscopy images, and the decomposition of vascular networks, as well as synthetic objects. We show that CSD outperforms state-of-the-art decomposition techniques in these applications.
Tasks	Semantic Segmentation
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00571v2
PDF	https://arxiv.org/pdf/1911.00571v2.pdf
PWC	https://paperswithcode.com/paper/cylindrical-shape-decomposition-algorithm-for
Repo
Framework

Combining Shape Priors with Conditional Adversarial Networks for Improved Scapula Segmentation in MR images


Title	Combining Shape Priors with Conditional Adversarial Networks for Improved Scapula Segmentation in MR images
Authors	Arnaud Boutillon, Bhushan Borotikar, Valérie Burdin, Pierre-Henri Conze
Abstract	This paper proposes an automatic method for scapula bone segmentation from Magnetic Resonance (MR) images using deep learning. The purpose of this work is to incorporate anatomical priors into a conditional adversarial framework, given a limited amount of heterogeneous annotated images. Our approach encourages the segmentation model to follow the global anatomical properties of the underlying anatomy through a learnt non-linear shape representation while the adversarial contribution refines the model by promoting realistic delineations. These contributions are evaluated on a dataset of 15 pediatric shoulder examinations, and compared to state-of-the-art architectures including UNet and recent derivatives. The significant improvements achieved bring new perspectives for the pre-operative management of musculo-skeletal diseases.
Tasks
Published	2019-10-20
URL	https://arxiv.org/abs/1910.08963v3
PDF	https://arxiv.org/pdf/1910.08963v3.pdf
PWC	https://paperswithcode.com/paper/combining-shape-priors-with-conditional
Repo
Framework


Title	Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation
Authors	Fengda Zhu, Linchao Zhu, Yi Yang
Abstract	There has been an increasing interest in 3D indoor navigation, where a robot in an environment moves to a target according to an instruction. To deploy a robot for navigation in the physical world, lots of training data is required to learn an effective policy. It is quite labour intensive to obtain sufficient real environment data for training robots while synthetic data is much easier to construct by rendering. Though it is promising to utilize the synthetic environments to facilitate navigation training in the real world, real environment are heterogeneous from synthetic environment in two aspects. First, the visual representation of the two environments have significant variances. Second, the houseplans of these two environments are quite different. Therefore two types of information, i.e. visual representation and policy behavior, need to be adapted in the reinforcement model. The learning procedure of visual representation and that of policy behavior are presumably reciprocal. We propose to jointly adapt visual representation and policy behavior to leverage the mutual impacts of environment and policy. Specifically, our method employs an adversarial feature adaptation model for visual representation transfer and a policy mimic strategy for policy behavior imitation. Experiment shows that our method outperforms the baseline by 19.47% without any additional human annotations.
Tasks
Published	2019-04-08
URL	http://arxiv.org/abs/1904.03895v2
PDF	http://arxiv.org/pdf/1904.03895v2.pdf
PWC	https://paperswithcode.com/paper/sim-real-joint-reinforcement-transfer-for-3d
Repo
Framework

Self-supervised 3D Shape and Viewpoint Estimation from Single Images for Robotics


Title	Self-supervised 3D Shape and Viewpoint Estimation from Single Images for Robotics
Authors	Oier Mees, Maxim Tatarchenko, Thomas Brox, Wolfram Burgard
Abstract	We present a convolutional neural network for joint 3D shape prediction and viewpoint estimation from a single input image. During training, our network gets the learning signal from a silhouette of an object in the input image - a form of self-supervision. It does not require ground truth data for 3D shapes and the viewpoints. Because it relies on such a weak form of supervision, our approach can easily be applied to real-world data. We demonstrate that our method produces reasonable qualitative and quantitative results on natural images for both shape estimation and viewpoint prediction. Unlike previous approaches, our method does not require multiple views of the same object instance in the dataset, which significantly expands the applicability in practical robotics scenarios. We showcase it by using the hallucinated shapes to improve the performance on the task of grasping real-world objects both in simulation and with a PR2 robot.
Tasks	Viewpoint Estimation
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07948v1
PDF	https://arxiv.org/pdf/1910.07948v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-3d-shape-and-viewpoint
Repo
Framework

Boldly Going Where No Prover Has Gone Before


Title	Boldly Going Where No Prover Has Gone Before
Authors	Giles Reger
Abstract	I argue that the most interesting goal facing researchers in automated reasoning is being able to solve problems that cannot currently be solved by existing tools and methods. This may appear obvious, and is clearly not an original thought, but focusing on this as a primary goal allows us to examine other goals in a new light. Many successful theorem provers employ a portfolio of different methods for solving problems. This changes the landscape on which we perform our research: solving problems that can already be solved may not improve the state of the art and a method that can solve a handful of problems unsolvable by current methods, but generally performs poorly on most problems, can be very useful. We acknowledge that forcing new methods to compete against portfolio solvers can stifle innovation. However, this is only the case when comparisons are made at the level of total problems solved. We propose a movement towards focussing on unique solutions in evaluation and competitions i.e. measuring the potential contribution to a portfolio solver. This state of affairs is particularly prominent in first-order logic, which is undecidable. When reasoning in a decidable logic there can be a focus on optimising a decision procedure and measuring average solving times. But in a setting where solutions are difficult to find, average solving times lose meaning, and whilst improving the efficiency of a technique can move potential solutions within acceptable time limits, in general, complementary strategies may be more successful.
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/1912.12958v1
PDF	https://arxiv.org/pdf/1912.12958v1.pdf
PWC	https://paperswithcode.com/paper/boldly-going-where-no-prover-has-gone-before
Repo
Framework

Spatiotemporal Filtering for Event-Based Action Recognition


Title	Spatiotemporal Filtering for Event-Based Action Recognition
Authors	Rohan Ghosh, Anupam Gupta, Andrei Nakagawa, Alcimar Soares, Nitish Thakor
Abstract	In this paper, we address the challenging problem of action recognition, using event-based cameras. To recognise most gestural actions, often higher temporal precision is required for sampling visual information. Actions are defined by motion, and therefore, when using event-based cameras it is often unnecessary to re-sample the entire scene. Neuromorphic, event-based cameras have presented an alternative to visual information acquisition by asynchronously time-encoding pixel intensity changes, through temporally precise spikes (10 micro-second resolution), making them well equipped for action recognition. However, other challenges exist, which are intrinsic to event-based imagers, such as higher signal-to-noise ratio, and a spatiotemporally sparse information. One option is to convert event-data into frames, but this could result in significant temporal precision loss. In this work we introduce spatiotemporal filtering in the spike-event domain, as an alternative way of channeling spatiotemporal information through to a convolutional neural network. The filters are local spatiotemporal weight matrices, learned from the spike-event data, in an unsupervised manner. We find that appropriate spatiotemporal filtering significantly improves CNN performance beyond state-of-the-art on the event-based DVS Gesture dataset. On our newly recorded action recognition dataset, our method shows significant improvement when compared with other, standard ways of generating the spatiotemporal filters.
Tasks	Temporal Action Localization
Published	2019-03-17
URL	http://arxiv.org/abs/1903.07067v1
PDF	http://arxiv.org/pdf/1903.07067v1.pdf
PWC	https://paperswithcode.com/paper/spatiotemporal-filtering-for-event-based
Repo
Framework

Progressive NAPSAC: sampling from gradually growing neighborhoods


Title	Progressive NAPSAC: sampling from gradually growing neighborhoods
Authors	Daniel Barath, Maksym Ivashechkin, Jiri Matas
Abstract	We propose Progressive NAPSAC, P-NAPSAC in short, which merges the advantages of local and global sampling by drawing samples from gradually growing neighborhoods. Exploiting the fact that nearby points are more likely to originate from the same geometric model, P-NAPSAC finds local structures earlier than global samplers. We show that the progressive spatial sampling in P-NAPSAC can be integrated with PROSAC sampling, which is applied to the first, location-defining, point. P-NAPSAC is embedded in USAC, a state-of-the-art robust estimation pipeline, which we further improve by implementing its local optimization as in Graph-Cut RANSAC. We call the resulting estimator USAC. The method is tested on homography and fundamental matrix fitting on a total of 10,691 models from seven publicly available datasets. USAC with P-NAPSAC outperforms reference methods in terms of speed on all problems.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02295v1
PDF	https://arxiv.org/pdf/1906.02295v1.pdf
PWC	https://paperswithcode.com/paper/progressive-napsac-sampling-from-gradually
Repo
Framework

AITom: Open-source AI platform for cryo-electron Tomography data analysis


Title	AITom: Open-source AI platform for cryo-electron Tomography data analysis
Authors	Xiangrui Zeng, Min Xu
Abstract	Cryo-electron tomography (cryo-ET) is an emerging technology for the 3D visualization of structural organizations and interactions of subcellular components at near-native state and sub-molecular resolution. Tomograms captured by cryo-ET contain heterogeneous structures representing the complex and dynamic subcellular environment. Since the structures are not purified or fluorescently labeled, the spatial organization and interaction between both the known and unknown structures can be studied in their native environment. The rapid advances of cryo-electron tomography (cryo-ET) have generated abundant 3D cellular imaging data. However, the systematic localization, identification, segmentation, and structural recovery of the subcellular components require efficient and accurate large-scale image analysis methods. We introduce AITom, an open-source artificial intelligence platform for cryo-ET researchers. AITom provides many public as well as in-house algorithms for performing cryo-ET data analysis through both the traditional template-based or template-free approach and the deep learning approach. Comprehensive tutorials for each analysis module are provided to guide the user through. We welcome researchers and developers to join this collaborative open-source software development project. Availability: https://github.com/xulabs/aitom
Tasks	Electron Tomography
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03044v1
PDF	https://arxiv.org/pdf/1911.03044v1.pdf
PWC	https://paperswithcode.com/paper/aitom-open-source-ai-platform-for-cryo
Repo
Framework

Spectral-Spatial Diffusion Geometry for Hyperspectral Image Clustering


Title	Spectral-Spatial Diffusion Geometry for Hyperspectral Image Clustering
Authors	James M. Murphy, Mauro Maggioni
Abstract	An unsupervised learning algorithm to cluster hyperspectral image (HSI) data is proposed that exploits spatially-regularized random walks. Markov diffusions are defined on the space of HSI spectra with transitions constrained to near spatial neighbors. The explicit incorporation of spatial regularity into the diffusion construction leads to smoother random processes that are more adapted for unsupervised machine learning than those based on spectra alone. The regularized diffusion process is subsequently used to embed the high-dimensional HSI into a lower dimensional space through diffusion distances. Cluster modes are computed using density estimation and diffusion distances, and all other points are labeled according to these modes. The proposed method has low computational complexity and performs competitively against state-of-the-art HSI clustering algorithms on real data. In particular, the proposed spatial regularization confers an empirical advantage over non-regularized methods.
Tasks	Density Estimation, Image Clustering
Published	2019-02-08
URL	http://arxiv.org/abs/1902.05402v1
PDF	http://arxiv.org/pdf/1902.05402v1.pdf
PWC	https://paperswithcode.com/paper/spectral-spatial-diffusion-geometry-for
Repo
Framework

Minimizing the Societal Cost of Credit Card Fraud with Limited and Imbalanced Data


Title	Minimizing the Societal Cost of Credit Card Fraud with Limited and Imbalanced Data
Authors	Samuel Showalter, Zhixin Wu
Abstract	Machine learning has automated much of financial fraud detection, notifying firms of, or even blocking, questionable transactions instantly. However, data imbalance starves traditionally trained models of the content necessary to detect fraud. This study examines three separate factors of credit card fraud detection via machine learning. First, it assesses the potential for different sampling methods, undersampling and Synthetic Minority Oversampling Technique (SMOTE), to improve algorithm performance in data-starved environments. Additionally, five industry-practical machine learning algorithms are evaluated on total fraud cost savings in addition to traditional statistical metrics. Finally, an ensemble of individual models is trained with a genetic algorithm to attempt to generate higher cost efficiency than its components. Monte Carlo performance distributions discerned random undersampling outperformed SMOTE in lowering fraud costs, and that an ensemble was unable to outperform its individual parts. Most notably,the F-1 Score, a traditional metric often used to measure performance with imbalanced data, was uncorrelated with derived cost efficiency. Assuming a realistic cost structure can be derived, cost-based metrics provide an essential supplement to objective statistical evaluation.
Tasks	Fraud Detection
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01486v2
PDF	https://arxiv.org/pdf/1909.01486v2.pdf
PWC	https://paperswithcode.com/paper/minimizing-the-societal-cost-of-credit-card
Repo
Framework