January 31, 2020

3043 words 15 mins read

Paper Group ANR 10

Paper Group ANR 10

Learning Robust 3D Face Reconstruction and Discriminative Identity Representation. Modeling Named Entity Embedding Distribution into Hypersphere. PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model. Occlusions for Effective Data Augmentation in Image Classification. IMHO Fine-Tuning Improves Claim Detection. Maize Yield and …

Learning Robust 3D Face Reconstruction and Discriminative Identity Representation

Title Learning Robust 3D Face Reconstruction and Discriminative Identity Representation
Authors Yao Luo, Xiaoguang Tu, Mei Xie
Abstract 3D face reconstruction from a single 2D image is a very important topic in computer vision. However, the current reconstruction methods are usually non-sensitive to face identities and over-sensitive to facial poses, which may result in similar 3D geometries for faces of different identities, or obtain different shapes for the same identity with different poses. When such methods are applied practically, their 3D estimates are either changeable for different photos of the same subject or over-regularized and generic to distinguish face identities. In this paper, we propose a robust solution to solve this problem by carefully designing a novel Siamese Convolutional Neural Network (SCNN). Specifically, regarding the 3D Morphable face Model (3DMM) parameters of the same individual as the same class, we employ the contrastive loss to enlarge the inter-class distance and meanwhile reduce the intra-class distance for the output 3DMM parameters. We also propose an identity loss to preserve the identity information for the same individual in the feature space. Training with these two losses, our SCNN could learn representations that are more discriminative for face identity and generalizable for pose variants. Experiments on the challenging database 300W-LP and AFLW2000-3D have shown the effectiveness of our method by comparing with state-of-the-arts.
Tasks 3D Face Reconstruction, Face Reconstruction
Published 2019-05-16
URL https://arxiv.org/abs/1905.06505v1
PDF https://arxiv.org/pdf/1905.06505v1.pdf
PWC https://paperswithcode.com/paper/learning-robust-3d-face-reconstruction-and
Repo
Framework

Modeling Named Entity Embedding Distribution into Hypersphere

Title Modeling Named Entity Embedding Distribution into Hypersphere
Authors Zhuosheng Zhang, Bingjie Tang, Zuchao Li, Hai Zhao
Abstract This work models named entity distribution from a way of visualizing topological structure of embedding space, so that we make an assumption that most, if not all, named entities (NEs) for a language tend to aggregate together to be accommodated by a specific hypersphere in embedding space. Thus we present a novel open definition for NE which alleviates the obvious drawback in previous closed NE definition with a limited NE dictionary. Then, we show two applications with introducing the proposed named entity hypersphere model. First, using a generative adversarial neural network to learn a transformation matrix of two embedding spaces, which results in a convenient determination of named entity distribution in the target language, indicating the potential of fast named entity discovery only using isomorphic relation between embedding spaces. Second, the named entity hypersphere model is directly integrated with various named entity recognition models over sentences to achieve state-of-the-art results. Only assuming that embeddings are available, we show a prior knowledge free approach on effective named entity distribution depiction.
Tasks Named Entity Recognition
Published 2019-09-03
URL https://arxiv.org/abs/1909.01065v1
PDF https://arxiv.org/pdf/1909.01065v1.pdf
PWC https://paperswithcode.com/paper/modeling-named-entity-embedding-distribution
Repo
Framework

PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model

Title PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model
Authors Jieh-Sheng Lee, Jieh Hsiang
Abstract In this work we focus on fine-tuning a pre-trained BERT model and applying it to patent classification. When applied to large datasets of over two millions patents, our approach outperforms the state of the art by an approach using CNN with word embeddings. In addition, we focus on patent claims without other parts in patent documents. Our contributions include: (1) a new state-of-the-art method based on pre-trained BERT model and fine-tuning for patent classification, (2) a large dataset USPTO-3M at the CPC subclass level with SQL statements that can be used by future researchers, (3) showing that patent claims alone are sufficient for classification task, in contrast to conventional wisdom.
Tasks Word Embeddings
Published 2019-05-14
URL https://arxiv.org/abs/1906.02124v2
PDF https://arxiv.org/pdf/1906.02124v2.pdf
PWC https://paperswithcode.com/paper/190602124
Repo
Framework

Occlusions for Effective Data Augmentation in Image Classification

Title Occlusions for Effective Data Augmentation in Image Classification
Authors Ruth Fong, Andrea Vedaldi
Abstract Deep networks for visual recognition are known to leverage “easy to recognise” portions of objects such as faces and distinctive texture patterns. The lack of a holistic understanding of objects may increase fragility and overfitting. In recent years, several papers have proposed to address this issue by means of occlusions as a form of data augmentation. However, successes have been limited to tasks such as weak localization and model interpretation, but no benefit was demonstrated on image classification on large-scale datasets. In this paper, we show that, by using a simple technique based on batch augmentation, occlusions as data augmentation can result in better performance on ImageNet for high-capacity models (e.g., ResNet50). We also show that varying amounts of occlusions used during training can be used to study the robustness of different neural network architectures.
Tasks Data Augmentation, Image Classification
Published 2019-10-23
URL https://arxiv.org/abs/1910.10651v2
PDF https://arxiv.org/pdf/1910.10651v2.pdf
PWC https://paperswithcode.com/paper/occlusions-for-effective-data-augmentation-in
Repo
Framework

IMHO Fine-Tuning Improves Claim Detection

Title IMHO Fine-Tuning Improves Claim Detection
Authors Tuhin Chakrabarty, Christopher Hidey, Kathleen McKeown
Abstract Claims are the central component of an argument. Detecting claims across different domains or data sets can often be challenging due to their varying conceptualization. We propose to alleviate this problem by fine tuning a language model using a Reddit corpus of 5.5 million opinionated claims. These claims are self-labeled by their authors using the internet acronyms IMO/IMHO (in my (humble) opinion). Empirical results show that using this approach improves the state of art performance across four benchmark argumentation data sets by an average of 4 absolute F1 points in claim detection. As these data sets include diverse domains such as social media and student essays this improvement demonstrates the robustness of fine-tuning on this novel corpus.
Tasks Language Modelling
Published 2019-05-16
URL https://arxiv.org/abs/1905.07000v1
PDF https://arxiv.org/pdf/1905.07000v1.pdf
PWC https://paperswithcode.com/paper/imho-fine-tuning-improves-claim-detection
Repo
Framework

Maize Yield and Nitrate Loss Prediction with Machine Learning Algorithms

Title Maize Yield and Nitrate Loss Prediction with Machine Learning Algorithms
Authors Mohsen Shahhosseini, Rafael A. Martinez-Feria, Guiping Hu, Sotirios V. Archontoulis
Abstract Pre-season prediction of crop production outcomes such as grain yields and N losses can provide insights to stakeholders when making decisions. Simulation models can assist in scenario planning, but their use is limited because of data requirements and long run times. Thus, there is a need for more computationally expedient approaches to scale up predictions. We evaluated the potential of five machine learning (ML) algorithms as meta-models for a cropping systems simulator (APSIM) to inform future decision-support tool development. We asked: 1) How well do ML meta-models predict maize yield and N losses using pre-season information? 2) How many data are needed to train ML algorithms to achieve acceptable predictions?; 3) Which input data variables are most important for accurate prediction?; and 4) Do ensembles of ML meta-models improve prediction? The simulated dataset included more than 3 million genotype, environment and management scenarios. Random forests most accurately predicted maize yield and N loss at planting time, with a RRMSE of 14% and 55%, respectively. ML meta-models reasonably reproduced simulated maize yields but not N loss. They also differed in their sensitivities to the size of the training dataset. Across all ML models, yield prediction error decreased by 10-40% as the training dataset increased from 0.5 to 1.8 million data points, whereas N loss prediction error showed no consistent pattern. ML models also differed in their sensitivities to input variables. Averaged across all ML models, weather conditions, soil properties, management information and initial conditions were roughly equally important when predicting yields. Modest prediction improvements resulted from ML ensembles. These results can help accelerate progress in coupling simulation models and ML toward developing dynamic decision support tools for pre-season management.
Tasks
Published 2019-08-14
URL https://arxiv.org/abs/1908.06746v4
PDF https://arxiv.org/pdf/1908.06746v4.pdf
PWC https://paperswithcode.com/paper/maize-yield-and-nitrate-loss-prediction-with
Repo
Framework

MRI Pulse Sequence Integration for Deep-Learning Based Brain Metastasis Segmentation

Title MRI Pulse Sequence Integration for Deep-Learning Based Brain Metastasis Segmentation
Authors Darvin Yi, Endre Grøvik, Michael Iv, Elizabeth Tong, Kyrre Eeg Emblem, Line Brennhaug Nilsen, Cathrine Saxhaug, Anna Latysheva, Kari Dolven Jacobsen, Åslaug Helland, Greg Zaharchuk, Daniel Rubin
Abstract Magnetic resonance (MR) imaging is an essential diagnostic tool in clinical medicine. Recently, a variety of deep learning methods have been applied to segmentation tasks in medical images, with promising results for computer-aided diagnosis. For MR images, effectively integrating different pulse sequences is important to optimize performance. However, the best way to integrate different pulse sequences remains unclear. In this study, we evaluate multiple architectural features and characterize their effects in the task of metastasis segmentation. Specifically, we consider (1) different pulse sequence integration schemas, (2) different modes of weight sharing for parallel network branches, and (3) a new approach for enabling robustness to missing pulse sequences. We find that levels of integration and modes of weight sharing that favor low variance work best in our regime of small data (n = 100). By adding an input-level dropout layer, we could preserve the overall performance of these networks while allowing for inference on inputs with missing pulse sequence. We illustrate not only the generalizability of the network but also the utility of this robustness when applying the trained model to data from a different center, which does not use the same pulse sequences. Finally, we apply network visualization methods to better understand which input features are most important for network performance. Together, these results provide a framework for building networks with enhanced robustness to missing data while maintaining comparable performance in medical imaging applications.
Tasks
Published 2019-12-18
URL https://arxiv.org/abs/1912.08775v1
PDF https://arxiv.org/pdf/1912.08775v1.pdf
PWC https://paperswithcode.com/paper/mri-pulse-sequence-integration-for-deep
Repo
Framework

Scenarios and Recommendations for Ethical Interpretive AI

Title Scenarios and Recommendations for Ethical Interpretive AI
Authors John Licato, Zaid Marji, Sophia Abraham
Abstract Artificially intelligent systems, given a set of non-trivial ethical rules to follow, will inevitably be faced with scenarios which call into question the scope of those rules. In such cases, human reasoners typically will engage in interpretive reasoning, where interpretive arguments are used to support or attack claims that some rule should be understood a certain way. Artificially intelligent reasoners, however, currently lack the ability to carry out human-like interpretive reasoning, and we argue that bridging this gulf is of tremendous importance to human-centered AI. In order to better understand how future artificial reasoners capable of human-like interpretive reasoning must be developed, we have collected a dataset of ethical rules, scenarios designed to invoke interpretive reasoning, and interpretations of those scenarios. We perform a qualitative analysis of our dataset, and summarize our findings in the form of practical recommendations.
Tasks
Published 2019-11-05
URL https://arxiv.org/abs/1911.01917v1
PDF https://arxiv.org/pdf/1911.01917v1.pdf
PWC https://paperswithcode.com/paper/scenarios-and-recommendations-for-ethical
Repo
Framework

Dynamic Fusion for Multimodal Data

Title Dynamic Fusion for Multimodal Data
Authors Gaurav Sahu, Olga Vechtomova
Abstract Effective fusion of data from multiple modalities, such as video, speech, and text, is challenging pertaining to the heterogeneous nature of multimodal data. In this paper, we propose dynamic fusion techniques that model context from different modalities efficiently. Instead of defining a deterministic fusion operation, such as concatenation, for the network, we let the network decide “how” to combine given multimodal features in the most optimal way. We propose two networks: 1) transfusion network, which learns to compress information from different modalities while preserving the context, and 2) a GAN-based network, which regularizes the learned latent space given context from complimenting modalities. A quantitative evaluation on the tasks of machine translation, and emotion recognition suggest that such adaptive networks are able to model context better than all existing methods.
Tasks Emotion Recognition, Machine Translation
Published 2019-11-10
URL https://arxiv.org/abs/1911.03821v1
PDF https://arxiv.org/pdf/1911.03821v1.pdf
PWC https://paperswithcode.com/paper/dynamic-fusion-for-multimodal-data
Repo
Framework

A Decentralized Communication Policy for Multi Agent Multi Armed Bandit Problems

Title A Decentralized Communication Policy for Multi Agent Multi Armed Bandit Problems
Authors Pathmanathan Pankayaraj, D. H. S. Maithripala
Abstract This paper proposes a novel policy for a group of agents to, individually as well as collectively, solve a multi armed bandit (MAB) problem. The policy relies solely on the information that an agent has obtained through sampling of the options on its own and through communication with neighbors. The option selection policy is based on an Upper Confidence Based (UCB) strategy while the communication strategy that is proposed forces agents to communicate with other agents who they believe are most likely to be exploring than exploiting. The overall strategy is shown to significantly outperform an independent Erd\H{o}s-R'{e}nyi (ER) graph based random communication policy. The policy is shown to be cost effective in terms of communication and thus to be easily scalable to a large network of agents.
Tasks
Published 2019-10-07
URL https://arxiv.org/abs/1910.02635v3
PDF https://arxiv.org/pdf/1910.02635v3.pdf
PWC https://paperswithcode.com/paper/an-option-and-agent-selection-policy-with
Repo
Framework

PIV-Based 3D Fluid Flow Reconstruction Using Light Field Camera

Title PIV-Based 3D Fluid Flow Reconstruction Using Light Field Camera
Authors Zhong Li, Jinwei Ye, Yu Ji, Hao Sheng, Jingyi Yu
Abstract Particle Imaging Velocimetry (PIV) estimates the flow of fluid by analyzing the motion of injected particles. The problem is challenging as the particles lie at different depths but have similar appearance and tracking a large number of particles is particularly difficult. In this paper, we present a PIV solution that uses densely sampled light field to reconstruct and track 3D particles. We exploit the refocusing capability and focal symmetry constraint of the light field for reliable particle depth estimation. We further propose a new motion-constrained optical flow estimation scheme by enforcing local motion rigidity and the Navier-Stoke constraint. Comprehensive experiments on synthetic and real experiments show that using a single light field camera, our technique can recover dense and accurate 3D fluid flows in small to medium volumes.
Tasks Depth Estimation, Optical Flow Estimation
Published 2019-04-15
URL https://arxiv.org/abs/1904.06841v2
PDF https://arxiv.org/pdf/1904.06841v2.pdf
PWC https://paperswithcode.com/paper/piv-based-3d-fluid-flow-reconstruction-using
Repo
Framework

Low-cost Measurement of Industrial Shock Signals via Deep Learning Calibration

Title Low-cost Measurement of Industrial Shock Signals via Deep Learning Calibration
Authors Houpu Yao, Jingjing Wen, Yi Ren, Bin Wu, Ze Ji
Abstract Special high-end sensors with expensive hardware are usually needed to measure shock signals with high accuracy. In this paper, we show that cheap low-end sensors calibrated by deep neural networks are also capable to measure high-g shocks accurately. Firstly we perform drop shock tests to collect a dataset of shock signals measured by sensors of different fidelity. Secondly, we propose a novel network to effectively learn both the signal peak and overall shape. The results show that the proposed network is capable to map low-end shock signals to its high-end counterparts with satisfactory accuracy. To the best of our knowledge, this is the first work to apply deep learning techniques to calibrate shock sensors.
Tasks Calibration
Published 2019-02-07
URL http://arxiv.org/abs/1902.02829v1
PDF http://arxiv.org/pdf/1902.02829v1.pdf
PWC https://paperswithcode.com/paper/low-cost-measurement-of-industrial-shock
Repo
Framework
Title RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search
Authors Sam Green, Craig M. Vineyard, Ryan Helinski, Çetin Kaya Koç
Abstract Early neural network architectures were designed by so-called “grad student descent”. Since then, the field of Neural Architecture Search (NAS) has developed with the goal of algorithmically designing architectures tailored for a dataset of interest. Recently, gradient-based NAS approaches have been created to rapidly perform the search. Gradient-based approaches impose more structure on the search, compared to alternative NAS methods, enabling faster search phase optimization. In the real-world, neural architecture performance is measured by more than just high accuracy. There is increasing need for efficient neural architectures, where resources such as model size or latency must also be considered. Gradient-based NAS is also suitable for such multi-objective optimization. In this work we extend a popular gradient-based NAS method to support one or more resource costs. We then perform in-depth analysis on the discovery of architectures satisfying single-resource constraints for classification of CIFAR-10.
Tasks Neural Architecture Search
Published 2019-11-08
URL https://arxiv.org/abs/1911.05704v1
PDF https://arxiv.org/pdf/1911.05704v1.pdf
PWC https://paperswithcode.com/paper/rapdarts-resource-aware-progressive
Repo
Framework

Non-Lambertian Surface Shape and Reflectance Reconstruction Using Concentric Multi-Spectral Light Field

Title Non-Lambertian Surface Shape and Reflectance Reconstruction Using Concentric Multi-Spectral Light Field
Authors Mingyuan Zhou, Yu Ji, Yuqi Ding, Jinwei Ye, S. Susan Young, Jingyi Yu
Abstract Recovering the shape and reflectance of non-Lambertian surfaces remains a challenging problem in computer vision since the view-dependent appearance invalidates traditional photo-consistency constraint. In this paper, we introduce a novel concentric multi-spectral light field (CMSLF) design that is able to recover the shape and reflectance of surfaces with arbitrary material in one shot. Our CMSLF system consists of an array of cameras arranged on concentric circles where each ring captures a specific spectrum. Coupled with a multi-spectral ring light, we are able to sample viewpoint and lighting variations in a single shot via spectral multiplexing. We further show that such concentric camera/light setting results in a unique pattern of specular changes across views that enables robust depth estimation. We formulate a physical-based reflectance model on CMSLF to estimate depth and multi-spectral reflectance map without imposing any surface prior. Extensive synthetic and real experiments show that our method outperforms state-of-the-art light field-based techniques, especially in non-Lambertian scenes.
Tasks Depth Estimation
Published 2019-04-09
URL http://arxiv.org/abs/1904.04875v2
PDF http://arxiv.org/pdf/1904.04875v2.pdf
PWC https://paperswithcode.com/paper/non-lambertian-surface-shape-and-reflectance
Repo
Framework

Defogging Kinect: Simultaneous Estimation of Object Region and Depth in Foggy Scenes

Title Defogging Kinect: Simultaneous Estimation of Object Region and Depth in Foggy Scenes
Authors Yuki Fujimura, Motoharu Sonogashira, Masaaki Iiyama
Abstract Three-dimensional (3D) reconstruction and scene depth estimation from 2-dimensional (2D) images are major tasks in computer vision. However, using conventional 3D reconstruction techniques gets challenging in participating media such as murky water, fog, or smoke. We have developed a method that uses a time-of-flight (ToF) camera to estimate an object region and depth in participating media simultaneously. The scattering component is saturated, so it does not depend on the scene depth, and received signals bouncing off distant points are negligible due to light attenuation in the participating media, so the observation of such a point contains only a scattering component. These phenomena enable us to estimate the scattering component in an object region from a background that only contains the scattering component. The problem is formulated as robust estimation where the object region is regarded as outliers, and it enables the simultaneous estimation of an object region and depth on the basis of an iteratively reweighted least squares (IRLS) optimization scheme. We demonstrate the effectiveness of the proposed method using captured images from a Kinect v2 in real foggy scenes and evaluate the applicability with synthesized data.
Tasks 3D Reconstruction, Depth Estimation
Published 2019-04-01
URL http://arxiv.org/abs/1904.00558v1
PDF http://arxiv.org/pdf/1904.00558v1.pdf
PWC https://paperswithcode.com/paper/defogging-kinect-simultaneous-estimation-of
Repo
Framework
comments powered by Disqus