January 31, 2020

3043 words 15 mins read

Paper Group ANR 10

Learning Robust 3D Face Reconstruction and Discriminative Identity Representation. Modeling Named Entity Embedding Distribution into Hypersphere. PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model. Occlusions for Effective Data Augmentation in Image Classification. IMHO Fine-Tuning Improves Claim Detection. Maize Yield and …

Learning Robust 3D Face Reconstruction and Discriminative Identity Representation


Title	Learning Robust 3D Face Reconstruction and Discriminative Identity Representation
Authors	Yao Luo, Xiaoguang Tu, Mei Xie
Abstract	3D face reconstruction from a single 2D image is a very important topic in computer vision. However, the current reconstruction methods are usually non-sensitive to face identities and over-sensitive to facial poses, which may result in similar 3D geometries for faces of different identities, or obtain different shapes for the same identity with different poses. When such methods are applied practically, their 3D estimates are either changeable for different photos of the same subject or over-regularized and generic to distinguish face identities. In this paper, we propose a robust solution to solve this problem by carefully designing a novel Siamese Convolutional Neural Network (SCNN). Specifically, regarding the 3D Morphable face Model (3DMM) parameters of the same individual as the same class, we employ the contrastive loss to enlarge the inter-class distance and meanwhile reduce the intra-class distance for the output 3DMM parameters. We also propose an identity loss to preserve the identity information for the same individual in the feature space. Training with these two losses, our SCNN could learn representations that are more discriminative for face identity and generalizable for pose variants. Experiments on the challenging database 300W-LP and AFLW2000-3D have shown the effectiveness of our method by comparing with state-of-the-arts.
Tasks	3D Face Reconstruction, Face Reconstruction
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06505v1
PDF	https://arxiv.org/pdf/1905.06505v1.pdf
PWC	https://paperswithcode.com/paper/learning-robust-3d-face-reconstruction-and
Repo
Framework

Modeling Named Entity Embedding Distribution into Hypersphere


Title	Modeling Named Entity Embedding Distribution into Hypersphere
Authors	Zhuosheng Zhang, Bingjie Tang, Zuchao Li, Hai Zhao
Abstract	This work models named entity distribution from a way of visualizing topological structure of embedding space, so that we make an assumption that most, if not all, named entities (NEs) for a language tend to aggregate together to be accommodated by a specific hypersphere in embedding space. Thus we present a novel open definition for NE which alleviates the obvious drawback in previous closed NE definition with a limited NE dictionary. Then, we show two applications with introducing the proposed named entity hypersphere model. First, using a generative adversarial neural network to learn a transformation matrix of two embedding spaces, which results in a convenient determination of named entity distribution in the target language, indicating the potential of fast named entity discovery only using isomorphic relation between embedding spaces. Second, the named entity hypersphere model is directly integrated with various named entity recognition models over sentences to achieve state-of-the-art results. Only assuming that embeddings are available, we show a prior knowledge free approach on effective named entity distribution depiction.
Tasks	Named Entity Recognition
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01065v1
PDF	https://arxiv.org/pdf/1909.01065v1.pdf
PWC	https://paperswithcode.com/paper/modeling-named-entity-embedding-distribution
Repo
Framework

PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model


Title	PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model
Authors	Jieh-Sheng Lee, Jieh Hsiang
Abstract	In this work we focus on fine-tuning a pre-trained BERT model and applying it to patent classification. When applied to large datasets of over two millions patents, our approach outperforms the state of the art by an approach using CNN with word embeddings. In addition, we focus on patent claims without other parts in patent documents. Our contributions include: (1) a new state-of-the-art method based on pre-trained BERT model and fine-tuning for patent classification, (2) a large dataset USPTO-3M at the CPC subclass level with SQL statements that can be used by future researchers, (3) showing that patent claims alone are sufficient for classification task, in contrast to conventional wisdom.
Tasks	Word Embeddings
Published	2019-05-14
URL	https://arxiv.org/abs/1906.02124v2
PDF	https://arxiv.org/pdf/1906.02124v2.pdf
PWC	https://paperswithcode.com/paper/190602124
Repo
Framework

Occlusions for Effective Data Augmentation in Image Classification


Title	Occlusions for Effective Data Augmentation in Image Classification
Authors	Ruth Fong, Andrea Vedaldi
Abstract	Deep networks for visual recognition are known to leverage “easy to recognise” portions of objects such as faces and distinctive texture patterns. The lack of a holistic understanding of objects may increase fragility and overfitting. In recent years, several papers have proposed to address this issue by means of occlusions as a form of data augmentation. However, successes have been limited to tasks such as weak localization and model interpretation, but no benefit was demonstrated on image classification on large-scale datasets. In this paper, we show that, by using a simple technique based on batch augmentation, occlusions as data augmentation can result in better performance on ImageNet for high-capacity models (e.g., ResNet50). We also show that varying amounts of occlusions used during training can be used to study the robustness of different neural network architectures.
Tasks	Data Augmentation, Image Classification
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10651v2
PDF	https://arxiv.org/pdf/1910.10651v2.pdf
PWC	https://paperswithcode.com/paper/occlusions-for-effective-data-augmentation-in
Repo
Framework

IMHO Fine-Tuning Improves Claim Detection


Title	IMHO Fine-Tuning Improves Claim Detection
Authors	Tuhin Chakrabarty, Christopher Hidey, Kathleen McKeown
Abstract	Claims are the central component of an argument. Detecting claims across different domains or data sets can often be challenging due to their varying conceptualization. We propose to alleviate this problem by fine tuning a language model using a Reddit corpus of 5.5 million opinionated claims. These claims are self-labeled by their authors using the internet acronyms IMO/IMHO (in my (humble) opinion). Empirical results show that using this approach improves the state of art performance across four benchmark argumentation data sets by an average of 4 absolute F1 points in claim detection. As these data sets include diverse domains such as social media and student essays this improvement demonstrates the robustness of fine-tuning on this novel corpus.
Tasks	Language Modelling
Published	2019-05-16
URL	https://arxiv.org/abs/1905.07000v1
PDF	https://arxiv.org/pdf/1905.07000v1.pdf
PWC	https://paperswithcode.com/paper/imho-fine-tuning-improves-claim-detection
Repo
Framework

Maize Yield and Nitrate Loss Prediction with Machine Learning Algorithms


Title	Maize Yield and Nitrate Loss Prediction with Machine Learning Algorithms
Authors	Mohsen Shahhosseini, Rafael A. Martinez-Feria, Guiping Hu, Sotirios V. Archontoulis
Abstract	Pre-season prediction of crop production outcomes such as grain yields and N losses can provide insights to stakeholders when making decisions. Simulation models can assist in scenario planning, but their use is limited because of data requirements and long run times. Thus, there is a need for more computationally expedient approaches to scale up predictions. We evaluated the potential of five machine learning (ML) algorithms as meta-models for a cropping systems simulator (APSIM) to inform future decision-support tool development. We asked: 1) How well do ML meta-models predict maize yield and N losses using pre-season information? 2) How many data are needed to train ML algorithms to achieve acceptable predictions?; 3) Which input data variables are most important for accurate prediction?; and 4) Do ensembles of ML meta-models improve prediction? The simulated dataset included more than 3 million genotype, environment and management scenarios. Random forests most accurately predicted maize yield and N loss at planting time, with a RRMSE of 14% and 55%, respectively. ML meta-models reasonably reproduced simulated maize yields but not N loss. They also differed in their sensitivities to the size of the training dataset. Across all ML models, yield prediction error decreased by 10-40% as the training dataset increased from 0.5 to 1.8 million data points, whereas N loss prediction error showed no consistent pattern. ML models also differed in their sensitivities to input variables. Averaged across all ML models, weather conditions, soil properties, management information and initial conditions were roughly equally important when predicting yields. Modest prediction improvements resulted from ML ensembles. These results can help accelerate progress in coupling simulation models and ML toward developing dynamic decision support tools for pre-season management.
Tasks
Published	2019-08-14
URL	https://arxiv.org/abs/1908.06746v4
PDF	https://arxiv.org/pdf/1908.06746v4.pdf
PWC	https://paperswithcode.com/paper/maize-yield-and-nitrate-loss-prediction-with
Repo
Framework

MRI Pulse Sequence Integration for Deep-Learning Based Brain Metastasis Segmentation


Title	MRI Pulse Sequence Integration for Deep-Learning Based Brain Metastasis Segmentation
Authors	Darvin Yi, Endre Grøvik, Michael Iv, Elizabeth Tong, Kyrre Eeg Emblem, Line Brennhaug Nilsen, Cathrine Saxhaug, Anna Latysheva, Kari Dolven Jacobsen, Åslaug Helland, Greg Zaharchuk, Daniel Rubin
Abstract	Magnetic resonance (MR) imaging is an essential diagnostic tool in clinical medicine. Recently, a variety of deep learning methods have been applied to segmentation tasks in medical images, with promising results for computer-aided diagnosis. For MR images, effectively integrating different pulse sequences is important to optimize performance. However, the best way to integrate different pulse sequences remains unclear. In this study, we evaluate multiple architectural features and characterize their effects in the task of metastasis segmentation. Specifically, we consider (1) different pulse sequence integration schemas, (2) different modes of weight sharing for parallel network branches, and (3) a new approach for enabling robustness to missing pulse sequences. We find that levels of integration and modes of weight sharing that favor low variance work best in our regime of small data (n = 100). By adding an input-level dropout layer, we could preserve the overall performance of these networks while allowing for inference on inputs with missing pulse sequence. We illustrate not only the generalizability of the network but also the utility of this robustness when applying the trained model to data from a different center, which does not use the same pulse sequences. Finally, we apply network visualization methods to better understand which input features are most important for network performance. Together, these results provide a framework for building networks with enhanced robustness to missing data while maintaining comparable performance in medical imaging applications.
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08775v1
PDF	https://arxiv.org/pdf/1912.08775v1.pdf
PWC	https://paperswithcode.com/paper/mri-pulse-sequence-integration-for-deep
Repo
Framework

Scenarios and Recommendations for Ethical Interpretive AI


Title	Scenarios and Recommendations for Ethical Interpretive AI
Authors	John Licato, Zaid Marji, Sophia Abraham
Abstract	Artificially intelligent systems, given a set of non-trivial ethical rules to follow, will inevitably be faced with scenarios which call into question the scope of those rules. In such cases, human reasoners typically will engage in interpretive reasoning, where interpretive arguments are used to support or attack claims that some rule should be understood a certain way. Artificially intelligent reasoners, however, currently lack the ability to carry out human-like interpretive reasoning, and we argue that bridging this gulf is of tremendous importance to human-centered AI. In order to better understand how future artificial reasoners capable of human-like interpretive reasoning must be developed, we have collected a dataset of ethical rules, scenarios designed to invoke interpretive reasoning, and interpretations of those scenarios. We perform a qualitative analysis of our dataset, and summarize our findings in the form of practical recommendations.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01917v1
PDF	https://arxiv.org/pdf/1911.01917v1.pdf
PWC	https://paperswithcode.com/paper/scenarios-and-recommendations-for-ethical
Repo
Framework

Dynamic Fusion for Multimodal Data


Title	Dynamic Fusion for Multimodal Data
Authors	Gaurav Sahu, Olga Vechtomova
Abstract	Effective fusion of data from multiple modalities, such as video, speech, and text, is challenging pertaining to the heterogeneous nature of multimodal data. In this paper, we propose dynamic fusion techniques that model context from different modalities efficiently. Instead of defining a deterministic fusion operation, such as concatenation, for the network, we let the network decide “how” to combine given multimodal features in the most optimal way. We propose two networks: 1) transfusion network, which learns to compress information from different modalities while preserving the context, and 2) a GAN-based network, which regularizes the learned latent space given context from complimenting modalities. A quantitative evaluation on the tasks of machine translation, and emotion recognition suggest that such adaptive networks are able to model context better than all existing methods.
Tasks	Emotion Recognition, Machine Translation
Published	2019-11-10
URL	https://arxiv.org/abs/1911.03821v1
PDF	https://arxiv.org/pdf/1911.03821v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-fusion-for-multimodal-data
Repo
Framework

A Decentralized Communication Policy for Multi Agent Multi Armed Bandit Problems


Title	A Decentralized Communication Policy for Multi Agent Multi Armed Bandit Problems
Authors	Pathmanathan Pankayaraj, D. H. S. Maithripala
Abstract	This paper proposes a novel policy for a group of agents to, individually as well as collectively, solve a multi armed bandit (MAB) problem. The policy relies solely on the information that an agent has obtained through sampling of the options on its own and through communication with neighbors. The option selection policy is based on an Upper Confidence Based (UCB) strategy while the communication strategy that is proposed forces agents to communicate with other agents who they believe are most likely to be exploring than exploiting. The overall strategy is shown to significantly outperform an independent Erd\H{o}s-R'{e}nyi (ER) graph based random communication policy. The policy is shown to be cost effective in terms of communication and thus to be easily scalable to a large network of agents.
Tasks
Published	2019-10-07
URL	https://arxiv.org/abs/1910.02635v3
PDF	https://arxiv.org/pdf/1910.02635v3.pdf
PWC	https://paperswithcode.com/paper/an-option-and-agent-selection-policy-with
Repo
Framework

PIV-Based 3D Fluid Flow Reconstruction Using Light Field Camera


Title	PIV-Based 3D Fluid Flow Reconstruction Using Light Field Camera
Authors	Zhong Li, Jinwei Ye, Yu Ji, Hao Sheng, Jingyi Yu
Abstract	Particle Imaging Velocimetry (PIV) estimates the flow of fluid by analyzing the motion of injected particles. The problem is challenging as the particles lie at different depths but have similar appearance and tracking a large number of particles is particularly difficult. In this paper, we present a PIV solution that uses densely sampled light field to reconstruct and track 3D particles. We exploit the refocusing capability and focal symmetry constraint of the light field for reliable particle depth estimation. We further propose a new motion-constrained optical flow estimation scheme by enforcing local motion rigidity and the Navier-Stoke constraint. Comprehensive experiments on synthetic and real experiments show that using a single light field camera, our technique can recover dense and accurate 3D fluid flows in small to medium volumes.
Tasks	Depth Estimation, Optical Flow Estimation
Published	2019-04-15
URL	https://arxiv.org/abs/1904.06841v2
PDF	https://arxiv.org/pdf/1904.06841v2.pdf
PWC	https://paperswithcode.com/paper/piv-based-3d-fluid-flow-reconstruction-using
Repo
Framework

Low-cost Measurement of Industrial Shock Signals via Deep Learning Calibration


Title	Low-cost Measurement of Industrial Shock Signals via Deep Learning Calibration
Authors	Houpu Yao, Jingjing Wen, Yi Ren, Bin Wu, Ze Ji
Abstract	Special high-end sensors with expensive hardware are usually needed to measure shock signals with high accuracy. In this paper, we show that cheap low-end sensors calibrated by deep neural networks are also capable to measure high-g shocks accurately. Firstly we perform drop shock tests to collect a dataset of shock signals measured by sensors of different fidelity. Secondly, we propose a novel network to effectively learn both the signal peak and overall shape. The results show that the proposed network is capable to map low-end shock signals to its high-end counterparts with satisfactory accuracy. To the best of our knowledge, this is the first work to apply deep learning techniques to calibrate shock sensors.
Tasks	Calibration
Published	2019-02-07
URL	http://arxiv.org/abs/1902.02829v1
PDF	http://arxiv.org/pdf/1902.02829v1.pdf
PWC	https://paperswithcode.com/paper/low-cost-measurement-of-industrial-shock
Repo
Framework

RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search


Title	RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search
Authors	Sam Green, Craig M. Vineyard, Ryan Helinski, Çetin Kaya Koç
Abstract	Early neural network architectures were designed by so-called “grad student descent”. Since then, the field of Neural Architecture Search (NAS) has developed with the goal of algorithmically designing architectures tailored for a dataset of interest. Recently, gradient-based NAS approaches have been created to rapidly perform the search. Gradient-based approaches impose more structure on the search, compared to alternative NAS methods, enabling faster search phase optimization. In the real-world, neural architecture performance is measured by more than just high accuracy. There is increasing need for efficient neural architectures, where resources such as model size or latency must also be considered. Gradient-based NAS is also suitable for such multi-objective optimization. In this work we extend a popular gradient-based NAS method to support one or more resource costs. We then perform in-depth analysis on the discovery of architectures satisfying single-resource constraints for classification of CIFAR-10.
Tasks	Neural Architecture Search
Published	2019-11-08
URL	https://arxiv.org/abs/1911.05704v1
PDF	https://arxiv.org/pdf/1911.05704v1.pdf
PWC	https://paperswithcode.com/paper/rapdarts-resource-aware-progressive
Repo
Framework

Non-Lambertian Surface Shape and Reflectance Reconstruction Using Concentric Multi-Spectral Light Field


Title	Non-Lambertian Surface Shape and Reflectance Reconstruction Using Concentric Multi-Spectral Light Field
Authors	Mingyuan Zhou, Yu Ji, Yuqi Ding, Jinwei Ye, S. Susan Young, Jingyi Yu
Abstract	Recovering the shape and reflectance of non-Lambertian surfaces remains a challenging problem in computer vision since the view-dependent appearance invalidates traditional photo-consistency constraint. In this paper, we introduce a novel concentric multi-spectral light field (CMSLF) design that is able to recover the shape and reflectance of surfaces with arbitrary material in one shot. Our CMSLF system consists of an array of cameras arranged on concentric circles where each ring captures a specific spectrum. Coupled with a multi-spectral ring light, we are able to sample viewpoint and lighting variations in a single shot via spectral multiplexing. We further show that such concentric camera/light setting results in a unique pattern of specular changes across views that enables robust depth estimation. We formulate a physical-based reflectance model on CMSLF to estimate depth and multi-spectral reflectance map without imposing any surface prior. Extensive synthetic and real experiments show that our method outperforms state-of-the-art light field-based techniques, especially in non-Lambertian scenes.
Tasks	Depth Estimation
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04875v2
PDF	http://arxiv.org/pdf/1904.04875v2.pdf
PWC	https://paperswithcode.com/paper/non-lambertian-surface-shape-and-reflectance
Repo
Framework

Defogging Kinect: Simultaneous Estimation of Object Region and Depth in Foggy Scenes


Title	Defogging Kinect: Simultaneous Estimation of Object Region and Depth in Foggy Scenes
Authors	Yuki Fujimura, Motoharu Sonogashira, Masaaki Iiyama
Abstract	Three-dimensional (3D) reconstruction and scene depth estimation from 2-dimensional (2D) images are major tasks in computer vision. However, using conventional 3D reconstruction techniques gets challenging in participating media such as murky water, fog, or smoke. We have developed a method that uses a time-of-flight (ToF) camera to estimate an object region and depth in participating media simultaneously. The scattering component is saturated, so it does not depend on the scene depth, and received signals bouncing off distant points are negligible due to light attenuation in the participating media, so the observation of such a point contains only a scattering component. These phenomena enable us to estimate the scattering component in an object region from a background that only contains the scattering component. The problem is formulated as robust estimation where the object region is regarded as outliers, and it enables the simultaneous estimation of an object region and depth on the basis of an iteratively reweighted least squares (IRLS) optimization scheme. We demonstrate the effectiveness of the proposed method using captured images from a Kinect v2 in real foggy scenes and evaluate the applicability with synthesized data.
Tasks	3D Reconstruction, Depth Estimation
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00558v1
PDF	http://arxiv.org/pdf/1904.00558v1.pdf
PWC	https://paperswithcode.com/paper/defogging-kinect-simultaneous-estimation-of
Repo
Framework