February 2, 2020

3288 words 16 mins read

Paper Group AWR 7

Unsupervised Clinical Language Translation. Learning Robust Global Representations by Penalizing Local Predictive Power. Noise-Tolerant Paradigm for Training Face Recognition CNNs. Known-class Aware Self-ensemble for Open Set Domain Adaptation. Radial-Based Undersampling for Imbalanced Data Classification. GORC: A large contextual citation graph of …

Unsupervised Clinical Language Translation


Title	Unsupervised Clinical Language Translation
Authors	Wei-Hung Weng, Yu-An Chung, Peter Szolovits
Abstract	As patients’ access to their doctors’ clinical notes becomes common, translating professional, clinical jargon to layperson-understandable language is essential to improve patient-clinician communication. Such translation yields better clinical outcomes by enhancing patients’ understanding of their own health conditions, and thus improving patients’ involvement in their own care. Existing research has used dictionary-based word replacement or definition insertion to approach the need. However, these methods are limited by expert curation, which is hard to scale and has trouble generalizing to unseen datasets that do not share an overlapping vocabulary. In contrast, we approach the clinical word and sentence translation problem in a completely unsupervised manner. We show that a framework using representation learning, bilingual dictionary induction and statistical machine translation yields the best precision at 10 of 0.827 on professional-to-consumer word translation, and mean opinion scores of 4.10 and 4.28 out of 5 for clinical correctness and layperson readability, respectively, on sentence translation. Our fully-unsupervised strategy overcomes the curation problem, and the clinically meaningful evaluation reduces biases from inappropriate evaluators, which are critical in clinical machine learning.
Tasks	Clinical Language Translation, Machine Translation, Representation Learning
Published	2019-02-04
URL	https://arxiv.org/abs/1902.01177v2
PDF	https://arxiv.org/pdf/1902.01177v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-clinical-language-translation
Repo	https://github.com/ckbjimmy/p2c
Framework	pytorch

Learning Robust Global Representations by Penalizing Local Predictive Power


Title	Learning Robust Global Representations by Penalizing Local Predictive Power
Authors	Haohan Wang, Songwei Ge, Eric P. Xing, Zachary C. Lipton
Abstract	Despite their renowned predictive power on i.i.d. data, convolutional neural networks are known to rely more on high-frequency patterns that humans deem superficial than on low-frequency patterns that agree better with intuitions about what constitutes category membership. This paper proposes a method for training robust convolutional networks by penalizing the predictive power of the local representations learned by earlier layers. Intuitively, our networks are forced to discard predictive signals such as color and texture that can be gleaned from local receptive fields and to rely instead on the global structures of the image. Across a battery of synthetic and benchmark domain adaptation tasks, our method confers improved generalization out of the domain. Also, to evaluate cross-domain transfer, we introduce ImageNet-Sketch, a new dataset consisting of sketch-like images, that matches the ImageNet classification validation set in categories and scale.
Tasks	Domain Adaptation
Published	2019-05-29
URL	https://arxiv.org/abs/1905.13549v2
PDF	https://arxiv.org/pdf/1905.13549v2.pdf
PWC	https://paperswithcode.com/paper/190513549
Repo	https://github.com/HaohanWang/PAR
Framework	tf

Noise-Tolerant Paradigm for Training Face Recognition CNNs


Title	Noise-Tolerant Paradigm for Training Face Recognition CNNs
Authors	Wei Hu, Yangyu Huang, Fan Zhang, Ruirui Li
Abstract	Benefit from large-scale training datasets, deep Convolutional Neural Networks(CNNs) have achieved impressive results in face recognition(FR). However, tremendous scale of datasets inevitably lead to noisy data, which obviously reduce the performance of the trained CNN models. Kicking out wrong labels from large-scale FR datasets is still very expensive, although some cleaning approaches are proposed. According to the analysis of the whole process of training CNN models supervised by angular margin based loss(AM-Loss) functions, we find that the $\theta$ distribution of training samples implicitly reflects their probability of being clean. Thus, we propose a novel training paradigm that employs the idea of weighting samples based on the above probability. Without any prior knowledge of noise, we can train high performance CNN models with large-scale FR datasets. Experiments demonstrate the effectiveness of our training paradigm. The codes are available at https://github.com/huangyangyu/NoiseFace.
Tasks	Face Recognition
Published	2019-03-25
URL	http://arxiv.org/abs/1903.10357v2
PDF	http://arxiv.org/pdf/1903.10357v2.pdf
PWC	https://paperswithcode.com/paper/noise-tolerant-paradigm-for-training-face
Repo	https://github.com/huangyangyu/NoiseFace
Framework	none

Known-class Aware Self-ensemble for Open Set Domain Adaptation


Title	Known-class Aware Self-ensemble for Open Set Domain Adaptation
Authors	Qing Lian, Wen Li, Lin Chen, Lixin Duan
Abstract	Existing domain adaptation methods generally assume different domains have the identical label space, which is quite restrict for real-world applications. In this paper, we focus on a more realistic and challenging case of open set domain adaptation. Particularly, in open set domain adaptation, we allow the classes from the source and target domains to be partially overlapped. In this case, the assumption of conventional distribution alignment does not hold anymore, due to the different label spaces in two domains. To tackle this challenge, we propose a new approach coined as Known-class Aware Self-Ensemble (KASE), which is built upon the recently developed self-ensemble model. In KASE, we first introduce a Known-class Aware Recognition (KAR) module to identify the known and unknown classes from the target domain, which is achieved by encouraging a low cross-entropy for known classes and a high entropy based on the source data from the unknown class. Then, we develop a Known-class Aware Adaptation (KAA) module to better adapt from the source domain to the target by reweighing the adaptation loss based on the likeliness to belong to known classes of unlabeled target samples as predicted by KAR. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness of our approach.
Tasks	Domain Adaptation
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01068v1
PDF	https://arxiv.org/pdf/1905.01068v1.pdf
PWC	https://paperswithcode.com/paper/known-class-aware-self-ensemble-for-open-set
Repo	https://github.com/ChenJinBIT/OSDA
Framework	tf

Radial-Based Undersampling for Imbalanced Data Classification


Title	Radial-Based Undersampling for Imbalanced Data Classification
Authors	Michał Koziarski
Abstract	Data imbalance remains one of the most widespread problems affecting contemporary machine learning. The negative effect data imbalance can have on the traditional learning algorithms is most severe in combination with other dataset difficulty factors, such as small disjuncts, presence of outliers and insufficient number of training observations. Said difficulty factors can also limit the applicability of some of the methods of dealing with data imbalance, in particular the neighborhood-based oversampling algorithms based on SMOTE. Radial-Based Oversampling (RBO) was previously proposed to mitigate some of the limitations of the neighborhood-based methods. In this paper we examine the possibility of utilizing the concept of mutual class potential, used to guide the oversampling process in RBO, in the undersampling procedure. Conducted computational complexity analysis indicates a significantly reduced time complexity of the proposed Radial-Based Undersampling algorithm, and the results of the performed experimental study indicate its usefulness, especially on difficult datasets.
Tasks
Published	2019-06-02
URL	https://arxiv.org/abs/1906.00452v1
PDF	https://arxiv.org/pdf/1906.00452v1.pdf
PWC	https://paperswithcode.com/paper/190600452
Repo	https://github.com/michalkoziarski/RBU
Framework	none

GORC: A large contextual citation graph of academic papers


Title	GORC: A large contextual citation graph of academic papers
Authors	Kyle Lo, Lucy Lu Wang, Mark Neumann, Rodney Kinney, Dan S. Weld
Abstract	We introduce the Semantic Scholar Graph of References in Context (GORC), a large contextual citation graph of 81.1M academic publications, including parsed full text for 8.1M open access papers, across broad domains of science. Each paper is represented with rich paper metadata (title, authors, abstract, etc.), and where available: cleaned full text, section headers, figure and table captions, and parsed bibliography entries. In-line citation mentions in full text are linked to their corresponding bibliography entries, which are in turn linked to in-corpus cited papers, forming the edges of a contextual citation graph. To our knowledge, this is the largest publicly available contextual citation graph; the full text alone is the largest parsed academic text corpus publicly available. We demonstrate the ability to identify similar papers using these citation contexts and propose several applications for language modeling and citation-related tasks.
Tasks	Language Modelling
Published	2019-11-07
URL	https://arxiv.org/abs/1911.02782v1
PDF	https://arxiv.org/pdf/1911.02782v1.pdf
PWC	https://paperswithcode.com/paper/gorc-a-large-contextual-citation-graph-of
Repo	https://github.com/allenai/s2-gorc
Framework	none

Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning


Title	Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
Authors	Seungyul Han, Youngchul Sung
Abstract	In importance sampling (IS)-based reinforcement learning algorithms such as Proximal Policy Optimization (PPO), IS weights are typically clipped to avoid large variance in learning. However, policy update from clipped statistics induces large bias in tasks with high action dimensions, and bias from clipping makes it difficult to reuse old samples with large IS weights. In this paper, we consider PPO, a representative on-policy algorithm, and propose its improvement by dimension-wise IS weight clipping which separately clips the IS weight of each action dimension to avoid large bias and adaptively controls the IS weight to bound policy update from the current policy. This new technique enables efficient learning for high action-dimensional tasks and reusing of old samples like in off-policy learning to increase the sample efficiency. Numerical results show that the proposed new algorithm outperforms PPO and other RL algorithms in various Open AI Gym tasks.
Tasks
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02363v2
PDF	https://arxiv.org/pdf/1905.02363v2.pdf
PWC	https://paperswithcode.com/paper/dimension-wise-importance-sampling-weight
Repo	https://github.com/seungyulhan/disc
Framework	tf

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image


Title	Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
Authors	Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, Michael J. Black
Abstract	To facilitate the analysis of human actions, interactions and emotions, we compute a 3D model of human body pose, hand pose, and facial expression from a single monocular image. To achieve this, we use thousands of 3D scans to train a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with fully articulated hands and an expressive face. Learning to regress the parameters of SMPL-X directly from images is challenging without paired images and 3D ground truth. Consequently, we follow the approach of SMPLify, which estimates 2D features and then optimizes model parameters to fit the features. We improve on SMPLify in several significant ways: (1) we detect 2D features corresponding to the face, hands, and feet and fit the full SMPL-X model to these; (2) we train a new neural network pose prior using a large MoCap dataset; (3) we define a new interpenetration penalty that is both fast and accurate; (4) we automatically detect gender and the appropriate body models (male, female, or neutral); (5) our PyTorch implementation achieves a speedup of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild. We evaluate 3D accuracy on a new curated dataset comprising 100 images with pseudo ground-truth. This is a step towards automatic expressive human capture from monocular RGB data. The models, code, and data are available for research purposes at https://smpl-x.is.tue.mpg.de.
Tasks	3D Human Pose Estimation, 3D Reconstruction
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05866v1
PDF	http://arxiv.org/pdf/1904.05866v1.pdf
PWC	https://paperswithcode.com/paper/expressive-body-capture-3d-hands-face-and
Repo	https://github.com/vchoutas/smplify-x
Framework	pytorch

DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation


Title	DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation
Authors	Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, Steven Lovegrove
Abstract	Computer graphics, 3D computer vision and robotics communities have produced multiple approaches to representing 3D geometry for rendering and reconstruction. These provide trade-offs across fidelity, efficiency and compression capabilities. In this work, we introduce DeepSDF, a learned continuous Signed Distance Function (SDF) representation of a class of shapes that enables high quality shape representation, interpolation and completion from partial and noisy 3D input data. DeepSDF, like its classical counterpart, represents a shape’s surface by a continuous volumetric field: the magnitude of a point in the field represents the distance to the surface boundary and the sign indicates whether the region is inside (-) or outside (+) of the shape, hence our representation implicitly encodes a shape’s boundary as the zero-level-set of the learned function while explicitly representing the classification of space as being part of the shapes interior or not. While classical SDF’s both in analytical or discretized voxel form typically represent the surface of a single shape, DeepSDF can represent an entire class of shapes. Furthermore, we show state-of-the-art performance for learned 3D shape representation and completion while reducing the model size by an order of magnitude compared with previous work.
Tasks	3D Reconstruction, 3D Shape Representation
Published	2019-01-16
URL	http://arxiv.org/abs/1901.05103v1
PDF	http://arxiv.org/pdf/1901.05103v1.pdf
PWC	https://paperswithcode.com/paper/deepsdf-learning-continuous-signed-distance
Repo	https://github.com/crazyleg/workshop-3d-neural
Framework	pytorch

Learning Embedding of 3D models with Quadric Loss


Title	Learning Embedding of 3D models with Quadric Loss
Authors	Nitin Agarwal, Sung-eui Yoon, M Gopi
Abstract	Sharp features such as edges and corners play an important role in the perception of 3D models. In order to capture them better, we propose quadric loss, a point-surface loss function, which minimizes the quadric error between the reconstructed points and the input surface. Computation of Quadric loss is easy, efficient since the quadric matrices can be computed apriori, and is fully differentiable, making quadric loss suitable for training point and mesh based architectures. Through extensive experiments we show the merits and demerits of quadric loss. When combined with Chamfer loss, quadric loss achieves better reconstruction results as compared to any one of them or other point-surface loss functions.
Tasks	3D Reconstruction, 3D Shape Representation, Representation Learning
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10250v1
PDF	https://arxiv.org/pdf/1907.10250v1.pdf
PWC	https://paperswithcode.com/paper/learning-embedding-of-3d-models-with-quadric
Repo	https://github.com/nitinagarwal/QuadricLoss
Framework	pytorch

Jointly Measuring Diversity and Quality in Text Generation Models


Title	Jointly Measuring Diversity and Quality in Text Generation Models
Authors	Ehsan Montahaei, Danial Alihosseini, Mahdieh Soleymani Baghshah
Abstract	Text generation is an important Natural Language Processing task with various applications. Although several metrics have already been introduced to evaluate the text generation methods, each of them has its own shortcomings. The most widely used metrics such as BLEU only consider the quality of generated sentences and neglect their diversity. For example, repeatedly generation of only one high quality sentence would result in a high BLEU score. On the other hand, the more recent metric introduced to evaluate the diversity of generated texts known as Self-BLEU ignores the quality of generated texts. In this paper, we propose metrics to evaluate both the quality and diversity simultaneously by approximating the distance of the learned generative model and the real data distribution. For this purpose, we first introduce a metric that approximates this distance using n-gram based measures. Then, a feature-based measure which is based on a recent highly deep model trained on a large text corpus called BERT is introduced. Finally, for oracle training mode in which the generator’s density can also be calculated, we propose to use the distance measures between the corresponding explicit distributions. Eventually, the most popular and recent text generation models are evaluated using both the existing and the proposed metrics and the preferences of the proposed metrics are determined.
Tasks	Text Generation
Published	2019-04-08
URL	https://arxiv.org/abs/1904.03971v2
PDF	https://arxiv.org/pdf/1904.03971v2.pdf
PWC	https://paperswithcode.com/paper/jointly-measuring-diversity-and-quality-in
Repo	https://github.com/Danial-Alh/FastBLEU
Framework	none

SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation


Title	SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation
Authors	Michaël Ramamonjisoa, Vincent Lepetit
Abstract	We introduce SharpNet, a method that predicts an accurate depth map for an input color image, with a particular attention to the reconstruction of occluding contours: Occluding contours are an important cue for object recognition, and for realistic integration of virtual objects in Augmented Reality, but they are also notoriously difficult to reconstruct accurately. For example, they are a challenge for stereo-based reconstruction methods, as points around an occluding contour are visible in only one image. Inspired by recent methods that introduce normal estimation to improve depth prediction, we introduce a novel term that constrains depth and occluding contours predictions. Since ground truth depth is difficult to obtain with pixel-perfect accuracy along occluding contours, we use synthetic images for training, followed by fine-tuning on real data. We demonstrate our approach on the challenging NYUv2-Depth dataset, and show that our method outperforms the state-of-the-art along occluding contours, while performing on par with the best recent methods for the rest of the images. Its accuracy along the occluding contours is actually better than the `ground truth’ acquired by a depth camera based on structured light. We show this by introducing a new benchmark based on NYUv2-Depth for evaluating occluding contours in monocular reconstruction, which is our second contribution. \|
Tasks	Depth Estimation, Monocular Depth Estimation, Object Recognition
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08598v3
PDF	https://arxiv.org/pdf/1905.08598v3.pdf
PWC	https://paperswithcode.com/paper/190508598
Repo	https://github.com/MichaelRamamonjisoa/SharpNet
Framework	pytorch

Implicit Generation and Generalization in Energy-Based Models


Title	Implicit Generation and Generalization in Energy-Based Models
Authors	Yilun Du, Igor Mordatch
Abstract	Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train. We present techniques to scale MCMC based EBM training on continuous neural networks, and we show its success on the high-dimensional data domains of ImageNet32x32, ImageNet128x128, CIFAR-10, and robotic hand trajectories, achieving better samples than other likelihood models and nearing the performance of contemporary GAN approaches, while covering all modes of the data. We highlight some unique capabilities of implicit generation such as compositionality and corrupt image reconstruction and inpainting. Finally, we show that EBMs are useful models across a wide variety of tasks, achieving state-of-the-art out-of-distribution classification, adversarially robust classification, state-of-the-art continual online class learning, and coherent long term predicted trajectory rollouts.
Tasks	Image Reconstruction
Published	2019-03-20
URL	https://arxiv.org/abs/1903.08689v3
PDF	https://arxiv.org/pdf/1903.08689v3.pdf
PWC	https://paperswithcode.com/paper/implicit-generation-and-generalization-in
Repo	https://github.com/rosinality/igebm-pytorch
Framework	pytorch

FDA: Feature Disruptive Attack


Title	FDA: Feature Disruptive Attack
Authors	Aditya Ganeshan, B. S. Vivek, R. Venkatesh Babu
Abstract	Though Deep Neural Networks (DNN) show excellent performance across various computer vision tasks, several works show their vulnerability to adversarial samples, i.e., image samples with imperceptible noise engineered to manipulate the network’s prediction. Adversarial sample generation methods range from simple to complex optimization techniques. Majority of these methods generate adversaries through optimization objectives that are tied to the pre-softmax or softmax output of the network. In this work we, (i) show the drawbacks of such attacks, (ii) propose two new evaluation metrics: Old Label New Rank (OLNR) and New Label Old Rank (NLOR) in order to quantify the extent of damage made by an attack, and (iii) propose a new adversarial attack FDA: Feature Disruptive Attack, to address the drawbacks of existing attacks. FDA works by generating image perturbation that disrupt features at each layer of the network and causes deep-features to be highly corrupt. This allows FDA adversaries to severely reduce the performance of deep networks. We experimentally validate that FDA generates stronger adversaries than other state-of-the-art methods for image classification, even in the presence of various defense measures. More importantly, we show that FDA disrupts feature-representation based tasks even without access to the task-specific network or methodology. Code available at: https://github.com/BardOfCodes/fda
Tasks	Adversarial Attack, Image Classification
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04385v1
PDF	https://arxiv.org/pdf/1909.04385v1.pdf
PWC	https://paperswithcode.com/paper/fda-feature-disruptive-attack
Repo	https://github.com/BardOfCodes/fda
Framework	tf

Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL


Title	Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL
Authors	Corey Lammie, Wei Xiang, Mostafa Rahimi Azghadi
Abstract	Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated using both GPUs and FPGAs. Our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art performance, while offering a >16-fold improvement in power consumption, compared to conventional GPU-accelerated networks. Both our FPGA-accelerated determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10 by >9.89x and >9.91x, respectively.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06105v1
PDF	https://arxiv.org/pdf/1905.06105v1.pdf
PWC	https://paperswithcode.com/paper/accelerating-deterministic-and-stochastic
Repo	https://github.com/coreylammie/Accelerating-Stochastically-Binarized-Neural-Networks-on-FPGAs-using-OpenCL
Framework	pytorch