January 31, 2020

2726 words 13 mins read

Paper Group AWR 422

Paper Group AWR 422

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. Exploiting temporal context for 3D human pose estimation in the wild. Visual Relationship Detection with Language prior and Softmax. ConvPoint: Continuous Convolutions for Point Cloud Processing. Automated Classification of Histopathology Images Using Tran …

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

Title EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Authors Jason Wei, Kai Zou
Abstract We present EDA: easy data augmentation techniques for boosting performance on text classification tasks. EDA consists of four simple but powerful operations: synonym replacement, random insertion, random swap, and random deletion. On five text classification tasks, we show that EDA improves performance for both convolutional and recurrent neural networks. EDA demonstrates particularly strong results for smaller datasets; on average, across five datasets, training with EDA while using only 50% of the available training set achieved the same accuracy as normal training with all available data. We also performed extensive ablation studies and suggest parameters for practical use.
Tasks Data Augmentation, Sentiment Analysis, Subjectivity Analysis, Text Augmentation, Text Classification
Published 2019-01-31
URL https://arxiv.org/abs/1901.11196v2
PDF https://arxiv.org/pdf/1901.11196v2.pdf
PWC https://paperswithcode.com/paper/eda-easy-data-augmentation-techniques-for
Repo https://github.com/alikhodadoost/Persian-Sentence-Augmenter
Framework none

Exploiting temporal context for 3D human pose estimation in the wild

Title Exploiting temporal context for 3D human pose estimation in the wild
Authors Anurag Arnab, Carl Doersch, Andrew Zisserman
Abstract We present a bundle-adjustment-based algorithm for recovering accurate 3D human pose and meshes from monocular videos. Unlike previous algorithms which operate on single frames, we show that reconstructing a person over an entire sequence gives extra constraints that can resolve ambiguities. This is because videos often give multiple views of a person, yet the overall body shape does not change and 3D positions vary slowly. Our method improves not only on standard mocap-based datasets like Human 3.6M – where we show quantitative improvements – but also on challenging in-the-wild datasets such as Kinetics. Building upon our algorithm, we present a new dataset of more than 3 million frames of YouTube videos from Kinetics with automatically generated 3D poses and meshes. We show that retraining a single-frame 3D pose estimator on this data improves accuracy on both real-world and mocap data by evaluating on the 3DPW and HumanEVA datasets.
Tasks 3D Human Pose Estimation, Pose Estimation
Published 2019-05-10
URL https://arxiv.org/abs/1905.04266v1
PDF https://arxiv.org/pdf/1905.04266v1.pdf
PWC https://paperswithcode.com/paper/exploiting-temporal-context-for-3d-human-pose
Repo https://github.com/deepmind/Temporal-3D-Pose-Kinetics
Framework tf

Visual Relationship Detection with Language prior and Softmax

Title Visual Relationship Detection with Language prior and Softmax
Authors Jaewon Jung, Jongyoul Park
Abstract Visual relationship detection is an intermediate image understanding task that detects two objects and classifies a predicate that explains the relationship between two objects in an image. The three components are linguistically and visually correlated (e.g. “wear” is related to “person” and “shirt”, while “laptop” is related to “table” and “on”) thus, the solution space is huge because there are many possible cases between them. Language and visual modules are exploited and a sophisticated spatial vector is proposed. The models in this work outperformed the state of arts without costly linguistic knowledge distillation from a large text corpus and building complex loss functions. All experiments were only evaluated on Visual Relationship Detection and Visual Genome dataset.
Tasks
Published 2019-04-16
URL http://arxiv.org/abs/1904.07798v1
PDF http://arxiv.org/pdf/1904.07798v1.pdf
PWC https://paperswithcode.com/paper/visual-relationship-detection-with-language-1
Repo https://github.com/Jungjaewon/Visual-Relationship-Detection
Framework caffe2

ConvPoint: Continuous Convolutions for Point Cloud Processing

Title ConvPoint: Continuous Convolutions for Point Cloud Processing
Authors Alexandre Boulch
Abstract Point clouds are unstructured and unordered data, as opposed to images. Thus, most machine learning approach developed for image cannot be directly transferred to point clouds. In this paper, we propose a generalization of discrete convolutional neural networks (CNNs) in order to deal with point clouds by replacing discrete kernels by continuous ones. This formulation is simple, allows arbitrary point cloud sizes and can easily be used for designing neural networks similarly to 2D CNNs. We present experimental results with various architectures, highlighting the flexibility of the proposed approach. We obtain competitive results compared to the state-of-the-art on shape classification, part segmentation and semantic segmentation for large-scale point clouds.
Tasks 3D Part Segmentation, Semantic Segmentation
Published 2019-04-04
URL https://arxiv.org/abs/1904.02375v5
PDF https://arxiv.org/pdf/1904.02375v5.pdf
PWC https://paperswithcode.com/paper/generalizing-discrete-convolutions-for
Repo https://github.com/aboulch/ConvPoint
Framework pytorch

Automated Classification of Histopathology Images Using Transfer Learning

Title Automated Classification of Histopathology Images Using Transfer Learning
Authors Muhammed Talo
Abstract There is a strong need for automated systems to improve diagnostic quality and reduce the analysis time in histopathology image processing. Automated detection and classification of pathological tissue characteristics with computer-aided diagnostic systems are a critical step in the early diagnosis and treatment of diseases. Once a pathology image is scanned by a microscope and loaded onto a computer, it can be used for automated detection and classification of diseases. In this study, the DenseNet-161 and ResNet-50 pre-trained CNN models have been used to classify digital histopathology patches into the corresponding whole slide images via transfer learning technique. The proposed pre-trained models were tested on grayscale and color histopathology images. The DenseNet-161 pre-trained model achieved a classification accuracy of 97.89% using grayscale images and the ResNet-50 model obtained the accuracy of 98.87% for color images. The proposed pre-trained models outperform state-of-the-art methods in all performance metrics to classify digital pathology patches into 24 categories.
Tasks Image Classification, Transfer Learning
Published 2019-03-24
URL https://arxiv.org/abs/1903.10035v2
PDF https://arxiv.org/pdf/1903.10035v2.pdf
PWC https://paperswithcode.com/paper/convolutional-neural-networks-for-multi-class
Repo https://github.com/MichelML/ml-cellsignal
Framework pytorch

Named Entity Recognition – Is there a glass ceiling?

Title Named Entity Recognition – Is there a glass ceiling?
Authors Tomasz Stanislawek, Anna Wróblewska, Alicja Wójcicka, Daniel Ziembicki, Przemyslaw Biecek
Abstract Recent developments in Named Entity Recognition (NER) have resulted in better and better models. However, is there a glass ceiling? Do we know which types of errors are still hard or even impossible to correct? In this paper, we present a detailed analysis of the types of errors in state-of-the-art machine learning (ML) methods. Our study reveals the weak and strong points of the Stanford, CMU, FLAIR, ELMO and BERT models, as well as their shared limitations. We also introduce new techniques for improving annotation, for training processes and for checking a model’s quality and stability. Presented results are based on the CoNLL 2003 data set for the English language. A new enriched semantic annotation of errors for this data set and new diagnostic data sets are attached in the supplementary materials.
Tasks Named Entity Recognition
Published 2019-10-06
URL https://arxiv.org/abs/1910.02403v2
PDF https://arxiv.org/pdf/1910.02403v2.pdf
PWC https://paperswithcode.com/paper/named-entity-recognition-is-there-a-glass
Repo https://github.com/applicaai/ner-resources
Framework none

From Node Embedding To Community Embedding : A Hyperbolic Approach

Title From Node Embedding To Community Embedding : A Hyperbolic Approach
Authors Thomas Gerald, Hadi Zaatiti, Hatem Hajri, Nicolas Baskiotis, Olivier Schwander
Abstract Detecting communities on graphs has received significant interest in recent literature. Current state-of-the-art community embedding approach called \textit{ComE} tackles this problem by coupling graph embedding with community detection. Considering the success of hyperbolic representations of graph-structured data in last years, an ongoing challenge is to set up a hyperbolic approach for the community detection problem. The present paper meets this challenge by introducing a Riemannian equivalent of \textit{ComE}. Our proposed approach combines hyperbolic embeddings with Riemannian K-means or Riemannian mixture models to perform community detection. We illustrate the usefulness of this framework through several experiments on real-world social networks and comparisons with \textit{ComE} and recent hyperbolic-based classification approaches.
Tasks Community Detection, Graph Embedding
Published 2019-07-02
URL https://arxiv.org/abs/1907.01662v2
PDF https://arxiv.org/pdf/1907.01662v2.pdf
PWC https://paperswithcode.com/paper/learning-graph-structured-data-using-poincare
Repo https://github.com/drewwilimitis/poincare
Framework pytorch

Learning monocular depth estimation infusing traditional stereo knowledge

Title Learning monocular depth estimation infusing traditional stereo knowledge
Authors Fabio Tosi, Filippo Aleotti, Matteo Poggi, Stefano Mattoccia
Abstract Depth estimation from a single image represents a fascinating, yet challenging problem with countless applications. Recent works proved that this task could be learned without direct supervision from ground truth labels leveraging image synthesis on sequences or stereo pairs. Focusing on this second case, in this paper we leverage stereo matching in order to improve monocular depth estimation. To this aim we propose monoResMatch, a novel deep architecture designed to infer depth from a single input image by synthesizing features from a different point of view, horizontally aligned with the input image, performing stereo matching between the two cues. In contrast to previous works sharing this rationale, our network is the first trained end-to-end from scratch. Moreover, we show how obtaining proxy ground truth annotation through traditional stereo algorithms, such as Semi-Global Matching, enables more accurate monocular depth estimation still countering the need for expensive depth labels by keeping a self-supervised approach. Exhaustive experimental results prove how the synergy between i) the proposed monoResMatch architecture and ii) proxy-supervision attains state-of-the-art for self-supervised monocular depth estimation. The code is publicly available at https://github.com/fabiotosi92/monoResMatch-Tensorflow.
Tasks Depth Estimation, Image Generation, Monocular Depth Estimation, Stereo Matching, Stereo Matching Hand
Published 2019-04-08
URL http://arxiv.org/abs/1904.04144v1
PDF http://arxiv.org/pdf/1904.04144v1.pdf
PWC https://paperswithcode.com/paper/learning-monocular-depth-estimation-infusing
Repo https://github.com/fabiotosi92/monoResMatch-Tensorflow
Framework tf

Modeling the Dynamics of PDE Systems with Physics-Constrained Deep Auto-Regressive Networks

Title Modeling the Dynamics of PDE Systems with Physics-Constrained Deep Auto-Regressive Networks
Authors Nicholas Geneva, Nicholas Zabaras
Abstract In recent years, deep learning has proven to be a viable methodology for surrogate modeling and uncertainty quantification for a vast number of physical systems. However, in their traditional form, such models can require a large amount of training data. This is of particular importance for various engineering and scientific applications where data may be extremely expensive to obtain. To overcome this shortcoming, physics-constrained deep learning provides a promising methodology as it only utilizes the governing equations. In this work, we propose a novel auto-regressive dense encoder-decoder convolutional neural network to solve and model non-linear dynamical systems without training data at a computational cost that is potentially magnitudes lower than standard numerical solvers. This model includes a Bayesian framework that allows for uncertainty quantification of the predicted quantities of interest at each time-step. We rigorously test this model on several non-linear transient partial differential equation systems including the turbulence of the Kuramoto-Sivashinsky equation, multi-shock formation and interaction with 1D Burgers’ equation and 2D wave dynamics with coupled Burgers’ equations. For each system, the predictive results and uncertainty are presented and discussed together with comparisons to the results obtained from traditional numerical analysis methods.
Tasks
Published 2019-06-13
URL https://arxiv.org/abs/1906.05747v3
PDF https://arxiv.org/pdf/1906.05747v3.pdf
PWC https://paperswithcode.com/paper/modeling-the-dynamics-of-pde-systems-with
Repo https://github.com/cics-nd/ar-pde-cnn
Framework pytorch

Conditional Independence Testing using Generative Adversarial Networks

Title Conditional Independence Testing using Generative Adversarial Networks
Authors Alexis Bellot, Mihaela van der Schaar
Abstract We consider the hypothesis testing problem of detecting conditional dependence, with a focus on high-dimensional feature spaces. Our contribution is a new test statistic based on samples from a generative adversarial network designed to approximate directly a conditional distribution that encodes the null hypothesis, in a manner that maximizes power (the rate of true negatives). We show that such an approach requires only that density approximation be viable in order to ensure that we control type I error (the rate of false positives); in particular, no assumptions need to be made on the form of the distributions or feature dependencies. Using synthetic simulations with high-dimensional data we demonstrate significant gains in power over competing methods. In addition, we illustrate the use of our test to discover causal markers of disease in genetic data.
Tasks
Published 2019-07-09
URL https://arxiv.org/abs/1907.04068v2
PDF https://arxiv.org/pdf/1907.04068v2.pdf
PWC https://paperswithcode.com/paper/conditional-independence-testing-using
Repo https://github.com/alexisbellot/GCIT
Framework tf

Post-editese: an Exacerbated Translationese

Title Post-editese: an Exacerbated Translationese
Authors Antonio Toral
Abstract Post-editing (PE) machine translation (MT) is widely used for dissemination because it leads to higher productivity than human translation from scratch (HT). In addition, PE translations are found to be of equal or better quality than HTs. However, most such studies measure quality solely as the number of errors. We conduct a set of computational analyses in which we compare PE against HT on three different datasets that cover five translation directions with measures that address different translation universals and laws of translation: simplification, normalisation and interference. We find out that PEs are simpler and more normalised and have a higher degree of interference from the source language than HTs.
Tasks Machine Translation
Published 2019-07-01
URL https://arxiv.org/abs/1907.00900v2
PDF https://arxiv.org/pdf/1907.00900v2.pdf
PWC https://paperswithcode.com/paper/post-editese-an-exacerbated-translationese
Repo https://github.com/antot/posteditese_mtsummit19
Framework none

Adaptive Thompson Sampling Stacks for Memory Bounded Open-Loop Planning

Title Adaptive Thompson Sampling Stacks for Memory Bounded Open-Loop Planning
Authors Thomy Phan, Thomas Gabor, Robert Müller, Christoph Roch, Claudia Linnhoff-Popien
Abstract We propose Stable Yet Memory Bounded Open-Loop (SYMBOL) planning, a general memory bounded approach to partially observable open-loop planning. SYMBOL maintains an adaptive stack of Thompson Sampling bandits, whose size is bounded by the planning horizon and can be automatically adapted according to the underlying domain without any prior domain knowledge beyond a generative model. We empirically test SYMBOL in four large POMDP benchmark problems to demonstrate its effectiveness and robustness w.r.t. the choice of hyperparameters and evaluate its adaptive memory consumption. We also compare its performance with other open-loop planning algorithms and POMCP.
Tasks
Published 2019-07-11
URL https://arxiv.org/abs/1907.05861v1
PDF https://arxiv.org/pdf/1907.05861v1.pdf
PWC https://paperswithcode.com/paper/adaptive-thompson-sampling-stacks-for-memory
Repo https://github.com/thomyphan/planning
Framework none

3D Packing for Self-Supervised Monocular Depth Estimation

Title 3D Packing for Self-Supervised Monocular Depth Estimation
Authors Vitor Guizilini, Rares Ambrus, Sudeep Pillai, Allan Raventos, Adrien Gaidon
Abstract Although cameras are ubiquitous, robotic platforms typically rely on active sensors like LiDAR for direct 3D perception. In this work, we propose a novel self-supervised monocular depth estimation method combining geometry with a new deep network, PackNet, learned only from unlabeled monocular videos. Our architecture leverages novel symmetrical packing and unpacking blocks to jointly learn to compress and decompress detail-preserving representations using 3D convolutions. Although self-supervised, our method outperforms other self, semi, and fully supervised methods on the KITTI benchmark. The 3D inductive bias in PackNet enables it to scale with input resolution and number of parameters without overfitting, generalizing better on out-of-domain data such as the NuScenes dataset. Furthermore, it does not require large-scale supervised pretraining on ImageNet and can run in real-time. Finally, we release DDAD (Dense Depth for Automated Driving), a new urban driving dataset with more challenging and accurate depth evaluation, thanks to longer-range and denser ground-truth depth generated from high-density LiDARs mounted on a fleet of self-driving cars operating world-wide.
Tasks Depth Estimation, Monocular Depth Estimation, Self-Driving Cars
Published 2019-05-06
URL https://arxiv.org/abs/1905.02693v4
PDF https://arxiv.org/pdf/1905.02693v4.pdf
PWC https://paperswithcode.com/paper/packnet-sfm-3d-packing-for-self-supervised
Repo https://github.com/ToyotaResearchInstitute/packnet-sfm
Framework none

Recovery of Vertex Orderings in Dynamic Graphs: Algorithms for a General Solution

Title Recovery of Vertex Orderings in Dynamic Graphs: Algorithms for a General Solution
Authors Krzysztof Turowski, Jithin K. Sreedharan, Wojciech Szpankowski
Abstract Dynamic networks model important processes in diverse applications, from spread of infectious diseases to flow of capital in economic systems. The arrival order of nodes in such networks reveals critical insights into their structural and functional organization. In typical applications, the state of a dynamic network is available as a snapshot in time, and one must infer a node arrival order from it. We formulate a more general problem that infers a partial order or clusters of node arrivals. We provide nearly-optimal and approximate solutions to the associated optimization problem that is suitable for any dynamic graph model with only node and edge additions. Finally our methods are validated for a particular graph model through experiments on both synthetic and real-world networks.
Tasks
Published 2019-05-02
URL https://arxiv.org/abs/1905.00672v3
PDF https://arxiv.org/pdf/1905.00672v3.pdf
PWC https://paperswithcode.com/paper/temporal-ordered-clustering-in-dynamic
Repo https://github.com/krzysztof-turowski/duplication-divergence
Framework none

ConveRT: Efficient and Accurate Conversational Representations from Transformers

Title ConveRT: Efficient and Accurate Conversational Representations from Transformers
Authors Matthew Henderson, Iñigo Casanueva, Nikola Mrkšić, Pei-Hao Su, Tsung-Hsien, Ivan Vulić
Abstract General-purpose pretrained sentence encoders such as BERT are not ideal for real-world conversational AI applications; they are computationally heavy, slow, and expensive to train. We propose ConveRT (Conversational Representations from Transformers), a faster, more compact dual sentence encoder specifically optimized for dialog tasks. We pretrain using a retrieval-based response selection task, effectively leveraging quantization and subword-level parameterization in the dual encoder to build a lightweight memory- and energy-efficient model. In our evaluation, we show that ConveRT achieves state-of-the-art performance across widely established response selection tasks. We also demonstrate that the use of extended dialog history as context yields further performance gains. Finally, we show that pretrained representations from the proposed encoder can be transferred to the intent classification task, yielding strong results across three diverse data sets. ConveRT trains substantially faster than standard sentence encoders or previous state-of-the-art dual encoders. With its reduced size and superior performance, we believe this model promises wider portability and scalability for Conversational AI applications.
Tasks Conversational Response Selection, Intent Classification, Quantization
Published 2019-11-09
URL https://arxiv.org/abs/1911.03688v1
PDF https://arxiv.org/pdf/1911.03688v1.pdf
PWC https://paperswithcode.com/paper/convert-efficient-and-accurate-conversational
Repo https://github.com/codertimo/ConveRT-pytorch
Framework pytorch
comments powered by Disqus