January 28, 2020

3556 words 17 mins read

Paper Group ANR 1018

Paper Group ANR 1018

On the Effectiveness of the Pooling Methods for Biomedical Relation Extraction with Deep Learning. HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects. Automated Detecting and Placing Road Objects from Street-level Images. Towards Robust Neural Vocoding for Speech Generation: A Survey. Learning to Dress 3D People in Generative Clothing …

On the Effectiveness of the Pooling Methods for Biomedical Relation Extraction with Deep Learning

Title On the Effectiveness of the Pooling Methods for Biomedical Relation Extraction with Deep Learning
Authors Tuan Ngo Nguyen, Franck Dernoncourt, Thien Huu Nguyen
Abstract Deep learning models have achieved state-of-the-art performances on many relation extraction datasets. A common element in these deep learning models involves the pooling mechanisms where a sequence of hidden vectors is aggregated to generate a single representation vector, serving as the features to perform prediction for RE. Unfortunately, the models in the literature tend to employ different strategies to perform pooling for RE, leading to the challenge to determine the best pooling mechanism for this problem, especially in the biomedical domain. In order to answer this question, in this work, we conduct a comprehensive study to evaluate the effectiveness of different pooling mechanisms for the deep learning models in biomedical RE. The experimental results suggest that dependency-based pooling is the best pooling strategy for RE in the biomedical domain, yielding the state-of-the-art performance on two benchmark datasets for this problem.
Tasks Relation Extraction
Published 2019-11-04
URL https://arxiv.org/abs/1911.01055v1
PDF https://arxiv.org/pdf/1911.01055v1.pdf
PWC https://paperswithcode.com/paper/on-the-effectiveness-of-the-pooling-methods-1
Repo
Framework

HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects

Title HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects
Authors Roman Kaskman, Sergey Zakharov, Ivan Shugurov, Slobodan Ilic
Abstract Among the most important prerequisites for creating and evaluating 6D object pose detectors are datasets with labeled 6D poses. With the advent of deep learning, demand for such datasets is growing continuously. Despite the fact that some of exist, they are scarce and typically have restricted setups, such as a single object per sequence, or they focus on specific object types, such as textureless industrial parts. Besides, two significant components are often ignored: training using only available 3D models instead of real data and scalability, i.e. training one method to detect all objects rather than training one detector per object. Other challenges, such as occlusions, changing light conditions and changes in object appearance, as well precisely defined benchmarks are either not present or are scattered among different datasets. In this paper we present a dataset for 6D pose estimation that covers the above-mentioned challenges, mainly targeting training from 3D models (both textured and textureless), scalability, occlusions, and changes in light conditions and object appearance. The dataset features 33 objects (17 toy, 8 household and 8 industry-relevant objects) over 13 scenes of various difficulty. We also present a set of benchmarks to test various desired detector properties, particularly focusing on scalability with respect to the number of objects and resistance to changing light conditions, occlusions and clutter. We also set a baseline for the presented benchmarks using a state-of-the-art DPOD detector. Considering the difficulty of making such datasets, we plan to release the code allowing other researchers to extend this dataset or make their own datasets in the future.
Tasks 6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation
Published 2019-04-05
URL https://arxiv.org/abs/1904.03167v2
PDF https://arxiv.org/pdf/1904.03167v2.pdf
PWC https://paperswithcode.com/paper/homebreweddb-rgb-d-dataset-for-6d-pose
Repo
Framework

Automated Detecting and Placing Road Objects from Street-level Images

Title Automated Detecting and Placing Road Objects from Street-level Images
Authors Chaoquan Zhang, Hongchao Fan, Wanzhi Li, Bo Mao, Xuan Ding
Abstract Navigation services utilized by autonomous vehicles or ordinary users require the availability of detailed information about road-related objects and their geolocations, especially at road intersections. However, these road intersections are mainly represented as point elements without detailed information, or are even not available in current versions of crowdsourced mapping databases including OpenStreetMap(OSM). This study develops an approach to automatically detect road objects and place them to right location from street-level images. Our processing pipeline relies on two convolutional neural networks: the first segments the images, while the second detects and classifies the specific objects. Moreover, to locate the detected objects, we establish an attributed topological binary tree(ATBT) based on urban grammar for each image to depict the coherent relations of topologies, attributes and semantics of the road objects. Then the ATBT is further matched with map features on OSM to determine the right placed location. The proposed method has been applied to a case study in Berlin, Germany. We validate the effectiveness of our method on two object classes: traffic signs and traffic lights. Experimental results demonstrate that the proposed approach provides near-precise localization results in terms of completeness and positional accuracy. Among many potential applications, the output may be combined with other sources of data to guide autonomous vehicles
Tasks Autonomous Vehicles
Published 2019-08-29
URL https://arxiv.org/abs/1909.05621v3
PDF https://arxiv.org/pdf/1909.05621v3.pdf
PWC https://paperswithcode.com/paper/automated-detecting-and-placing-road-objects
Repo
Framework

Towards Robust Neural Vocoding for Speech Generation: A Survey

Title Towards Robust Neural Vocoding for Speech Generation: A Survey
Authors Po-chun Hsu, Chun-hsuan Wang, Andy T. Liu, Hung-yi Lee
Abstract Recently, neural vocoders have been widely used in speech synthesis tasks, including text-to-speech and voice conversion. However, in the encounter of data distribution mismatch between training and inference, neural vocoders trained on real data often degrade in voice quality for unseen scenarios. In this paper, we train three commonly used neural vocoders, including WaveNet, WaveRNN, and WaveGlow, alternately on five different datasets. To study the robustness of neural vocoders, we evaluate the models using acoustic features from seen/unseen speakers, seen/unseen languages, a text-to-speech model, and a voice conversion model. In this work, we found that WaveNet is more robust than WaveRNN, especially in the face of inconsistency between training and testing data. Through our experiments, we show that WaveNet is more suitable for text-to-speech models, and WaveRNN more suitable for voice conversion applications. Furthermore, we present results with considerable reference value of subjective human evaluation for future studies.
Tasks Speech Synthesis, Voice Conversion
Published 2019-12-05
URL https://arxiv.org/abs/1912.02461v1
PDF https://arxiv.org/pdf/1912.02461v1.pdf
PWC https://paperswithcode.com/paper/towards-robust-neural-vocoding-for-speech
Repo
Framework

Learning to Dress 3D People in Generative Clothing

Title Learning to Dress 3D People in Generative Clothing
Authors Qianli Ma, Jinlong Yang, Anurag Ranjan, Sergi Pujades, Gerard Pons-Moll, Siyu Tang, Michael J. Black
Abstract Three-dimensional human body models are widely used in the analysis of human pose and motion. Existing models, however, are learned from minimally-clothed 3D scans and thus do not generalize to the complexity of dressed people in common images and videos. Additionally, current models lack the expressive power needed to represent the complex non-linear geometry of pose-dependent clothing shape. To address this, we learn a generative 3D mesh model of clothed people from 3D scans with varying pose and clothing. Specifically, we train a conditional Mesh-VAE-GAN to learn the clothing deformation from the SMPL body model, making clothing an additional term on SMPL. Our model is conditioned on both pose and clothing type, giving the ability to draw samples of clothing to dress different body shapes in a variety of styles and poses. To preserve wrinkle detail, our Mesh-VAE-GAN extends patchwise discriminators to 3D meshes. Our model, named CAPE, represents global shape and fine local structure, effectively extending the SMPL body model to clothing. To our knowledge, this is the first generative model that directly dresses 3D human body meshes and generalizes to different poses.
Tasks
Published 2019-07-31
URL https://arxiv.org/abs/1907.13615v2
PDF https://arxiv.org/pdf/1907.13615v2.pdf
PWC https://paperswithcode.com/paper/dressing-3d-humans-using-a-conditional-mesh
Repo
Framework

SPEC2: SPECtral SParsE CNN Accelerator on FPGAs

Title SPEC2: SPECtral SParsE CNN Accelerator on FPGAs
Authors Yue Niu, Hanqing Zeng, Ajitesh Srivastava, Kartik Lakhotia, Rajgopal Kannan, Yanzhi Wang, Viktor Prasanna
Abstract To accelerate inference of Convolutional Neural Networks (CNNs), various techniques have been proposed to reduce computation redundancy. Converting convolutional layers into frequency domain significantly reduces the computation complexity of the sliding window operations in space domain. On the other hand, weight pruning techniques address the redundancy in model parameters by converting dense convolutional kernels into sparse ones. To obtain high-throughput FPGA implementation, we propose SPEC2 – the first work to prune and accelerate spectral CNNs. First, we propose a systematic pruning algorithm based on Alternative Direction Method of Multipliers (ADMM). The offline pruning iteratively sets the majority of spectral weights to zero, without using any handcrafted heuristics. Then, we design an optimized pipeline architecture on FPGA that has efficient random access into the sparse kernels and exploits various dimensions of parallelism in convolutional layers. Overall, SPEC2 achieves high inference throughput with extremely low computation complexity and negligible accuracy degradation. We demonstrate SPEC2 by pruning and implementing LeNet and VGG16 on the Xilinx Virtex platform. After pruning 75% of the spectral weights, SPEC2 achieves 0% accuracy loss for LeNet, and <1% accuracy loss for VGG16. The resulting accelerators achieve up to 24x higher throughput, compared with the state-of-the-art FPGA implementations for VGG16.
Tasks
Published 2019-10-16
URL https://arxiv.org/abs/1910.11103v1
PDF https://arxiv.org/pdf/1910.11103v1.pdf
PWC https://paperswithcode.com/paper/spec2-spectral-sparse-cnn-accelerator-on
Repo
Framework

Learning De-biased Representations with Biased Representations

Title Learning De-biased Representations with Biased Representations
Authors Hyojin Bahng, Sanghyuk Chun, Sangdoo Yun, Jaegul Choo, Seong Joon Oh
Abstract Many machine learning algorithms are trained and evaluated by splitting data from a single source into training and test sets. While such focus on in-distribution learning scenarios has led to interesting advancement, it has not been able to tell if models are relying on dataset biases as shortcuts for successful prediction (e.g., using snow cues for recognising snowmobiles). Such biased models fail to generalise when the bias shifts to a different class. The cross-bias generalisation problem has been addressed by de-biasing training data through augmentation or re-sampling, which are often prohibitive due to the data collection cost (e.g., collecting images of a snowmobile on a desert) and the difficulty of quantifying or expressing biases in the first place. In this work, we propose a novel framework to train a de-biased representation by encouraging it to be different from a set of representations that are biased by design. This tactic is feasible in many scenarios where it is much easier to define a set of biased representations than to define and quantify bias. We demonstrate the efficacy of our method across a variety of synthetic and real-world biases. Our experiments and analyses show that the method discourages models from taking bias shortcuts, resulting in improved generalisation.
Tasks
Published 2019-10-07
URL https://arxiv.org/abs/1910.02806v2
PDF https://arxiv.org/pdf/1910.02806v2.pdf
PWC https://paperswithcode.com/paper/learning-de-biased-representations-with-1
Repo
Framework

An Efficient Convolutional Neural Network for Coronary Heart Disease Prediction

Title An Efficient Convolutional Neural Network for Coronary Heart Disease Prediction
Authors Aniruddha Dutta, Tamal Batabyal, Meheli Basu, Scott T. Acton
Abstract This study proposes an efficient neural network with convolutional layers to classify significantly class-imbalanced clinical data. The data are curated from the National Health and Nutritional Examination Survey (NHANES) with the goal of predicting the occurrence of Coronary Heart Disease (CHD). While the majority of the existing machine learning models that have been used on this class of data are vulnerable to class imbalance even after the adjustment of class-specific weights, our simple two-layer CNN exhibits resilience to the imbalance with fair harmony in class-specific performance. In order to obtain significant improvement in classification accuracy under supervised learning settings, it is a common practice to train a neural network architecture with a massive data and thereafter, test the resulting network on a comparatively smaller amount of data. However, given a highly imbalanced dataset, it is often challenging to achieve a high class 1 (true CHD prediction rate) accuracy as the testing data size increases. We adopt a two-step approach: first, we employ least absolute shrinkage and selection operator (LASSO) based feature weight assessment followed by majority-voting based identification of important features. Next, the important features are homogenized by using a fully connected layer, a crucial step before passing the output of the layer to successive convolutional stages. We also propose a training routine per epoch, akin to a simulated annealing process, to boost the classification accuracy. Despite a 35:1 (Non-CHD:CHD) ratio in the NHANES dataset, the investigation confirms that our proposed CNN architecture has the classification power of 77% to correctly classify the presence of CHD and 81.8% the absence of CHD cases on a testing data, which is 85.70% of the total dataset. ( (<1920 characters)Please check the paper for full abstract)
Tasks Disease Prediction
Published 2019-09-01
URL https://arxiv.org/abs/1909.00489v1
PDF https://arxiv.org/pdf/1909.00489v1.pdf
PWC https://paperswithcode.com/paper/an-efficient-convolutional-neural-network-for
Repo
Framework

Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate

Title Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate
Authors Ugur Kursuncu, Manas Gaur, Carlos Castillo, Amanuel Alambo, K. Thirunarayan, Valerie Shalin, Dilshod Achilov, I. Budak Arpinar, Amit Sheth
Abstract Terror attacks have been linked in part to online extremist content. Although tens of thousands of Islamist extremism supporters consume such content, they are a small fraction relative to peaceful Muslims. The efforts to contain the ever-evolving extremism on social media platforms have remained inadequate and mostly ineffective. Divergent extremist and mainstream contexts challenge machine interpretation, with a particular threat to the precision of classification algorithms. Our context-aware computational approach to the analysis of extremist content on Twitter breaks down this persuasion process into building blocks that acknowledge inherent ambiguity and sparsity that likely challenge both manual and automated classification. We model this process using a combination of three contextual dimensions – religion, ideology, and hate – each elucidating a degree of radicalization and highlighting independent features to render them computationally accessible. We utilize domain-specific knowledge resources for each of these contextual dimensions such as Qur’an for religion, the books of extremist ideologues and preachers for political ideology and a social media hate speech corpus for hate. Our study makes three contributions to reliable analysis: (i) Development of a computational approach rooted in the contextual dimensions of religion, ideology, and hate that reflects strategies employed by online Islamist extremist groups, (ii) An in-depth analysis of relevant tweet datasets with respect to these dimensions to exclude likely mislabeled users, and (iii) A framework for understanding online radicalization as a process to assist counter-programming. Given the potentially significant social impact, we evaluate the performance of our algorithms to minimize mislabeling, where our approach outperforms a competitive baseline by 10.2% in precision.
Tasks
Published 2019-08-18
URL https://arxiv.org/abs/1908.06520v2
PDF https://arxiv.org/pdf/1908.06520v2.pdf
PWC https://paperswithcode.com/paper/modeling-islamist-extremist-communications-on
Repo
Framework

Leveling the Playing Field – Fairness in AI Versus Human Game Benchmarks

Title Leveling the Playing Field – Fairness in AI Versus Human Game Benchmarks
Authors Rodrigo Canaan, Christoph Salge, Julian Togelius, Andy Nealen
Abstract From the beginning if the history of AI, there has been interest in games as a platform of research. As the field developed, human-level competence in complex games became a target researchers worked to reach. Only relatively recently has this target been finally met for traditional tabletop games such as Backgammon, Chess and Go. Current research focus has shifted to electronic games, which provide unique challenges. As is often the case with AI research, these results are liable to be exaggerated or misrepresented by either authors or third parties. The extent to which these games benchmark consist of fair competition between human and AI is also a matter of debate. In this work, we review the statements made by authors and third parties in the general media and academic circle about these game benchmark results and discuss factors that can impact the perception of fairness in the contest between humans and machines
Tasks
Published 2019-03-17
URL https://arxiv.org/abs/1903.07008v4
PDF https://arxiv.org/pdf/1903.07008v4.pdf
PWC https://paperswithcode.com/paper/leveling-the-playing-field-fairness-in-ai
Repo
Framework

How much data is sufficient to learn high-performing algorithms?

Title How much data is sufficient to learn high-performing algorithms?
Authors Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, Ellen Vitercik
Abstract Algorithms – for example for scientific analysis – typically have tunable parameters that significantly influence computational efficiency and solution quality. If a parameter setting leads to strong algorithmic performance on average over a set of training instances, that parameter setting – ideally – will perform well on previously unseen future instances. However, if the set of training instances is too small, average performance will not generalize to future performance. This raises the question: how large should this training set be? We answer this question for any algorithm satisfying an easy-to-describe, ubiquitous property: its performance is a piecewise-structured function of its parameters. We provide the first unified sample complexity framework for algorithm parameter configuration; prior research followed case-by-case analyses. We present example applications to diverse domains including biology, political science, economics, integer programming, and clustering.
Tasks
Published 2019-08-08
URL https://arxiv.org/abs/1908.02894v3
PDF https://arxiv.org/pdf/1908.02894v3.pdf
PWC https://paperswithcode.com/paper/how-much-data-is-sufficient-to-learn-high
Repo
Framework

Atalaya at TASS 2019: Data Augmentation and Robust Embeddings for Sentiment Analysis

Title Atalaya at TASS 2019: Data Augmentation and Robust Embeddings for Sentiment Analysis
Authors Franco M. Luque
Abstract In this article we describe our participation in TASS 2019, a shared task aimed at the detection of sentiment polarity of Spanish tweets. We combined different representations such as bag-of-words, bag-of-characters, and tweet embeddings. In particular, we trained robust subword-aware word embeddings and computed tweet representations using a weighted-averaging strategy. We also used two data augmentation techniques to deal with data scarcity: two-way translation augmentation, and instance crossover augmentation, a novel technique that generates new instances by combining halves of tweets. In experiments, we trained linear classifiers and ensemble models, obtaining highly competitive results despite the simplicity of our approaches.
Tasks Data Augmentation, Sentiment Analysis, Word Embeddings
Published 2019-09-25
URL https://arxiv.org/abs/1909.11241v1
PDF https://arxiv.org/pdf/1909.11241v1.pdf
PWC https://paperswithcode.com/paper/atalaya-at-tass-2019-data-augmentation-and
Repo
Framework

Improved Surrogates in Inertial Confinement Fusion with Manifold and Cycle Consistencies

Title Improved Surrogates in Inertial Confinement Fusion with Manifold and Cycle Consistencies
Authors Rushil Anirudh, Jayaraman J. Thiagarajan, Peer-Timo Bremer, Brian K. Spears
Abstract Neural networks have become very popular in surrogate modeling because of their ability to characterize arbitrary, high dimensional functions in a data driven fashion. This paper advocates for the training of surrogates that are consistent with the physical manifold – i.e., predictions are always physically meaningful, and are cyclically consistent – i.e., when the predictions of the surrogate, when passed through an independently trained inverse model give back the original input parameters. We find that these two consistencies lead to surrogates that are superior in terms of predictive performance, more resilient to sampling artifacts, and tend to be more data efficient. Using Inertial Confinement Fusion (ICF) as a test bed problem, we model a 1D semi-analytic numerical simulator and demonstrate the effectiveness of our approach. Code and data are available at https://github.com/rushilanirudh/macc/
Tasks
Published 2019-12-17
URL https://arxiv.org/abs/1912.08113v1
PDF https://arxiv.org/pdf/1912.08113v1.pdf
PWC https://paperswithcode.com/paper/improved-surrogates-in-inertial-confinement
Repo
Framework

DCGANs for Realistic Breast Mass Augmentation in X-ray Mammography

Title DCGANs for Realistic Breast Mass Augmentation in X-ray Mammography
Authors Basel Alyafi, Oliver Diaz, Robert Marti
Abstract Early detection of breast cancer has a major contribution to curability, and using mammographic images, this can be achieved non-invasively. Supervised deep learning, the dominant CADe tool currently, has played a great role in object detection in computer vision, but it suffers from a limiting property: the need of a large amount of labelled data. This becomes stricter when it comes to medical datasets which require high-cost and time-consuming annotations. Furthermore, medical datasets are usually imbalanced, a condition that often hinders classifiers performance. The aim of this paper is to learn the distribution of the minority class to synthesise new samples in order to improve lesion detection in mammography. Deep Convolutional Generative Adversarial Networks (DCGANs) can efficiently generate breast masses. They are trained on increasing-size subsets of one mammographic dataset and used to generate diverse and realistic breast masses. The effect of including the generated images and/or applying horizontal and vertical flipping is tested in an environment where a 1:10 imbalanced dataset of masses and normal tissue patches is classified by a fully-convolutional network. A maximum of ~ 0:09 improvement of F1 score is reported by using DCGANs along with flipping augmentation over using the original images. We show that DCGANs can be used for synthesising photo-realistic breast mass patches with considerable diversity. It is demonstrated that appending synthetic images in this environment, along with flipping, outperforms the traditional augmentation method of flipping solely, offering faster improvements as a function of the training set size.
Tasks Object Detection
Published 2019-09-04
URL https://arxiv.org/abs/1909.02062v1
PDF https://arxiv.org/pdf/1909.02062v1.pdf
PWC https://paperswithcode.com/paper/dcgans-for-realistic-breast-mass-augmentation
Repo
Framework

What is the Point of Fairness? Disability, AI and The Complexity of Justice

Title What is the Point of Fairness? Disability, AI and The Complexity of Justice
Authors Cynthia L. Bennett, Os Keyes
Abstract Work integrating conversations around AI and Disability is vital and valued, particularly when done through a lens of fairness. Yet at the same time, analyzing the ethical implications of AI for disabled people solely through the lens of a singular idea of “fairness” risks reinforcing existing power dynamics, either through reinforcing the position of existing medical gatekeepers, or promoting tools and techniques that benefit otherwise-privileged disabled people while harming those who are rendered outliers in multiple ways. In this paper we present two case studies from within computer vision - a subdiscipline of AI focused on training algorithms that can “see” - of technologies putatively intended to help disabled people but, through failures to consider structural injustices in their design, are likely to result in harms not addressed by a “fairness” framing of ethics. Drawing on disability studies and critical data science, we call on researchers into AI ethics and disability to move beyond simplistic notions of fairness, and towards notions of justice.
Tasks
Published 2019-08-02
URL https://arxiv.org/abs/1908.01024v3
PDF https://arxiv.org/pdf/1908.01024v3.pdf
PWC https://paperswithcode.com/paper/what-is-the-point-of-fairness-disability-ai
Repo
Framework
comments powered by Disqus