October 17, 2019

3118 words 15 mins read

Paper Group ANR 750

Paper Group ANR 750

The impact of imbalanced training data on machine learning for author name disambiguation. One “Ruler” for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning. Spatial Frequency Loss for Learning Convolutional Autoencoders. Defending against Adversarial Images using Basis Functions Transformations. T-RECS: Training …

The impact of imbalanced training data on machine learning for author name disambiguation

Title The impact of imbalanced training data on machine learning for author name disambiguation
Authors Jinseok Kim, Jenna Kim
Abstract In supervised machine learning for author name disambiguation, negative training data are often dominantly larger than positive training data. This paper examines how the ratios of negative to positive training data can affect the performance of machine learning algorithms to disambiguate author names in bibliographic records. On multiple labeled datasets, three classifiers - Logistic Regression, Na"ive Bayes, and Random Forest - are trained through representative features such as coauthor names, and title words extracted from the same training data but with various positive-negative training data ratios. Results show that increasing negative training data can improve disambiguation performance but with a few percent of performance gains and sometimes degrade it. Logistic Regression and Na"ive Bayes learn optimal disambiguation models even with a base ratio (1:1) of positive and negative training data. Also, the performance improvement by Random Forest tends to quickly saturate roughly after 1:10 ~ 1:15. These findings imply that contrary to the common practice using all training data, name disambiguation algorithms can be trained using part of negative training data without degrading much disambiguation performance while increasing computational efficiency. This study calls for more attention from author name disambiguation scholars to methods for machine learning from imbalanced data.
Tasks
Published 2018-07-30
URL http://arxiv.org/abs/1808.00525v2
PDF http://arxiv.org/pdf/1808.00525v2.pdf
PWC https://paperswithcode.com/paper/the-impact-of-imbalanced-training-data-on
Repo
Framework

One “Ruler” for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning

Title One “Ruler” for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning
Authors Xiaowei Tong, Zhenxin Fu, Mingyue Shang, Dongyan Zhao, Rui Yan
Abstract Automatic evaluating the performance of Open-domain dialogue system is a challenging problem. Recent work in neural network-based metrics has shown promising opportunities for automatic dialogue evaluation. However, existing methods mainly focus on monolingual evaluation, in which the trained metric is not flexible enough to transfer across different languages. To address this issue, we propose an adversarial multi-task neural metric (ADVMT) for multi-lingual dialogue evaluation, with shared feature extraction across languages. We evaluate the proposed model in two different languages. Experiments show that the adversarial multi-task neural metric achieves a high correlation with human annotation, which yields better performance than monolingual ones and various existing metrics.
Tasks Multi-Task Learning
Published 2018-05-08
URL http://arxiv.org/abs/1805.02914v1
PDF http://arxiv.org/pdf/1805.02914v1.pdf
PWC https://paperswithcode.com/paper/one-ruler-for-all-languages-multi-lingual
Repo
Framework

Spatial Frequency Loss for Learning Convolutional Autoencoders

Title Spatial Frequency Loss for Learning Convolutional Autoencoders
Authors Naoyuki Ichimura
Abstract This paper presents a learning method for convolutional autoencoders (CAEs) for extracting features from images. CAEs can be obtained by utilizing convolutional neural networks to learn an approximation to the identity function in an unsupervised manner. The loss function based on the pixel loss (PL) that is the mean squared error between the pixel values of original and reconstructed images is the common choice for learning. However, using the loss function leads to blurred reconstructed images. A method for learning CAEs using a loss function computed from features reflecting spatial frequencies is proposed to mitigate the problem. The blurs in reconstructed images show lack of high spatial frequency components mainly constituting edges and detailed textures that are important features for tasks such as object detection and spatial matching. In order to evaluate the lack of components, a convolutional layer with a Laplacian filter bank as weights is added to CAEs and the mean squared error of features in a subband, called the spatial frequency loss (SFL), is computed from the outputs of each filter. The learning is performed using a loss function based on the SFL. Empirical evaluation demonstrates that using the SFL reduces the blurs in reconstructed images.
Tasks Object Detection
Published 2018-06-06
URL http://arxiv.org/abs/1806.02336v1
PDF http://arxiv.org/pdf/1806.02336v1.pdf
PWC https://paperswithcode.com/paper/spatial-frequency-loss-for-learning
Repo
Framework

Defending against Adversarial Images using Basis Functions Transformations

Title Defending against Adversarial Images using Basis Functions Transformations
Authors Uri Shaham, James Garritano, Yutaro Yamada, Ethan Weinberger, Alex Cloninger, Xiuyuan Cheng, Kelly Stanton, Yuval Kluger
Abstract We study the effectiveness of various approaches that defend against adversarial attacks on deep networks via manipulations based on basis function representations of images. Specifically, we experiment with low-pass filtering, PCA, JPEG compression, low resolution wavelet approximation, and soft-thresholding. We evaluate these defense techniques using three types of popular attacks in black, gray and white-box settings. Our results show JPEG compression tends to outperform the other tested defenses in most of the settings considered, in addition to soft-thresholding, which performs well in specific cases, and yields a more mild decrease in accuracy on benign examples. In addition, we also mathematically derive a novel white-box attack in which the adversarial perturbation is composed only of terms corresponding a to pre-determined subset of the basis functions, of which a “low frequency attack” is a special case.
Tasks
Published 2018-03-28
URL http://arxiv.org/abs/1803.10840v3
PDF http://arxiv.org/pdf/1803.10840v3.pdf
PWC https://paperswithcode.com/paper/defending-against-adversarial-images-using
Repo
Framework

T-RECS: Training for Rate-Invariant Embeddings by Controlling Speed for Action Recognition

Title T-RECS: Training for Rate-Invariant Embeddings by Controlling Speed for Action Recognition
Authors Madan Ravi Ganesh, Eric Hofesmann, Byungsu Min, Nadha Gafoor, Jason J. Corso
Abstract An action should remain identifiable when modifying its speed: consider the contrast between an expert chef and a novice chef each chopping an onion. Here, we expect the novice chef to have a relatively measured and slow approach to chopping when compared to the expert. In general, the speed at which actions are performed, whether slower or faster than average, should not dictate how they are recognized. We explore the erratic behavior caused by this phenomena on state-of-the-art deep network-based methods for action recognition in terms of maximum performance and stability in recognition accuracy across a range of input video speeds. By observing the trends in these metrics and summarizing them based on expected temporal behaviour w.r.t. variations in input video speeds, we find two distinct types of network architectures. In this paper, we propose a preprocessing method named T-RECS, as a way to extend deep-network-based methods for action recognition to explicitly account for speed variability in the data. We do so by adaptively resampling the inputs to a given model. T-RECS is agnostic to the specific deep-network model; we apply it to four state-of-the-art action recognition architectures, C3D, I3D, TSN, and ConvNet+LSTM. On HMDB51 and UCF101, T-RECS-based I3D models show a peak improvement of at least 2.9% in performance over the baseline while T-RECS-based C3D models achieve a maximum improvement in stability by 59% over the baseline, on the HMDB51 dataset.
Tasks Temporal Action Localization
Published 2018-03-21
URL http://arxiv.org/abs/1803.08094v2
PDF http://arxiv.org/pdf/1803.08094v2.pdf
PWC https://paperswithcode.com/paper/t-recs-training-for-rate-invariant-embeddings
Repo
Framework

A syllogistic system for propositions with intermediate quantifiers

Title A syllogistic system for propositions with intermediate quantifiers
Authors Pasquale Iero, Allan Third, Paul Piwek
Abstract This paper describes a formalism that subsumes Peterson’s intermediate quantifier syllogistic system, and extends the ideas by van Eijck on Aristotle’s logic. Syllogisms are expressed in a concise form making use of and extending the Monotonicity Calculus. Contradictory and contrary relationships are added so that deduction can derive propositions expressing a form of negation.
Tasks
Published 2018-05-18
URL http://arxiv.org/abs/1805.08707v1
PDF http://arxiv.org/pdf/1805.08707v1.pdf
PWC https://paperswithcode.com/paper/a-syllogistic-system-for-propositions-with
Repo
Framework

Beyond One Glance: Gated Recurrent Architecture for Hand Segmentation

Title Beyond One Glance: Gated Recurrent Architecture for Hand Segmentation
Authors Wei Wang, Kaicheng Yu, Joachim Hugonot, Pascal Fua, Mathieu Salzmann
Abstract As mixed reality is gaining increased momentum, the development of effective and efficient solutions to egocentric hand segmentation is becoming critical. Traditional segmentation techniques typically follow a one-shot approach, where the image is passed forward only once through a model that produces a segmentation mask. This strategy, however, does not reflect the perception of humans, who continuously refine their representation of the world. In this paper, we therefore introduce a novel gated recurrent architecture. It goes beyond both iteratively passing the predicted segmentation mask through the network and adding a standard recurrent unit to it. Instead, it incorporates multiple encoder-decoder layers of the segmentation network, so as to keep track of its internal state in the refinement process. As evidenced by our results on standard hand segmentation benchmarks and on our own dataset, our approach outperforms these other, simpler recurrent segmentation techniques, as well as the state-of-the-art hand segmentation one. Furthermore, we demonstrate the generality of our approach by applying it to road segmentation, where it also outperforms other baseline methods.
Tasks Hand Segmentation
Published 2018-11-27
URL http://arxiv.org/abs/1811.10914v3
PDF http://arxiv.org/pdf/1811.10914v3.pdf
PWC https://paperswithcode.com/paper/beyond-one-glance-gated-recurrent
Repo
Framework

FDMO: Feature Assisted Direct Monocular Odometry

Title FDMO: Feature Assisted Direct Monocular Odometry
Authors Georges Younes, Daniel Asmar, John Zelek
Abstract Visual Odometry (VO) can be categorized as being either direct or feature based. When the system is calibrated photometrically, and images are captured at high rates, direct methods have shown to outperform feature-based ones in terms of accuracy and processing time; they are also more robust to failure in feature-deprived environments. On the downside, Direct methods rely on heuristic motion models to seed the estimation of camera motion between frames; in the event that these models are violated (e.g., erratic motion), Direct methods easily fail. This paper proposes a novel system entitled FDMO (Feature assisted Direct Monocular Odometry), which complements the advantages of both direct and featured based techniques. FDMO bootstraps indirect feature tracking upon the sub-pixel accurate localized direct keyframes only when failure modes (e.g., large baselines) of direct tracking occur. Control returns back to direct odometry when these conditions are no longer violated. Efficiencies are introduced to help FDMO perform in real time. FDMO shows significant drift (alignment, rotation & scale) reduction when compared to DSO & ORB SLAM when evaluated using the TumMono and EuroC datasets.
Tasks Visual Odometry
Published 2018-04-15
URL http://arxiv.org/abs/1804.05422v1
PDF http://arxiv.org/pdf/1804.05422v1.pdf
PWC https://paperswithcode.com/paper/fdmo-feature-assisted-direct-monocular
Repo
Framework

Automated Early Leaderboard Generation From Comparative Tables

Title Automated Early Leaderboard Generation From Comparative Tables
Authors Mayank Singh, Rajdeep Sarkar, Atharva Vyas, Pawan Goyal, Animesh Mukherjee, Soumen Chakrabarti
Abstract A leaderboard is a tabular presentation of performance scores of the best competing techniques that address a specific scientific problem. Manually maintained leaderboards take time to emerge, which induces a latency in performance discovery and meaningful comparison. This can delay dissemination of best practices to non-experts and practitioners. Regarding papers as proxies for techniques, we present a new system to automatically discover and maintain leaderboards in the form of partial orders between papers, based on performance reported therein. In principle, a leaderboard depends on the task, data set, other experimental settings, and the choice of performance metrics. Often there are also tradeoffs between different metrics. Thus, leaderboard discovery is not just a matter of accurately extracting performance numbers and comparing them. In fact, the levels of noise and uncertainty around performance comparisons are so large that reliable traditional extraction is infeasible. We mitigate these challenges by using relatively cleaner, structured parts of the papers, e.g., performance tables. We propose a novel performance improvement graph with papers as nodes, where edges encode noisy performance comparison information extracted from tables. Every individual performance edge is extracted from a table with citations to other papers. These extractions resemble (noisy) outcomes of ‘matches’ in an incomplete tournament. We propose several approaches to rank papers from these noisy ‘match’ outcomes. We show that our ranking scheme can reproduce various manually curated leaderboards very well. Using widely-used lists of state-of-the-art papers in 27 areas of Computer Science, we demonstrate that our system produces very reliable rankings.
Tasks
Published 2018-02-13
URL http://arxiv.org/abs/1802.04538v2
PDF http://arxiv.org/pdf/1802.04538v2.pdf
PWC https://paperswithcode.com/paper/automated-early-leaderboard-generation-from
Repo
Framework

Inference under Information Constraints I: Lower Bounds from Chi-Square Contraction

Title Inference under Information Constraints I: Lower Bounds from Chi-Square Contraction
Authors Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi
Abstract Multiple players are each given one independent sample, about which they can only provide limited information to a central referee. Each player is allowed to describe its observed sample to the referee using a channel from a family of channels $\mathcal{W}$, which can be instantiated to capture both the communication- and privacy-constrained settings and beyond. The referee uses the messages from players to solve an inference problem for the unknown distribution that generated the samples. We derive lower bounds for sample complexity of learning and testing discrete distributions in this information-constrained setting. Underlying our bounds is a characterization of the contraction in chi-square distances between the observed distributions of the samples when information constraints are placed. This contraction is captured in a local neighborhood in terms of chi-square and decoupled chi-square fluctuations of a given channel, two quantities we introduce. The former captures the average distance between distributions of channel output for two product distributions on the input, and the latter for a product distribution and a mixture of product distribution on the input. Our bounds are tight for both public- and private-coin protocols. Interestingly, the sample complexity of testing is order-wise higher when restricted to private-coin protocols.
Tasks
Published 2018-12-30
URL http://arxiv.org/abs/1812.11476v3
PDF http://arxiv.org/pdf/1812.11476v3.pdf
PWC https://paperswithcode.com/paper/inference-under-information-constraints-i
Repo
Framework

Self-Organizing Maps for Storage and Transfer of Knowledge in Reinforcement Learning

Title Self-Organizing Maps for Storage and Transfer of Knowledge in Reinforcement Learning
Authors Thommen George Karimpanal, Roland Bouffanais
Abstract The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In this work, we describe a novel approach for reusing previously acquired knowledge by using it to guide the exploration of an agent while it learns new tasks. In order to do so, we employ a variant of the growing self-organizing map algorithm, which is trained using a measure of similarity that is defined directly in the space of the vectorized representations of the value functions. In addition to enabling transfer across tasks, the resulting map is simultaneously used to enable the efficient storage of previously acquired task knowledge in an adaptive and scalable manner. We empirically validate our approach in a simulated navigation environment, and also demonstrate its utility through simple experiments using a mobile micro-robotics platform. In addition, we demonstrate the scalability of this approach, and analytically examine its relation to the proposed network growth mechanism. Further, we briefly discuss some of the possible improvements and extensions to this approach, as well as its relevance to real world scenarios in the context of continual learning.
Tasks Continual Learning
Published 2018-11-18
URL http://arxiv.org/abs/1811.08318v1
PDF http://arxiv.org/pdf/1811.08318v1.pdf
PWC https://paperswithcode.com/paper/self-organizing-maps-for-storage-and-transfer
Repo
Framework

A Survey on Neural Network-Based Summarization Methods

Title A Survey on Neural Network-Based Summarization Methods
Authors Yue Dong
Abstract Automatic text summarization, the automated process of shortening a text while reserving the main ideas of the document(s), is a critical research area in natural language processing. The aim of this literature review is to survey the recent work on neural-based models in automatic text summarization. We examine in detail ten state-of-the-art neural-based summarizers: five abstractive models and five extractive models. In addition, we discuss the related techniques that can be applied to the summarization tasks and present promising paths for future research in neural-based summarization.
Tasks Text Summarization
Published 2018-03-19
URL http://arxiv.org/abs/1804.04589v1
PDF http://arxiv.org/pdf/1804.04589v1.pdf
PWC https://paperswithcode.com/paper/a-survey-on-neural-network-based
Repo
Framework

Multi-Frame Super-Resolution Reconstruction with Applications to Medical Imaging

Title Multi-Frame Super-Resolution Reconstruction with Applications to Medical Imaging
Authors Thomas Köhler
Abstract The optical resolution of a digital camera is one of its most crucial parameters with broad relevance for consumer electronics, surveillance systems, remote sensing, or medical imaging. However, resolution is physically limited by the optics and sensor characteristics. In addition, practical and economic reasons often stipulate the use of out-dated or low-cost hardware. Super-resolution is a class of retrospective techniques that aims at high-resolution imagery by means of software. Multi-frame algorithms approach this task by fusing multiple low-resolution frames to reconstruct high-resolution images. This work covers novel super-resolution methods along with new applications in medical imaging.
Tasks Multi-Frame Super-Resolution, Super-Resolution
Published 2018-12-21
URL http://arxiv.org/abs/1812.09375v1
PDF http://arxiv.org/pdf/1812.09375v1.pdf
PWC https://paperswithcode.com/paper/multi-frame-super-resolution-reconstruction
Repo
Framework

ARBEE: Towards Automated Recognition of Bodily Expression of Emotion In the Wild

Title ARBEE: Towards Automated Recognition of Bodily Expression of Emotion In the Wild
Authors Yu Luo, Jianbo Ye, Reginald B. Adams, Jr., Jia Li, Michelle G. Newman, James Z. Wang
Abstract Humans are arguably innately prepared to comprehend others’ emotional expressions from subtle body movements. If robots or computers can be empowered with this capability, a number of robotic applications become possible. Automatically recognizing human bodily expression in unconstrained situations, however, is daunting given the incomplete understanding of the relationship between emotional expressions and body movements. The current research, as a multidisciplinary effort among computer and information sciences, psychology, and statistics, proposes a scalable and reliable crowdsourcing approach for collecting in-the-wild perceived emotion data for computers to learn to recognize body languages of humans. To accomplish this task, a large and growing annotated dataset with 9,876 video clips of body movements and 13,239 human characters, named BoLD (Body Language Dataset), has been created. Comprehensive statistical analysis of the dataset revealed many interesting insights. A system to model the emotional expressions based on bodily movements, named ARBEE (Automated Recognition of Bodily Expression of Emotion), has also been developed and evaluated. Our analysis shows the effectiveness of Laban Movement Analysis (LMA) features in characterizing arousal, and our experiments using LMA features further demonstrate computability of bodily expression. We report and compare results of several other baseline methods which were developed for action recognition based on two different modalities, body skeleton, and raw image. The dataset and findings presented in this work will likely serve as a launchpad for future discoveries in body language understanding that will enable future robots to interact and collaborate more effectively with humans.
Tasks
Published 2018-08-28
URL https://arxiv.org/abs/1808.09568v2
PDF https://arxiv.org/pdf/1808.09568v2.pdf
PWC https://paperswithcode.com/paper/arbee-towards-automated-recognition-of-bodily
Repo
Framework

Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation

Title Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation
Authors Adam Poliak, Aparajita Haldar, Rachel Rudinger, J. Edward Hu, Ellie Pavlick, Aaron Steven White, Benjamin Van Durme
Abstract We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning. The collection results from recasting 13 existing datasets from 7 semantic phenomena into a common NLI structure, resulting in over half a million labeled context-hypothesis pairs in total. We refer to our collection as the DNC: Diverse Natural Language Inference Collection. The DNC is available online at https://www.decomp.net, and will grow over time as additional resources are recast and added from novel sources.
Tasks Natural Language Inference
Published 2018-04-23
URL http://arxiv.org/abs/1804.08207v2
PDF http://arxiv.org/pdf/1804.08207v2.pdf
PWC https://paperswithcode.com/paper/collecting-diverse-natural-language-inference
Repo
Framework
comments powered by Disqus