October 19, 2019

3035 words 15 mins read

Paper Group ANR 302

Three Dimensional Fluorescence Microscopy Image Synthesis and Segmentation. A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research. A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep Learning. A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition. Automat …

Three Dimensional Fluorescence Microscopy Image Synthesis and Segmentation


Title	Three Dimensional Fluorescence Microscopy Image Synthesis and Segmentation
Authors	Chichen Fu, Soonam Lee, David Joon Ho, Shuo Han, Paul Salama, Kenneth W. Dunn, Edward J. Delp
Abstract	Advances in fluorescence microscopy enable acquisition of 3D image volumes with better image quality and deeper penetration into tissue. Segmentation is a required step to characterize and analyze biological structures in the images and recent 3D segmentation using deep learning has achieved promising results. One issue is that deep learning techniques require a large set of groundtruth data which is impractical to annotate manually for large 3D microscopy volumes. This paper describes a 3D deep learning nuclei segmentation method using synthetic 3D volumes for training. A set of synthetic volumes and the corresponding groundtruth are generated using spatially constrained cycle-consistent adversarial networks. Segmentation results demonstrate that our proposed method is capable of segmenting nuclei successfully for various data sets.
Tasks	Image Generation
Published	2018-01-22
URL	http://arxiv.org/abs/1801.07198v2
PDF	http://arxiv.org/pdf/1801.07198v2.pdf
PWC	https://paperswithcode.com/paper/three-dimensional-fluorescence-microscopy
Repo
Framework

A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research


Title	A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research
Authors	Mohammadreza Rezvan, Saeedeh Shekarpour, Lakshika Balasuriya, Krishnaprasad Thirunarayan, Valerie Shalin, Amit Sheth
Abstract	Having a quality annotated corpus is essential especially for applied research. Despite the recent focus of Web science community on researching about cyberbullying, the community dose not still have standard benchmarks. In this paper, we publish first, a quality annotated corpus and second, an offensive words lexicon capturing different types type of harassment as (i) sexual harassment, (ii) racial harassment, (iii) appearance-related harassment, (iv) intellectual harassment, and (v) political harassment.We crawled data from Twitter using our offensive lexicon. Then relied on the human judge to annotate the collected tweets w.r.t. the contextual types because using offensive words is not sufficient to reliably detect harassment. Our corpus consists of 25,000 annotated tweets in five contextual types. We are pleased to share this novel annotated corpus and the lexicon with the research community. The instruction to acquire the corpus has been published on the Git repository.
Tasks
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09416v2
PDF	http://arxiv.org/pdf/1802.09416v2.pdf
PWC	https://paperswithcode.com/paper/a-quality-type-aware-annotated-corpus-and
Repo
Framework

A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep Learning


Title	A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep Learning
Authors	Shengdong Du, Tianrui Li, Xun Gong, Shi-Jinn Horng
Abstract	Traffic flow forecasting has been regarded as a key problem of intelligent transport systems. In this work, we propose a hybrid multimodal deep learning method for short-term traffic flow forecasting, which can jointly and adaptively learn the spatial-temporal correlation features and long temporal interdependence of multi-modality traffic data by an attention auxiliary multimodal deep learning architecture. According to the highly nonlinear characteristics of multi-modality traffic data, the base module of our method consists of one-dimensional Convolutional Neural Networks (1D CNN) and Gated Recurrent Units (GRU) with the attention mechanism. The former is to capture the local trend features and the latter is to capture the long temporal dependencies. Then, we design a hybrid multimodal deep learning framework (HMDLF) for fusing share representation features of different modality traffic data by multiple CNN-GRU-Attention modules. The experimental results indicate that the proposed multimodal deep learning model is capable of dealing with complex nonlinear urban traffic flow forecasting with satisfying accuracy and effectiveness.
Tasks
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02099v4
PDF	http://arxiv.org/pdf/1803.02099v4.pdf
PWC	https://paperswithcode.com/paper/a-hybrid-method-for-traffic-flow-forecasting
Repo
Framework

A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition


Title	A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition
Authors	Hao Tang, Wei-Ning Hsu, Francois Grondin, James Glass
Abstract	Speech recognizers trained on close-talking speech do not generalize to distant speech and the word error rate degradation can be as large as 40% absolute. Most studies focus on tackling distant speech recognition as a separate problem, leaving little effort to adapting close-talking speech recognizers to distant speech. In this work, we review several approaches from a domain adaptation perspective. These approaches, including speech enhancement, multi-condition training, data augmentation, and autoencoders, all involve a transformation of the data between domains. We conduct experiments on the AMI data set, where these approaches can be realized under the same controlled setting. These approaches lead to different amounts of improvement under their respective assumptions. The purpose of this paper is to quantify and characterize the performance gap between the two domains, setting up the basis for studying adaptation of speech recognizers from close-talking speech to distant speech. Our results also have implications for improving distant speech recognition.
Tasks	Data Augmentation, Distant Speech Recognition, Domain Adaptation, Speech Enhancement, Speech Recognition
Published	2018-06-13
URL	http://arxiv.org/abs/1806.04841v1
PDF	http://arxiv.org/pdf/1806.04841v1.pdf
PWC	https://paperswithcode.com/paper/a-study-of-enhancement-augmentation-and
Repo
Framework

Automatic context window composition for distant speech recognition


Title	Automatic context window composition for distant speech recognition
Authors	Mirco Ravanelli, Maurizio Omologo
Abstract	Distant speech recognition is being revolutionized by deep learning, that has contributed to significantly outperform previous HMM-GMM systems. A key aspect behind the rapid rise and success of DNNs is their ability to better manage large time contexts. With this regard, asymmetric context windows that embed more past than future frames have been recently used with feed-forward neural networks. This context configuration turns out to be useful not only to address low-latency speech recognition, but also to boost the recognition performance under reverberant conditions. This paper investigates on the mechanisms occurring inside DNNs, which lead to an effective application of asymmetric contexts.In particular, we propose a novel method for automatic context window composition based on a gradient analysis. The experiments, performed with different acoustic environments, features, DNN architectures, microphone settings, and recognition tasks show that our simple and efficient strategy leads to a less redundant frame configuration, which makes DNN training more effective in reverberant scenarios.
Tasks	Distant Speech Recognition, Speech Recognition
Published	2018-05-26
URL	http://arxiv.org/abs/1805.10498v1
PDF	http://arxiv.org/pdf/1805.10498v1.pdf
PWC	https://paperswithcode.com/paper/automatic-context-window-composition-for
Repo
Framework

Lifted Proximal Operator Machines


Title	Lifted Proximal Operator Machines
Authors	Jia Li, Cong Fang, Zhouchen Lin
Abstract	We propose a new optimization method for training feed-forward neural networks. By rewriting the activation function as an equivalent proximal operator, we approximate a feed-forward neural network by adding the proximal operators to the objective function as penalties, hence we call the lifted proximal operator machine (LPOM). LPOM is block multi-convex in all layer-wise weights and activations. This allows us to use block coordinate descent to update the layer-wise weights and activations in parallel. Most notably, we only use the mapping of the activation function itself, rather than its derivatives, thus avoiding the gradient vanishing or blow-up issues in gradient based training methods. So our method is applicable to various non-decreasing Lipschitz continuous activation functions, which can be saturating and non-differentiable. LPOM does not require more auxiliary variables than the layer-wise activations, thus using roughly the same amount of memory as stochastic gradient descent (SGD) does. We further prove the convergence of updating the layer-wise weights and activations. Experiments on MNIST and CIFAR-10 datasets testify to the advantages of LPOM.
Tasks
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01501v1
PDF	http://arxiv.org/pdf/1811.01501v1.pdf
PWC	https://paperswithcode.com/paper/lifted-proximal-operator-machines
Repo
Framework

Scheduled Multi-Task Learning: From Syntax to Translation


Title	Scheduled Multi-Task Learning: From Syntax to Translation
Authors	Eliyahu Kiperwasser, Miguel Ballesteros
Abstract	Neural encoder-decoder models of machine translation have achieved impressive results, while learning linguistic knowledge of both the source and target languages in an implicit end-to-end manner. We propose a framework in which our model begins learning syntax and translation interleaved, gradually putting more focus on translation. Using this approach, we achieve considerable improvements in terms of BLEU score on relatively large parallel corpus (WMT14 English to German) and a low-resource (WIT German to English) setup.
Tasks	Machine Translation, Multi-Task Learning
Published	2018-04-24
URL	http://arxiv.org/abs/1804.08915v1
PDF	http://arxiv.org/pdf/1804.08915v1.pdf
PWC	https://paperswithcode.com/paper/scheduled-multi-task-learning-from-syntax-to
Repo
Framework

Designing Adversarially Resilient Classifiers using Resilient Feature Engineering


Title	Designing Adversarially Resilient Classifiers using Resilient Feature Engineering
Authors	Kevin Eykholt, Atul Prakash
Abstract	We provide a methodology, resilient feature engineering, for creating adversarially resilient classifiers. According to existing work, adversarial attacks identify weakly correlated or non-predictive features learned by the classifier during training and design the adversarial noise to utilize these features. Therefore, highly predictive features should be used first during classification in order to determine the set of possible output labels. Our methodology focuses the problem of designing resilient classifiers into a problem of designing resilient feature extractors for these highly predictive features. We provide two theorems, which support our methodology. The Serial Composition Resilience and Parallel Composition Resilience theorems show that the output of adversarially resilient feature extractors can be combined to create an equally resilient classifier. Based on our theoretical results, we outline the design of an adversarially resilient classifier.
Tasks	Feature Engineering
Published	2018-12-17
URL	http://arxiv.org/abs/1812.06626v1
PDF	http://arxiv.org/pdf/1812.06626v1.pdf
PWC	https://paperswithcode.com/paper/designing-adversarially-resilient-classifiers
Repo
Framework

Mixed Precision Training of Convolutional Neural Networks using Integer Operations


Title	Mixed Precision Training of Convolutional Neural Networks using Integer Operations
Authors	Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, Alexander Heinecke, Pradeep Dubey, Jesus Corbal, Nikita Shustrov, Roma Dubtsov, Evarist Fomenko, Vadim Pirogov
Abstract	The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular, FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand, while a lot of research has also happened in the domain of low and mixed-precision Integer training, these works either present results for non-SOTA networks (for instance only AlexNet for ImageNet-1K), or relatively small datasets (like CIFAR-10). In this work, we train state-of-the-art visual understanding neural networks on the ImageNet-1K dataset, with Integer operations on General Purpose (GP) hardware. In particular, we focus on Integer Fused-Multiply-and-Accumulate (FMA) operations which take two pairs of INT16 operands and accumulate results into an INT32 output.We propose a shared exponent representation of tensors and develop a Dynamic Fixed Point (DFP) scheme suitable for common neural network operations. The nuances of developing an efficient integer convolution kernel is examined, including methods to handle overflow of the INT32 accumulator. We implement CNN training for ResNet-50, GoogLeNet-v1, VGG-16 and AlexNet; and these networks achieve or exceed SOTA accuracy within the same number of iterations as their FP32 counterparts without any change in hyper-parameters and with a 1.8X improvement in end-to-end training throughput. To the best of our knowledge these results represent the first INT16 training results on GP hardware for ImageNet-1K dataset using SOTA CNNs and achieve highest reported accuracy using half-precision
Tasks
Published	2018-02-03
URL	http://arxiv.org/abs/1802.00930v2
PDF	http://arxiv.org/pdf/1802.00930v2.pdf
PWC	https://paperswithcode.com/paper/mixed-precision-training-of-convolutional
Repo
Framework

Learning Cheap and Novel Flight Itineraries


Title	Learning Cheap and Novel Flight Itineraries
Authors	Dmytro Karamshuk, David Matthews
Abstract	We consider the problem of efficiently constructing cheap and novel round trip flight itineraries by combining legs from different airlines. We analyse the factors that contribute towards the price of such itineraries and find that many result from the combination of just 30% of airlines and that the closer the departure of such itineraries is to the user’s search date the more likely they are to be cheaper than the tickets from one airline. We use these insights to formulate the problem as a trade-off between the recall of cheap itinerary constructions and the costs associated with building them. We propose a supervised learning solution with location embeddings which achieves an AUC=80.48, a substantial improvement over simpler baselines. We discuss various practical considerations for dealing with the staleness and the stability of the model and present the design of the machine learning pipeline. Finally, we present an analysis of the model’s performance in production and its impact on Skyscanner’s users.
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01735v1
PDF	http://arxiv.org/pdf/1812.01735v1.pdf
PWC	https://paperswithcode.com/paper/learning-cheap-and-novel-flight-itineraries
Repo
Framework

Feature Pyramid Network for Multi-Class Land Segmentation


Title	Feature Pyramid Network for Multi-Class Land Segmentation
Authors	Selim S. Seferbekov, Vladimir I. Iglovikov, Alexander V. Buslaev, Alexey A. Shvets
Abstract	Semantic segmentation is in-demand in satellite imagery processing. Because of the complex environment, automatic categorization and segmentation of land cover is a challenging problem. Solving it can help to overcome many obstacles in urban planning, environmental engineering or natural landscape monitoring. In this paper, we propose an approach for automatic multi-class land segmentation based on a fully convolutional neural network of feature pyramid network (FPN) family. This network is consisted of pre-trained on ImageNet Resnet50 encoder and neatly developed decoder. Based on validation results, leaderboard score and our own experience this network shows reliable results for the DEEPGLOBE - CVPR 2018 land cover classification sub-challenge. Moreover, this network moderately uses memory that allows using GTX 1080 or 1080 TI video cards to perform whole training and makes pretty fast predictions.
Tasks	Semantic Segmentation
Published	2018-06-09
URL	http://arxiv.org/abs/1806.03510v2
PDF	http://arxiv.org/pdf/1806.03510v2.pdf
PWC	https://paperswithcode.com/paper/feature-pyramid-network-for-multi-class-land
Repo
Framework

Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency


Title	Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency
Authors	Zhuang Ma, Michael Collins
Abstract	Noise Contrastive Estimation (NCE) is a powerful parameter estimation method for log-linear models, which avoids calculation of the partition function or its derivatives at each training step, a computationally demanding step in many cases. It is closely related to negative sampling methods, now widely used in NLP. This paper considers NCE-based estimation of conditional models. Conditional models are frequently encountered in practice; however there has not been a rigorous theoretical analysis of NCE in this setting, and we will argue there are subtle but important questions when generalizing NCE to the conditional case. In particular, we analyze two variants of NCE for conditional models: one based on a classification objective, the other based on a ranking objective. We show that the ranking-based variant of NCE gives consistent parameter estimates under weaker assumptions than the classification-based method; we analyze the statistical efficiency of the ranking-based and classification-based variants of NCE; finally we describe experiments on synthetic data and language modeling showing the effectiveness and trade-offs of both methods.
Tasks	Language Modelling, Question Answering
Published	2018-09-06
URL	http://arxiv.org/abs/1809.01812v1
PDF	http://arxiv.org/pdf/1809.01812v1.pdf
PWC	https://paperswithcode.com/paper/noise-contrastive-estimation-and-negative
Repo
Framework

Reflection Analysis for Face Morphing Attack Detection


Title	Reflection Analysis for Face Morphing Attack Detection
Authors	Clemens Seibold, Anna Hilsmann, Peter Eisert
Abstract	A facial morph is a synthetically created image of a face that looks similar to two different individuals and can even trick biometric facial recognition systems into recognizing both individuals. This attack is known as face morphing attack. The process of creating such a facial morph is well documented and a lot of tutorials and software to create them are freely available. Therefore, it is mandatory to be able to detect this kind of fraud to ensure the integrity of the face as reliable biometric feature. In this work, we study the effects of face morphing on the physically correctness of the illumination. We estimate the direction to the light sources based on specular highlights in the eyes and use them to generate a synthetic map for highlights on the skin. This map is compared with the highlights in the image that is suspected to be a fraud. Morphing faces with different geometries, a bad alignment of the source images or using images with different illuminations, can lead to inconsistencies in reflections that indicate the existence of a morphing attack.
Tasks
Published	2018-07-05
URL	http://arxiv.org/abs/1807.02030v1
PDF	http://arxiv.org/pdf/1807.02030v1.pdf
PWC	https://paperswithcode.com/paper/reflection-analysis-for-face-morphing-attack
Repo
Framework

Seeing isn’t Believing: Practical Adversarial Attack Against Object Detectors


Title	Seeing isn’t Believing: Practical Adversarial Attack Against Object Detectors
Authors	Yue Zhao, Hong Zhu, Ruigang Liang, Qintao Shen, Shengzhi Zhang, Kai Chen
Abstract	In this paper, we presented systematic solutions to build robust and practical AEs against real world object detectors. Particularly, for Hiding Attack (HA), we proposed the feature-interference reinforcement (FIR) method and the enhanced realistic constraints generation (ERG) to enhance robustness, and for Appearing Attack (AA), we proposed the nested-AE, which combines two AEs together to attack object detectors in both long and short distance. We also designed diverse styles of AEs to make AA more surreptitious. Evaluation results show that our AEs can attack the state-of-the-art real-time object detectors (i.e., YOLO V3 and faster-RCNN) at the success rate up to 92.4% with varying distance from 1m to 25m and angles from -60{\deg} to 60{\deg}. Our AEs are also demonstrated to be highly transferable, capable of attacking another three state-of-the-art black-box models with high success rate.
Tasks	Adversarial Attack, Autonomous Driving
Published	2018-12-26
URL	https://arxiv.org/abs/1812.10217v3
PDF	https://arxiv.org/pdf/1812.10217v3.pdf
PWC	https://paperswithcode.com/paper/practical-adversarial-attack-against-object
Repo
Framework

Prior Networks for Detection of Adversarial Attacks


Title	Prior Networks for Detection of Adversarial Attacks
Authors	Andrey Malinin, Mark Gales
Abstract	Adversarial examples are considered a serious issue for safety critical applications of AI, such as finance, autonomous vehicle control and medicinal applications. Though significant work has resulted in increased robustness of systems to these attacks, systems are still vulnerable to well-crafted attacks. To address this problem, several adversarial attack detection methods have been proposed. However, a system can still be vulnerable to adversarial samples that are designed to specifically evade these detection methods. One recent detection scheme that has shown good performance is based on uncertainty estimates derived from Monte-Carlo dropout ensembles. Prior Networks, a new method of estimating predictive uncertainty, has been shown to outperform Monte-Carlo dropout on a range of tasks. One of the advantages of this approach is that the behaviour of a Prior Network can be explicitly tuned to, for example, predict high uncertainty in regions where there are no training data samples. In this work, Prior Networks are applied to adversarial attack detection using measures of uncertainty in a similar fashion to Monte-Carlo Dropout. Detection based on measures of uncertainty derived from DNNs and Monte-Carlo dropout ensembles are used as a baseline. Prior Networks are shown to significantly out-perform these baseline approaches over a range of adversarial attacks in both detection of whitebox and blackbox configurations. Even when the adversarial attacks are constructed with full knowledge of the detection mechanism, it is shown to be highly challenging to successfully generate an adversarial sample.
Tasks	Adversarial Attack
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02575v1
PDF	http://arxiv.org/pdf/1812.02575v1.pdf
PWC	https://paperswithcode.com/paper/prior-networks-for-detection-of-adversarial
Repo
Framework