October 21, 2019

3024 words 15 mins read

Paper Group AWR 32

Batch-normalized Recurrent Highway Networks. Fixed-sized representation learning from Offline Handwritten Signatures of different sizes. Syntactic Scaffolds for Semantic Structures. Data-driven Summarization of Scientific Articles. Power of Tempospatially Unified Spectral Density for Perceptual Video Quality Assessment. M$^3$RL: Mind-aware Multi-ag …

Batch-normalized Recurrent Highway Networks


Title	Batch-normalized Recurrent Highway Networks
Authors	Chi Zhang, Thang Nguyen, Shagan Sah, Raymond Ptucha, Alexander Loui, Carl Salvaggio
Abstract	Gradient control plays an important role in feed-forward networks applied to various computer vision tasks. Previous work has shown that Recurrent Highway Networks minimize the problem of vanishing or exploding gradients. They achieve this by setting the eigenvalues of the temporal Jacobian to 1 across the time steps. In this work, batch normalized recurrent highway networks are proposed to control the gradient flow in an improved way for network convergence. Specifically, the introduced model can be formed by batch normalizing the inputs at each recurrence loop. The proposed model is tested on an image captioning task using MSCOCO dataset. Experimental results indicate that the batch normalized recurrent highway networks converge faster and performs better compared with the traditional LSTM and RHN based models.
Tasks	Image Captioning
Published	2018-09-26
URL	http://arxiv.org/abs/1809.10271v1
PDF	http://arxiv.org/pdf/1809.10271v1.pdf
PWC	https://paperswithcode.com/paper/batch-normalized-recurrent-highway-networks
Repo	https://github.com/KurochkinAlexey/Hierarchical-Attention-Based-Recurrent-Highway-Networks-for-Time-Series-Prediction
Framework	pytorch

Fixed-sized representation learning from Offline Handwritten Signatures of different sizes


Title	Fixed-sized representation learning from Offline Handwritten Signatures of different sizes
Authors	Luiz G. Hafemann, Robert Sabourin, Luiz S. Oliveira
Abstract	Methods for learning feature representations for Offline Handwritten Signature Verification have been successfully proposed in recent literature, using Deep Convolutional Neural Networks to learn representations from signature pixels. Such methods reported large performance improvements compared to handcrafted feature extractors. However, they also introduced an important constraint: the inputs to the neural networks must have a fixed size, while signatures vary significantly in size between different users. In this paper we propose addressing this issue by learning a fixed-sized representation from variable-sized signatures by modifying the network architecture, using Spatial Pyramid Pooling. We also investigate the impact of the resolution of the images used for training, and the impact of adapting (fine-tuning) the representations to new operating conditions (different acquisition protocols, such as writing instruments and scan resolution). On the GPDS dataset, we achieve results comparable with the state-of-the-art, while removing the constraint of having a maximum size for the signatures to be processed. We also show that using higher resolutions (300 or 600dpi) can improve performance when skilled forgeries from a subset of users are available for feature learning, but lower resolutions (around 100dpi) can be used if only genuine signatures are used. Lastly, we show that fine-tuning can improve performance when the operating conditions change.
Tasks	Representation Learning
Published	2018-04-02
URL	http://arxiv.org/abs/1804.00448v2
PDF	http://arxiv.org/pdf/1804.00448v2.pdf
PWC	https://paperswithcode.com/paper/fixed-sized-representation-learning-from
Repo	https://github.com/luizgh/sigver_wiwd
Framework	tf

Syntactic Scaffolds for Semantic Structures


Title	Syntactic Scaffolds for Semantic Structures
Authors	Swabha Swayamdipta, Sam Thomson, Kenton Lee, Luke Zettlemoyer, Chris Dyer, Noah A. Smith
Abstract	We introduce the syntactic scaffold, an approach to incorporating syntactic information into semantic tasks. Syntactic scaffolds avoid expensive syntactic processing at runtime, only making use of a treebank during training, through a multitask objective. We improve over strong baselines on PropBank semantics, frame semantics, and coreference resolution, achieving competitive performance on all three tasks.
Tasks	Coreference Resolution
Published	2018-08-30
URL	http://arxiv.org/abs/1808.10485v1
PDF	http://arxiv.org/pdf/1808.10485v1.pdf
PWC	https://paperswithcode.com/paper/syntactic-scaffolds-for-semantic-structures
Repo	https://github.com/swabhs/scaffolding
Framework	pytorch

Data-driven Summarization of Scientific Articles


Title	Data-driven Summarization of Scientific Articles
Authors	Nikola I. Nikolov, Michael Pfeiffer, Richard H. R. Hahnloser
Abstract	Data-driven approaches to sequence-to-sequence modelling have been successfully applied to short text summarization of news articles. Such models are typically trained on input-summary pairs consisting of only a single or a few sentences, partially due to limited availability of multi-sentence training data. Here, we propose to use scientific articles as a new milestone for text summarization: large-scale training data come almost for free with two types of high-quality summaries at different levels - the title and the abstract. We generate two novel multi-sentence summarization datasets from scientific articles and test the suitability of a wide range of existing extractive and abstractive neural network-based summarization approaches. Our analysis demonstrates that scientific papers are suitable for data-driven text summarization. Our results could serve as valuable benchmarks for scaling sequence-to-sequence models to very long sequences.
Tasks	Text Summarization
Published	2018-04-24
URL	http://arxiv.org/abs/1804.08875v1
PDF	http://arxiv.org/pdf/1804.08875v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-summarization-of-scientific
Repo	https://github.com/Santosh-Gupta/Datasets
Framework	none

Power of Tempospatially Unified Spectral Density for Perceptual Video Quality Assessment


Title	Power of Tempospatially Unified Spectral Density for Perceptual Video Quality Assessment
Authors	Mohammed A. Aabed, Gukyeong Kwon, Ghassan AlRegib
Abstract	We propose a perceptual video quality assessment (PVQA) metric for distorted videos by analyzing the power spectral density (PSD) of a group of pictures. This is an estimation approach that relies on the changes in video dynamic calculated in the frequency domain and are primarily caused by distortion. We obtain a feature map by processing a 3D PSD tensor obtained from a set of distorted frames. This is a full-reference tempospatial approach that considers both temporal and spatial PSD characteristics. This makes it ubiquitously suitable for videos with varying motion patterns and spatial contents. Our technique does not make any assumptions on the coding conditions, streaming conditions or distortion. This approach is also computationally inexpensive which makes it feasible for real-time and practical implementations. We validate our proposed metric by testing it on a variety of distorted sequences from PVQA databases. The results show that our metric estimates the perceptual quality at the sequence level accurately. We report the correlation coefficients with the differential mean opinion scores (DMOS) reported in the databases. The results show high and competitive correlations compared with the state of the art techniques.
Tasks	Video Quality Assessment
Published	2018-12-12
URL	http://arxiv.org/abs/1812.05177v1
PDF	http://arxiv.org/pdf/1812.05177v1.pdf
PWC	https://paperswithcode.com/paper/power-of-tempospatially-unified-spectral
Repo	https://github.com/gukyeongkwon/3DPSD-VQA
Framework	none

M$^3$RL: Mind-aware Multi-agent Management Reinforcement Learning


Title	M$^3$RL: Mind-aware Multi-agent Management Reinforcement Learning
Authors	Tianmin Shu, Yuandong Tian
Abstract	Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward. In this paper, we aim to address this from a different angle. In particular, we consider scenarios where there are self-interested agents (i.e., worker agents) which have their own minds (preferences, intentions, skills, etc.) and can not be dictated to perform tasks they do not wish to do. For achieving optimal coordination among these agents, we train a super agent (i.e., the manager) to manage them by first inferring their minds based on both current and past observations and then initiating contracts to assign suitable tasks to workers and promise to reward them with corresponding bonuses so that they will agree to work together. The objective of the manager is maximizing the overall productivity as well as minimizing payments made to the workers for ad-hoc worker teaming. To train the manager, we propose Mind-aware Multi-agent Management Reinforcement Learning (M^3RL), which consists of agent modeling and policy learning. We have evaluated our approach in two environments, Resource Collection and Crafting, to simulate multi-agent management problems with various task settings and multiple designs for the worker agents. The experimental results have validated the effectiveness of our approach in modeling worker agents’ minds online, and in achieving optimal ad-hoc teaming with good generalization and fast adaptation.
Tasks	Multi-agent Reinforcement Learning
Published	2018-09-29
URL	http://arxiv.org/abs/1810.00147v3
PDF	http://arxiv.org/pdf/1810.00147v3.pdf
PWC	https://paperswithcode.com/paper/m3rl-mind-aware-multi-agent-management-1
Repo	https://github.com/facebookresearch/M3RL
Framework	pytorch

Robust Audio Adversarial Example for a Physical Attack


Title	Robust Audio Adversarial Example for a Physical Attack
Authors	Hiromu Yakura, Jun Sakuma
Abstract	We propose a method to generate audio adversarial examples that can attack a state-of-the-art speech recognition model in the physical world. Previous work assumes that generated adversarial examples are directly fed to the recognition model, and is not able to perform such a physical attack because of reverberation and noise from playback environments. In contrast, our method obtains robust adversarial examples by simulating transformations caused by playback or recording in the physical world and incorporating the transformations into the generation process. Evaluation and a listening experiment demonstrated that our adversarial examples are able to attack without being noticed by humans. This result suggests that audio adversarial examples generated by the proposed method may become a real threat.
Tasks	Speech Recognition
Published	2018-10-28
URL	https://arxiv.org/abs/1810.11793v4
PDF	https://arxiv.org/pdf/1810.11793v4.pdf
PWC	https://paperswithcode.com/paper/robust-audio-adversarial-example-for-a
Repo	https://github.com/hiromu/robust_audio_ae
Framework	tf

A Trajectory Calculus for Qualitative Spatial Reasoning Using Answer Set Programming


Title	A Trajectory Calculus for Qualitative Spatial Reasoning Using Answer Set Programming
Authors	George Baryannis, Ilias Tachmazidis, Sotiris Batsakis, Grigoris Antoniou, Mario Alviano, Timos Sellis, Pei-Wei Tsai
Abstract	Spatial information is often expressed using qualitative terms such as natural language expressions instead of coordinates; reasoning over such terms has several practical applications, such as bus routes planning. Representing and reasoning on trajectories is a specific case of qualitative spatial reasoning that focuses on moving objects and their paths. In this work, we propose two versions of a trajectory calculus based on the allowed properties over trajectories, where trajectories are defined as a sequence of non-overlapping regions of a partitioned map. More specifically, if a given trajectory is allowed to start and finish at the same region, 6 base relations are defined (TC-6). If a given trajectory should have different start and finish regions but cycles are allowed within, 10 base relations are defined (TC-10). Both versions of the calculus are implemented as ASP programs; we propose several different encodings, including a generalised program capable of encoding any qualitative calculus in ASP. All proposed encodings are experimentally evaluated using a real-world dataset. Experiment results show that the best performing implementation can scale up to an input of 250 trajectories for TC-6 and 150 trajectories for TC-10 for the problem of discovering a consistent configuration, a significant improvement compared to previous ASP implementations for similar qualitative spatial and temporal calculi. This manuscript is under consideration for acceptance in TPLP.
Tasks
Published	2018-04-19
URL	http://arxiv.org/abs/1804.07088v1
PDF	http://arxiv.org/pdf/1804.07088v1.pdf
PWC	https://paperswithcode.com/paper/a-trajectory-calculus-for-qualitative-spatial
Repo	https://github.com/gmparg/ICLP2018
Framework	none

Tensor-Tensor Product Toolbox


Title	Tensor-Tensor Product Toolbox
Authors	Canyi Lu
Abstract	The tensor-tensor product (t-product) [M. E. Kilmer and C. D. Martin, 2011] is a natural generalization of matrix multiplication. Based on t-product, many operations on matrix can be extended to tensor cases, including tensor SVD, tensor spectral norm, tensor nuclear norm [C. Lu, et al., 2018] and many others. The linear algebraic structure of tensors are similar to the matrix cases. We develop a Matlab toolbox to implement several basic operations on tensors based on t-product. The toolbox is available at https://github.com/canyilu/tproduct.
Tasks
Published	2018-06-17
URL	http://arxiv.org/abs/1806.07247v2
PDF	http://arxiv.org/pdf/1806.07247v2.pdf
PWC	https://paperswithcode.com/paper/tensor-tensor-product-toolbox
Repo	https://github.com/canyilu/tproduct
Framework	none

Towards automatic initialization of registration algorithms using simulated endoscopy images


Title	Towards automatic initialization of registration algorithms using simulated endoscopy images
Authors	Ayushi Sinha, Masaru Ishii, Russell H. Taylor, Gregory D. Hager, Austin Reiter
Abstract	Registering images from different modalities is an active area of research in computer aided medical interventions. Several registration algorithms have been developed, many of which achieve high accuracy. However, these results are dependent on many factors, including the quality of the extracted features or segmentations being registered as well as the initial alignment. Although several methods have been developed towards improving segmentation algorithms and automating the segmentation process, few automatic initialization algorithms have been explored. In many cases, the initial alignment from which a registration is initiated is performed manually, which interferes with the clinical workflow. Our aim is to use scene classification in endoscopic procedures to achieve coarse alignment of the endoscope and a preoperative image of the anatomy. In this paper, we show using simulated scenes that a neural network can predict the region of anatomy (with respect to a preoperative image) that the endoscope is located in by observing a single endoscopic video frame. With limited training and without any hyperparameter tuning, our method achieves an accuracy of 76.53 (+/-1.19)%. There are several avenues for improvement, making this a promising direction of research. Code is available at https://github.com/AyushiSinha/AutoInitialization.
Tasks	Scene Classification
Published	2018-06-28
URL	http://arxiv.org/abs/1806.10748v1
PDF	http://arxiv.org/pdf/1806.10748v1.pdf
PWC	https://paperswithcode.com/paper/towards-automatic-initialization-of
Repo	https://github.com/AyushiSinha/AutoInitialization
Framework	pytorch

Are You Tampering With My Data?


Title	Are You Tampering With My Data?
Authors	Michele Alberti, Vinaychandran Pondenkandath, Marcel Würsch, Manuel Bouillon, Mathias Seuret, Rolf Ingold, Marcus Liwicki
Abstract	We propose a novel approach towards adversarial attacks on neural networks (NN), focusing on tampering the data used for training instead of generating attacks on trained models. Our network-agnostic method creates a backdoor during training which can be exploited at test time to force a neural network to exhibit abnormal behaviour. We demonstrate on two widely used datasets (CIFAR-10 and SVHN) that a universal modification of just one pixel per image for all the images of a class in the training set is enough to corrupt the training procedure of several state-of-the-art deep neural networks causing the networks to misclassify any images to which the modification is applied. Our aim is to bring to the attention of the machine learning community, the possibility that even learning-based methods that are personally trained on public datasets can be subject to attacks by a skillful adversary.
Tasks
Published	2018-08-21
URL	http://arxiv.org/abs/1808.06809v1
PDF	http://arxiv.org/pdf/1808.06809v1.pdf
PWC	https://paperswithcode.com/paper/are-you-tampering-with-my-data
Repo	https://github.com/vinaychandranp/Are-You-Tampering-With-My-Data
Framework	pytorch

Multi-chart Generative Surface Modeling


Title	Multi-chart Generative Surface Modeling
Authors	Heli Ben-Hamu, Haggai Maron, Itay Kezurer, Gal Avineri, Yaron Lipman
Abstract	This paper introduces a 3D shape generative model based on deep neural networks. A new image-like (i.e., tensor) data representation for genus-zero 3D shapes is devised. It is based on the observation that complicated shapes can be well represented by multiple parameterizations (charts), each focusing on a different part of the shape. The new tensor data representation is used as input to Generative Adversarial Networks for the task of 3D shape generation. The 3D shape tensor representation is based on a multi-chart structure that enjoys a shape covering property and scale-translation rigidity. Scale-translation rigidity facilitates high quality 3D shape learning and guarantees unique reconstruction. The multi-chart structure uses as input a dataset of 3D shapes (with arbitrary connectivity) and a sparse correspondence between them. The output of our algorithm is a generative model that learns the shape distribution and is able to generate novel shapes, interpolate shapes, and explore the generated shape space. The effectiveness of the method is demonstrated for the task of anatomic shape generation including human body and bone (teeth) shape generation.
Tasks	3D Shape Generation
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02143v3
PDF	http://arxiv.org/pdf/1806.02143v3.pdf
PWC	https://paperswithcode.com/paper/multi-chart-generative-surface-modeling
Repo	https://github.com/helibenhamu/multichart3dgans
Framework	tf

Live Blog Corpus for Summarization


Title	Live Blog Corpus for Summarization
Authors	Avinesh P. V. S., Maxime Peyrard, Christian M. Meyer
Abstract	Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism. Online news websites around the world are using this medium to give their readers a minute by minute update on an event. Good summaries enhance the value of the live blogs for a reader but are often not available. In this paper, we study a way of collecting corpora for automatic live blog summarization. In an empirical evaluation using well-known state-of-the-art summarization systems, we show that live blogs corpus poses new challenges in the field of summarization. We make our tools publicly available to reconstruct the corpus to encourage the research community and replicate our results.
Tasks
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09884v1
PDF	http://arxiv.org/pdf/1802.09884v1.pdf
PWC	https://paperswithcode.com/paper/live-blog-corpus-for-summarization
Repo	https://github.com/UKPLab/lrec2018-live-blog-corpus
Framework	none

Deep Neural Networks Motivated by Partial Differential Equations


Title	Deep Neural Networks Motivated by Partial Differential Equations
Authors	Lars Ruthotto, Eldad Haber
Abstract	Partial differential equations (PDEs) are indispensable for modeling many physical phenomena and also commonly used for solving image processing tasks. In the latter area, PDE-based approaches interpret image data as discretizations of multivariate functions and the output of image processing algorithms as solutions to certain PDEs. Posing image processing problems in the infinite dimensional setting provides powerful tools for their analysis and solution. Over the last few decades, the reinterpretation of classical image processing problems through the PDE lens has been creating multiple celebrated approaches that benefit a vast area of tasks including image segmentation, denoising, registration, and reconstruction. In this paper, we establish a new PDE-interpretation of a class of deep convolutional neural networks (CNN) that are commonly used to learn from speech, image, and video data. Our interpretation includes convolution residual neural networks (ResNet), which are among the most promising approaches for tasks such as image classification having improved the state-of-the-art performance in prestigious benchmark challenges. Despite their recent successes, deep ResNets still face some critical challenges associated with their design, immense computational costs and memory requirements, and lack of understanding of their reasoning. Guided by well-established PDE theory, we derive three new ResNet architectures that fall into two new classes: parabolic and hyperbolic CNNs. We demonstrate how PDE theory can provide new insights and algorithms for deep learning and demonstrate the competitiveness of three new CNN architectures using numerical experiments.
Tasks	Denoising, Image Classification, Semantic Segmentation
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04272v2
PDF	http://arxiv.org/pdf/1804.04272v2.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-motivated-by-partial
Repo	https://github.com/EmoryMLIP/DynamicBlocks
Framework	pytorch

Generating Natural Language Adversarial Examples


Title	Generating Natural Language Adversarial Examples
Authors	Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, Kai-Wei Chang
Abstract	Deep neural networks (DNNs) are vulnerable to adversarial examples, perturbations to correctly classified examples which can cause the model to misclassify. In the image domain, these perturbations are often virtually indistinguishable to human perception, causing humans and state-of-the-art models to disagree. However, in the natural language domain, small perturbations are clearly perceptible, and the replacement of a single word can drastically alter the semantics of the document. Given these challenges, we use a black-box population-based optimization algorithm to generate semantically and syntactically similar adversarial examples that fool well-trained sentiment analysis and textual entailment models with success rates of 97% and 70%, respectively. We additionally demonstrate that 92.3% of the successful sentiment analysis adversarial examples are classified to their original label by 20 human annotators, and that the examples are perceptibly quite similar. Finally, we discuss an attempt to use adversarial training as a defense, but fail to yield improvement, demonstrating the strength and diversity of our adversarial examples. We hope our findings encourage researchers to pursue improving the robustness of DNNs in the natural language domain.
Tasks	Natural Language Inference, Sentiment Analysis
Published	2018-04-21
URL	http://arxiv.org/abs/1804.07998v2
PDF	http://arxiv.org/pdf/1804.07998v2.pdf
PWC	https://paperswithcode.com/paper/generating-natural-language-adversarial
Repo	https://github.com/alankarj/robust_nlp
Framework	none