October 21, 2019

3072 words 15 mins read

Paper Group AWR 131

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. Local Spectral Graph Convolution for Point Set Feature Learning. Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents. MLtuner: System Support for Automatic Machine Learning Tuning. Traffic Graph Convolutional Recurrent Neural Netw …

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents


Title	A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
Authors	Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, Nazli Goharian
Abstract	Neural abstractive summarization models have led to promising results in summarizing relatively short documents. We propose the first model for abstractive summarization of single, longer-form documents (e.g., research papers). Our approach consists of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary. Empirical results on two large-scale datasets of scientific papers show that our model significantly outperforms state-of-the-art models.
Tasks	Abstractive Text Summarization
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05685v2
PDF	http://arxiv.org/pdf/1804.05685v2.pdf
PWC	https://paperswithcode.com/paper/a-discourse-aware-attention-model-for
Repo	https://github.com/acohan/long-summarization
Framework	tf

Local Spectral Graph Convolution for Point Set Feature Learning


Title	Local Spectral Graph Convolution for Point Set Feature Learning
Authors	Chu Wang, Babak Samari, Kaleem Siddiqi
Abstract	Feature learning on point clouds has shown great promise, with the introduction of effective and generalizable deep learning frameworks such as pointnet++. Thus far, however, point features have been abstracted in an independent and isolated manner, ignoring the relative layout of neighboring points as well as their features. In the present article, we propose to overcome this limitation by using spectral graph convolution on a local graph, combined with a novel graph pooling strategy. In our approach, graph convolution is carried out on a nearest neighbor graph constructed from a point’s neighborhood, such that features are jointly learned. We replace the standard max pooling step with a recursive clustering and pooling strategy, devised to aggregate information from within clusters of nodes that are close to one another in their spectral coordinates, leading to richer overall feature descriptors. Through extensive experiments on diverse datasets, we show a consistent demonstrable advantage for the tasks of both point set classification and segmentation.
Tasks
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05827v1
PDF	http://arxiv.org/pdf/1803.05827v1.pdf
PWC	https://paperswithcode.com/paper/local-spectral-graph-convolution-for-point
Repo	https://github.com/fate3439/LocalSpecGCN
Framework	tf

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents


Title	Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents
Authors	Wenhan Xiong, Xiaoxiao Guo, Mo Yu, Shiyu Chang, Bowen Zhou, William Yang Wang
Abstract	We investigate the task of learning to follow natural language instructions by jointly reasoning with visual observations and language inputs. In contrast to existing methods which start with learning from demonstrations (LfD) and then use reinforcement learning (RL) to fine-tune the model parameters, we propose a novel policy optimization algorithm which dynamically schedules demonstration learning and RL. The proposed training paradigm provides efficient exploration and better generalization beyond existing methods. Comparing to existing ensemble models, the best single model based on our proposed method tremendously decreases the execution error by over 50% on a block-world environment. To further illustrate the exploration strategy of our RL algorithm, We also include systematic studies on the evolution of policy entropy during training.
Tasks	Efficient Exploration
Published	2018-06-16
URL	http://arxiv.org/abs/1806.06187v2
PDF	http://arxiv.org/pdf/1806.06187v2.pdf
PWC	https://paperswithcode.com/paper/scheduled-policy-optimization-for-natural
Repo	https://github.com/clic-lab/ciff
Framework	pytorch

MLtuner: System Support for Automatic Machine Learning Tuning


Title	MLtuner: System Support for Automatic Machine Learning Tuning
Authors	Henggang Cui, Gregory R. Ganger, Phillip B. Gibbons
Abstract	MLtuner automatically tunes settings for training tunables (such as the learning rate, the momentum, the mini-batch size, and the data staleness bound) that have a significant impact on large-scale machine learning (ML) performance. Traditionally, these tunables are set manually, which is unsurprisingly error-prone and difficult to do without extensive domain knowledge. MLtuner uses efficient snapshotting, branching, and optimization-guided online trial-and-error to find good initial settings as well as to re-tune settings during execution. Experiments show that MLtuner can robustly find and re-tune tunable settings for a variety of ML applications, including image classification (for 3 models and 2 datasets), video classification, and matrix factorization. Compared to state-of-the-art ML auto-tuning approaches, MLtuner is more robust for large problems and over an order of magnitude faster.
Tasks	Image Classification, Video Classification
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07445v1
PDF	http://arxiv.org/pdf/1803.07445v1.pdf
PWC	https://paperswithcode.com/paper/mltuner-system-support-for-automatic-machine
Repo	https://github.com/cuihenggang/geeps
Framework	none

Traffic Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting


Title	Traffic Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting
Authors	Zhiyong Cui, Kristian Henrickson, Ruimin Ke, Ziyuan Pu, Yinhai Wang
Abstract	Traffic forecasting is a particularly challenging application of spatiotemporal forecasting, due to the time-varying traffic patterns and the complicated spatial dependencies on road networks. To address this challenge, we learn the traffic network as a graph and propose a novel deep learning framework, Traffic Graph Convolutional Long Short-Term Memory Neural Network (TGC-LSTM), to learn the interactions between roadways in the traffic network and forecast the network-wide traffic state. We define the traffic graph convolution based on the physical network topology. The relationship between the proposed traffic graph convolution and the spectral graph convolution is also discussed. An L1-norm on graph convolution weights and an L2-norm on graph convolution features are added to the model’s loss function to enhance the interpretability of the proposed model. Experimental results show that the proposed model outperforms baseline methods on two real-world traffic state datasets. The visualization of the graph convolution weights indicates that the proposed framework can recognize the most influential road segments in real-world traffic networks.
Tasks	Traffic Prediction
Published	2018-02-20
URL	https://arxiv.org/abs/1802.07007v3
PDF	https://arxiv.org/pdf/1802.07007v3.pdf
PWC	https://paperswithcode.com/paper/traffic-graph-convolutional-recurrent-neural
Repo	https://github.com/zhiyongc/Seattle-Loop-Data
Framework	none

A Bayesian Perspective of Statistical Machine Learning for Big Data


Title	A Bayesian Perspective of Statistical Machine Learning for Big Data
Authors	Rajiv Sambasivan, Sourish Das, Sujit K Sahu
Abstract	Statistical Machine Learning (SML) refers to a body of algorithms and methods by which computers are allowed to discover important features of input data sets which are often very large in size. The very task of feature discovery from data is essentially the meaning of the keyword `learning’ in SML. Theoretical justifications for the effectiveness of the SML algorithms are underpinned by sound principles from different disciplines, such as Computer Science and Statistics. The theoretical underpinnings particularly justified by statistical inference methods are together termed as statistical learning theory. This paper provides a review of SML from a Bayesian decision theoretic point of view – where we argue that many SML techniques are closely connected to making inference by using the so called Bayesian paradigm. We discuss many important SML techniques such as supervised and unsupervised learning, deep learning, online learning and Gaussian processes especially in the context of very large data sets where these are often employed. We present a dictionary which maps the key concepts of SML from Computer Science and Statistics. We illustrate the SML techniques with three moderately large data sets where we also discuss many practical implementation issues. Thus the review is especially targeted at statisticians and computer scientists who are aspiring to understand and apply SML for moderately large to big data sets. \|
Tasks	Gaussian Processes
Published	2018-11-09
URL	http://arxiv.org/abs/1811.04788v2
PDF	http://arxiv.org/pdf/1811.04788v2.pdf
PWC	https://paperswithcode.com/paper/a-bayesian-perspective-of-statistical-machine
Repo	https://github.com/fraziezr/Machine_Learning_Final_Project_Team_12
Framework	none

FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations


Title	FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations
Authors	Roman A. Solovyev, Alexandr A. Kalinin, Alexander G. Kustov, Dmitry V. Telpukhov, Vladimir S. Ruhlov
Abstract	Neural network-based methods for image processing are becoming widely used in practical applications. Modern neural networks are computationally expensive and require specialized hardware, such as graphics processing units. Since such hardware is not always available in real life applications, there is a compelling need for the design of neural networks for mobile devices. Mobile neural networks typically have reduced number of parameters and require a relatively small number of arithmetic operations. However, they usually still are executed at the software level and use floating-point calculations. The use of mobile networks without further optimization may not provide sufficient performance when high processing speed is required, for example, in real-time video processing (30 frames per second). In this study, we suggest optimizations to speed up computations in order to efficiently use already trained neural networks on a mobile device. Specifically, we propose an approach for speeding up neural networks by moving computation from software to hardware and by using fixed-point calculations instead of floating-point. We propose a number of methods for neural network architecture design to improve the performance with fixed-point calculations. We also show an example of how existing datasets can be modified and adapted for the recognition task in hand. Finally, we present the design and the implementation of a floating-point gate array-based device to solve the practical problem of real-time handwritten digit classification from mobile camera video feed.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09945v1
PDF	http://arxiv.org/pdf/1808.09945v1.pdf
PWC	https://paperswithcode.com/paper/fpga-implementation-of-convolutional-neural
Repo	https://github.com/ZFTurbo/Verilog-Generator-of-Neural-Net-Digit-Detector-for-FPGA
Framework	tf

U-Finger: Multi-Scale Dilated Convolutional Network for Fingerprint Image Denoising and Inpainting


Title	U-Finger: Multi-Scale Dilated Convolutional Network for Fingerprint Image Denoising and Inpainting
Authors	Ramakrishna Prabhu, Xiaojing Yu, Zhangyang Wang, Ding Liu, Anxiao, Jiang
Abstract	This paper studies the challenging problem of fingerprint image denoising and inpainting. To tackle the challenge of suppressing complicated artifacts (blur, brightness, contrast, elastic transformation, occlusion, scratch, resolution, rotation, and so on) while preserving fine textures, we develop a multi-scale convolutional network, termed U- Finger. Based on the domain expertise, we show that the usage of dilated convolutions as well as the removal of padding have important positive impacts on the final restoration performance, in addition to multi-scale cascaded feature modules. Our model achieves the overall ranking of No.2 in the ECCV 2018 Chalearn LAP Inpainting Competition Track 3 (Fingerprint Denoising and Inpainting). Among all participating teams, we obtain the MSE of 0.0231 (rank 2), PSNR 16.9688 dB (rank 2), and SSIM 0.8093 (rank 3) on the hold-out testing set.
Tasks	Denoising, Image Denoising
Published	2018-07-29
URL	http://arxiv.org/abs/1807.10993v2
PDF	http://arxiv.org/pdf/1807.10993v2.pdf
PWC	https://paperswithcode.com/paper/u-finger-multi-scale-dilated-convolutional
Repo	https://github.com/rgsl888/U-Finger-A-Fingerprint-Denosing-Network
Framework	none

StarMap for Category-Agnostic Keypoint and Viewpoint Estimation


Title	StarMap for Category-Agnostic Keypoint and Viewpoint Estimation
Authors	Xingyi Zhou, Arjun Karpur, Linjie Luo, Qixing Huang
Abstract	Semantic keypoints provide concise abstractions for a variety of visual understanding tasks. Existing methods define semantic keypoints separately for each category with a fixed number of semantic labels in fixed indices. As a result, this keypoint representation is in-feasible when objects have a varying number of parts, e.g. chairs with varying number of legs. We propose a category-agnostic keypoint representation, which combines a multi-peak heatmap (StarMap) for all the keypoints and their corresponding features as 3D locations in the canonical viewpoint (CanViewFeature) defined for each instance. Our intuition is that the 3D locations of the keypoints in canonical object views contain rich semantic and compositional information. Using our flexible representation, we demonstrate competitive performance in keypoint detection and localization compared to category-specific state-of-the-art methods. Moreover, we show that when augmented with an additional depth channel (DepthMap) to lift the 2D keypoints to 3D, our representation can achieve state-of-the-art results in viewpoint estimation. Finally, we show that our category-agnostic keypoint representation can be generalized to novel categories.
Tasks	Keypoint Detection, Viewpoint Estimation
Published	2018-03-25
URL	http://arxiv.org/abs/1803.09331v2
PDF	http://arxiv.org/pdf/1803.09331v2.pdf
PWC	https://paperswithcode.com/paper/starmap-for-category-agnostic-keypoint-and
Repo	https://github.com/xingyizhou/StarMap
Framework	pytorch

EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras


Title	EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras
Authors	Alex Zihao Zhu, Liangzhe Yuan, Kenneth Chaney, Kostas Daniilidis
Abstract	Event-based cameras have shown great promise in a variety of situations where frame based cameras suffer, such as high speed motions and high dynamic range scenes. However, developing algorithms for event measurements requires a new class of hand crafted algorithms. Deep learning has shown great success in providing model free solutions to many problems in the vision community, but existing networks have been developed with frame based images in mind, and there does not exist the wealth of labeled data for events as there does for images for supervised training. To these points, we present EV-FlowNet, a novel self-supervised deep learning pipeline for optical flow estimation for event based cameras. In particular, we introduce an image based representation of a given event stream, which is fed into a self-supervised neural network as the sole input. The corresponding grayscale images captured from the same camera at the same time as the events are then used as a supervisory signal to provide a loss function at training time, given the estimated flow from the network. We show that the resulting network is able to accurately predict optical flow from events only in a variety of different scenes, with performance competitive to image based networks. This method not only allows for accurate estimation of dense optical flow, but also provides a framework for the transfer of other self-supervised methods to the event-based domain.
Tasks	Optical Flow Estimation
Published	2018-02-19
URL	http://arxiv.org/abs/1802.06898v4
PDF	http://arxiv.org/pdf/1802.06898v4.pdf
PWC	https://paperswithcode.com/paper/ev-flownet-self-supervised-optical-flow
Repo	https://github.com/daniilidis-group/EV-FlowNet
Framework	tf

Bird Species Classification using Transfer Learning with Multistage Training


Title	Bird Species Classification using Transfer Learning with Multistage Training
Authors	Sourya Dipta Das, Akash Kumar
Abstract	Bird species classification has received more and more attention in the field of computer vision, for its promising applications in biology and environmental studies. Recognizing bird species is difficult due to the challenges of discriminative region localization and fine-grained feature learning. In this paper, we have introduced a Transfer learning based method with multistage training. We have used both Pre-Trained Mask-RCNN and an ensemble model consisting of Inception Nets (InceptionV3 & InceptionResNetV2 ) to get localization and species of the bird from the images respectively. Our final model achieves an F1 score of 0.5567 or 55.67 % on the dataset provided in CVIP 2018 Challenge.
Tasks	Transfer Learning
Published	2018-10-09
URL	http://arxiv.org/abs/1810.04250v2
PDF	http://arxiv.org/pdf/1810.04250v2.pdf
PWC	https://paperswithcode.com/paper/bird-species-classification-using-transfer
Repo	https://github.com/AKASH2907/bird_species_classification
Framework	tf

Recent Advances in Neural Program Synthesis


Title	Recent Advances in Neural Program Synthesis
Authors	Neel Kant
Abstract	In recent years, deep learning has made tremendous progress in a number of fields that were previously out of reach for artificial intelligence. The successes in these problems has led researchers to consider the possibilities for intelligent systems to tackle a problem that humans have only recently themselves considered: program synthesis. This challenge is unlike others such as object recognition and speech translation, since its abstract nature and demand for rigor make it difficult even for human minds to attempt. While it is still far from being solved or even competitive with most existing methods, neural program synthesis is a rapidly growing discipline which holds great promise if completely realized. In this paper, we start with exploring the problem statement and challenges of program synthesis. Then, we examine the fascinating evolution of program induction models, along with how they have succeeded, failed and been reimagined since. Finally, we conclude with a contrastive look at program synthesis and future research recommendations for the field.
Tasks	Object Recognition, Program Synthesis
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02353v1
PDF	http://arxiv.org/pdf/1802.02353v1.pdf
PWC	https://paperswithcode.com/paper/recent-advances-in-neural-program-synthesis
Repo	https://github.com/qu-arx/arx-inf
Framework	none

Matching Convolutional Neural Networks without Priors about Data


Title	Matching Convolutional Neural Networks without Priors about Data
Authors	Carlos Eduardo Rosar Kos Lassance, Jean-Charles Vialatte, Vincent Gripon
Abstract	We propose an extension of Convolutional Neural Networks (CNNs) to graph-structured data, including strided convolutions and data augmentation on graphs. Our method matches the accuracy of state-of-the-art CNNs when applied on images, without any prior about their 2D regular structure. On fMRI data, we obtain a significant gain in accuracy compared with existing graph-based alternatives.
Tasks	Data Augmentation
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09802v1
PDF	http://arxiv.org/pdf/1802.09802v1.pdf
PWC	https://paperswithcode.com/paper/matching-convolutional-neural-networks
Repo	https://github.com/brain-bzh/MCNN
Framework	pytorch

Neural Article Pair Modeling for Wikipedia Sub-article Matching


Title	Neural Article Pair Modeling for Wikipedia Sub-article Matching
Authors	Muhao Chen, Changping Meng, Gang Huang, Carlo Zaniolo
Abstract	Nowadays, editors tend to separate different subtopics of a long Wiki-pedia article into multiple sub-articles. This separation seeks to improve human readability. However, it also has a deleterious effect on many Wikipedia-based tasks that rely on the article-as-concept assumption, which requires each entity (or concept) to be described solely by one article. This underlying assumption significantly simplifies knowledge representation and extraction, and it is vital to many existing technologies such as automated knowledge base construction, cross-lingual knowledge alignment, semantic search and data lineage of Wikipedia entities. In this paper we provide an approach to match the scattered sub-articles back to their corresponding main-articles, with the intent of facilitating automated Wikipedia curation and processing. The proposed model adopts a hierarchical learning structure that combines multiple variants of neural document pair encoders with a comprehensive set of explicit features. A large crowdsourced dataset is created to support the evaluation and feature extraction for the task. Based on the large dataset, the proposed model achieves promising results of cross-validation and significantly outperforms previous approaches. Large-scale serving on the entire English Wikipedia also proves the practicability and scalability of the proposed model by effectively extracting a vast collection of newly paired main and sub-articles.
Tasks
Published	2018-07-31
URL	http://arxiv.org/abs/1807.11689v2
PDF	http://arxiv.org/pdf/1807.11689v2.pdf
PWC	https://paperswithcode.com/paper/neural-article-pair-modeling-for-wikipedia
Repo	https://github.com/muhaochen/subarticle
Framework	tf

On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples


Title	On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples
Authors	Pei-Hsuan Lu, Pin-Yu Chen, Chia-Mu Yu
Abstract	Understanding and characterizing the subspaces of adversarial examples aid in studying the robustness of deep neural networks (DNNs) to adversarial perturbations. Very recently, Ma et al. (ICLR 2018) proposed to use local intrinsic dimensionality (LID) in layer-wise hidden representations of DNNs to study adversarial subspaces. It was demonstrated that LID can be used to characterize the adversarial subspaces associated with different attack methods, e.g., the Carlini and Wagner’s (C&W) attack and the fast gradient sign attack. In this paper, we use MNIST and CIFAR-10 to conduct two new sets of experiments that are absent in existing LID analysis and report the limitation of LID in characterizing the corresponding adversarial subspaces, which are (i) oblivious attacks and LID analysis using adversarial examples with different confidence levels; and (ii) black-box transfer attacks. For (i), we find that the performance of LID is very sensitive to the confidence parameter deployed by an attack, and the LID learned from ensembles of adversarial examples with varying confidence levels surprisingly gives poor performance. For (ii), we find that when adversarial examples are crafted from another DNN model, LID is ineffective in characterizing their adversarial subspaces. These two findings together suggest the limited capability of LID in characterizing the subspaces of adversarial examples.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09638v1
PDF	http://arxiv.org/pdf/1803.09638v1.pdf
PWC	https://paperswithcode.com/paper/on-the-limitation-of-local-intrinsic
Repo	https://github.com/ysharma1126/EAD_Attack
Framework	tf