Paper Group AWR 131
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. Local Spectral Graph Convolution for Point Set Feature Learning. Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents. MLtuner: System Support for Automatic Machine Learning Tuning. Traffic Graph Convolutional Recurrent Neural Netw …
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
Title | A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents |
Authors | Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, Nazli Goharian |
Abstract | Neural abstractive summarization models have led to promising results in summarizing relatively short documents. We propose the first model for abstractive summarization of single, longer-form documents (e.g., research papers). Our approach consists of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary. Empirical results on two large-scale datasets of scientific papers show that our model significantly outperforms state-of-the-art models. |
Tasks | Abstractive Text Summarization |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05685v2 |
http://arxiv.org/pdf/1804.05685v2.pdf | |
PWC | https://paperswithcode.com/paper/a-discourse-aware-attention-model-for |
Repo | https://github.com/acohan/long-summarization |
Framework | tf |
Local Spectral Graph Convolution for Point Set Feature Learning
Title | Local Spectral Graph Convolution for Point Set Feature Learning |
Authors | Chu Wang, Babak Samari, Kaleem Siddiqi |
Abstract | Feature learning on point clouds has shown great promise, with the introduction of effective and generalizable deep learning frameworks such as pointnet++. Thus far, however, point features have been abstracted in an independent and isolated manner, ignoring the relative layout of neighboring points as well as their features. In the present article, we propose to overcome this limitation by using spectral graph convolution on a local graph, combined with a novel graph pooling strategy. In our approach, graph convolution is carried out on a nearest neighbor graph constructed from a point’s neighborhood, such that features are jointly learned. We replace the standard max pooling step with a recursive clustering and pooling strategy, devised to aggregate information from within clusters of nodes that are close to one another in their spectral coordinates, leading to richer overall feature descriptors. Through extensive experiments on diverse datasets, we show a consistent demonstrable advantage for the tasks of both point set classification and segmentation. |
Tasks | |
Published | 2018-03-15 |
URL | http://arxiv.org/abs/1803.05827v1 |
http://arxiv.org/pdf/1803.05827v1.pdf | |
PWC | https://paperswithcode.com/paper/local-spectral-graph-convolution-for-point |
Repo | https://github.com/fate3439/LocalSpecGCN |
Framework | tf |
Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents
Title | Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents |
Authors | Wenhan Xiong, Xiaoxiao Guo, Mo Yu, Shiyu Chang, Bowen Zhou, William Yang Wang |
Abstract | We investigate the task of learning to follow natural language instructions by jointly reasoning with visual observations and language inputs. In contrast to existing methods which start with learning from demonstrations (LfD) and then use reinforcement learning (RL) to fine-tune the model parameters, we propose a novel policy optimization algorithm which dynamically schedules demonstration learning and RL. The proposed training paradigm provides efficient exploration and better generalization beyond existing methods. Comparing to existing ensemble models, the best single model based on our proposed method tremendously decreases the execution error by over 50% on a block-world environment. To further illustrate the exploration strategy of our RL algorithm, We also include systematic studies on the evolution of policy entropy during training. |
Tasks | Efficient Exploration |
Published | 2018-06-16 |
URL | http://arxiv.org/abs/1806.06187v2 |
http://arxiv.org/pdf/1806.06187v2.pdf | |
PWC | https://paperswithcode.com/paper/scheduled-policy-optimization-for-natural |
Repo | https://github.com/clic-lab/ciff |
Framework | pytorch |
MLtuner: System Support for Automatic Machine Learning Tuning
Title | MLtuner: System Support for Automatic Machine Learning Tuning |
Authors | Henggang Cui, Gregory R. Ganger, Phillip B. Gibbons |
Abstract | MLtuner automatically tunes settings for training tunables (such as the learning rate, the momentum, the mini-batch size, and the data staleness bound) that have a significant impact on large-scale machine learning (ML) performance. Traditionally, these tunables are set manually, which is unsurprisingly error-prone and difficult to do without extensive domain knowledge. MLtuner uses efficient snapshotting, branching, and optimization-guided online trial-and-error to find good initial settings as well as to re-tune settings during execution. Experiments show that MLtuner can robustly find and re-tune tunable settings for a variety of ML applications, including image classification (for 3 models and 2 datasets), video classification, and matrix factorization. Compared to state-of-the-art ML auto-tuning approaches, MLtuner is more robust for large problems and over an order of magnitude faster. |
Tasks | Image Classification, Video Classification |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07445v1 |
http://arxiv.org/pdf/1803.07445v1.pdf | |
PWC | https://paperswithcode.com/paper/mltuner-system-support-for-automatic-machine |
Repo | https://github.com/cuihenggang/geeps |
Framework | none |
Traffic Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting
Title | Traffic Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting |
Authors | Zhiyong Cui, Kristian Henrickson, Ruimin Ke, Ziyuan Pu, Yinhai Wang |
Abstract | Traffic forecasting is a particularly challenging application of spatiotemporal forecasting, due to the time-varying traffic patterns and the complicated spatial dependencies on road networks. To address this challenge, we learn the traffic network as a graph and propose a novel deep learning framework, Traffic Graph Convolutional Long Short-Term Memory Neural Network (TGC-LSTM), to learn the interactions between roadways in the traffic network and forecast the network-wide traffic state. We define the traffic graph convolution based on the physical network topology. The relationship between the proposed traffic graph convolution and the spectral graph convolution is also discussed. An L1-norm on graph convolution weights and an L2-norm on graph convolution features are added to the model’s loss function to enhance the interpretability of the proposed model. Experimental results show that the proposed model outperforms baseline methods on two real-world traffic state datasets. The visualization of the graph convolution weights indicates that the proposed framework can recognize the most influential road segments in real-world traffic networks. |
Tasks | Traffic Prediction |
Published | 2018-02-20 |
URL | https://arxiv.org/abs/1802.07007v3 |
https://arxiv.org/pdf/1802.07007v3.pdf | |
PWC | https://paperswithcode.com/paper/traffic-graph-convolutional-recurrent-neural |
Repo | https://github.com/zhiyongc/Seattle-Loop-Data |
Framework | none |
A Bayesian Perspective of Statistical Machine Learning for Big Data
Title | A Bayesian Perspective of Statistical Machine Learning for Big Data |
Authors | Rajiv Sambasivan, Sourish Das, Sujit K Sahu |
Abstract | Statistical Machine Learning (SML) refers to a body of algorithms and methods by which computers are allowed to discover important features of input data sets which are often very large in size. The very task of feature discovery from data is essentially the meaning of the keyword `learning’ in SML. Theoretical justifications for the effectiveness of the SML algorithms are underpinned by sound principles from different disciplines, such as Computer Science and Statistics. The theoretical underpinnings particularly justified by statistical inference methods are together termed as statistical learning theory. This paper provides a review of SML from a Bayesian decision theoretic point of view – where we argue that many SML techniques are closely connected to making inference by using the so called Bayesian paradigm. We discuss many important SML techniques such as supervised and unsupervised learning, deep learning, online learning and Gaussian processes especially in the context of very large data sets where these are often employed. We present a dictionary which maps the key concepts of SML from Computer Science and Statistics. We illustrate the SML techniques with three moderately large data sets where we also discuss many practical implementation issues. Thus the review is especially targeted at statisticians and computer scientists who are aspiring to understand and apply SML for moderately large to big data sets. | |
Tasks | Gaussian Processes |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.04788v2 |
http://arxiv.org/pdf/1811.04788v2.pdf | |
PWC | https://paperswithcode.com/paper/a-bayesian-perspective-of-statistical-machine |
Repo | https://github.com/fraziezr/Machine_Learning_Final_Project_Team_12 |
Framework | none |
FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations
Title | FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations |
Authors | Roman A. Solovyev, Alexandr A. Kalinin, Alexander G. Kustov, Dmitry V. Telpukhov, Vladimir S. Ruhlov |
Abstract | Neural network-based methods for image processing are becoming widely used in practical applications. Modern neural networks are computationally expensive and require specialized hardware, such as graphics processing units. Since such hardware is not always available in real life applications, there is a compelling need for the design of neural networks for mobile devices. Mobile neural networks typically have reduced number of parameters and require a relatively small number of arithmetic operations. However, they usually still are executed at the software level and use floating-point calculations. The use of mobile networks without further optimization may not provide sufficient performance when high processing speed is required, for example, in real-time video processing (30 frames per second). In this study, we suggest optimizations to speed up computations in order to efficiently use already trained neural networks on a mobile device. Specifically, we propose an approach for speeding up neural networks by moving computation from software to hardware and by using fixed-point calculations instead of floating-point. We propose a number of methods for neural network architecture design to improve the performance with fixed-point calculations. We also show an example of how existing datasets can be modified and adapted for the recognition task in hand. Finally, we present the design and the implementation of a floating-point gate array-based device to solve the practical problem of real-time handwritten digit classification from mobile camera video feed. |
Tasks | |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09945v1 |
http://arxiv.org/pdf/1808.09945v1.pdf | |
PWC | https://paperswithcode.com/paper/fpga-implementation-of-convolutional-neural |
Repo | https://github.com/ZFTurbo/Verilog-Generator-of-Neural-Net-Digit-Detector-for-FPGA |
Framework | tf |
U-Finger: Multi-Scale Dilated Convolutional Network for Fingerprint Image Denoising and Inpainting
Title | U-Finger: Multi-Scale Dilated Convolutional Network for Fingerprint Image Denoising and Inpainting |
Authors | Ramakrishna Prabhu, Xiaojing Yu, Zhangyang Wang, Ding Liu, Anxiao, Jiang |
Abstract | This paper studies the challenging problem of fingerprint image denoising and inpainting. To tackle the challenge of suppressing complicated artifacts (blur, brightness, contrast, elastic transformation, occlusion, scratch, resolution, rotation, and so on) while preserving fine textures, we develop a multi-scale convolutional network, termed U- Finger. Based on the domain expertise, we show that the usage of dilated convolutions as well as the removal of padding have important positive impacts on the final restoration performance, in addition to multi-scale cascaded feature modules. Our model achieves the overall ranking of No.2 in the ECCV 2018 Chalearn LAP Inpainting Competition Track 3 (Fingerprint Denoising and Inpainting). Among all participating teams, we obtain the MSE of 0.0231 (rank 2), PSNR 16.9688 dB (rank 2), and SSIM 0.8093 (rank 3) on the hold-out testing set. |
Tasks | Denoising, Image Denoising |
Published | 2018-07-29 |
URL | http://arxiv.org/abs/1807.10993v2 |
http://arxiv.org/pdf/1807.10993v2.pdf | |
PWC | https://paperswithcode.com/paper/u-finger-multi-scale-dilated-convolutional |
Repo | https://github.com/rgsl888/U-Finger-A-Fingerprint-Denosing-Network |
Framework | none |
StarMap for Category-Agnostic Keypoint and Viewpoint Estimation
Title | StarMap for Category-Agnostic Keypoint and Viewpoint Estimation |
Authors | Xingyi Zhou, Arjun Karpur, Linjie Luo, Qixing Huang |
Abstract | Semantic keypoints provide concise abstractions for a variety of visual understanding tasks. Existing methods define semantic keypoints separately for each category with a fixed number of semantic labels in fixed indices. As a result, this keypoint representation is in-feasible when objects have a varying number of parts, e.g. chairs with varying number of legs. We propose a category-agnostic keypoint representation, which combines a multi-peak heatmap (StarMap) for all the keypoints and their corresponding features as 3D locations in the canonical viewpoint (CanViewFeature) defined for each instance. Our intuition is that the 3D locations of the keypoints in canonical object views contain rich semantic and compositional information. Using our flexible representation, we demonstrate competitive performance in keypoint detection and localization compared to category-specific state-of-the-art methods. Moreover, we show that when augmented with an additional depth channel (DepthMap) to lift the 2D keypoints to 3D, our representation can achieve state-of-the-art results in viewpoint estimation. Finally, we show that our category-agnostic keypoint representation can be generalized to novel categories. |
Tasks | Keypoint Detection, Viewpoint Estimation |
Published | 2018-03-25 |
URL | http://arxiv.org/abs/1803.09331v2 |
http://arxiv.org/pdf/1803.09331v2.pdf | |
PWC | https://paperswithcode.com/paper/starmap-for-category-agnostic-keypoint-and |
Repo | https://github.com/xingyizhou/StarMap |
Framework | pytorch |
EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras
Title | EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras |
Authors | Alex Zihao Zhu, Liangzhe Yuan, Kenneth Chaney, Kostas Daniilidis |
Abstract | Event-based cameras have shown great promise in a variety of situations where frame based cameras suffer, such as high speed motions and high dynamic range scenes. However, developing algorithms for event measurements requires a new class of hand crafted algorithms. Deep learning has shown great success in providing model free solutions to many problems in the vision community, but existing networks have been developed with frame based images in mind, and there does not exist the wealth of labeled data for events as there does for images for supervised training. To these points, we present EV-FlowNet, a novel self-supervised deep learning pipeline for optical flow estimation for event based cameras. In particular, we introduce an image based representation of a given event stream, which is fed into a self-supervised neural network as the sole input. The corresponding grayscale images captured from the same camera at the same time as the events are then used as a supervisory signal to provide a loss function at training time, given the estimated flow from the network. We show that the resulting network is able to accurately predict optical flow from events only in a variety of different scenes, with performance competitive to image based networks. This method not only allows for accurate estimation of dense optical flow, but also provides a framework for the transfer of other self-supervised methods to the event-based domain. |
Tasks | Optical Flow Estimation |
Published | 2018-02-19 |
URL | http://arxiv.org/abs/1802.06898v4 |
http://arxiv.org/pdf/1802.06898v4.pdf | |
PWC | https://paperswithcode.com/paper/ev-flownet-self-supervised-optical-flow |
Repo | https://github.com/daniilidis-group/EV-FlowNet |
Framework | tf |
Bird Species Classification using Transfer Learning with Multistage Training
Title | Bird Species Classification using Transfer Learning with Multistage Training |
Authors | Sourya Dipta Das, Akash Kumar |
Abstract | Bird species classification has received more and more attention in the field of computer vision, for its promising applications in biology and environmental studies. Recognizing bird species is difficult due to the challenges of discriminative region localization and fine-grained feature learning. In this paper, we have introduced a Transfer learning based method with multistage training. We have used both Pre-Trained Mask-RCNN and an ensemble model consisting of Inception Nets (InceptionV3 & InceptionResNetV2 ) to get localization and species of the bird from the images respectively. Our final model achieves an F1 score of 0.5567 or 55.67 % on the dataset provided in CVIP 2018 Challenge. |
Tasks | Transfer Learning |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.04250v2 |
http://arxiv.org/pdf/1810.04250v2.pdf | |
PWC | https://paperswithcode.com/paper/bird-species-classification-using-transfer |
Repo | https://github.com/AKASH2907/bird_species_classification |
Framework | tf |
Recent Advances in Neural Program Synthesis
Title | Recent Advances in Neural Program Synthesis |
Authors | Neel Kant |
Abstract | In recent years, deep learning has made tremendous progress in a number of fields that were previously out of reach for artificial intelligence. The successes in these problems has led researchers to consider the possibilities for intelligent systems to tackle a problem that humans have only recently themselves considered: program synthesis. This challenge is unlike others such as object recognition and speech translation, since its abstract nature and demand for rigor make it difficult even for human minds to attempt. While it is still far from being solved or even competitive with most existing methods, neural program synthesis is a rapidly growing discipline which holds great promise if completely realized. In this paper, we start with exploring the problem statement and challenges of program synthesis. Then, we examine the fascinating evolution of program induction models, along with how they have succeeded, failed and been reimagined since. Finally, we conclude with a contrastive look at program synthesis and future research recommendations for the field. |
Tasks | Object Recognition, Program Synthesis |
Published | 2018-02-07 |
URL | http://arxiv.org/abs/1802.02353v1 |
http://arxiv.org/pdf/1802.02353v1.pdf | |
PWC | https://paperswithcode.com/paper/recent-advances-in-neural-program-synthesis |
Repo | https://github.com/qu-arx/arx-inf |
Framework | none |
Matching Convolutional Neural Networks without Priors about Data
Title | Matching Convolutional Neural Networks without Priors about Data |
Authors | Carlos Eduardo Rosar Kos Lassance, Jean-Charles Vialatte, Vincent Gripon |
Abstract | We propose an extension of Convolutional Neural Networks (CNNs) to graph-structured data, including strided convolutions and data augmentation on graphs. Our method matches the accuracy of state-of-the-art CNNs when applied on images, without any prior about their 2D regular structure. On fMRI data, we obtain a significant gain in accuracy compared with existing graph-based alternatives. |
Tasks | Data Augmentation |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09802v1 |
http://arxiv.org/pdf/1802.09802v1.pdf | |
PWC | https://paperswithcode.com/paper/matching-convolutional-neural-networks |
Repo | https://github.com/brain-bzh/MCNN |
Framework | pytorch |
Neural Article Pair Modeling for Wikipedia Sub-article Matching
Title | Neural Article Pair Modeling for Wikipedia Sub-article Matching |
Authors | Muhao Chen, Changping Meng, Gang Huang, Carlo Zaniolo |
Abstract | Nowadays, editors tend to separate different subtopics of a long Wiki-pedia article into multiple sub-articles. This separation seeks to improve human readability. However, it also has a deleterious effect on many Wikipedia-based tasks that rely on the article-as-concept assumption, which requires each entity (or concept) to be described solely by one article. This underlying assumption significantly simplifies knowledge representation and extraction, and it is vital to many existing technologies such as automated knowledge base construction, cross-lingual knowledge alignment, semantic search and data lineage of Wikipedia entities. In this paper we provide an approach to match the scattered sub-articles back to their corresponding main-articles, with the intent of facilitating automated Wikipedia curation and processing. The proposed model adopts a hierarchical learning structure that combines multiple variants of neural document pair encoders with a comprehensive set of explicit features. A large crowdsourced dataset is created to support the evaluation and feature extraction for the task. Based on the large dataset, the proposed model achieves promising results of cross-validation and significantly outperforms previous approaches. Large-scale serving on the entire English Wikipedia also proves the practicability and scalability of the proposed model by effectively extracting a vast collection of newly paired main and sub-articles. |
Tasks | |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1807.11689v2 |
http://arxiv.org/pdf/1807.11689v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-article-pair-modeling-for-wikipedia |
Repo | https://github.com/muhaochen/subarticle |
Framework | tf |
On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples
Title | On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples |
Authors | Pei-Hsuan Lu, Pin-Yu Chen, Chia-Mu Yu |
Abstract | Understanding and characterizing the subspaces of adversarial examples aid in studying the robustness of deep neural networks (DNNs) to adversarial perturbations. Very recently, Ma et al. (ICLR 2018) proposed to use local intrinsic dimensionality (LID) in layer-wise hidden representations of DNNs to study adversarial subspaces. It was demonstrated that LID can be used to characterize the adversarial subspaces associated with different attack methods, e.g., the Carlini and Wagner’s (C&W) attack and the fast gradient sign attack. In this paper, we use MNIST and CIFAR-10 to conduct two new sets of experiments that are absent in existing LID analysis and report the limitation of LID in characterizing the corresponding adversarial subspaces, which are (i) oblivious attacks and LID analysis using adversarial examples with different confidence levels; and (ii) black-box transfer attacks. For (i), we find that the performance of LID is very sensitive to the confidence parameter deployed by an attack, and the LID learned from ensembles of adversarial examples with varying confidence levels surprisingly gives poor performance. For (ii), we find that when adversarial examples are crafted from another DNN model, LID is ineffective in characterizing their adversarial subspaces. These two findings together suggest the limited capability of LID in characterizing the subspaces of adversarial examples. |
Tasks | |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09638v1 |
http://arxiv.org/pdf/1803.09638v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-limitation-of-local-intrinsic |
Repo | https://github.com/ysharma1126/EAD_Attack |
Framework | tf |