July 29, 2019

3388 words 16 mins read

Paper Group AWR 85

Soft Proposal Networks for Weakly Supervised Object Localization. Chord Generation from Symbolic Melody Using BLSTM Networks. Outcome-Oriented Predictive Process Monitoring: Review and Benchmark. Passing the Brazilian OAB Exam: data preparation and some experiments. Boosting Adversarial Attacks with Momentum. DPC-Net: Deep Pose Correction for Visua …

Soft Proposal Networks for Weakly Supervised Object Localization


Title	Soft Proposal Networks for Weakly Supervised Object Localization
Authors	Yi Zhu, Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao
Abstract	Weakly supervised object localization remains challenging, where only image labels instead of bounding boxes are available during training. Object proposal is an effective component in localization, but often computationally expensive and incapable of joint optimization with some of the remaining modules. In this paper, to the best of our knowledge, we for the first time integrate weakly supervised object proposal into convolutional neural networks (CNNs) in an end-to-end learning manner. We design a network component, Soft Proposal (SP), to be plugged into any standard convolutional architecture to introduce the nearly cost-free object proposal, orders of magnitude faster than state-of-the-art methods. In the SP-augmented CNNs, referred to as Soft Proposal Networks (SPNs), iteratively evolved object proposals are generated based on the deep feature maps then projected back, and further jointly optimized with network parameters, with image-level supervision only. Through the unified learning process, SPNs learn better object-centric filters, discover more discriminative visual evidence, and suppress background interference, significantly boosting both weakly supervised object localization and classification performance. We report the best results on popular benchmarks, including PASCAL VOC, MS COCO, and ImageNet.
Tasks	Object Localization, Weakly Supervised Object Detection, Weakly-Supervised Object Localization
Published	2017-09-06
URL	http://arxiv.org/abs/1709.01829v1
PDF	http://arxiv.org/pdf/1709.01829v1.pdf
PWC	https://paperswithcode.com/paper/soft-proposal-networks-for-weakly-supervised
Repo	https://github.com/yeezhu/SPN.pytorch
Framework	pytorch

Chord Generation from Symbolic Melody Using BLSTM Networks


Title	Chord Generation from Symbolic Melody Using BLSTM Networks
Authors	Hyungui Lim, Seungyeon Rhyu, Kyogu Lee
Abstract	Generating a chord progression from a monophonic melody is a challenging problem because a chord progression requires a series of layered notes played simultaneously. This paper presents a novel method of generating chord sequences from a symbolic melody using bidirectional long short-term memory (BLSTM) networks trained on a lead sheet database. To this end, a group of feature vectors composed of 12 semitones is extracted from the notes in each bar of monophonic melodies. In order to ensure that the data shares uniform key and duration characteristics, the key and the time signatures of the vectors are normalized. The BLSTM networks then learn from the data to incorporate the temporal dependencies to produce a chord progression. Both quantitative and qualitative evaluations are conducted by comparing the proposed method with the conventional HMM and DNN-HMM based approaches. Proposed model achieves 23.8% and 11.4% performance increase from the other models, respectively. User studies further confirm that the chord sequences generated by the proposed method are preferred by listeners.
Tasks
Published	2017-12-04
URL	http://arxiv.org/abs/1712.01011v1
PDF	http://arxiv.org/pdf/1712.01011v1.pdf
PWC	https://paperswithcode.com/paper/chord-generation-from-symbolic-melody-using
Repo	https://github.com/nprabala/Mixtape
Framework	pytorch

Outcome-Oriented Predictive Process Monitoring: Review and Benchmark


Title	Outcome-Oriented Predictive Process Monitoring: Review and Benchmark
Authors	Irene Teinemaa, Marlon Dumas, Marcello La Rosa, Fabrizio Maria Maggi
Abstract	Predictive business process monitoring refers to the act of making predictions about the future state of ongoing cases of a business process, based on their incomplete execution traces and logs of historical (completed) traces. Motivated by the increasingly pervasive availability of fine-grained event data about business process executions, the problem of predictive process monitoring has received substantial attention in the past years. In particular, a considerable number of methods have been put forward to address the problem of outcome-oriented predictive process monitoring, which refers to classifying each ongoing case of a process according to a given set of possible categorical outcomes - e.g., Will the customer complain or not? Will an order be delivered, canceled or withdrawn? Unfortunately, different authors have used different datasets, experimental settings, evaluation measures and baselines to assess their proposals, resulting in poor comparability and an unclear picture of the relative merits and applicability of different methods. To address this gap, this article presents a systematic review and taxonomy of outcome-oriented predictive process monitoring methods, and a comparative experimental evaluation of eleven representative methods using a benchmark covering 24 predictive process monitoring tasks based on nine real-life event logs.
Tasks
Published	2017-07-21
URL	http://arxiv.org/abs/1707.06766v4
PDF	http://arxiv.org/pdf/1707.06766v4.pdf
PWC	https://paperswithcode.com/paper/outcome-oriented-predictive-process
Repo	https://github.com/irhete/predictive-monitoring-benchmark
Framework	none

Passing the Brazilian OAB Exam: data preparation and some experiments


Title	Passing the Brazilian OAB Exam: data preparation and some experiments
Authors	Pedro Delfino, Bruno Cuconato, Edward Hermann Haeusler, Alexandre Rademaker
Abstract	In Brazil, all legal professionals must demonstrate their knowledge of the law and its application by passing the OAB exams, the national bar exams. The OAB exams therefore provide an excellent benchmark for the performance of legal information systems since passing the exam would arguably signal that the system has acquired capacity of legal reasoning comparable to that of a human lawyer. This article describes the construction of a new data set and some preliminary experiments on it, treating the problem of finding the justification for the answers to questions. The results provide a baseline performance measure against which to evaluate future improvements. We discuss the reasons to the poor performance and propose next steps.
Tasks
Published	2017-12-14
URL	http://arxiv.org/abs/1712.05128v1
PDF	http://arxiv.org/pdf/1712.05128v1.pdf
PWC	https://paperswithcode.com/paper/passing-the-brazilian-oab-exam-data
Repo	https://github.com/own-pt/oab-exams
Framework	none

Boosting Adversarial Attacks with Momentum


Title	Boosting Adversarial Attacks with Momentum
Authors	Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, Jianguo Li
Abstract	Deep neural networks are vulnerable to adversarial examples, which poses security concerns on these algorithms due to the potentially severe consequences. Adversarial attacks serve as an important surrogate to evaluate the robustness of deep learning models before they are deployed. However, most of existing adversarial attacks can only fool a black-box model with a low success rate. To address this issue, we propose a broad class of momentum-based iterative algorithms to boost adversarial attacks. By integrating the momentum term into the iterative process for attacks, our methods can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples. To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks. We hope that the proposed methods will serve as a benchmark for evaluating the robustness of various deep models and defense methods. With this method, we won the first places in NIPS 2017 Non-targeted Adversarial Attack and Targeted Adversarial Attack competitions.
Tasks	Adversarial Attack
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06081v3
PDF	http://arxiv.org/pdf/1710.06081v3.pdf
PWC	https://paperswithcode.com/paper/boosting-adversarial-attacks-with-momentum
Repo	https://github.com/srk97/targeted-adversarial-mnist
Framework	tf

DPC-Net: Deep Pose Correction for Visual Localization


Title	DPC-Net: Deep Pose Correction for Visual Localization
Authors	Valentin Peretroukhin, Jonathan Kelly
Abstract	We present a novel method to fuse the power of deep networks with the computational efficiency of geometric and probabilistic localization algorithms. In contrast to other methods that completely replace a classical visual estimator with a deep network, we propose an approach that uses a convolutional neural network to learn difficult-to-model corrections to the estimator from ground-truth training data. To this end, we derive a novel loss function for learning SE(3) corrections based on a matrix Lie groups approach, with a natural formulation for balancing translation and rotation errors. We use this loss to train a Deep Pose Correction network (DPC-Net) that predicts corrections for a particular estimator, sensor and environment. Using the KITTI odometry dataset, we demonstrate significant improvements to the accuracy of a computationally-efficient sparse stereo visual odometry pipeline, that render it as accurate as a modern computationally-intensive dense estimator. Further, we show how DPC-Net can be used to mitigate the effect of poorly calibrated lens distortion parameters.
Tasks	Visual Localization, Visual Odometry
Published	2017-09-10
URL	http://arxiv.org/abs/1709.03128v3
PDF	http://arxiv.org/pdf/1709.03128v3.pdf
PWC	https://paperswithcode.com/paper/dpc-net-deep-pose-correction-for-visual
Repo	https://github.com/utiasSTARS/dpc-net
Framework	pytorch

MSC: A Dataset for Macro-Management in StarCraft II


Title	MSC: A Dataset for Macro-Management in StarCraft II
Authors	Huikai Wu, Junge Zhang, Kaiqi Huang
Abstract	Macro-management is an important problem in StarCraft, which has been studied for a long time. Various datasets together with assorted methods have been proposed in the last few years. But these datasets have some defects for boosting the academic and industrial research: 1) There’re neither standard preprocessing, parsing and feature extraction procedures nor predefined training, validation and test set in some datasets. 2) Some datasets are only specified for certain tasks in macro-management. 3) Some datasets are either too small or don’t have enough labeled data for modern machine learning algorithms such as deep neural networks. So most previous methods are trained with various features, evaluated on different test sets from the same or different datasets, making it difficult to be compared directly. To boost the research of macro-management in StarCraft, we release a new dataset MSC based on the platform SC2LE. MSC consists of well-designed feature vectors, pre-defined high-level actions and final result of each match. We also split MSC into training, validation and test set for the convenience of evaluation and comparison. Besides the dataset, we propose a baseline model and present initial baseline results for global state evaluation and build order prediction, which are two of the key tasks in macro-management. Various downstream tasks and analyses of the dataset are also described for the sake of research on macro-management in StarCraft II. Homepage: https://github.com/wuhuikai/MSC.
Tasks	Real-Time Strategy Games, Starcraft, Starcraft II
Published	2017-10-09
URL	http://arxiv.org/abs/1710.03131v2
PDF	http://arxiv.org/pdf/1710.03131v2.pdf
PWC	https://paperswithcode.com/paper/msc-a-dataset-for-macro-management-in
Repo	https://github.com/wuhuikai/MSC
Framework	none

Natural Language Processing with Small Feed-Forward Networks


Title	Natural Language Processing with Small Feed-Forward Networks
Authors	Jan A. Botha, Emily Pitler, Ji Ma, Anton Bakalov, Alex Salcianu, David Weiss, Ryan McDonald, Slav Petrov
Abstract	We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models. Motivated by resource-constrained environments like mobile phones, we showcase simple techniques for obtaining such small neural network models, and investigate different tradeoffs when deciding how to allocate a small memory budget.
Tasks
Published	2017-08-01
URL	http://arxiv.org/abs/1708.00214v1
PDF	http://arxiv.org/pdf/1708.00214v1.pdf
PWC	https://paperswithcode.com/paper/natural-language-processing-with-small-feed
Repo	https://github.com/bzz/LangID
Framework	tf

Multitask learning and benchmarking with clinical time series data


Title	Multitask learning and benchmarking with clinical time series data
Authors	Hrayr Harutyunyan, Hrant Khachatrian, David C. Kale, Greg Ver Steeg, Aram Galstyan
Abstract	Health care is one of the most exciting frontiers in data mining and machine learning. Successful adoption of electronic health records (EHRs) created an explosion in digital clinical data available for analysis, but progress in machine learning for healthcare research has been difficult to measure because of the absence of publicly available benchmark data sets. To address this problem, we propose four clinical prediction benchmarks using data derived from the publicly available Medical Information Mart for Intensive Care (MIMIC-III) database. These tasks cover a range of clinical problems including modeling risk of mortality, forecasting length of stay, detecting physiologic decline, and phenotype classification. We propose strong linear and neural baselines for all four tasks and evaluate the effect of deep supervision, multitask training and data-specific architectural modifications on the performance of neural models.
Tasks	Computational Phenotyping, Length-of-Stay prediction, Mortality Prediction, Time Series
Published	2017-03-22
URL	https://arxiv.org/abs/1703.07771v3
PDF	https://arxiv.org/pdf/1703.07771v3.pdf
PWC	https://paperswithcode.com/paper/multitask-learning-and-benchmarking-with
Repo	https://github.com/ksu-hmi/Class-Assignment-https-github.com-pwhitley2-Evaluation-of-Python-suite-to-construct-benchmark
Framework	none

Self-view Grounding Given a Narrated 360° Video


Title	Self-view Grounding Given a Narrated 360° Video
Authors	Shih-Han Chou, Yi-Chun Chen, Kuo-Hao Zeng, Hou-Ning Hu, Jianlong Fu, Min Sun
Abstract	Narrated 360{\deg} videos are typically provided in many touring scenarios to mimic real-world experience. However, previous work has shown that smart assistance (i.e., providing visual guidance) can significantly help users to follow the Normal Field of View (NFoV) corresponding to the narrative. In this project, we aim at automatically grounding the NFoVs of a 360{\deg} video given subtitles of the narrative (referred to as “NFoV-grounding”). We propose a novel Visual Grounding Model (VGM) to implicitly and efficiently predict the NFoVs given the video content and subtitles. Specifically, at each frame, we efficiently encode the panorama into feature map of candidate NFoVs using a Convolutional Neural Network (CNN) and the subtitles to the same hidden space using an RNN with Gated Recurrent Units (GRU). Then, we apply soft-attention on candidate NFoVs to trigger sentence decoder aiming to minimize the reconstruct loss between the generated and given sentence. Finally, we obtain the NFoV as the candidate NFoV with the maximum attention without any human supervision. To train VGM more robustly, we also generate a reverse sentence conditioning on one minus the soft-attention such that the attention focuses on candidate NFoVs less relevant to the given sentence. The negative log reconstruction loss of the reverse sentence (referred to as “irrelevant loss”) is jointly minimized to encourage the reverse sentence to be different from the given sentence. To evaluate our method, we collect the first narrated 360{\deg} videos dataset and achieve state-of-the-art NFoV-grounding performance.
Tasks
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08664v1
PDF	http://arxiv.org/pdf/1711.08664v1.pdf
PWC	https://paperswithcode.com/paper/self-view-grounding-given-a-narrated-360
Repo	https://github.com/ShihHanChou/360grounding
Framework	pytorch

Deep Convolutional Framelets: A General Deep Learning Framework for Inverse Problems


Title	Deep Convolutional Framelets: A General Deep Learning Framework for Inverse Problems
Authors	Jong Chul Ye, Yoseob Han, Eunju Cha
Abstract	Recently, deep learning approaches with various network architectures have achieved significant performance improvement over existing iterative reconstruction methods in various imaging problems. However, it is still unclear why these deep learning architectures work for specific inverse problems. To address these issues, here we show that the long-searched-for missing link is the convolution framelets for representing a signal by convolving local and non-local bases. The convolution framelets was originally developed to generalize the theory of low-rank Hankel matrix approaches for inverse problems, and this paper further extends the idea so that we can obtain a deep neural network using multilayer convolution framelets with perfect reconstruction (PR) under rectilinear linear unit nonlinearity (ReLU). Our analysis also shows that the popular deep network components such as residual block, redundant filter channels, and concatenated ReLU (CReLU) do indeed help to achieve the PR, while the pooling and unpooling layers should be augmented with high-pass branches to meet the PR condition. Moreover, by changing the number of filter channels and bias, we can control the shrinkage behaviors of the neural network. This discovery leads us to propose a novel theory for deep convolutional framelets neural network. Using numerical experiments with various inverse problems, we demonstrated that our deep convolution framelets network shows consistent improvement over existing deep architectures.This discovery suggests that the success of deep learning is not from a magical power of a black-box, but rather comes from the power of a novel signal representation using non-local basis combined with data-driven local basis, which is indeed a natural extension of classical signal processing theory.
Tasks
Published	2017-07-03
URL	http://arxiv.org/abs/1707.00372v5
PDF	http://arxiv.org/pdf/1707.00372v5.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-framelets-a-general-deep
Repo	https://github.com/hanyoseob/framing-u-net
Framework	none

Soft-NMS – Improving Object Detection With One Line of Code


Title	Soft-NMS – Improving Object Detection With One Line of Code
Authors	Navaneeth Bodla, Bharat Singh, Rama Chellappa, Larry S. Davis
Abstract	Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss. To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process. Soft-NMS obtains consistent improvements for the coco-style mAP metric on standard datasets like PASCAL VOC 2007 (1.7% for both R-FCN and Faster-RCNN) and MS-COCO (1.3% for R-FCN and 1.1% for Faster-RCNN) by just changing the NMS algorithm without any additional hyper-parameters. Using Deformable-RFCN, Soft-NMS improves state-of-the-art in object detection from 39.8% to 40.9% with a single model. Further, the computational complexity of Soft-NMS is the same as traditional NMS and hence it can be efficiently implemented. Since Soft-NMS does not require any extra training and is simple to implement, it can be easily integrated into any object detection pipeline. Code for Soft-NMS is publicly available on GitHub (http://bit.ly/2nJLNMu).
Tasks	Object Detection
Published	2017-04-14
URL	http://arxiv.org/abs/1704.04503v2
PDF	http://arxiv.org/pdf/1704.04503v2.pdf
PWC	https://paperswithcode.com/paper/soft-nms-improving-object-detection-with-one
Repo	https://github.com/tkuanlun350/Kaggle_Ship_Detection_2018
Framework	tf

Generative Face Completion


Title	Generative Face Completion
Authors	Yijun Li, Sifei Liu, Jimei Yang, Ming-Hsuan Yang
Abstract	In this paper, we propose an effective face completion algorithm using a deep generative model. Different from well-studied background completion, the face completion task is more challenging as it often requires to generate semantically new pixels for the missing key components (e.g., eyes and mouths) that contain large appearance variations. Unlike existing nonparametric algorithms that search for patches to synthesize, our algorithm directly generates contents for missing regions based on a neural network. The model is trained with a combination of a reconstruction loss, two adversarial losses and a semantic parsing loss, which ensures pixel faithfulness and local-global contents consistency. With extensive experimental results, we demonstrate qualitatively and quantitatively that our model is able to deal with a large area of missing pixels in arbitrary shapes and generate realistic face completion results.
Tasks	Facial Inpainting, Semantic Parsing
Published	2017-04-19
URL	http://arxiv.org/abs/1704.05838v1
PDF	http://arxiv.org/pdf/1704.05838v1.pdf
PWC	https://paperswithcode.com/paper/generative-face-completion
Repo	https://github.com/easternCar/Face-Parsing-Network
Framework	pytorch

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning


Title	Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning
Authors	Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum
Abstract	Knowledge bases (KB), both automatically and manually constructed, are often incomplete — many valid facts can be inferred from the KB by synthesizing existing information. A popular approach to KB completion is to infer new relations by combinatory reasoning over the information found along other paths connecting a pair of entities. Given the enormous size of KBs and the exponential number of paths, previous path-based models have considered only the problem of predicting a missing relation given two entities or evaluating the truth of a proposed triple. Additionally, these methods have traditionally used random paths between fixed entity pairs or more recently learned to pick paths between them. We propose a new algorithm MINERVA, which addresses the much more difficult and practical task of answering questions where the relation is known, but only one entity. Since random walks are impractical in a setting with combinatorially many destinations from a start node, we present a neural reinforcement learning approach which learns how to navigate the graph conditioned on the input query to find predictive paths. Empirically, this approach obtains state-of-the-art results on several datasets, significantly outperforming prior methods.
Tasks
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05851v2
PDF	http://arxiv.org/pdf/1711.05851v2.pdf
PWC	https://paperswithcode.com/paper/go-for-a-walk-and-arrive-at-the-answer
Repo	https://github.com/markWJJ/rl
Framework	tf

Fast MCMC sampling algorithms on polytopes


Title	Fast MCMC sampling algorithms on polytopes
Authors	Yuansi Chen, Raaz Dwivedi, Martin J. Wainwright, Bin Yu
Abstract	We propose and analyze two new MCMC sampling algorithms, the Vaidya walk and the John walk, for generating samples from the uniform distribution over a polytope. Both random walks are sampling algorithms derived from interior point methods. The former is based on volumetric-logarithmic barrier introduced by Vaidya whereas the latter uses John’s ellipsoids. We show that the Vaidya walk mixes in significantly fewer steps than the logarithmic-barrier based Dikin walk studied in past work. For a polytope in $\mathbb{R}^d$ defined by $n >d$ linear constraints, we show that the mixing time from a warm start is bounded as $\mathcal{O}(n^{0.5}d^{1.5})$, compared to the $\mathcal{O}(nd)$ mixing time bound for the Dikin walk. The cost of each step of the Vaidya walk is of the same order as the Dikin walk, and at most twice as large in terms of constant pre-factors. For the John walk, we prove an $\mathcal{O}(d^{2.5}\cdot\log^4(n/d))$ bound on its mixing time and conjecture that an improved variant of it could achieve a mixing time of $\mathcal{O}(d^2\cdot\text{polylog}(n/d))$. Additionally, we propose variants of the Vaidya and John walks that mix in polynomial time from a deterministic starting point. The speed-up of the Vaidya walk over the Dikin walk are illustrated in numerical examples.
Tasks
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08165v3
PDF	http://arxiv.org/pdf/1710.08165v3.pdf
PWC	https://paperswithcode.com/paper/fast-mcmc-sampling-algorithms-on-polytopes
Repo	https://github.com/yuachen/polytopewalk
Framework	none