January 31, 2020

3269 words 16 mins read

Paper Group AWR 454

Paper Group AWR 454

GRIP: Graph-based Interaction-aware Trajectory Prediction. Tweets Can Tell: Activity Recognition using Hybrid Long Short-Term Memory Model. Flow-Motion and Depth Network for Monocular Stereo and Beyond. Meta-learnt priors slow down catastrophic forgetting in neural networks. Indoor Depth Completion with Boundary Consistency and Self-Attention. Omni …

GRIP: Graph-based Interaction-aware Trajectory Prediction

Title GRIP: Graph-based Interaction-aware Trajectory Prediction
Authors Xin Li, Xiaowen Ying, Mooi Choo Chuah
Abstract Nowadays, autonomous driving cars have become commercially available. However, the safety of a self-driving car is still a challenging problem that has not been well studied. Motion prediction is one of the core functions of an autonomous driving car. In this paper, we propose a novel scheme called GRIP which is designed to predict trajectories for traffic agents around an autonomous car efficiently. GRIP uses a graph to represent the interactions of close objects, applies several graph convolutional blocks to extract features, and subsequently uses an encoder-decoder long short-term memory (LSTM) model to make predictions. The experimental results on two well-known public datasets show that our proposed model improves the prediction accuracy of the state-of-the-art solution by 30%. The prediction error of GRIP is one meter shorter than existing schemes. Such an improvement can help autonomous driving cars avoid many traffic accidents. In addition, the proposed GRIP runs 5x faster than state-of-the-art schemes.
Tasks Autonomous Driving, motion prediction, Trajectory Prediction
Published 2019-07-17
URL https://arxiv.org/abs/1907.07792v1
PDF https://arxiv.org/pdf/1907.07792v1.pdf
PWC https://paperswithcode.com/paper/grip-graph-based-interaction-aware-trajectory
Repo https://github.com/rohanchandra30/Spectral-Trajectory-Prediction
Framework pytorch

Tweets Can Tell: Activity Recognition using Hybrid Long Short-Term Memory Model

Title Tweets Can Tell: Activity Recognition using Hybrid Long Short-Term Memory Model
Authors Renhao Cui, Gagan Agrawal, Rajiv Ramnath
Abstract This paper presents techniques to detect the “offline” activity a person is engaged in when she is tweeting (such as dining, shopping or entertainment), in order to create a dynamic profile of the user, for uses such as better targeting of advertisements. To this end, we propose a hybrid LSTM model for rich contextual learning, along with studies on the effects of applying and combining multiple LSTM based methods with different contextual features. The hybrid model is shown to outperform a set of baselines and state-of-the-art methods. Finally, this paper presents an orthogonal validation with a real-case application. Our model generates an offline activity analysis for the followers of several well-known accounts, which is quite representative of the expected characteristics of these accounts.
Tasks Activity Recognition
Published 2019-07-10
URL https://arxiv.org/abs/1908.02551v1
PDF https://arxiv.org/pdf/1908.02551v1.pdf
PWC https://paperswithcode.com/paper/tweets-can-tell-activity-recognition-using
Repo https://github.com/renhaocui/activityExtractor
Framework none

Flow-Motion and Depth Network for Monocular Stereo and Beyond

Title Flow-Motion and Depth Network for Monocular Stereo and Beyond
Authors Kaixuan Wang, Shaojie Shen
Abstract We propose a learning-based method that solves monocular stereo and can be extended to fuse depth information from multiple target frames. Given two unconstrained images from a monocular camera with known intrinsic calibration, our network estimates relative camera poses and the depth map of the source image. The core contribution of the proposed method is threefold. First, a network is tailored for static scenes that jointly estimates the optical flow and camera motion. By the joint estimation, the optical flow search space is gradually reduced resulting in an efficient and accurate flow estimation. Second, a novel triangulation layer is proposed to encode the estimated optical flow and camera motion while avoiding common numerical issues caused by epipolar. Third, beyond two-view depth estimation, we further extend the above networks to fuse depth information from multiple target images and estimate the depth map of the source image. To further benefit the research community, we introduce tools to generate photorealistic structure-from-motion datasets such that deep networks can be well trained and evaluated. The proposed method is compared with previous methods and achieves state-of-the-art results within less time. Images from real-world applications and Google Earth are used to demonstrate the generalization ability of the method.
Tasks Calibration, Depth Estimation, Optical Flow Estimation
Published 2019-09-12
URL https://arxiv.org/abs/1909.05452v1
PDF https://arxiv.org/pdf/1909.05452v1.pdf
PWC https://paperswithcode.com/paper/flow-motion-and-depth-network-for-monocular
Repo https://github.com/HKUST-Aerial-Robotics/Flow-Motion-Depth
Framework none

Meta-learnt priors slow down catastrophic forgetting in neural networks

Title Meta-learnt priors slow down catastrophic forgetting in neural networks
Authors Giacomo Spigler
Abstract Current training regimes for deep learning usually involve exposure to a single task / dataset at a time. Here we start from the observation that in this context the trained model is not given any knowledge of anything outside its (single-task) training distribution, and has thus no way to learn parameters (i.e., feature detectors or policies) that could be helpful to solve other tasks, and to limit future interference with the acquired knowledge, and thus catastrophic forgetting. Here we show that catastrophic forgetting can be mitigated in a meta-learning context, by exposing a neural network to multiple tasks in a sequential manner during training. Finally, we present SeqFOMAML, a meta-learning algorithm that implements these principles, and we evaluate it on sequential learning problems composed by Omniglot and MiniImageNet classification tasks.
Tasks Meta-Learning, Omniglot
Published 2019-09-09
URL https://arxiv.org/abs/1909.04170v2
PDF https://arxiv.org/pdf/1909.04170v2.pdf
PWC https://paperswithcode.com/paper/meta-learnt-priors-slow-down-catastrophic
Repo https://github.com/spiglerg/pyMeta
Framework tf

Indoor Depth Completion with Boundary Consistency and Self-Attention

Title Indoor Depth Completion with Boundary Consistency and Self-Attention
Authors Yu-Kai Huang, Tsung-Han Wu, Yueh-Cheng Liu, Winston H. Hsu
Abstract Depth estimation features are helpful for 3D recognition. Commodity-grade depth cameras are able to capture depth and color image in real-time. However, glossy, transparent or distant surface cannot be scanned properly by the sensor. As a result, enhancement and restoration from sensing depth is an important task. Depth completion aims at filling the holes that sensors fail to detect, which is still a complex task for machine to learn. Traditional hand-tuned methods have reached their limits, while neural network based methods tend to copy and interpolate the output from surrounding depth values. This leads to blurred boundaries, and structures of the depth map are lost. Consequently, our main work is to design an end-to-end network improving completion depth maps while maintaining edge clarity. We utilize self-attention mechanism, previously used in image inpainting fields, to extract more useful information in each layer of convolution so that the complete depth map is enhanced. In addition, we propose boundary consistency concept to enhance the depth map quality and structure. Experimental results validate the effectiveness of our self-attention and boundary consistency schema, which outperforms previous state-of-the-art depth completion work on Matterport3D dataset. Our code is publicly available at https://github.com/patrickwu2/Depth-Completion
Tasks Depth Completion, Depth Estimation, Image Inpainting
Published 2019-08-22
URL https://arxiv.org/abs/1908.08344v2
PDF https://arxiv.org/pdf/1908.08344v2.pdf
PWC https://paperswithcode.com/paper/indoor-depth-completion-with-boundary
Repo https://github.com/patrickwu2/Depth-Completion
Framework pytorch

OmniMVS: End-to-End Learning for Omnidirectional Stereo Matching

Title OmniMVS: End-to-End Learning for Omnidirectional Stereo Matching
Authors Changhee Won, Jongbin Ryu, Jongwoo Lim
Abstract In this paper, we propose a novel end-to-end deep neural network model for omnidirectional depth estimation from a wide-baseline multi-view stereo setup. The images captured with ultra wide field-of-view (FOV) cameras on an omnidirectional rig are processed by the feature extraction module, and then the deep feature maps are warped onto the concentric spheres swept through all candidate depths using the calibrated camera parameters. The 3D encoder-decoder block takes the aligned feature volume to produce the omnidirectional depth estimate with regularization on uncertain regions utilizing the global context information. In addition, we present large-scale synthetic datasets for training and testing omnidirectional multi-view stereo algorithms. Our datasets consist of 11K ground-truth depth maps and 45K fisheye images in four orthogonal directions with various objects and environments. Experimental results show that the proposed method generates excellent results in both synthetic and real-world environments, and it outperforms the prior art and the omnidirectional versions of the state-of-the-art conventional stereo algorithms.
Tasks Depth Estimation, Stereo Matching, Stereo Matching Hand
Published 2019-08-17
URL https://arxiv.org/abs/1908.06257v1
PDF https://arxiv.org/pdf/1908.06257v1.pdf
PWC https://paperswithcode.com/paper/omnimvs-end-to-end-learning-for
Repo https://github.com/matsuren/omnimvs_pytorch
Framework pytorch

Stay Ahead of Poachers: Illegal Wildlife Poaching Prediction and Patrol Planning Under Uncertainty with Field Test Evaluations

Title Stay Ahead of Poachers: Illegal Wildlife Poaching Prediction and Patrol Planning Under Uncertainty with Field Test Evaluations
Authors Lily Xu, Shahrzad Gholami, Sara Mc Carthy, Bistra Dilkina, Andrew Plumptre, Milind Tambe, Rohit Singh, Mustapha Nsubuga, Joshua Mabonga, Margaret Driciru, Fred Wanyama, Aggrey Rwetsiba, Tom Okello, Eric Enyel
Abstract Illegal wildlife poaching threatens ecosystems and drives endangered species toward extinction. However, efforts for wildlife protection are constrained by the limited resources of law enforcement agencies. To help combat poaching, the Protection Assistant for Wildlife Security (PAWS) is a machine learning pipeline that has been developed as a data-driven approach to identify areas at high risk of poaching throughout protected areas and compute optimal patrol routes. In this paper, we take an end-to-end approach to the data-to-deployment pipeline for anti-poaching. In doing so, we address challenges including extreme class imbalance (up to 1:200), bias, and uncertainty in wildlife poaching data to enhance PAWS, and we apply our methodology to three national parks with diverse characteristics. (i) We use Gaussian processes to quantify predictive uncertainty, which we exploit to improve robustness of our prescribed patrols and increase detection of snares by an average of 30%. We evaluate our approach on real-world historical poaching data from Murchison Falls and Queen Elizabeth National Parks in Uganda and, for the first time, Srepok Wildlife Sanctuary in Cambodia. (ii) We present the results of large-scale field tests conducted in Murchison Falls and Srepok Wildlife Sanctuary which confirm that the predictive power of PAWS extends promisingly to multiple parks. This paper is part of an effort to expand PAWS to 800 parks around the world through integration with SMART conservation software.
Tasks Gaussian Processes
Published 2019-03-08
URL https://arxiv.org/abs/1903.06669v3
PDF https://arxiv.org/pdf/1903.06669v3.pdf
PWC https://paperswithcode.com/paper/stay-ahead-of-poachers-illegal-wildlife
Repo https://github.com/lily-x/PAWS-public
Framework none

TEASER: Early and Accurate Time Series Classification

Title TEASER: Early and Accurate Time Series Classification
Authors P. Schäfer, U. Leser
Abstract Early time series classification (eTSC) is the problem of classifying a time series after as few measurements as possible with the highest possible accuracy. The most critical issue of any eTSC method is to decide when enough data of a time series has been seen to take a decision: Waiting for more data points usually makes the classification problem easier but delays the time in which a classification is made; in contrast, earlier classification has to cope with less input data, often leading to inferior accuracy. The state-of-the-art eTSC methods compute a fixed optimal decision time assuming that every times series has the same defined start time (like turning on a machine). However, in many real-life applications measurements start at arbitrary times (like measuring heartbeats of a patient), implying that the best time for taking a decision varies heavily between time series. We present TEASER, a novel algorithm that models eTSC as a two two-tier classification problem: In the first tier, a classifier periodically assesses the incoming time series to compute class probabilities. However, these class probabilities are only used as output label if a second-tier classifier decides that the predicted label is reliable enough, which can happen after a different number of measurements. In an evaluation using 45 benchmark datasets, TEASER is two to three times earlier at predictions than its competitors while reaching the same or an even higher classification accuracy. We further show TEASER’s superior performance using real-life use cases, namely energy monitoring, and gait detection.
Tasks Time Series, Time Series Classification
Published 2019-08-09
URL https://arxiv.org/abs/1908.03405v2
PDF https://arxiv.org/pdf/1908.03405v2.pdf
PWC https://paperswithcode.com/paper/teaser-early-and-accurate-time-series
Repo https://github.com/patrickzib/SFA
Framework tf

Simplify the Usage of Lexicon in Chinese NER

Title Simplify the Usage of Lexicon in Chinese NER
Authors Minlong Peng, Ruotian Ma, Qi Zhang, Xuanjing Huang
Abstract Recently, many works have tried to utilizing word lexicon to augment the performance of Chinese named entity recognition (NER). As a representative work in this line, Lattice-LSTM \cite{zhang2018chinese} has achieved new state-of-the-art performance on several benchmark Chinese NER datasets. However, Lattice-LSTM suffers from a complicated model architecture, resulting in low computational efficiency. This will heavily limit its application in many industrial areas, which require real-time NER response. In this work, we ask the question: if we can simplify the usage of lexicon and, at the same time, achieve comparative performance with Lattice-LSTM for Chinese NER? Started with this question and motivated by the idea of Lattice-LSTM, we propose a concise but effective method to incorporate the lexicon information into the vector representations of characters. This way, our method can avoid introducing a complicated sequence modeling architecture to model the lexicon information. Instead, it only needs to subtly adjust the character representation layer of the neural sequence model. Experimental study on four benchmark Chinese NER datasets shows that our method can achieve much faster inference speed, comparative or better performance over Lattice-LSTM and its follwees. It also shows that our method can be easily transferred across difference neural architectures.
Tasks Chinese Named Entity Recognition, Named Entity Recognition
Published 2019-08-16
URL https://arxiv.org/abs/1908.05969v1
PDF https://arxiv.org/pdf/1908.05969v1.pdf
PWC https://paperswithcode.com/paper/simplify-the-usage-of-lexicon-in-chinese-ner
Repo https://github.com/v-mipeng/LexiconAugmentedNER
Framework pytorch

NeuronBlocks: Building Your NLP DNN Models Like Playing Lego

Title NeuronBlocks: Building Your NLP DNN Models Like Playing Lego
Authors Ming Gong, Linjun Shou, Wutao Lin, Zhijie Sang, Quanjia Yan, Ze Yang, Feixiang Cheng, Daxin Jiang
Abstract Deep Neural Networks (DNN) have been widely employed in industry to address various Natural Language Processing (NLP) tasks. However, many engineers find it a big overhead when they have to choose from multiple frameworks, compare different types of models, and understand various optimization mechanisms. An NLP toolkit for DNN models with both generality and flexibility can greatly improve the productivity of engineers by saving their learning cost and guiding them to find optimal solutions to their tasks. In this paper, we introduce NeuronBlocks\footnote{Code: \url{https://github.com/Microsoft/NeuronBlocks}} \footnote{Demo: \url{https://youtu.be/x6cOpVSZcdo}}, a toolkit encapsulating a suite of neural network modules as building blocks to construct various DNN models with complex architecture. This toolkit empowers engineers to build, train, and test various NLP models through simple configuration of JSON files. The experiments on several NLP datasets such as GLUE, WikiQA and CoNLL-2003 demonstrate the effectiveness of NeuronBlocks.
Tasks
Published 2019-04-21
URL https://arxiv.org/abs/1904.09535v3
PDF https://arxiv.org/pdf/1904.09535v3.pdf
PWC https://paperswithcode.com/paper/neuronblocks-building-your-nlp-dnn-models
Repo https://github.com/Microsoft/NeuronBlocks
Framework pytorch

Towards Improving Solution Dominance with Incomparability Conditions: A case-study using Generator Itemset Mining

Title Towards Improving Solution Dominance with Incomparability Conditions: A case-study using Generator Itemset Mining
Authors Gökberk Koçak, Özgür Akgün, Tias Guns, Ian Miguel
Abstract Finding interesting patterns is a challenging task in data mining. Constraint based mining is a well-known approach to this, and one for which constraint programming has been shown to be a well-suited and generic framework. Dominance programming has been proposed as an extension that can capture an even wider class of constraint-based mining problems, by allowing to compare relations between patterns. In this paper, in addition to specifying a dominance relation, we introduce the ability to specify an incomparability condition. Using these two concepts we devise a generic framework that can do a batch-wise search that avoids checking incomparable solutions. We extend the ESSENCE language and underlying modelling pipeline to support this. We use generator itemset mining problem as a test case and give a declarative specification for that. We also present preliminary experimental results on this specific problem class with a CP solver backend to show that using the incomparability condition during search can improve the efficiency of dominance programming and reduces the need for post-processing to filter dominated solutions.
Tasks
Published 2019-10-01
URL https://arxiv.org/abs/1910.00505v1
PDF https://arxiv.org/pdf/1910.00505v1.pdf
PWC https://paperswithcode.com/paper/towards-improving-solution-dominance-with
Repo https://github.com/stacs-cp/ModRef2019-Dominance
Framework none

Dual-modality seq2seq network for audio-visual event localization

Title Dual-modality seq2seq network for audio-visual event localization
Authors Yan-Bo Lin, Yu-Jhe Li, Yu-Chiang Frank Wang
Abstract Audio-visual event localization requires one to identify theevent which is both visible and audible in a video (eitherat a frame or video level). To address this task, we pro-pose a deep neural network named Audio-Visual sequence-to-sequence dual network (AVSDN). By jointly taking bothaudio and visual features at each time segment as inputs, ourproposed model learns global and local event information ina sequence to sequence manner, which can be realized in ei-ther fully supervised or weakly supervised settings. Empiricalresults confirm that our proposed method performs favorablyagainst recent deep learning approaches in both settings.
Tasks
Published 2019-02-20
URL http://arxiv.org/abs/1902.07473v1
PDF http://arxiv.org/pdf/1902.07473v1.pdf
PWC https://paperswithcode.com/paper/dual-modality-seq2seq-network-for-audio
Repo https://github.com/YapengTian/AVE-ECCV18
Framework pytorch

Sobolev Independence Criterion

Title Sobolev Independence Criterion
Authors Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Dos Santos
Abstract We propose the Sobolev Independence Criterion (SIC), an interpretable dependency measure between a high dimensional random variable X and a response variable Y . SIC decomposes to the sum of feature importance scores and hence can be used for nonlinear feature selection. SIC can be seen as a gradient regularized Integral Probability Metric (IPM) between the joint distribution of the two random variables and the product of their marginals. We use sparsity inducing gradient penalties to promote input sparsity of the critic of the IPM. In the kernel version we show that SIC can be cast as a convex optimization problem by introducing auxiliary variables that play an important role in feature selection as they are normalized feature importance scores. We then present a neural version of SIC where the critic is parameterized as a homogeneous neural network, improving its representation power as well as its interpretability. We conduct experiments validating SIC for feature selection in synthetic and real-world experiments. We show that SIC enables reliable and interpretable discoveries, when used in conjunction with the holdout randomization test and knockoffs to control the False Discovery Rate. Code is available at http://github.com/ibm/sic.
Tasks Feature Importance, Feature Selection
Published 2019-10-31
URL https://arxiv.org/abs/1910.14212v1
PDF https://arxiv.org/pdf/1910.14212v1.pdf
PWC https://paperswithcode.com/paper/sobolev-independence-criterion
Repo https://github.com/IBM/SIC
Framework pytorch

FCC-GAN: A Fully Connected and Convolutional Net Architecture for GANs

Title FCC-GAN: A Fully Connected and Convolutional Net Architecture for GANs
Authors Sukarna Barua, Sarah Monazam Erfani, James Bailey
Abstract Generative Adversarial Networks (GANs) are a powerful class of generative models. Despite their successes, the most appropriate choice of a GAN network architecture is still not well understood. GAN models for image synthesis have adopted a deep convolutional network architecture, which eliminates or minimizes the use of fully connected and pooling layers in favor of convolution layers in the generator and discriminator of GANs. In this paper, we demonstrate that a convolution network architecture utilizing deep fully connected layers and pooling layers can be more effective than the traditional convolution-only architecture, and we propose FCC-GAN, a fully connected and convolutional GAN architecture. Models based on our FCC-GAN architecture learn both faster than the conventional architecture and also generate higher quality of samples. We demonstrate the effectiveness and stability of our approach across four popular image datasets.
Tasks Image Generation
Published 2019-05-07
URL https://arxiv.org/abs/1905.02417v2
PDF https://arxiv.org/pdf/1905.02417v2.pdf
PWC https://paperswithcode.com/paper/fcc-gan-a-fully-connected-and-convolutional
Repo https://github.com/sukarnabarua/fccgan
Framework tf

Modeling Semantic Compositionality with Sememe Knowledge

Title Modeling Semantic Compositionality with Sememe Knowledge
Authors Fanchao Qi, Junjie Huang, Chenghao Yang, Zhiyuan Liu, Xiao Chen, Qun Liu, Maosong Sun
Abstract Semantic compositionality (SC) refers to the phenomenon that the meaning of a complex linguistic unit can be composed of the meanings of its constituents. Most related works focus on using complicated compositionality functions to model SC while few works consider external knowledge in models. In this paper, we verify the effectiveness of sememes, the minimum semantic units of human languages, in modeling SC by a confirmatory experiment. Furthermore, we make the first attempt to incorporate sememe knowledge into SC models, and employ the sememeincorporated models in learning representations of multiword expressions, a typical task of SC. In experiments, we implement our models by incorporating knowledge from a famous sememe knowledge base HowNet and perform both intrinsic and extrinsic evaluations. Experimental results show that our models achieve significant performance boost as compared to the baseline methods without considering sememe knowledge. We further conduct quantitative analysis and case studies to demonstrate the effectiveness of applying sememe knowledge in modeling SC. All the code and data of this paper can be obtained on https://github.com/thunlp/Sememe-SC.
Tasks multi-word expression embedding, multi-word expression sememe prediction
Published 2019-07-10
URL https://arxiv.org/abs/1907.04744v1
PDF https://arxiv.org/pdf/1907.04744v1.pdf
PWC https://paperswithcode.com/paper/modeling-semantic-compositionality-with
Repo https://github.com/thunlp/Sememe-SC
Framework tf
comments powered by Disqus