Paper Group AWR 454
GRIP: Graph-based Interaction-aware Trajectory Prediction. Tweets Can Tell: Activity Recognition using Hybrid Long Short-Term Memory Model. Flow-Motion and Depth Network for Monocular Stereo and Beyond. Meta-learnt priors slow down catastrophic forgetting in neural networks. Indoor Depth Completion with Boundary Consistency and Self-Attention. Omni …
GRIP: Graph-based Interaction-aware Trajectory Prediction
Title | GRIP: Graph-based Interaction-aware Trajectory Prediction |
Authors | Xin Li, Xiaowen Ying, Mooi Choo Chuah |
Abstract | Nowadays, autonomous driving cars have become commercially available. However, the safety of a self-driving car is still a challenging problem that has not been well studied. Motion prediction is one of the core functions of an autonomous driving car. In this paper, we propose a novel scheme called GRIP which is designed to predict trajectories for traffic agents around an autonomous car efficiently. GRIP uses a graph to represent the interactions of close objects, applies several graph convolutional blocks to extract features, and subsequently uses an encoder-decoder long short-term memory (LSTM) model to make predictions. The experimental results on two well-known public datasets show that our proposed model improves the prediction accuracy of the state-of-the-art solution by 30%. The prediction error of GRIP is one meter shorter than existing schemes. Such an improvement can help autonomous driving cars avoid many traffic accidents. In addition, the proposed GRIP runs 5x faster than state-of-the-art schemes. |
Tasks | Autonomous Driving, motion prediction, Trajectory Prediction |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07792v1 |
https://arxiv.org/pdf/1907.07792v1.pdf | |
PWC | https://paperswithcode.com/paper/grip-graph-based-interaction-aware-trajectory |
Repo | https://github.com/rohanchandra30/Spectral-Trajectory-Prediction |
Framework | pytorch |
Tweets Can Tell: Activity Recognition using Hybrid Long Short-Term Memory Model
Title | Tweets Can Tell: Activity Recognition using Hybrid Long Short-Term Memory Model |
Authors | Renhao Cui, Gagan Agrawal, Rajiv Ramnath |
Abstract | This paper presents techniques to detect the “offline” activity a person is engaged in when she is tweeting (such as dining, shopping or entertainment), in order to create a dynamic profile of the user, for uses such as better targeting of advertisements. To this end, we propose a hybrid LSTM model for rich contextual learning, along with studies on the effects of applying and combining multiple LSTM based methods with different contextual features. The hybrid model is shown to outperform a set of baselines and state-of-the-art methods. Finally, this paper presents an orthogonal validation with a real-case application. Our model generates an offline activity analysis for the followers of several well-known accounts, which is quite representative of the expected characteristics of these accounts. |
Tasks | Activity Recognition |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1908.02551v1 |
https://arxiv.org/pdf/1908.02551v1.pdf | |
PWC | https://paperswithcode.com/paper/tweets-can-tell-activity-recognition-using |
Repo | https://github.com/renhaocui/activityExtractor |
Framework | none |
Flow-Motion and Depth Network for Monocular Stereo and Beyond
Title | Flow-Motion and Depth Network for Monocular Stereo and Beyond |
Authors | Kaixuan Wang, Shaojie Shen |
Abstract | We propose a learning-based method that solves monocular stereo and can be extended to fuse depth information from multiple target frames. Given two unconstrained images from a monocular camera with known intrinsic calibration, our network estimates relative camera poses and the depth map of the source image. The core contribution of the proposed method is threefold. First, a network is tailored for static scenes that jointly estimates the optical flow and camera motion. By the joint estimation, the optical flow search space is gradually reduced resulting in an efficient and accurate flow estimation. Second, a novel triangulation layer is proposed to encode the estimated optical flow and camera motion while avoiding common numerical issues caused by epipolar. Third, beyond two-view depth estimation, we further extend the above networks to fuse depth information from multiple target images and estimate the depth map of the source image. To further benefit the research community, we introduce tools to generate photorealistic structure-from-motion datasets such that deep networks can be well trained and evaluated. The proposed method is compared with previous methods and achieves state-of-the-art results within less time. Images from real-world applications and Google Earth are used to demonstrate the generalization ability of the method. |
Tasks | Calibration, Depth Estimation, Optical Flow Estimation |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05452v1 |
https://arxiv.org/pdf/1909.05452v1.pdf | |
PWC | https://paperswithcode.com/paper/flow-motion-and-depth-network-for-monocular |
Repo | https://github.com/HKUST-Aerial-Robotics/Flow-Motion-Depth |
Framework | none |
Meta-learnt priors slow down catastrophic forgetting in neural networks
Title | Meta-learnt priors slow down catastrophic forgetting in neural networks |
Authors | Giacomo Spigler |
Abstract | Current training regimes for deep learning usually involve exposure to a single task / dataset at a time. Here we start from the observation that in this context the trained model is not given any knowledge of anything outside its (single-task) training distribution, and has thus no way to learn parameters (i.e., feature detectors or policies) that could be helpful to solve other tasks, and to limit future interference with the acquired knowledge, and thus catastrophic forgetting. Here we show that catastrophic forgetting can be mitigated in a meta-learning context, by exposing a neural network to multiple tasks in a sequential manner during training. Finally, we present SeqFOMAML, a meta-learning algorithm that implements these principles, and we evaluate it on sequential learning problems composed by Omniglot and MiniImageNet classification tasks. |
Tasks | Meta-Learning, Omniglot |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.04170v2 |
https://arxiv.org/pdf/1909.04170v2.pdf | |
PWC | https://paperswithcode.com/paper/meta-learnt-priors-slow-down-catastrophic |
Repo | https://github.com/spiglerg/pyMeta |
Framework | tf |
Indoor Depth Completion with Boundary Consistency and Self-Attention
Title | Indoor Depth Completion with Boundary Consistency and Self-Attention |
Authors | Yu-Kai Huang, Tsung-Han Wu, Yueh-Cheng Liu, Winston H. Hsu |
Abstract | Depth estimation features are helpful for 3D recognition. Commodity-grade depth cameras are able to capture depth and color image in real-time. However, glossy, transparent or distant surface cannot be scanned properly by the sensor. As a result, enhancement and restoration from sensing depth is an important task. Depth completion aims at filling the holes that sensors fail to detect, which is still a complex task for machine to learn. Traditional hand-tuned methods have reached their limits, while neural network based methods tend to copy and interpolate the output from surrounding depth values. This leads to blurred boundaries, and structures of the depth map are lost. Consequently, our main work is to design an end-to-end network improving completion depth maps while maintaining edge clarity. We utilize self-attention mechanism, previously used in image inpainting fields, to extract more useful information in each layer of convolution so that the complete depth map is enhanced. In addition, we propose boundary consistency concept to enhance the depth map quality and structure. Experimental results validate the effectiveness of our self-attention and boundary consistency schema, which outperforms previous state-of-the-art depth completion work on Matterport3D dataset. Our code is publicly available at https://github.com/patrickwu2/Depth-Completion |
Tasks | Depth Completion, Depth Estimation, Image Inpainting |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08344v2 |
https://arxiv.org/pdf/1908.08344v2.pdf | |
PWC | https://paperswithcode.com/paper/indoor-depth-completion-with-boundary |
Repo | https://github.com/patrickwu2/Depth-Completion |
Framework | pytorch |
OmniMVS: End-to-End Learning for Omnidirectional Stereo Matching
Title | OmniMVS: End-to-End Learning for Omnidirectional Stereo Matching |
Authors | Changhee Won, Jongbin Ryu, Jongwoo Lim |
Abstract | In this paper, we propose a novel end-to-end deep neural network model for omnidirectional depth estimation from a wide-baseline multi-view stereo setup. The images captured with ultra wide field-of-view (FOV) cameras on an omnidirectional rig are processed by the feature extraction module, and then the deep feature maps are warped onto the concentric spheres swept through all candidate depths using the calibrated camera parameters. The 3D encoder-decoder block takes the aligned feature volume to produce the omnidirectional depth estimate with regularization on uncertain regions utilizing the global context information. In addition, we present large-scale synthetic datasets for training and testing omnidirectional multi-view stereo algorithms. Our datasets consist of 11K ground-truth depth maps and 45K fisheye images in four orthogonal directions with various objects and environments. Experimental results show that the proposed method generates excellent results in both synthetic and real-world environments, and it outperforms the prior art and the omnidirectional versions of the state-of-the-art conventional stereo algorithms. |
Tasks | Depth Estimation, Stereo Matching, Stereo Matching Hand |
Published | 2019-08-17 |
URL | https://arxiv.org/abs/1908.06257v1 |
https://arxiv.org/pdf/1908.06257v1.pdf | |
PWC | https://paperswithcode.com/paper/omnimvs-end-to-end-learning-for |
Repo | https://github.com/matsuren/omnimvs_pytorch |
Framework | pytorch |
Stay Ahead of Poachers: Illegal Wildlife Poaching Prediction and Patrol Planning Under Uncertainty with Field Test Evaluations
Title | Stay Ahead of Poachers: Illegal Wildlife Poaching Prediction and Patrol Planning Under Uncertainty with Field Test Evaluations |
Authors | Lily Xu, Shahrzad Gholami, Sara Mc Carthy, Bistra Dilkina, Andrew Plumptre, Milind Tambe, Rohit Singh, Mustapha Nsubuga, Joshua Mabonga, Margaret Driciru, Fred Wanyama, Aggrey Rwetsiba, Tom Okello, Eric Enyel |
Abstract | Illegal wildlife poaching threatens ecosystems and drives endangered species toward extinction. However, efforts for wildlife protection are constrained by the limited resources of law enforcement agencies. To help combat poaching, the Protection Assistant for Wildlife Security (PAWS) is a machine learning pipeline that has been developed as a data-driven approach to identify areas at high risk of poaching throughout protected areas and compute optimal patrol routes. In this paper, we take an end-to-end approach to the data-to-deployment pipeline for anti-poaching. In doing so, we address challenges including extreme class imbalance (up to 1:200), bias, and uncertainty in wildlife poaching data to enhance PAWS, and we apply our methodology to three national parks with diverse characteristics. (i) We use Gaussian processes to quantify predictive uncertainty, which we exploit to improve robustness of our prescribed patrols and increase detection of snares by an average of 30%. We evaluate our approach on real-world historical poaching data from Murchison Falls and Queen Elizabeth National Parks in Uganda and, for the first time, Srepok Wildlife Sanctuary in Cambodia. (ii) We present the results of large-scale field tests conducted in Murchison Falls and Srepok Wildlife Sanctuary which confirm that the predictive power of PAWS extends promisingly to multiple parks. This paper is part of an effort to expand PAWS to 800 parks around the world through integration with SMART conservation software. |
Tasks | Gaussian Processes |
Published | 2019-03-08 |
URL | https://arxiv.org/abs/1903.06669v3 |
https://arxiv.org/pdf/1903.06669v3.pdf | |
PWC | https://paperswithcode.com/paper/stay-ahead-of-poachers-illegal-wildlife |
Repo | https://github.com/lily-x/PAWS-public |
Framework | none |
TEASER: Early and Accurate Time Series Classification
Title | TEASER: Early and Accurate Time Series Classification |
Authors | P. Schäfer, U. Leser |
Abstract | Early time series classification (eTSC) is the problem of classifying a time series after as few measurements as possible with the highest possible accuracy. The most critical issue of any eTSC method is to decide when enough data of a time series has been seen to take a decision: Waiting for more data points usually makes the classification problem easier but delays the time in which a classification is made; in contrast, earlier classification has to cope with less input data, often leading to inferior accuracy. The state-of-the-art eTSC methods compute a fixed optimal decision time assuming that every times series has the same defined start time (like turning on a machine). However, in many real-life applications measurements start at arbitrary times (like measuring heartbeats of a patient), implying that the best time for taking a decision varies heavily between time series. We present TEASER, a novel algorithm that models eTSC as a two two-tier classification problem: In the first tier, a classifier periodically assesses the incoming time series to compute class probabilities. However, these class probabilities are only used as output label if a second-tier classifier decides that the predicted label is reliable enough, which can happen after a different number of measurements. In an evaluation using 45 benchmark datasets, TEASER is two to three times earlier at predictions than its competitors while reaching the same or an even higher classification accuracy. We further show TEASER’s superior performance using real-life use cases, namely energy monitoring, and gait detection. |
Tasks | Time Series, Time Series Classification |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.03405v2 |
https://arxiv.org/pdf/1908.03405v2.pdf | |
PWC | https://paperswithcode.com/paper/teaser-early-and-accurate-time-series |
Repo | https://github.com/patrickzib/SFA |
Framework | tf |
Simplify the Usage of Lexicon in Chinese NER
Title | Simplify the Usage of Lexicon in Chinese NER |
Authors | Minlong Peng, Ruotian Ma, Qi Zhang, Xuanjing Huang |
Abstract | Recently, many works have tried to utilizing word lexicon to augment the performance of Chinese named entity recognition (NER). As a representative work in this line, Lattice-LSTM \cite{zhang2018chinese} has achieved new state-of-the-art performance on several benchmark Chinese NER datasets. However, Lattice-LSTM suffers from a complicated model architecture, resulting in low computational efficiency. This will heavily limit its application in many industrial areas, which require real-time NER response. In this work, we ask the question: if we can simplify the usage of lexicon and, at the same time, achieve comparative performance with Lattice-LSTM for Chinese NER? Started with this question and motivated by the idea of Lattice-LSTM, we propose a concise but effective method to incorporate the lexicon information into the vector representations of characters. This way, our method can avoid introducing a complicated sequence modeling architecture to model the lexicon information. Instead, it only needs to subtly adjust the character representation layer of the neural sequence model. Experimental study on four benchmark Chinese NER datasets shows that our method can achieve much faster inference speed, comparative or better performance over Lattice-LSTM and its follwees. It also shows that our method can be easily transferred across difference neural architectures. |
Tasks | Chinese Named Entity Recognition, Named Entity Recognition |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.05969v1 |
https://arxiv.org/pdf/1908.05969v1.pdf | |
PWC | https://paperswithcode.com/paper/simplify-the-usage-of-lexicon-in-chinese-ner |
Repo | https://github.com/v-mipeng/LexiconAugmentedNER |
Framework | pytorch |
NeuronBlocks: Building Your NLP DNN Models Like Playing Lego
Title | NeuronBlocks: Building Your NLP DNN Models Like Playing Lego |
Authors | Ming Gong, Linjun Shou, Wutao Lin, Zhijie Sang, Quanjia Yan, Ze Yang, Feixiang Cheng, Daxin Jiang |
Abstract | Deep Neural Networks (DNN) have been widely employed in industry to address various Natural Language Processing (NLP) tasks. However, many engineers find it a big overhead when they have to choose from multiple frameworks, compare different types of models, and understand various optimization mechanisms. An NLP toolkit for DNN models with both generality and flexibility can greatly improve the productivity of engineers by saving their learning cost and guiding them to find optimal solutions to their tasks. In this paper, we introduce NeuronBlocks\footnote{Code: \url{https://github.com/Microsoft/NeuronBlocks}} \footnote{Demo: \url{https://youtu.be/x6cOpVSZcdo}}, a toolkit encapsulating a suite of neural network modules as building blocks to construct various DNN models with complex architecture. This toolkit empowers engineers to build, train, and test various NLP models through simple configuration of JSON files. The experiments on several NLP datasets such as GLUE, WikiQA and CoNLL-2003 demonstrate the effectiveness of NeuronBlocks. |
Tasks | |
Published | 2019-04-21 |
URL | https://arxiv.org/abs/1904.09535v3 |
https://arxiv.org/pdf/1904.09535v3.pdf | |
PWC | https://paperswithcode.com/paper/neuronblocks-building-your-nlp-dnn-models |
Repo | https://github.com/Microsoft/NeuronBlocks |
Framework | pytorch |
Towards Improving Solution Dominance with Incomparability Conditions: A case-study using Generator Itemset Mining
Title | Towards Improving Solution Dominance with Incomparability Conditions: A case-study using Generator Itemset Mining |
Authors | Gökberk Koçak, Özgür Akgün, Tias Guns, Ian Miguel |
Abstract | Finding interesting patterns is a challenging task in data mining. Constraint based mining is a well-known approach to this, and one for which constraint programming has been shown to be a well-suited and generic framework. Dominance programming has been proposed as an extension that can capture an even wider class of constraint-based mining problems, by allowing to compare relations between patterns. In this paper, in addition to specifying a dominance relation, we introduce the ability to specify an incomparability condition. Using these two concepts we devise a generic framework that can do a batch-wise search that avoids checking incomparable solutions. We extend the ESSENCE language and underlying modelling pipeline to support this. We use generator itemset mining problem as a test case and give a declarative specification for that. We also present preliminary experimental results on this specific problem class with a CP solver backend to show that using the incomparability condition during search can improve the efficiency of dominance programming and reduces the need for post-processing to filter dominated solutions. |
Tasks | |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00505v1 |
https://arxiv.org/pdf/1910.00505v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-improving-solution-dominance-with |
Repo | https://github.com/stacs-cp/ModRef2019-Dominance |
Framework | none |
Dual-modality seq2seq network for audio-visual event localization
Title | Dual-modality seq2seq network for audio-visual event localization |
Authors | Yan-Bo Lin, Yu-Jhe Li, Yu-Chiang Frank Wang |
Abstract | Audio-visual event localization requires one to identify theevent which is both visible and audible in a video (eitherat a frame or video level). To address this task, we pro-pose a deep neural network named Audio-Visual sequence-to-sequence dual network (AVSDN). By jointly taking bothaudio and visual features at each time segment as inputs, ourproposed model learns global and local event information ina sequence to sequence manner, which can be realized in ei-ther fully supervised or weakly supervised settings. Empiricalresults confirm that our proposed method performs favorablyagainst recent deep learning approaches in both settings. |
Tasks | |
Published | 2019-02-20 |
URL | http://arxiv.org/abs/1902.07473v1 |
http://arxiv.org/pdf/1902.07473v1.pdf | |
PWC | https://paperswithcode.com/paper/dual-modality-seq2seq-network-for-audio |
Repo | https://github.com/YapengTian/AVE-ECCV18 |
Framework | pytorch |
Sobolev Independence Criterion
Title | Sobolev Independence Criterion |
Authors | Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Dos Santos |
Abstract | We propose the Sobolev Independence Criterion (SIC), an interpretable dependency measure between a high dimensional random variable X and a response variable Y . SIC decomposes to the sum of feature importance scores and hence can be used for nonlinear feature selection. SIC can be seen as a gradient regularized Integral Probability Metric (IPM) between the joint distribution of the two random variables and the product of their marginals. We use sparsity inducing gradient penalties to promote input sparsity of the critic of the IPM. In the kernel version we show that SIC can be cast as a convex optimization problem by introducing auxiliary variables that play an important role in feature selection as they are normalized feature importance scores. We then present a neural version of SIC where the critic is parameterized as a homogeneous neural network, improving its representation power as well as its interpretability. We conduct experiments validating SIC for feature selection in synthetic and real-world experiments. We show that SIC enables reliable and interpretable discoveries, when used in conjunction with the holdout randomization test and knockoffs to control the False Discovery Rate. Code is available at http://github.com/ibm/sic. |
Tasks | Feature Importance, Feature Selection |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14212v1 |
https://arxiv.org/pdf/1910.14212v1.pdf | |
PWC | https://paperswithcode.com/paper/sobolev-independence-criterion |
Repo | https://github.com/IBM/SIC |
Framework | pytorch |
FCC-GAN: A Fully Connected and Convolutional Net Architecture for GANs
Title | FCC-GAN: A Fully Connected and Convolutional Net Architecture for GANs |
Authors | Sukarna Barua, Sarah Monazam Erfani, James Bailey |
Abstract | Generative Adversarial Networks (GANs) are a powerful class of generative models. Despite their successes, the most appropriate choice of a GAN network architecture is still not well understood. GAN models for image synthesis have adopted a deep convolutional network architecture, which eliminates or minimizes the use of fully connected and pooling layers in favor of convolution layers in the generator and discriminator of GANs. In this paper, we demonstrate that a convolution network architecture utilizing deep fully connected layers and pooling layers can be more effective than the traditional convolution-only architecture, and we propose FCC-GAN, a fully connected and convolutional GAN architecture. Models based on our FCC-GAN architecture learn both faster than the conventional architecture and also generate higher quality of samples. We demonstrate the effectiveness and stability of our approach across four popular image datasets. |
Tasks | Image Generation |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02417v2 |
https://arxiv.org/pdf/1905.02417v2.pdf | |
PWC | https://paperswithcode.com/paper/fcc-gan-a-fully-connected-and-convolutional |
Repo | https://github.com/sukarnabarua/fccgan |
Framework | tf |
Modeling Semantic Compositionality with Sememe Knowledge
Title | Modeling Semantic Compositionality with Sememe Knowledge |
Authors | Fanchao Qi, Junjie Huang, Chenghao Yang, Zhiyuan Liu, Xiao Chen, Qun Liu, Maosong Sun |
Abstract | Semantic compositionality (SC) refers to the phenomenon that the meaning of a complex linguistic unit can be composed of the meanings of its constituents. Most related works focus on using complicated compositionality functions to model SC while few works consider external knowledge in models. In this paper, we verify the effectiveness of sememes, the minimum semantic units of human languages, in modeling SC by a confirmatory experiment. Furthermore, we make the first attempt to incorporate sememe knowledge into SC models, and employ the sememeincorporated models in learning representations of multiword expressions, a typical task of SC. In experiments, we implement our models by incorporating knowledge from a famous sememe knowledge base HowNet and perform both intrinsic and extrinsic evaluations. Experimental results show that our models achieve significant performance boost as compared to the baseline methods without considering sememe knowledge. We further conduct quantitative analysis and case studies to demonstrate the effectiveness of applying sememe knowledge in modeling SC. All the code and data of this paper can be obtained on https://github.com/thunlp/Sememe-SC. |
Tasks | multi-word expression embedding, multi-word expression sememe prediction |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04744v1 |
https://arxiv.org/pdf/1907.04744v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-semantic-compositionality-with |
Repo | https://github.com/thunlp/Sememe-SC |
Framework | tf |