Paper Group ANR 682
R$^3$-Net: A Deep Network for Multi-oriented Vehicle Detection in Aerial Images and Videos. Relational dynamic memory networks. Generative Models for Spear Phishing Posts on Social Media. Vehicle Re-identification Using Quadruple Directional Deep Learning Features. Sequence-to-Sequence Learning for Task-oriented Dialogue with Dialogue State Represe …
R$^3$-Net: A Deep Network for Multi-oriented Vehicle Detection in Aerial Images and Videos
Title | R$^3$-Net: A Deep Network for Multi-oriented Vehicle Detection in Aerial Images and Videos |
Authors | Qingpeng Li, Lichao Mou, Qizhi Xu, Yun Zhang, Xiao Xiang Zhu |
Abstract | Vehicle detection is a significant and challenging task in aerial remote sensing applications. Most existing methods detect vehicles with regular rectangle boxes and fail to offer the orientation of vehicles. However, the orientation information is crucial for several practical applications, such as the trajectory and motion estimation of vehicles. In this paper, we propose a novel deep network, called rotatable region-based residual network (R$^3$-Net), to detect multi-oriented vehicles in aerial images and videos. More specially, R$^3$-Net is utilized to generate rotatable rectangular target boxes in a half coordinate system. First, we use a rotatable region proposal network (R-RPN) to generate rotatable region of interests (R-RoIs) from feature maps produced by a deep convolutional neural network. Here, a proposed batch averaging rotatable anchor (BAR anchor) strategy is applied to initialize the shape of vehicle candidates. Next, we propose a rotatable detection network (R-DN) for the final classification and regression of the R-RoIs. In R-DN, a novel rotatable position sensitive pooling (R-PS pooling) is designed to keep the position and orientation information simultaneously while downsampling the feature maps of R-RoIs. In our model, R-RPN and R-DN can be trained jointly. We test our network on two open vehicle detection image datasets, namely DLR 3K Munich Dataset and VEDAI Dataset, demonstrating the high precision and robustness of our method. In addition, further experiments on aerial videos show the good generalization capability of the proposed method and its potential for vehicle tracking in aerial videos. The demo video is available at https://youtu.be/xCYD-tYudN0. |
Tasks | Motion Estimation |
Published | 2018-08-16 |
URL | http://arxiv.org/abs/1808.05560v1 |
http://arxiv.org/pdf/1808.05560v1.pdf | |
PWC | https://paperswithcode.com/paper/r3-net-a-deep-network-for-multi-oriented |
Repo | |
Framework | |
Relational dynamic memory networks
Title | Relational dynamic memory networks |
Authors | Trang Pham, Truyen Tran, Svetha Venkatesh |
Abstract | Neural networks excel in detecting regular patterns but are less successful in representing and manipulating complex data structures, possibly due to the lack of an external memory. This has led to the recent development of a new line of architectures known as Memory-Augmented Neural Networks (MANNs), each of which consists of a neural network that interacts with an external memory matrix. However, this RAM-like memory matrix is unstructured and thus does not naturally encode structured objects. Here we design a new MANN dubbed Relational Dynamic Memory Network (RMDN) to bridge the gap. Like existing MANNs, RMDN has a neural controller but its memory is structured as multi-relational graphs. RMDN uses the memory to represent and manipulate graph-structured data in response to query; and as a neural network, RMDN is trainable from labeled data. Thus RMDN learns to answer queries about a set of graph-structured objects without explicit programming. We evaluate the capability of RMDN on several important prediction problems, including software vulnerability, molecular bioactivity and chemical-chemical interaction. Results demonstrate the efficacy of the proposed model. |
Tasks | |
Published | 2018-08-10 |
URL | http://arxiv.org/abs/1808.04247v3 |
http://arxiv.org/pdf/1808.04247v3.pdf | |
PWC | https://paperswithcode.com/paper/relational-dynamic-memory-networks |
Repo | |
Framework | |
Generative Models for Spear Phishing Posts on Social Media
Title | Generative Models for Spear Phishing Posts on Social Media |
Authors | John Seymour, Philip Tully |
Abstract | Historically, machine learning in computer security has prioritized defense: think intrusion detection systems, malware classification, and botnet traffic identification. Offense can benefit from data just as well. Social networks, with their access to extensive personal data, bot-friendly APIs, colloquial syntax, and prevalence of shortened links, are the perfect venues for spreading machine-generated malicious content. We aim to discover what capabilities an adversary might utilize in such a domain. We present a long short-term memory (LSTM) neural network that learns to socially engineer specific users into clicking on deceptive URLs. The model is trained with word vector representations of social media posts, and in order to make a click-through more likely, it is dynamically seeded with topics extracted from the target’s timeline. We augment the model with clustering to triage high value targets based on their level of social engagement, and measure success of the LSTM’s phishing expedition using click-rates of IP-tracked links. We achieve state of the art success rates, tripling those of historic email attack campaigns, and outperform humans manually performing the same task. |
Tasks | Intrusion Detection, Malware Classification |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.05196v1 |
http://arxiv.org/pdf/1802.05196v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-models-for-spear-phishing-posts-on |
Repo | |
Framework | |
Vehicle Re-identification Using Quadruple Directional Deep Learning Features
Title | Vehicle Re-identification Using Quadruple Directional Deep Learning Features |
Authors | Jianqing Zhu, Huanqiang Zeng, Jingchang Huang, Shengcai Liao, Zhen Lei, Canhui Cai, LiXin Zheng |
Abstract | In order to resist the adverse effect of viewpoint variations for improving vehicle re-identification performance, we design quadruple directional deep learning networks to extract quadruple directional deep learning features (QD-DLF) of vehicle images. The quadruple directional deep learning networks are with similar overall architecture, including the same basic deep learning architecture but different directional feature pooling layers. Specifically, the same basic deep learning architecture is a shortly and densely connected convolutional neural network to extract basic feature maps of an input square vehicle image in the first stage. Then, the quadruple directional deep learning networks utilize different directional pooling layers, i.e., horizontal average pooling (HAP) layer, vertical average pooling (VAP) layer, diagonal average pooling (DAP) layer and anti-diagonal average pooling (AAP) layer, to compress the basic feature maps into horizontal, vertical, diagonal and anti-diagonal directional feature maps, respectively. Finally, these directional feature maps are spatially normalized and concatenated together as a quadruple directional deep learning feature for vehicle re-identification. Extensive experiments on both VeRi and VehicleID databases show that the proposed QD-DLF approach outperforms multiple state-of-the-art vehicle re-identification methods. |
Tasks | Vehicle Re-Identification |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05163v1 |
http://arxiv.org/pdf/1811.05163v1.pdf | |
PWC | https://paperswithcode.com/paper/vehicle-re-identification-using-quadruple |
Repo | |
Framework | |
Sequence-to-Sequence Learning for Task-oriented Dialogue with Dialogue State Representation
Title | Sequence-to-Sequence Learning for Task-oriented Dialogue with Dialogue State Representation |
Authors | Haoyang Wen, Yijia Liu, Wanxiang Che, Libo Qin, Ting Liu |
Abstract | Classic pipeline models for task-oriented dialogue system require explicit modeling the dialogue states and hand-crafted action spaces to query a domain-specific knowledge base. Conversely, sequence-to-sequence models learn to map dialogue history to the response in current turn without explicit knowledge base querying. In this work, we propose a novel framework that leverages the advantages of classic pipeline and sequence-to-sequence models. Our framework models a dialogue state as a fixed-size distributed representation and use this representation to query a knowledge base via an attention mechanism. Experiment on Stanford Multi-turn Multi-domain Task-oriented Dialogue Dataset shows that our framework significantly outperforms other sequence-to-sequence based baseline models on both automatic and human evaluation. |
Tasks | |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04441v1 |
http://arxiv.org/pdf/1806.04441v1.pdf | |
PWC | https://paperswithcode.com/paper/sequence-to-sequence-learning-for-task |
Repo | |
Framework | |
Vehicle Re-Identification in Context
Title | Vehicle Re-Identification in Context |
Authors | Aytaç Kanacı, Xiatian Zhu, Shaogang Gong |
Abstract | Existing vehicle re-identification (re-id) evaluation benchmarks consider strongly artificial test scenarios by assuming the availability of high quality images and fine-grained appearance at an almost constant image scale, reminiscent to images required for Automatic Number Plate Recognition, e.g. VeRi-776. Such assumptions are often invalid in realistic vehicle re-id scenarios where arbitrarily changing image resolutions (scales) are the norm. This makes the existing vehicle re-id benchmarks limited for testing the true performance of a re-id method. In this work, we introduce a more realistic and challenging vehicle re-id benchmark, called Vehicle Re-Identification in Context (VRIC). In contrast to existing datasets, VRIC is uniquely characterised by vehicle images subject to more realistic and unconstrained variations in resolution (scale), motion blur, illumination, occlusion, and viewpoint. It contains 60,430 images of 5,622 vehicle identities captured by 60 different cameras at heterogeneous road traffic scenes in both day-time and night-time. |
Tasks | Vehicle Re-Identification |
Published | 2018-09-25 |
URL | http://arxiv.org/abs/1809.09409v2 |
http://arxiv.org/pdf/1809.09409v2.pdf | |
PWC | https://paperswithcode.com/paper/vehicle-re-identification-in-context |
Repo | |
Framework | |
Integrating Task-Motion Planning with Reinforcement Learning for Robust Decision Making in Mobile Robots
Title | Integrating Task-Motion Planning with Reinforcement Learning for Robust Decision Making in Mobile Robots |
Authors | Yuqian Jiang, Fangkai Yang, Shiqi Zhang, Peter Stone |
Abstract | Task-motion planning (TMP) addresses the problem of efficiently generating executable and low-cost task plans in a discrete space such that the (initially unknown) action costs are determined by motion plans in a corresponding continuous space. However, a task-motion plan can be sensitive to unexpected domain uncertainty and changes, leading to suboptimal behaviors or execution failures. In this paper, we propose a novel framework, TMP-RL, which is an integration of TMP and reinforcement learning (RL) from the execution experience, to solve the problem of robust task-motion planning in dynamic and uncertain domains. TMP-RL features two nested planning-learning loops. In the inner TMP loop, the robot generates a low-cost, feasible task-motion plan by iteratively planning in the discrete space and updating relevant action costs evaluated by the motion planner in continuous space. In the outer loop, the plan is executed, and the robot learns from the execution experience via model-free RL, to further improve its task-motion plans. RL in the outer loop is more accurate to the current domain but also more expensive, and using less costly task and motion planning leads to a jump-start for learning in the real world. Our approach is evaluated on a mobile service robot conducting navigation tasks in an office area. Results show that TMP-RL approach significantly improves adaptability and robustness (in comparison to TMP methods) and leads to rapid convergence (in comparison to task planning (TP)-RL methods). We also show that TMP-RL can reuse learned values to smoothly adapt to new scenarios during long-term deployments. |
Tasks | Decision Making, Motion Planning |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08955v1 |
http://arxiv.org/pdf/1811.08955v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-task-motion-planning-with |
Repo | |
Framework | |
Multi-Agent Deep Reinforcement Learning with Human Strategies
Title | Multi-Agent Deep Reinforcement Learning with Human Strategies |
Authors | Thanh Nguyen, Ngoc Duy Nguyen, Saeid Nahavandi |
Abstract | Deep learning has enabled traditional reinforcement learning methods to deal with high-dimensional problems. However, one of the disadvantages of deep reinforcement learning methods is the limited exploration capacity of learning agents. In this paper, we introduce an approach that integrates human strategies to increase the exploration capacity of multiple deep reinforcement learning agents. We also report the development of our own multi-agent environment called Multiple Tank Defence to simulate the proposed approach. The results show the significant performance improvement of multiple agents that have learned cooperatively with human strategies. This implies that there is a critical need for human intellect teamed with machines to solve complex problems. In addition, the success of this simulation indicates that our multi-agent environment can be used as a testbed platform to develop and validate other multi-agent control algorithms. |
Tasks | |
Published | 2018-06-12 |
URL | https://arxiv.org/abs/1806.04562v2 |
https://arxiv.org/pdf/1806.04562v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-agent-deep-reinforcement-learning-with-1 |
Repo | |
Framework | |
Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization
Title | Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization |
Authors | Minshuo Chen, Lin Yang, Mengdi Wang, Tuo Zhao |
Abstract | Stochastic optimization naturally arises in machine learning. Efficient algorithms with provable guarantees, however, are still largely missing, when the objective function is nonconvex and the data points are dependent. This paper studies this fundamental challenge through a streaming PCA problem for stationary time series data. Specifically, our goal is to estimate the principle component of time series data with respect to the covariance matrix of the stationary distribution. Computationally, we propose a variant of Oja’s algorithm combined with downsampling to control the bias of the stochastic gradient caused by the data dependency. Theoretically, we quantify the uncertainty of our proposed stochastic algorithm based on diffusion approximations. This allows us to prove the asymptotic rate of convergence and further implies near optimal asymptotic sample complexity. Numerical experiments are provided to support our analysis. |
Tasks | Dimensionality Reduction, Stochastic Optimization, Time Series |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02312v4 |
http://arxiv.org/pdf/1803.02312v4.pdf | |
PWC | https://paperswithcode.com/paper/dimensionality-reduction-for-stationary-time |
Repo | |
Framework | |
Deep Communicating Agents for Abstractive Summarization
Title | Deep Communicating Agents for Abstractive Summarization |
Authors | Asli Celikyilmaz, Antoine Bosselut, Xiaodong He, Yejin Choi |
Abstract | We present deep communicating agents in an encoder-decoder architecture to address the challenges of representing a long document for abstractive summarization. With deep communicating agents, the task of encoding a long text is divided across multiple collaborating agents, each in charge of a subsection of the input text. These encoders are connected to a single decoder, trained end-to-end using reinforcement learning to generate a focused and coherent summary. Empirical results demonstrate that multiple communicating encoders lead to a higher quality summary compared to several strong baselines, including those based on a single encoder or multiple non-communicating encoders. |
Tasks | Abstractive Text Summarization |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.10357v3 |
http://arxiv.org/pdf/1803.10357v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-communicating-agents-for-abstractive |
Repo | |
Framework | |
Approximate Inference via Weighted Rademacher Complexity
Title | Approximate Inference via Weighted Rademacher Complexity |
Authors | Jonathan Kuck, Ashish Sabharwal, Stefano Ermon |
Abstract | Rademacher complexity is often used to characterize the learnability of a hypothesis class and is known to be related to the class size. We leverage this observation and introduce a new technique for estimating the size of an arbitrary weighted set, defined as the sum of weights of all elements in the set. Our technique provides upper and lower bounds on a novel generalization of Rademacher complexity to the weighted setting in terms of the weighted set size. This generalizes Massart’s Lemma, a known upper bound on the Rademacher complexity in terms of the unweighted set size. We show that the weighted Rademacher complexity can be estimated by solving a randomly perturbed optimization problem, allowing us to derive high-probability bounds on the size of any weighted set. We apply our method to the problems of calculating the partition function of an Ising model and computing propositional model counts (#SAT). Our experiments demonstrate that we can produce tighter bounds than competing methods in both the weighted and unweighted settings. |
Tasks | |
Published | 2018-01-27 |
URL | http://arxiv.org/abs/1801.09028v1 |
http://arxiv.org/pdf/1801.09028v1.pdf | |
PWC | https://paperswithcode.com/paper/approximate-inference-via-weighted-rademacher |
Repo | |
Framework | |
Cross-modality image synthesis from unpaired data using CycleGAN: Effects of gradient consistency loss and training data size
Title | Cross-modality image synthesis from unpaired data using CycleGAN: Effects of gradient consistency loss and training data size |
Authors | Yuta Hiasa, Yoshito Otake, Masaki Takao, Takumi Matsuoka, Kazuma Takashima, Jerry L. Prince, Nobuhiko Sugano, Yoshinobu Sato |
Abstract | CT is commonly used in orthopedic procedures. MRI is used along with CT to identify muscle structures and diagnose osteonecrosis due to its superior soft tissue contrast. However, MRI has poor contrast for bone structures. Clearly, it would be helpful if a corresponding CT were available, as bone boundaries are more clearly seen and CT has standardized (i.e., Hounsfield) units. Therefore, we aim at MR-to-CT synthesis. The CycleGAN was successfully applied to unpaired CT and MR images of the head, these images do not have as much variation of intensity pairs as do images in the pelvic region due to the presence of joints and muscles. In this paper, we extended the CycleGAN approach by adding the gradient consistency loss to improve the accuracy at the boundaries. We conducted two experiments. To evaluate image synthesis, we investigated dependency of image synthesis accuracy on 1) the number of training data and 2) the gradient consistency loss. To demonstrate the applicability of our method, we also investigated a segmentation accuracy on synthesized images. |
Tasks | Image Generation |
Published | 2018-03-18 |
URL | http://arxiv.org/abs/1803.06629v3 |
http://arxiv.org/pdf/1803.06629v3.pdf | |
PWC | https://paperswithcode.com/paper/cross-modality-image-synthesis-from-unpaired |
Repo | |
Framework | |
Home Activity Monitoring using Low Resolution Infrared Sensor
Title | Home Activity Monitoring using Low Resolution Infrared Sensor |
Authors | Lili Tao, Timothy Volonakis, Bo Tan, Yanguo Jing, Kevin Chetty, Melvyn Smith |
Abstract | Action monitoring in a home environment provides important information for health monitoring and may serve as input into a smart home environment. Visual analysis using cameras can recognise actions in a complex scene, such as someones living room. However, although there the huge potential benefits and importance, specifically for health, cameras are not widely accepted because of privacy concerns. This paper recognises human activities using a sensor that retains privacy. The sensor is not only different by being thermal, but it is also of low resolution: 8x8 pixels. The combination of the thermal imaging, and the low spatial resolution ensures the privacy of individuals. We present an approach to recognise daily activities using this sensor based on a discrete cosine transform. We evaluate the proposed method on a state-of-the-art dataset and experimentally confirm that our approach outperforms the baseline method. We also introduce a new dataset, and evaluate the method on it. Here we show that the sensor is considered better at detecting the occurrence of falls and Activities of Daily Living. Our method achieves an overall accuracy of 87.50% across 7 activities with a fall detection sensitivity of 100% and specificity of 99.21%. |
Tasks | Home Activity Monitoring |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05416v1 |
http://arxiv.org/pdf/1811.05416v1.pdf | |
PWC | https://paperswithcode.com/paper/home-activity-monitoring-using-low-resolution |
Repo | |
Framework | |
Learning Embeddings of Directed Networks with Text-Associated Nodes—with Applications in Software Package Dependency Networks
Title | Learning Embeddings of Directed Networks with Text-Associated Nodes—with Applications in Software Package Dependency Networks |
Authors | Shudan Zhong, Kexuan Sun, Hong Xu |
Abstract | A network embedding consists of a vector representation for each node in the network. Its usefulness has been shown in many real-world application domains, such as social networks and web networks. Directed networks with text associated with each node, such as software package dependency networks, are commonplace. However, to the best of our knowledge, their embeddings have hitherto not been specifically studied. In this paper, we propose PCTADW-1 and PCTADW-2, two algorithms based on neural networks that learn embeddings of directed networks with text associated with each node. We create two new node-labeled such networks: The package dependency networks in two popular GNU/Linux distributions, Debian and Fedora. We experimentally demonstrate that the embeddings produced by our algorithms resulted in node classification with better quality than those of various baselines on these two networks. We observe that there exist systematic presence of analogies (similar to those in word embeddings) in the network embeddings of software package dependency networks. To the best of our knowledge, this is the first time that such systematic presence of analogies is observed in network and document embeddings. This may potentially open up a new instrument for better understanding networks and documents algorithmically using their embeddings as well as for better human understanding of network and document embeddings. |
Tasks | Network Embedding, Node Classification, Word Embeddings |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02270v4 |
http://arxiv.org/pdf/1809.02270v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-embeddings-of-directed-networks-with |
Repo | |
Framework | |
Semi-Supervised Event Extraction with Paraphrase Clusters
Title | Semi-Supervised Event Extraction with Paraphrase Clusters |
Authors | James Ferguson, Colin Lockard, Daniel S. Weld, Hannaneh Hajishirzi |
Abstract | Supervised event extraction systems are limited in their accuracy due to the lack of available training data. We present a method for self-training event extraction systems by bootstrapping additional training data. This is done by taking advantage of the occurrence of multiple mentions of the same event instances across newswire articles from multiple sources. If our system can make a highconfidence extraction of some mentions in such a cluster, it can then acquire diverse training examples by adding the other mentions as well. Our experiments show significant performance improvements on multiple event extractors over ACE 2005 and TAC-KBP 2015 datasets. |
Tasks | |
Published | 2018-08-26 |
URL | http://arxiv.org/abs/1808.08622v1 |
http://arxiv.org/pdf/1808.08622v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-event-extraction-with |
Repo | |
Framework | |