October 20, 2019

3155 words 15 mins read

Paper Group AWR 354

Paper Group AWR 354

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation. An Empirical Model of Large-Batch Training. Marian: Fast Neural Machine Translation in C++. YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers. Session-based Recommendation with Graph Neural Networks. Pitfalls of Graph Neural Network Eval …

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation

Title Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation
Authors Qingyao Ai, Vahid Azizi, Xu Chen, Yongfeng Zhang
Abstract Providing model-generated explanations in recommender systems is important to user experience. State-of-the-art recommendation algorithms - especially collaborative filtering (CF)-based approaches with shallow or deep models - usually work with various unstructured information sources for recommendation, such as textual reviews, visual images, and various implicit or explicit feedbacks. Though structured knowledge bases were considered in content-based approaches, they have been largely ignored recently due to the research focus on CF approaches. However, structured knowledge exhibit unique advantages in personalized recommendation systems. When the explicit knowledge about users and items is considered for recommendation, the system could provide highly customized recommendations based on users’ historical behaviors and the knowledge is helpful for providing informed explanations regarding the recommended items. A great challenge for using knowledge bases for recommendation is how to integrate large-scale structured data, while taking advantage of collaborative filtering for highly accurate performance. Recent achievements in knowledge-base embedding (KBE) sheds light on this problem, which makes it possible to learn user and item representations while preserving the structure of their relationship with external knowledge for explanation. In this work, we propose to explain knowledge-base embeddings for explainable recommendation. Specifically, we propose a knowledge-base representation learning framework to embed heterogeneous entities for recommendation, and based on the embedded knowledge base, a soft matching algorithm is proposed to generate personalized explanations for the recommended items. Experimental results on real-world e-commerce datasets verified the superior recommendation performance and the explainability power of our approach compared with state-of-the-art baselines.
Tasks Recommendation Systems, Representation Learning
Published 2018-05-09
URL http://arxiv.org/abs/1805.03352v2
PDF http://arxiv.org/pdf/1805.03352v2.pdf
PWC https://paperswithcode.com/paper/180503352
Repo https://github.com/LunaBlack/KGAT-pytorch
Framework pytorch

An Empirical Model of Large-Batch Training

Title An Empirical Model of Large-Batch Training
Authors Sam McCandlish, Jared Kaplan, Dario Amodei, OpenAI Dota Team
Abstract In an increasing number of domains it has been demonstrated that deep learning models can be trained using relatively large batch sizes without sacrificing data efficiency. However the limits of this massive data parallelism seem to differ from domain to domain, ranging from batches of tens of thousands in ImageNet to batches of millions in RL agents that play the game Dota 2. To our knowledge there is limited conceptual understanding of why these limits to batch size differ or how we might choose the correct batch size in a new domain. In this paper, we demonstrate that a simple and easy-to-measure statistic called the gradient noise scale predicts the largest useful batch size across many domains and applications, including a number of supervised learning datasets (MNIST, SVHN, CIFAR-10, ImageNet, Billion Word), reinforcement learning domains (Atari and Dota), and even generative model training (autoencoders on SVHN). We find that the noise scale increases as the loss decreases over a training run and depends on the model size primarily through improved model performance. Our empirically-motivated theory also describes the tradeoff between compute-efficiency and time-efficiency, and provides a rough model of the benefits of adaptive batch-size training.
Tasks Dota 2
Published 2018-12-14
URL http://arxiv.org/abs/1812.06162v1
PDF http://arxiv.org/pdf/1812.06162v1.pdf
PWC https://paperswithcode.com/paper/an-empirical-model-of-large-batch-training
Repo https://github.com/sarahisyoung/rlpyt
Framework pytorch

Marian: Fast Neural Machine Translation in C++

Title Marian: Fast Neural Machine Translation in C++
Authors Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, Alexandra Birch
Abstract We present Marian, an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs. Marian is written entirely in C++. We describe the design of the encoder-decoder framework and demonstrate that a research-friendly toolkit can achieve high training and translation speed.
Tasks Machine Translation
Published 2018-04-01
URL http://arxiv.org/abs/1804.00344v3
PDF http://arxiv.org/pdf/1804.00344v3.pdf
PWC https://paperswithcode.com/paper/marian-fast-neural-machine-translation-in-c
Repo https://github.com/marian-nmt/marian
Framework none

YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers

Title YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers
Authors Jonathan Pedoeem, Rachel Huang
Abstract This paper focuses on YOLO-LITE, a real-time object detection model developed to run on portable devices such as a laptop or cellphone lacking a Graphics Processing Unit (GPU). The model was first trained on the PASCAL VOC dataset then on the COCO dataset, achieving a mAP of 33.81% and 12.26% respectively. YOLO-LITE runs at about 21 FPS on a non-GPU computer and 10 FPS after implemented onto a website with only 7 layers and 482 million FLOPS. This speed is 3.8x faster than the fastest state of art model, SSD MobilenetvI. Based on the original object detection algorithm YOLOV2, YOLO- LITE was designed to create a smaller, faster, and more efficient model increasing the accessibility of real-time object detection to a variety of devices.
Tasks Object Detection, Real-Time Object Detection
Published 2018-11-14
URL http://arxiv.org/abs/1811.05588v1
PDF http://arxiv.org/pdf/1811.05588v1.pdf
PWC https://paperswithcode.com/paper/yolo-lite-a-real-time-object-detection
Repo https://github.com/StevenBanama/Yolo-lite-Gesture
Framework tf

Session-based Recommendation with Graph Neural Networks

Title Session-based Recommendation with Graph Neural Networks
Authors Shu Wu, Yuyuan Tang, Yanqiao Zhu, Liang Wang, Xing Xie, Tieniu Tan
Abstract The problem of session-based recommendation aims to predict user actions based on anonymous sessions. Previous methods model a session as a sequence and estimate user representations besides item representations to make recommendations. Though achieved promising results, they are insufficient to obtain accurate user vectors in sessions and neglect complex transitions of items. To obtain accurate item embedding and take complex transitions of items into account, we propose a novel method, i.e. Session-based Recommendation with Graph Neural Networks, SR-GNN for brevity. In the proposed method, session sequences are modeled as graph-structured data. Based on the session graph, GNN can capture complex transitions of items, which are difficult to be revealed by previous conventional sequential methods. Each session is then represented as the composition of the global preference and the current interest of that session using an attention network. Extensive experiments conducted on two real datasets show that SR-GNN evidently outperforms the state-of-the-art session-based recommendation methods consistently.
Tasks Session-Based Recommendations
Published 2018-11-01
URL http://arxiv.org/abs/1811.00855v4
PDF http://arxiv.org/pdf/1811.00855v4.pdf
PWC https://paperswithcode.com/paper/session-based-recommendation-with-graph
Repo https://github.com/CRIPAC-DIG/SR-GNN
Framework tf

Pitfalls of Graph Neural Network Evaluation

Title Pitfalls of Graph Neural Network Evaluation
Authors Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, Stephan Günnemann
Abstract Semi-supervised node classification in graphs is a fundamental problem in graph mining, and the recently proposed graph neural networks (GNNs) have achieved unparalleled results on this task. Due to their massive success, GNNs have attracted a lot of attention, and many novel architectures have been put forward. In this paper we show that existing evaluation strategies for GNN models have serious shortcomings. We show that using the same train/validation/test splits of the same datasets, as well as making significant changes to the training procedure (e.g. early stopping criteria) precludes a fair comparison of different architectures. We perform a thorough empirical evaluation of four prominent GNN models and show that considering different splits of the data leads to dramatically different rankings of models. Even more importantly, our findings suggest that simpler GNN architectures are able to outperform the more sophisticated ones if the hyperparameters and the training procedure are tuned fairly for all models.
Tasks Node Classification
Published 2018-11-14
URL https://arxiv.org/abs/1811.05868v2
PDF https://arxiv.org/pdf/1811.05868v2.pdf
PWC https://paperswithcode.com/paper/pitfalls-of-graph-neural-network-evaluation
Repo https://github.com/shchur/gnn-benchmark
Framework tf

Hybrid Loss for Learning Single-Image-based HDR Reconstruction

Title Hybrid Loss for Learning Single-Image-based HDR Reconstruction
Authors Kenta Moriwaki, Ryota Yoshihashi, Rei Kawakami, Shaodi You, Takeshi Naemura
Abstract This paper tackles high-dynamic-range (HDR) image reconstruction given only a single low-dynamic-range (LDR) image as input. While the existing methods focus on minimizing the mean-squared-error (MSE) between the target and reconstructed images, we minimize a hybrid loss that consists of perceptual and adversarial losses in addition to HDR-reconstruction loss. The reconstruction loss instead of MSE is more suitable for HDR since it puts more weight on both over- and under- exposed areas. It makes the reconstruction faithful to the input. Perceptual loss enables the networks to utilize knowledge about objects and image structure for recovering the intensity gradients of saturated and grossly quantized areas. Adversarial loss helps to select the most plausible appearance from multiple solutions. The hybrid loss that combines all the three losses is calculated in logarithmic space of image intensity so that the outputs retain a large dynamic range and meanwhile the learning becomes tractable. Comparative experiments conducted with other state-of-the-art methods demonstrated that our method produces a leap in image quality.
Tasks Image Reconstruction, Single-Image-Based Hdr Reconstruction
Published 2018-12-18
URL http://arxiv.org/abs/1812.07134v1
PDF http://arxiv.org/pdf/1812.07134v1.pdf
PWC https://paperswithcode.com/paper/hybrid-loss-for-learning-single-image-based
Repo https://github.com/vinthony/awesome-deep-hdr
Framework none

Deep Affinity Network for Multiple Object Tracking

Title Deep Affinity Network for Multiple Object Tracking
Authors ShiJie Sun, Naveed Akhtar, HuanSheng Song, Ajmal Mian, Mubarak Shah
Abstract Multiple Object Tracking (MOT) plays an important role in solving many fundamental problems in video analysis in computer vision. Most MOT methods employ two steps: Object Detection and Data Association. The first step detects objects of interest in every frame of a video, and the second establishes correspondence between the detected objects in different frames to obtain their tracks. Object detection has made tremendous progress in the last few years due to deep learning. However, data association for tracking still relies on hand crafted constraints such as appearance, motion, spatial proximity, grouping etc. to compute affinities between the objects in different frames. In this paper, we harness the power of deep learning for data association in tracking by jointly modelling object appearances and their affinities between different frames in an end-to-end fashion. The proposed Deep Affinity Network (DAN) learns compact; yet comprehensive features of pre-detected objects at several levels of abstraction, and performs exhaustive pairing permutations of those features in any two frames to infer object affinities. DAN also accounts for multiple objects appearing and disappearing between video frames. We exploit the resulting efficient affinity computations to associate objects in the current frame deep into the previous frames for reliable on-line tracking. Our technique is evaluated on popular multiple object tracking challenges MOT15, MOT17 and UA-DETRAC. Comprehensive benchmarking under twelve evaluation metrics demonstrates that our approach is among the best performing techniques on the leader board for these challenges. The open source implementation of our work is available at https://github.com/shijieS/SST.git.
Tasks Multiple Object Tracking, Object Detection, Object Tracking
Published 2018-10-28
URL https://arxiv.org/abs/1810.11780v2
PDF https://arxiv.org/pdf/1810.11780v2.pdf
PWC https://paperswithcode.com/paper/deep-affinity-network-for-multiple-object
Repo https://github.com/shijieS/SST
Framework pytorch

Fast and accurate object detection in high resolution 4K and 8K video using GPUs

Title Fast and accurate object detection in high resolution 4K and 8K video using GPUs
Authors Vít Růžička, Franz Franchetti
Abstract Machine learning has celebrated a lot of achievements on computer vision tasks such as object detection, but the traditionally used models work with relatively low resolution images. The resolution of recording devices is gradually increasing and there is a rising need for new methods of processing high resolution data. We propose an attention pipeline method which uses two staged evaluation of each image or video frame under rough and refined resolution to limit the total number of necessary evaluations. For both stages, we make use of the fast object detection model YOLO v2. We have implemented our model in code, which distributes the work across GPUs. We maintain high accuracy while reaching the average performance of 3-6 fps on 4K video and 2 fps on 8K video.
Tasks Object Detection, Object Detection in High Resolution, Real-Time Object Detection
Published 2018-10-24
URL http://arxiv.org/abs/1810.10551v1
PDF http://arxiv.org/pdf/1810.10551v1.pdf
PWC https://paperswithcode.com/paper/fast-and-accurate-object-detection-in-high
Repo https://github.com/previtus/AttentionPipeline
Framework tf

False Positive Reduction in Lung Computed Tomography Images using Convolutional Neural Networks

Title False Positive Reduction in Lung Computed Tomography Images using Convolutional Neural Networks
Authors Gorkem Polat, Ugur Halici, Yesim Serinagaoglu Dogrusoz
Abstract Recent studies have shown that lung cancer screening using annual low-dose computed tomography (CT) reduces lung cancer mortality by 20% compared to traditional chest radiography. Therefore, CT lung screening has started to be used widely all across the world. However, analyzing these images is a serious burden for radiologists. In this study, we propose a novel and simple framework that analyzes CT lung screenings using convolutional neural networks (CNNs) and reduces false positives. Our framework shows that even non-complex architectures are very powerful to classify 3D nodule data when compared to traditional methods. We also use different fusions in order to show their power and effect on the overall score. 3D CNNs are preferred over 2D CNNs because data are in 3D, and 2D convolutional operations may result in information loss. Mini-batch is used in order to overcome class-imbalance. Proposed framework has been validated according to the LUNA16 challenge evaluation and got score of 0.786, which is the average sensitivity values at seven predefined false positive (FP) points.
Tasks Computed Tomography (CT)
Published 2018-11-04
URL http://arxiv.org/abs/1811.01424v1
PDF http://arxiv.org/pdf/1811.01424v1.pdf
PWC https://paperswithcode.com/paper/false-positive-reduction-in-lung-computed
Repo https://github.com/GorkemP/LUNA16_Challange
Framework tf

Diagnostic Classification Of Lung Nodules Using 3D Neural Networks

Title Diagnostic Classification Of Lung Nodules Using 3D Neural Networks
Authors Raunak Dey, Zhongjie Lu, Yi Hong
Abstract Lung cancer is the leading cause of cancer-related death worldwide. Early diagnosis of pulmonary nodules in Computed Tomography (CT) chest scans provides an opportunity for designing effective treatment and making financial and care plans. In this paper, we consider the problem of diagnostic classification between benign and malignant lung nodules in CT images, which aims to learn a direct mapping from 3D images to class labels. To achieve this goal, four two-pathway Convolutional Neural Networks (CNN) are proposed, including a basic 3D CNN, a novel multi-output network, a 3D DenseNet, and an augmented 3D DenseNet with multi-outputs. These four networks are evaluated on the public LIDC-IDRI dataset and outperform most existing methods. In particular, the 3D multi-output DenseNet (MoDenseNet) achieves the state-of-the-art classification accuracy on the task of end-to-end lung nodule diagnosis. In addition, the networks pretrained on the LIDC-IDRI dataset can be further extended to handle smaller datasets using transfer learning. This is demonstrated on our dataset with encouraging prediction accuracy in lung nodule classification.
Tasks Computed Tomography (CT), Lung Nodule Classification, Transfer Learning
Published 2018-03-19
URL http://arxiv.org/abs/1803.07192v1
PDF http://arxiv.org/pdf/1803.07192v1.pdf
PWC https://paperswithcode.com/paper/diagnostic-classification-of-lung-nodules
Repo https://github.com/raun1/Diagnostic-Classification-Of-Lung-Nodules-Using-3D-Neural-Networks
Framework none

Explicit State Tracking with Semi-Supervision for Neural Dialogue Generation

Title Explicit State Tracking with Semi-Supervision for Neural Dialogue Generation
Authors Xisen Jin, Wenqiang Lei, Zhaochun Ren, Hongshen Chen, Shangsong Liang, Yihong Zhao, Dawei Yin
Abstract The task of dialogue generation aims to automatically provide responses given previous utterances. Tracking dialogue states is an important ingredient in dialogue generation for estimating users’ intention. However, the \emph{expensive nature of state labeling} and the \emph{weak interpretability} make the dialogue state tracking a challenging problem for both task-oriented and non-task-oriented dialogue generation: For generating responses in task-oriented dialogues, state tracking is usually learned from manually annotated corpora, where the human annotation is expensive for training; for generating responses in non-task-oriented dialogues, most of existing work neglects the explicit state tracking due to the unlimited number of dialogue states. In this paper, we propose the \emph{semi-supervised explicit dialogue state tracker} (SEDST) for neural dialogue generation. To this end, our approach has two core ingredients: \emph{CopyFlowNet} and \emph{posterior regularization}. Specifically, we propose an encoder-decoder architecture, named \emph{CopyFlowNet}, to represent an explicit dialogue state with a probabilistic distribution over the vocabulary space. To optimize the training procedure, we apply a posterior regularization strategy to integrate indirect supervision. Extensive experiments conducted on both task-oriented and non-task-oriented dialogue corpora demonstrate the effectiveness of our proposed model. Moreover, we find that our proposed semi-supervised dialogue state tracker achieves a comparable performance as state-of-the-art supervised learning baselines in state tracking procedure.
Tasks Dialogue Generation, Dialogue State Tracking
Published 2018-08-31
URL http://arxiv.org/abs/1808.10596v1
PDF http://arxiv.org/pdf/1808.10596v1.pdf
PWC https://paperswithcode.com/paper/explicit-state-tracking-with-semi-supervision
Repo https://github.com/shizhediao/SEDST3
Framework pytorch

Toward Scalable Neural Dialogue State Tracking Model

Title Toward Scalable Neural Dialogue State Tracking Model
Authors Elnaz Nouri, Ehsan Hosseini-Asl
Abstract The latency in the current neural based dialogue state tracking models prohibits them from being used efficiently for deployment in production systems, albeit their highly accurate performance. This paper proposes a new scalable and accurate neural dialogue state tracking model, based on the recently proposed Global-Local Self-Attention encoder (GLAD) model by Zhong et al. which uses global modules to share parameters between estimators for different types (called slots) of dialogue states, and uses local modules to learn slot-specific features. By using only one recurrent networks with global conditioning, compared to (1 + # slots) recurrent networks with global and local conditioning used in the GLAD model, our proposed model reduces the latency in training and inference times by $35%$ on average, while preserving performance of belief state tracking, by $97.38%$ on turn request and $88.51%$ on joint goal and accuracy. Evaluation on Multi-domain dataset (Multi-WoZ) also demonstrates that our model outperforms GLAD on turn inform and joint goal accuracy.
Tasks Dialogue State Tracking
Published 2018-12-03
URL http://arxiv.org/abs/1812.00899v1
PDF http://arxiv.org/pdf/1812.00899v1.pdf
PWC https://paperswithcode.com/paper/toward-scalable-neural-dialogue-state
Repo https://github.com/elnaaz/GCE-Model
Framework none

Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

Title Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects
Authors Adam R. Kosiorek, Hyunjik Kim, Ingmar Posner, Yee Whye Teh
Abstract We present Sequential Attend, Infer, Repeat (SQAIR), an interpretable deep generative model for videos of moving objects. It can reliably discover and track objects throughout the sequence of frames, and can also generate future frames conditioning on the current frame, thereby simulating expected motion of objects. This is achieved by explicitly encoding object presence, locations and appearances in the latent variables of the model. SQAIR retains all strengths of its predecessor, Attend, Infer, Repeat (AIR, Eslami et. al., 2016), including learning in an unsupervised manner, and addresses its shortcomings. We use a moving multi-MNIST dataset to show limitations of AIR in detecting overlapping or partially occluded objects, and show how SQAIR overcomes them by leveraging temporal consistency of objects. Finally, we also apply SQAIR to real-world pedestrian CCTV data, where it learns to reliably detect, track and generate walking pedestrians with no supervision.
Tasks
Published 2018-06-05
URL http://arxiv.org/abs/1806.01794v2
PDF http://arxiv.org/pdf/1806.01794v2.pdf
PWC https://paperswithcode.com/paper/sequential-attend-infer-repeat-generative
Repo https://github.com/akosiorek/sqair
Framework tf

Deep Anomaly Detection Using Geometric Transformations

Title Deep Anomaly Detection Using Geometric Transformations
Authors Izhak Golan, Ran El-Yaniv
Abstract We consider the problem of anomaly detection in images, and present a new detection technique. Given a sample of images, all known to belong to a “normal” class (e.g., dogs), we show how to train a deep neural model that can detect out-of-distribution images (i.e., non-dog objects). The main idea behind our scheme is to train a multi-class model to discriminate between dozens of geometric transformations applied on all the given images. The auxiliary expertise learned by the model generates feature detectors that effectively identify, at test time, anomalous images based on the softmax activation statistics of the model when applied on transformed images. We present extensive experiments using the proposed detector, which indicate that our algorithm improves state-of-the-art methods by a wide margin.
Tasks Anomaly Detection
Published 2018-05-28
URL http://arxiv.org/abs/1805.10917v2
PDF http://arxiv.org/pdf/1805.10917v2.pdf
PWC https://paperswithcode.com/paper/deep-anomaly-detection-using-geometric
Repo https://github.com/izikgo/AnomalyDetectionTransformations
Framework tf
comments powered by Disqus