January 29, 2020

3179 words 15 mins read

Paper Group ANR 591

Jointly Aligning and Predicting Continuous Emotion Annotations. Spatio-Semantic ConvNet-Based Visual Place Recognition. Multi-Class Lane Semantic Segmentation using Efficient Convolutional Networks. Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform. Sentiment Dynamics in Social Media News C …

Jointly Aligning and Predicting Continuous Emotion Annotations


Title	Jointly Aligning and Predicting Continuous Emotion Annotations
Authors	Soheil Khorram, Melvin G McInnis, Emily Mower Provost
Abstract	Time-continuous dimensional descriptions of emotions (e.g., arousal, valence) allow researchers to characterize short-time changes and to capture long-term trends in emotion expression. However, continuous emotion labels are generally not synchronized with the input speech signal due to delays caused by reaction-time, which is inherent in human evaluations. To deal with this challenge, we introduce a new convolutional neural network (multi-delay sinc network) that is able to simultaneously align and predict labels in an end-to-end manner. The proposed network is a stack of convolutional layers followed by an aligner network that aligns the speech signal and emotion labels. This network is implemented using a new convolutional layer that we introduce, the delayed sinc layer. It is a time-shifted low-pass (sinc) filter that uses a gradient-based algorithm to learn a single delay. Multiple delayed sinc layers can be used to compensate for a non-stationary delay that is a function of the acoustic space. We test the efficacy of this system on two common emotion datasets, RECOLA and SEWA, and show that this approach obtains state-of-the-art speech-only results by learning time-varying delays while predicting dimensional descriptors of emotions.
Tasks
Published	2019-07-05
URL	https://arxiv.org/abs/1907.03050v2
PDF	https://arxiv.org/pdf/1907.03050v2.pdf
PWC	https://paperswithcode.com/paper/jointly-aligning-and-predicting-continuous
Repo
Framework

Spatio-Semantic ConvNet-Based Visual Place Recognition


Title	Spatio-Semantic ConvNet-Based Visual Place Recognition
Authors	Luis G. Camara, Libor Přeučil
Abstract	We present a Visual Place Recognition system that follows the two-stage format common to image retrieval pipelines. The system encodes images of places by employing the activations of different layers of a pre-trained, off-the-shelf, VGG16 Convolutional Neural Network (CNN) architecture. In the first stage of our method and given a query image of a place, a number of top candidate images is retrieved from a previously stored database of places. In the second stage, we propose an exhaustive comparison of the query image against these candidates by encoding semantic and spatial information in the form of CNN features. Results from our approach outperform by a large margin state-of-the-art visual place recognition methods on five of the most commonly used benchmark datasets. The performance gain is especially remarkable on the most challenging datasets, with more than a twofold recognition improvement with respect to the latest published work.
Tasks	Image Retrieval, Visual Place Recognition
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07671v1
PDF	https://arxiv.org/pdf/1909.07671v1.pdf
PWC	https://paperswithcode.com/paper/spatio-semantic-convnet-based-visual-place
Repo
Framework

Multi-Class Lane Semantic Segmentation using Efficient Convolutional Networks


Title	Multi-Class Lane Semantic Segmentation using Efficient Convolutional Networks
Authors	Shao-Yuan Lo, Hsueh-Ming Hang, Sheng-Wei Chan, Jing-Jhih Lin
Abstract	Lane detection plays an important role in a self-driving vehicle. Several studies leverage a semantic segmentation network to extract robust lane features, but few of them can distinguish different types of lanes. In this paper, we focus on the problem of multi-class lane semantic segmentation. Based on the observation that the lane is a small-size and narrow-width object in a road scene image, we propose two techniques, Feature Size Selection (FSS) and Degressive Dilation Block (DD Block). The FSS allows a network to extract thin lane features using appropriate feature sizes. To acquire fine-grained spatial information, the DD Block is made of a series of dilated convolutions with degressive dilation rates. Experimental results show that the proposed techniques provide obvious improvement in accuracy, while they achieve the same or faster inference speed compared to the baseline system, and can run at real-time on high-resolution images.
Tasks	Lane Detection, Semantic Segmentation
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09438v1
PDF	https://arxiv.org/pdf/1907.09438v1.pdf
PWC	https://paperswithcode.com/paper/multi-class-lane-semantic-segmentation-using
Repo
Framework

Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform


Title	Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform
Authors	Zhenyu Zhao, Radhika Anand, Mallory Wang
Abstract	In machine learning applications for online product offerings and marketing strategies, there are often hundreds or thousands of features available to build such models. Feature selection is one essential method in such applications for multiple objectives: improving the prediction accuracy by eliminating irrelevant features, accelerating the model training and prediction speed, reducing the monitoring and maintenance workload for feature data pipeline, and providing better model interpretation and diagnosis capability. However, selecting an optimal feature subset from a large feature space is considered as an NP-complete problem. The mRMR (Minimum Redundancy and Maximum Relevance) feature selection framework solves this problem by selecting the relevant features while controlling for the redundancy within the selected features. This paper describes the approach to extend, evaluate, and implement the mRMR feature selection methods for classification problem in a marketing machine learning platform at Uber that automates creation and deployment of targeting and personalization models at scale. This study first extends the existing mRMR methods by introducing a non-linear feature redundancy measure and a model-based feature relevance measure. Then an extensive empirical evaluation is performed for eight different feature selection methods, using one synthetic dataset and three real-world marketing datasets at Uber to cover different use cases. Based on the empirical results, the selected mRMR method is implemented in production for the marketing machine learning platform. A description of the production implementation is provided and an online experiment deployed through the platform is discussed.
Tasks	Feature Selection
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05376v1
PDF	https://arxiv.org/pdf/1908.05376v1.pdf
PWC	https://paperswithcode.com/paper/maximum-relevance-and-minimum-redundancy
Repo
Framework


Title	Sentiment Dynamics in Social Media News Channels
Authors	Nagendra Kumar, Rakshita Nagalla, Tanya Marwah, Manish Singh
Abstract	Social media is currently one of the most important means of news communication. Since people are consuming a large fraction of their daily news through social media, most of the traditional news channels are using social media to catch the attention of users. Each news channel has its own strategies to attract more users. In this paper, we analyze how the news channels use sentiment to garner users’ attention in social media. We compare the sentiment of social media news posts of television, radio and print media, to show the differences in the ways these channels cover the news. We also analyze users’ reactions and opinion sentiment on news posts with different sentiments. We perform our experiments on a dataset extracted from Facebook Pages of five popular news channels. Our dataset contains 0.15 million news posts and 1.13 billion users reactions. The results of our experiments show that the sentiment of user opinion has a strong correlation with the sentiment of the news post and the type of information source. Our study also illustrates the differences among the social media news channels of different types of news sources.
Tasks
Published	2019-08-21
URL	https://arxiv.org/abs/1908.08147v1
PDF	https://arxiv.org/pdf/1908.08147v1.pdf
PWC	https://paperswithcode.com/paper/sentiment-dynamics-in-social-media-news
Repo
Framework

Scalable Place Recognition Under Appearance Change for Autonomous Driving


Title	Scalable Place Recognition Under Appearance Change for Autonomous Driving
Authors	Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Yu Liu, Thanh-Toan Do, Ian Reid
Abstract	A major challenge in place recognition for autonomous driving is to be robust against appearance changes due to short-term (e.g., weather, lighting) and long-term (seasons, vegetation growth, etc.) environmental variations. A promising solution is to continuously accumulate images to maintain an adequate sample of the conditions and incorporate new changes into the place recognition decision. However, this demands a place recognition technique that is scalable on an ever growing dataset. To this end, we propose a novel place recognition technique that can be efficiently retrained and compressed, such that the recognition of new queries can exploit all available data (including recent changes) without suffering from visible growth in computational cost. Underpinning our method is a novel temporal image matching technique based on Hidden Markov Models. Our experiments show that, compared to state-of-the-art techniques, our method has much greater potential for large-scale place recognition for autonomous driving.
Tasks	Autonomous Driving, Visual Place Recognition
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00178v1
PDF	https://arxiv.org/pdf/1908.00178v1.pdf
PWC	https://paperswithcode.com/paper/scalable-place-recognition-under-appearance
Repo
Framework

Performance Optimization on Model Synchronization in Parallel Stochastic Gradient Descent Based SVM


Title	Performance Optimization on Model Synchronization in Parallel Stochastic Gradient Descent Based SVM
Authors	Vibhatha Abeykoon, Geoffrey Fox, Minje Kim
Abstract	Understanding the bottlenecks in implementing stochastic gradient descent (SGD)-based distributed support vector machines (SVM) algorithm is important in training larger data sets. The communication time to do the model synchronization across the parallel processes is the main bottleneck that causes inefficiency in the training process. The model synchronization is directly affected by the mini-batch size of data processed before the global synchronization. In producing an efficient distributed model, the communication time in training model synchronization has to be as minimum as possible while retaining a high testing accuracy. The effect from model synchronization frequency over the convergence of the algorithm and accuracy of the generated model must be well understood to design an efficient distributed model. In this research, we identify the bottlenecks in model synchronization in parallel stochastic gradient descent (PSGD)-based SVM algorithm with respect to the training model synchronization frequency (MSF). Our research shows that by optimizing the MSF in the data sets that we used, a reduction of 98% in communication time can be gained (16x - 24x speed up) with respect to high-frequency model synchronization. The training model optimization discussed in this paper guarantees a higher accuracy than the sequential algorithm along with faster convergence.
Tasks
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01219v1
PDF	https://arxiv.org/pdf/1905.01219v1.pdf
PWC	https://paperswithcode.com/paper/performance-optimization-on-model
Repo
Framework

GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding


Title	GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding
Authors	Zhaocheng Zhu, Shizhen Xu, Meng Qu, Jian Tang
Abstract	Learning continuous representations of nodes is attracting growing interest in both academia and industry recently, due to their simplicity and effectiveness in a variety of applications. Most of existing node embedding algorithms and systems are capable of processing networks with hundreds of thousands or a few millions of nodes. However, how to scale them to networks that have tens of millions or even hundreds of millions of nodes remains a challenging problem. In this paper, we propose GraphVite, a high-performance CPU-GPU hybrid system for training node embeddings, by co-optimizing the algorithm and the system. On the CPU end, augmented edge samples are parallelly generated by random walks in an online fashion on the network, and serve as the training data. On the GPU end, a novel parallel negative sampling is proposed to leverage multiple GPUs to train node embeddings simultaneously, without much data transfer and synchronization. Moreover, an efficient collaboration strategy is proposed to further reduce the synchronization cost between CPUs and GPUs. Experiments on multiple real-world networks show that GraphVite is super efficient. It takes only about one minute for a network with 1 million nodes and 5 million edges on a single machine with 4 GPUs, and takes around 20 hours for a network with 66 million nodes and 1.8 billion edges. Compared to the current fastest system, GraphVite is about 50 times faster without any sacrifice on performance.
Tasks	Dimensionality Reduction, Knowledge Graph Embedding, Link Prediction, Network Embedding, Node Classification
Published	2019-03-02
URL	http://arxiv.org/abs/1903.00757v1
PDF	http://arxiv.org/pdf/1903.00757v1.pdf
PWC	https://paperswithcode.com/paper/graphvite-a-high-performance-cpu-gpu-hybrid
Repo
Framework

Robust Full-FoV Depth Estimation in Tele-wide Camera System


Title	Robust Full-FoV Depth Estimation in Tele-wide Camera System
Authors	Kai Guo, Seongwook Song, Soonkeun Chang, Tae-ui Kim, Seungmin Han, Irina Kim
Abstract	Tele-wide camera system with different Field of View (FoV) lenses becomes very popular in recent mobile devices. Usually it is difficult to obtain full-FoV depth based on traditional stereo-matching methods. Pure Deep Neural Network (DNN) based depth estimation methods can obtain full-FoV depth, but have low robustness for scenarios which are not covered by training dataset. In this paper, to address the above problems we propose a hierarchical hourglass network for robust full-FoV depth estimation in tele-wide camera system, which combines the robustness of traditional stereo-matching methods with the accuracy of DNN. More specifically, the proposed network comprises three major modules: single image depth prediction module infers initial depth from input color image, depth propagation module propagates traditional stereo-matching tele-FoV depth to surrounding regions, and depth combination module fuses the initial depth with the propagated depth to generate final output. Each of these modules employs an hourglass model, which is a kind of encoder-decoder structure with skip connections. Experimental results compared with state-of-the-art depth estimation methods demonstrate that our method not only produces robust and better subjective depth quality on wild test images, but also obtains better quantitative results on standard datasets.
Tasks	Depth Estimation, Stereo Matching, Stereo Matching Hand
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03375v2
PDF	https://arxiv.org/pdf/1909.03375v2.pdf
PWC	https://paperswithcode.com/paper/robust-full-fov-depth-estimation-in-tele-wide
Repo
Framework

A Classification Supervised Auto-Encoder Based on Predefined Evenly-Distributed Class Centroids


Title	A Classification Supervised Auto-Encoder Based on Predefined Evenly-Distributed Class Centroids
Authors	Qiuyu Zhu, Ruixin Zhang
Abstract	Classic variational autoencoders are used to learn complex data distributions, that are built on standard function approximators. Especially, VAE has shown promise on a lot of complex task. In this paper, a new autoencoder model - classification supervised autoencoder (CSAE) based on predefined evenly-distributed class centroids (PEDCC) is proposed. Our method uses PEDCC of latent variables to train the network to ensure the maximization of inter-class distance and the minimization of inner-class distance. Instead of learning mean/variance of latent variables distribution and taking reparameterization of VAE, latent variables of CSAE are directly used to classify and as input of decoder. In addition, a new loss function is proposed to combine the loss function of classification. Based on the basic structure of the universal autoencoder, we realized the comprehensive optimal results of encoding, decoding, classification, and good model generalization performance at the same time. Theoretical advantages are reflected in experimental results.
Tasks
Published	2019-02-01
URL	https://arxiv.org/abs/1902.00220v3
PDF	https://arxiv.org/pdf/1902.00220v3.pdf
PWC	https://paperswithcode.com/paper/a-classification-supervised-auto-encoder
Repo
Framework

LogicENN: A Neural Based Knowledge Graphs Embedding Model with Logical Rules


Title	LogicENN: A Neural Based Knowledge Graphs Embedding Model with Logical Rules
Authors	Mojtaba Nayyeri, Chengjin Xu, Jens Lehmann, Hamed Shariat Yazdi
Abstract	Knowledge graph embedding models have gained significant attention in AI research. Recent works have shown that the inclusion of background knowledge, such as logical rules, can improve the performance of embeddings in downstream machine learning tasks. However, so far, most existing models do not allow the inclusion of rules. We address the challenge of including rules and present a new neural based embedding model (LogicENN). We prove that LogicENN can learn every ground truth of encoded rules in a knowledge graph. To the best of our knowledge, this has not been proved so far for the neural based family of embedding models. Moreover, we derive formulae for the inclusion of various rules, including (anti-)symmetric, inverse, irreflexive and transitive, implication, composition, equivalence and negation. Our formulation allows to avoid grounding for implication and equivalence relations. Our experiments show that LogicENN outperforms the state-of-the-art models in link prediction.
Tasks	Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Link Prediction
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07141v1
PDF	https://arxiv.org/pdf/1908.07141v1.pdf
PWC	https://paperswithcode.com/paper/logicenn-a-neural-based-knowledge-graphs
Repo
Framework

A Re-evaluation of Knowledge Graph Completion Methods


Title	A Re-evaluation of Knowledge Graph Completion Methods
Authors	Zhiqing Sun, Shikhar Vashishth, Soumya Sanyal, Partha Talukdar, Yiming Yang
Abstract	Knowledge Graph Completion (KGC) aims at automatically predicting missing links for large-scale knowledge graphs. A vast number of state-of-the-art KGC techniques have been published in top conferences in several research fields including data mining, machine learning, and natural language processing. However, we notice that several recent papers report very high performance which largely outperforms previous state-of-the-art methods. In this paper, we find that this can be attributed to the inappropriate evaluation protocol used by them and propose a simple evaluation protocol to address this problem. The proposed protocol is robust to handle bias in the model which can substantially affect the final results. We conduct extensive experiments and report the performance of several existing methods using our protocol.
Tasks	Knowledge Graph Completion, Knowledge Graphs, Link Prediction
Published	2019-11-10
URL	https://arxiv.org/abs/1911.03903v1
PDF	https://arxiv.org/pdf/1911.03903v1.pdf
PWC	https://paperswithcode.com/paper/a-re-evaluation-of-knowledge-graph-completion
Repo
Framework

Distance Metric Learned Collaborative Representation Classifier


Title	Distance Metric Learned Collaborative Representation Classifier
Authors	Tapabrata Chakraborti, Brendan McCane, Steven Mills, Umapada Pal
Abstract	Any generic deep machine learning algorithm is essentially a function fitting exercise, where the network tunes its weights and parameters to learn discriminatory features by minimizing some cost function. Though the network tries to learn the optimal feature space, it seldom tries to learn an optimal distance metric in the cost function, and hence misses out on an additional layer of abstraction. We present a simple effective way of achieving this by learning a generic Mahalanabis distance in a collaborative loss function in an end-to-end fashion with any standard convolutional network as the feature learner. The proposed method DML-CRC gives state-of-the-art performance on benchmark fine-grained classification datasets CUB Birds, Oxford Flowers and Oxford-IIIT Pets using the VGG-19 deep network. The method is network agnostic and can be used for any similar classification tasks.
Tasks
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01168v2
PDF	https://arxiv.org/pdf/1905.01168v2.pdf
PWC	https://paperswithcode.com/paper/distance-metric-learned-collaborative
Repo
Framework

Augmenting and Tuning Knowledge Graph Embeddings


Title	Augmenting and Tuning Knowledge Graph Embeddings
Authors	Robert Bamler, Farnood Salehi, Stephan Mandt
Abstract	Knowledge graph embeddings rank among the most successful methods for link prediction in knowledge graphs, i.e., the task of completing an incomplete collection of relational facts. A downside of these models is their strong sensitivity to model hyperparameters, in particular regularizers, which have to be extensively tuned to reach good performance [Kadlec et al., 2017]. We propose an efficient method for large scale hyperparameter tuning by interpreting these models in a probabilistic framework. After a model augmentation that introduces per-entity hyperparameters, we use a variational expectation-maximization approach to tune thousands of such hyperparameters with minimal additional cost. Our approach is agnostic to details of the model and results in a new state of the art in link prediction on standard benchmark data.
Tasks	Knowledge Graph Embeddings, Knowledge Graphs, Link Prediction
Published	2019-07-01
URL	https://arxiv.org/abs/1907.01068v1
PDF	https://arxiv.org/pdf/1907.01068v1.pdf
PWC	https://paperswithcode.com/paper/augmenting-and-tuning-knowledge-graph
Repo
Framework

Learning to Generate 6-DoF Grasp Poses with Reachability Awareness


Title	Learning to Generate 6-DoF Grasp Poses with Reachability Awareness
Authors	Xibai Lou, Yang Yang, Changhyun Choi
Abstract	Motivated by the stringent requirements of unstructured real-world where a plethora of unknown objects reside in arbitrary locations of the surface, we propose a voxel-based deep 3D Convolutional Neural Network (3D CNN) that generates feasible 6-DoF grasp poses in unrestricted workspace with reachability awareness. Unlike the majority of works that predict if a proposed grasp pose within the restricted workspace will be successful solely based on grasp pose stability, our approach further learns a reachability predictor that evaluates if the grasp pose is reachable or not from robot’s own experience. To avoid the laborious real training data collection, we exploit the power of simulation to train our networks on a large-scale synthetic dataset. This work is an early attempt that simultaneously evaluates grasping reachability from learned knowledge while proposing feasible grasp poses with 3D CNN. Experimental results in both simulation and real-world demonstrate that our approach outperforms several other methods and achieves 82.5% grasping success rate on unknown objects.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06404v2
PDF	https://arxiv.org/pdf/1910.06404v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-generate-6-dof-grasp-poses-with
Repo
Framework