October 19, 2019

2916 words 14 mins read

Paper Group ANR 259

A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval. Towards Practical Visual Search Engine within Elasticsearch. Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships. Online Embedding Compression for Text Classification using Low Rank Matrix Factorization. Semantic proj …

A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval


Title	A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval
Authors	Federico Magliani, Tomaso Fontanini, Andrea Prati
Abstract	The recent advances brought by deep learning allowed to improve the performance in image retrieval tasks. Through the many convolutional layers, available in a Convolutional Neural Network (CNN), it is possible to obtain a hierarchy of features from the evaluated image. At every step, the patches extracted are smaller than the previous levels and more representative. Following this idea, this paper introduces a new detector applied on the feature maps extracted from pre-trained CNN. Specifically, this approach lets to increase the number of features in order to increase the performance of the aggregation algorithms like the most famous and used VLAD embedding. The proposed approach is tested on different public datasets: Holidays, Oxford5k, Paris6k and UKB.
Tasks	Content-Based Image Retrieval, Image Retrieval
Published	2018-08-15
URL	http://arxiv.org/abs/1808.05022v1
PDF	http://arxiv.org/pdf/1808.05022v1.pdf
PWC	https://paperswithcode.com/paper/a-dense-depth-representation-for-vlad
Repo
Framework

Towards Practical Visual Search Engine within Elasticsearch


Title	Towards Practical Visual Search Engine within Elasticsearch
Authors	Cun Mu, Jun Zhao, Guang Yang, Jing Zhang, Zheng Yan
Abstract	In this paper, we describe our end-to-end content-based image retrieval system built upon Elasticsearch, a well-known and popular textual search engine. As far as we know, this is the first time such a system has been implemented in eCommerce, and our efforts have turned out to be highly worthwhile. We end up with a novel and exciting visual search solution that is extremely easy to be deployed, distributed, scaled and monitored in a cost-friendly manner. Moreover, our platform is intrinsically flexible in supporting multimodal searches, where visual and textual information can be jointly leveraged in retrieval. The core idea is to encode image feature vectors into a collection of string tokens in a way such that closer vectors will share more string tokens in common. By doing that, we can utilize Elasticsearch to efficiently retrieve similar images based on similarities within encoded sting tokens. As part of the development, we propose a novel vector to string encoding method, which is shown to substantially outperform the previous ones in terms of both precision and latency. First-hand experiences in implementing this Elasticsearch-based platform are extensively addressed, which should be valuable to practitioners also interested in building visual search engine on top of Elasticsearch.
Tasks	Content-Based Image Retrieval, Image Retrieval
Published	2018-06-23
URL	http://arxiv.org/abs/1806.08896v3
PDF	http://arxiv.org/pdf/1806.08896v3.pdf
PWC	https://paperswithcode.com/paper/towards-practical-visual-search-engine-within
Repo
Framework

Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships


Title	Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships
Authors	Yong Liu, Ruiping Wang, Shiguang Shan, Xilin Chen
Abstract	Context is important for accurate visual recognition. In this work we propose an object detection algorithm that not only considers object visual appearance, but also makes use of two kinds of context including scene contextual information and object relationships within a single image. Therefore, object detection is regarded as both a cognition problem and a reasoning problem when leveraging these structured information. Specifically, this paper formulates object detection as a problem of graph structure inference, where given an image the objects are treated as nodes in a graph and relationships between the objects are modeled as edges in such graph. To this end, we present a so-called Structure Inference Network (SIN), a detector that incorporates into a typical detection framework (e.g. Faster R-CNN) with a graphical model which aims to infer object state. Comprehensive experiments on PASCAL VOC and MS COCO datasets indicate that scene context and object relationships truly improve the performance of object detection with more desirable and reasonable outputs.
Tasks	Object Detection
Published	2018-06-30
URL	http://arxiv.org/abs/1807.00119v1
PDF	http://arxiv.org/pdf/1807.00119v1.pdf
PWC	https://paperswithcode.com/paper/structure-inference-net-object-detection
Repo
Framework

Online Embedding Compression for Text Classification using Low Rank Matrix Factorization


Title	Online Embedding Compression for Text Classification using Low Rank Matrix Factorization
Authors	Anish Acharya, Rahul Goel, Angeliki Metallinou, Inderjit Dhillon
Abstract	Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training,to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.
Tasks	Quantization, Sentence Classification, Text Classification
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00641v1
PDF	http://arxiv.org/pdf/1811.00641v1.pdf
PWC	https://paperswithcode.com/paper/online-embedding-compression-for-text
Repo
Framework

Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings


Title	Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings
Authors	Gabriel Grand, Idan Asher Blank, Francisco Pereira, Evelina Fedorenko
Abstract	The words of a language reflect the structure of the human mind, allowing us to transmit thoughts between individuals. However, language can represent only a subset of our rich and detailed cognitive architecture. Here, we ask what kinds of common knowledge (semantic memory) are captured by word meanings (lexical semantics). We examine a prominent computational model that represents words as vectors in a multidimensional space, such that proximity between word-vectors approximates semantic relatedness. Because related words appear in similar contexts, such spaces - called “word embeddings” - can be learned from patterns of lexical co-occurrences in natural language. Despite their popularity, a fundamental concern about word embeddings is that they appear to be semantically “rigid”: inter-word proximity captures only overall similarity, yet human judgments about object similarities are highly context-dependent and involve multiple, distinct semantic features. For example, dolphins and alligators appear similar in size, but differ in intelligence and aggressiveness. Could such context-dependent relationships be recovered from word embeddings? To address this issue, we introduce a powerful, domain-general solution: “semantic projection” of word-vectors onto lines that represent various object features, like size (the line extending from the word “small” to “big”), intelligence (from “dumb” to “smart”), or danger (from “safe” to “dangerous”). This method, which is intuitively analogous to placing objects “on a mental scale” between two extremes, recovers human judgments across a range of object categories and properties. We thus show that word embeddings inherit a wealth of common knowledge from word co-occurrence statistics and can be flexibly manipulated to express context-dependent meanings.
Tasks	Word Embeddings
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01241v2
PDF	http://arxiv.org/pdf/1802.01241v2.pdf
PWC	https://paperswithcode.com/paper/semantic-projection-recovering-human
Repo
Framework

Scalable kernel-based variable selection with sparsistency


Title	Scalable kernel-based variable selection with sparsistency
Authors	Xin He, Junhui Wang, Shaogao Lv
Abstract	Variable selection is central to high-dimensional data analysis, and various algorithms have been developed. Ideally, a variable selection algorithm shall be flexible, scalable, and with theoretical guarantee, yet most existing algorithms cannot attain these properties at the same time. In this article, a three-step variable selection algorithm is developed, involving kernel-based estimation of the regression function and its gradient functions as well as a hard thresholding. Its key advantage is that it assumes no explicit model assumption, admits general predictor effects, allows for scalable computation, and attains desirable asymptotic sparsistency. The proposed algorithm can be adapted to any reproducing kernel Hilbert space (RKHS) with different kernel functions, and can be extended to interaction selection with slight modification. Its computational cost is only linear in the data dimension, and can be further improved through parallel computing. The sparsistency of the proposed algorithm is established for general RKHS under mild conditions, including linear and Gaussian kernels as special cases. Its effectiveness is also supported by a variety of simulated and real examples.
Tasks
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09246v2
PDF	http://arxiv.org/pdf/1802.09246v2.pdf
PWC	https://paperswithcode.com/paper/scalable-kernel-based-variable-selection-with
Repo
Framework

Image Embedding of PMU Data for Deep Learning towards Transient Disturbance Classification


Title	Image Embedding of PMU Data for Deep Learning towards Transient Disturbance Classification
Authors	Yongli Zhu, Chengxi Liu, Kai Sun
Abstract	This paper presents a study on power grid disturbance classification by Deep Learning (DL). A real synchrophasor set composing of three different types of disturbance events from the Frequency Monitoring Network (FNET) is used. An image embedding technique called Gramian Angular Field is applied to transform each time series of event data to a two-dimensional image for learning. Two main DL algorithms, i.e. CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) are tested and compared with two widely used data mining tools, the Support Vector Machine and Decision Tree. The test results demonstrate the superiority of the both DL algorithms over other methods in the application of power system transient disturbance classification.
Tasks	Time Series
Published	2018-12-22
URL	http://arxiv.org/abs/1812.09427v1
PDF	http://arxiv.org/pdf/1812.09427v1.pdf
PWC	https://paperswithcode.com/paper/image-embedding-of-pmu-data-for-deep-learning
Repo
Framework

Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks


Title	Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks
Authors	Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann LeCun, Nathan Srebro
Abstract	Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes, and could potentially explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.
Tasks
Published	2018-05-30
URL	http://arxiv.org/abs/1805.12076v1
PDF	http://arxiv.org/pdf/1805.12076v1.pdf
PWC	https://paperswithcode.com/paper/towards-understanding-the-role-of-over
Repo
Framework

MEMOIR: Multi-class Extreme Classification with Inexact Margin


Title	MEMOIR: Multi-class Extreme Classification with Inexact Margin
Authors	Anton Belyy, Aleksei Sholokhov
Abstract	Multi-class classification with a very large number of classes, or extreme classification, is a challenging problem from both statistical and computational perspectives. Most of the classical approaches to multi-class classification, including one-vs-rest or multi-class support vector machines, require the exact estimation of the classifier’s margin, at both the training and the prediction steps making them intractable in extreme classification scenarios. In this paper, we study the impact of computing an approximate margin using nearest neighbor (ANN) search structures combined with locality-sensitive hashing (LSH). This approximation allows to dramatically reduce both the training and the prediction time without a significant loss in performance. We theoretically prove that this approximation does not lead to a significant loss of the risk of the model and provide empirical evidence over five publicly available large scale datasets, showing that the proposed approach is highly competitive with respect to state-of-the-art approaches on time, memory and performance measures.
Tasks
Published	2018-11-24
URL	http://arxiv.org/abs/1811.09863v1
PDF	http://arxiv.org/pdf/1811.09863v1.pdf
PWC	https://paperswithcode.com/paper/memoir-multi-class-extreme-classification
Repo
Framework

Utilizing Complex-valued Network for Learning to Compare Image Patches


Title	Utilizing Complex-valued Network for Learning to Compare Image Patches
Authors	Siwen Jiang, Wenxuan Wei, Shihao Guo, Hongguang Fu, Lei Huang
Abstract	At present, the great achievements of convolutional neural network(CNN) in feature and metric learning have attracted many researchers. However, the vast majority of deep network architectures have been used to represent based on real values. The research of complex-valued networks is seldom concerned due to the absence of effective models and suitable distance of complex-valued vector. Motived by recent works, complex vectors have been shown to have a richer representational capacity and efficient complex blocks have been reported, we propose a new approach for learning image descriptors with complex numbers to compare image patches. We also propose a new architecture to learn image similarity function directly based on complex-valued network. We show that our models can perform competitive results on benchmark datasets. We make the source code of our models publicly available.
Tasks	Metric Learning
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12035v2
PDF	http://arxiv.org/pdf/1811.12035v2.pdf
PWC	https://paperswithcode.com/paper/utilizing-complex-valued-network-for-learning
Repo
Framework

A Coordinate-Free Construction of Scalable Natural Gradient


Title	A Coordinate-Free Construction of Scalable Natural Gradient
Authors	Kevin Luk, Roger Grosse
Abstract	Most neural networks are trained using first-order optimization methods, which are sensitive to the parameterization of the model. Natural gradient descent is invariant to smooth reparameterizations because it is defined in a coordinate-free way, but tractable approximations are typically defined in terms of coordinate systems, and hence may lose the invariance properties. We analyze the invariance properties of the Kronecker-Factored Approximate Curvature (K-FAC) algorithm by constructing the algorithm in a coordinate-free way. We explicitly construct a Riemannian metric under which the natural gradient matches the K-FAC update; invariance to affine transformations of the activations follows immediately. We extend our framework to analyze the invariance properties of K-FAC applied to convolutional networks and recurrent neural networks, as well as metrics other than the usual Fisher metric.
Tasks
Published	2018-08-30
URL	http://arxiv.org/abs/1808.10340v1
PDF	http://arxiv.org/pdf/1808.10340v1.pdf
PWC	https://paperswithcode.com/paper/a-coordinate-free-construction-of-scalable
Repo
Framework

Deep Reinforcement Learning for Dynamic Urban Transportation Problems


Title	Deep Reinforcement Learning for Dynamic Urban Transportation Problems
Authors	Laura Schultz, Vadim Sokolov
Abstract	We explore the use of deep learning and deep reinforcement learning for optimization problems in transportation. Many transportation system analysis tasks are formulated as an optimization problem - such as optimal control problems in intelligent transportation systems and long term urban planning. Often transportation models used to represent dynamics of a transportation system involve large data sets with complex input-output interactions and are difficult to use in the context of optimization. Use of deep learning metamodels can produce a lower dimensional representation of those relations and allow to implement optimization and reinforcement learning algorithms in an efficient manner. In particular, we develop deep learning models for calibrating transportation simulators and for reinforcement learning to solve the problem of optimal scheduling of travelers on the network.
Tasks
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05310v1
PDF	http://arxiv.org/pdf/1806.05310v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-for-dynamic-urban
Repo
Framework

On the Dual Geometry of Laplacian Eigenfunctions


Title	On the Dual Geometry of Laplacian Eigenfunctions
Authors	Alexander Cloninger, Stefan Steinerberger
Abstract	We discuss the geometry of Laplacian eigenfunctions $-\Delta \phi = \lambda \phi$ on compact manifolds $(M,g)$ and combinatorial graphs $G=(V,E)$. The ‘dual’ geometry of Laplacian eigenfunctions is well understood on $\mathbb{T}^d$ (identified with $\mathbb{Z}^d$) and $\mathbb{R}^n$ (which is self-dual). The dual geometry is of tremendous role in various fields of pure and applied mathematics. The purpose of our paper is to point out a notion of similarity between eigenfunctions that allows to reconstruct that geometry. Our measure of ‘similarity’ $ \alpha(\phi_{\lambda}, \phi_{\mu})$ between eigenfunctions $\phi_{\lambda}$ and $\phi_{\mu}$ is given by a global average of local correlations $$ \alpha(\phi_{\lambda}, \phi_{\mu})^2 = \ \phi_{\lambda} \phi_{\mu} _{L^2}^{-2}\int_{M}{ \left( \int_{M}{ p(t,x,y)( \phi_{\lambda}(y) - \phi_{\lambda}(x))( \phi_{\mu}(y) - \phi_{\mu}(x)) dy} \right)^2 dx},$$ where $p(t,x,y)$ is the classical heat kernel and $e^{-t \lambda} + e^{-t \mu} = 1$. This notion recovers all classical notions of duality but is equally applicable to other (rough) geometries and graphs; many numerical examples in different continuous and discrete settings illustrate the result.
Tasks
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09816v1
PDF	http://arxiv.org/pdf/1804.09816v1.pdf
PWC	https://paperswithcode.com/paper/on-the-dual-geometry-of-laplacian
Repo
Framework

Review Helpfulness Prediction with Embedding-Gated CNN


Title	Review Helpfulness Prediction with Embedding-Gated CNN
Authors	Cen Chen, Minghui Qiu, Yinfei Yang, Jun Zhou, Jun Huang, Xiaolong Li, Forrest Bao
Abstract	Product reviews, in the form of texts dominantly, significantly help consumers finalize their purchasing decisions. Thus, it is important for e-commerce companies to predict review helpfulness to present and recommend reviews in a more informative manner. In this work, we introduce a convolutional neural network model that is able to extract abstract features from multi-granularity representations. Inspired by the fact that different words contribute to the meaning of a sentence differently, we consider to learn word-level embedding-gates for all the representations. Furthermore, as it is common that some product domains/categories have rich user reviews, other domains not. To help domains with less sufficient data, we integrate our model into a cross-domain relationship learning framework for effectively transferring knowledge from other domains. Extensive experiments show that our model yields better performance than the existing methods.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09896v1
PDF	http://arxiv.org/pdf/1808.09896v1.pdf
PWC	https://paperswithcode.com/paper/review-helpfulness-prediction-with-embedding
Repo
Framework

Deep Asymmetric Networks with a Set of Node-wise Variant Activation Functions


Title	Deep Asymmetric Networks with a Set of Node-wise Variant Activation Functions
Authors	Jinhyeok Jang, Hyunjoong Cho, Jaehong Kim, Jaeyeon Lee, Seungjoon Yang
Abstract	This work presents deep asymmetric networks with a set of node-wise variant activation functions. The nodes’ sensitivities are affected by activation function selections such that the nodes with smaller indices become increasingly more sensitive. As a result, features learned by the nodes are sorted by the node indices in the order of their importance. Asymmetric networks not only learn input features but also the importance of those features. Nodes of lesser importance in asymmetric networks can be pruned to reduce the complexity of the networks, and the pruned networks can be retrained without incurring performance losses. We validate the feature-sorting property using both shallow and deep asymmetric networks as well as deep asymmetric networks transferred from famous networks.
Tasks
Published	2018-09-11
URL	https://arxiv.org/abs/1809.03721v2
PDF	https://arxiv.org/pdf/1809.03721v2.pdf
PWC	https://paperswithcode.com/paper/deep-asymmetric-networks-with-a-set-of-node
Repo
Framework