Paper Group ANR 259
A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval. Towards Practical Visual Search Engine within Elasticsearch. Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships. Online Embedding Compression for Text Classification using Low Rank Matrix Factorization. Semantic proj …
A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval
Title | A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval |
Authors | Federico Magliani, Tomaso Fontanini, Andrea Prati |
Abstract | The recent advances brought by deep learning allowed to improve the performance in image retrieval tasks. Through the many convolutional layers, available in a Convolutional Neural Network (CNN), it is possible to obtain a hierarchy of features from the evaluated image. At every step, the patches extracted are smaller than the previous levels and more representative. Following this idea, this paper introduces a new detector applied on the feature maps extracted from pre-trained CNN. Specifically, this approach lets to increase the number of features in order to increase the performance of the aggregation algorithms like the most famous and used VLAD embedding. The proposed approach is tested on different public datasets: Holidays, Oxford5k, Paris6k and UKB. |
Tasks | Content-Based Image Retrieval, Image Retrieval |
Published | 2018-08-15 |
URL | http://arxiv.org/abs/1808.05022v1 |
http://arxiv.org/pdf/1808.05022v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dense-depth-representation-for-vlad |
Repo | |
Framework | |
Towards Practical Visual Search Engine within Elasticsearch
Title | Towards Practical Visual Search Engine within Elasticsearch |
Authors | Cun Mu, Jun Zhao, Guang Yang, Jing Zhang, Zheng Yan |
Abstract | In this paper, we describe our end-to-end content-based image retrieval system built upon Elasticsearch, a well-known and popular textual search engine. As far as we know, this is the first time such a system has been implemented in eCommerce, and our efforts have turned out to be highly worthwhile. We end up with a novel and exciting visual search solution that is extremely easy to be deployed, distributed, scaled and monitored in a cost-friendly manner. Moreover, our platform is intrinsically flexible in supporting multimodal searches, where visual and textual information can be jointly leveraged in retrieval. The core idea is to encode image feature vectors into a collection of string tokens in a way such that closer vectors will share more string tokens in common. By doing that, we can utilize Elasticsearch to efficiently retrieve similar images based on similarities within encoded sting tokens. As part of the development, we propose a novel vector to string encoding method, which is shown to substantially outperform the previous ones in terms of both precision and latency. First-hand experiences in implementing this Elasticsearch-based platform are extensively addressed, which should be valuable to practitioners also interested in building visual search engine on top of Elasticsearch. |
Tasks | Content-Based Image Retrieval, Image Retrieval |
Published | 2018-06-23 |
URL | http://arxiv.org/abs/1806.08896v3 |
http://arxiv.org/pdf/1806.08896v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-practical-visual-search-engine-within |
Repo | |
Framework | |
Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships
Title | Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships |
Authors | Yong Liu, Ruiping Wang, Shiguang Shan, Xilin Chen |
Abstract | Context is important for accurate visual recognition. In this work we propose an object detection algorithm that not only considers object visual appearance, but also makes use of two kinds of context including scene contextual information and object relationships within a single image. Therefore, object detection is regarded as both a cognition problem and a reasoning problem when leveraging these structured information. Specifically, this paper formulates object detection as a problem of graph structure inference, where given an image the objects are treated as nodes in a graph and relationships between the objects are modeled as edges in such graph. To this end, we present a so-called Structure Inference Network (SIN), a detector that incorporates into a typical detection framework (e.g. Faster R-CNN) with a graphical model which aims to infer object state. Comprehensive experiments on PASCAL VOC and MS COCO datasets indicate that scene context and object relationships truly improve the performance of object detection with more desirable and reasonable outputs. |
Tasks | Object Detection |
Published | 2018-06-30 |
URL | http://arxiv.org/abs/1807.00119v1 |
http://arxiv.org/pdf/1807.00119v1.pdf | |
PWC | https://paperswithcode.com/paper/structure-inference-net-object-detection |
Repo | |
Framework | |
Online Embedding Compression for Text Classification using Low Rank Matrix Factorization
Title | Online Embedding Compression for Text Classification using Low Rank Matrix Factorization |
Authors | Anish Acharya, Rahul Goel, Angeliki Metallinou, Inderjit Dhillon |
Abstract | Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training,to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark. |
Tasks | Quantization, Sentence Classification, Text Classification |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00641v1 |
http://arxiv.org/pdf/1811.00641v1.pdf | |
PWC | https://paperswithcode.com/paper/online-embedding-compression-for-text |
Repo | |
Framework | |
Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings
Title | Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings |
Authors | Gabriel Grand, Idan Asher Blank, Francisco Pereira, Evelina Fedorenko |
Abstract | The words of a language reflect the structure of the human mind, allowing us to transmit thoughts between individuals. However, language can represent only a subset of our rich and detailed cognitive architecture. Here, we ask what kinds of common knowledge (semantic memory) are captured by word meanings (lexical semantics). We examine a prominent computational model that represents words as vectors in a multidimensional space, such that proximity between word-vectors approximates semantic relatedness. Because related words appear in similar contexts, such spaces - called “word embeddings” - can be learned from patterns of lexical co-occurrences in natural language. Despite their popularity, a fundamental concern about word embeddings is that they appear to be semantically “rigid”: inter-word proximity captures only overall similarity, yet human judgments about object similarities are highly context-dependent and involve multiple, distinct semantic features. For example, dolphins and alligators appear similar in size, but differ in intelligence and aggressiveness. Could such context-dependent relationships be recovered from word embeddings? To address this issue, we introduce a powerful, domain-general solution: “semantic projection” of word-vectors onto lines that represent various object features, like size (the line extending from the word “small” to “big”), intelligence (from “dumb” to “smart”), or danger (from “safe” to “dangerous”). This method, which is intuitively analogous to placing objects “on a mental scale” between two extremes, recovers human judgments across a range of object categories and properties. We thus show that word embeddings inherit a wealth of common knowledge from word co-occurrence statistics and can be flexibly manipulated to express context-dependent meanings. |
Tasks | Word Embeddings |
Published | 2018-02-05 |
URL | http://arxiv.org/abs/1802.01241v2 |
http://arxiv.org/pdf/1802.01241v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-projection-recovering-human |
Repo | |
Framework | |
Scalable kernel-based variable selection with sparsistency
Title | Scalable kernel-based variable selection with sparsistency |
Authors | Xin He, Junhui Wang, Shaogao Lv |
Abstract | Variable selection is central to high-dimensional data analysis, and various algorithms have been developed. Ideally, a variable selection algorithm shall be flexible, scalable, and with theoretical guarantee, yet most existing algorithms cannot attain these properties at the same time. In this article, a three-step variable selection algorithm is developed, involving kernel-based estimation of the regression function and its gradient functions as well as a hard thresholding. Its key advantage is that it assumes no explicit model assumption, admits general predictor effects, allows for scalable computation, and attains desirable asymptotic sparsistency. The proposed algorithm can be adapted to any reproducing kernel Hilbert space (RKHS) with different kernel functions, and can be extended to interaction selection with slight modification. Its computational cost is only linear in the data dimension, and can be further improved through parallel computing. The sparsistency of the proposed algorithm is established for general RKHS under mild conditions, including linear and Gaussian kernels as special cases. Its effectiveness is also supported by a variety of simulated and real examples. |
Tasks | |
Published | 2018-02-26 |
URL | http://arxiv.org/abs/1802.09246v2 |
http://arxiv.org/pdf/1802.09246v2.pdf | |
PWC | https://paperswithcode.com/paper/scalable-kernel-based-variable-selection-with |
Repo | |
Framework | |
Image Embedding of PMU Data for Deep Learning towards Transient Disturbance Classification
Title | Image Embedding of PMU Data for Deep Learning towards Transient Disturbance Classification |
Authors | Yongli Zhu, Chengxi Liu, Kai Sun |
Abstract | This paper presents a study on power grid disturbance classification by Deep Learning (DL). A real synchrophasor set composing of three different types of disturbance events from the Frequency Monitoring Network (FNET) is used. An image embedding technique called Gramian Angular Field is applied to transform each time series of event data to a two-dimensional image for learning. Two main DL algorithms, i.e. CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) are tested and compared with two widely used data mining tools, the Support Vector Machine and Decision Tree. The test results demonstrate the superiority of the both DL algorithms over other methods in the application of power system transient disturbance classification. |
Tasks | Time Series |
Published | 2018-12-22 |
URL | http://arxiv.org/abs/1812.09427v1 |
http://arxiv.org/pdf/1812.09427v1.pdf | |
PWC | https://paperswithcode.com/paper/image-embedding-of-pmu-data-for-deep-learning |
Repo | |
Framework | |
Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks
Title | Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks |
Authors | Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann LeCun, Nathan Srebro |
Abstract | Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes, and could potentially explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks. |
Tasks | |
Published | 2018-05-30 |
URL | http://arxiv.org/abs/1805.12076v1 |
http://arxiv.org/pdf/1805.12076v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-understanding-the-role-of-over |
Repo | |
Framework | |
MEMOIR: Multi-class Extreme Classification with Inexact Margin
Title | MEMOIR: Multi-class Extreme Classification with Inexact Margin |
Authors | Anton Belyy, Aleksei Sholokhov |
Abstract | Multi-class classification with a very large number of classes, or extreme classification, is a challenging problem from both statistical and computational perspectives. Most of the classical approaches to multi-class classification, including one-vs-rest or multi-class support vector machines, require the exact estimation of the classifier’s margin, at both the training and the prediction steps making them intractable in extreme classification scenarios. In this paper, we study the impact of computing an approximate margin using nearest neighbor (ANN) search structures combined with locality-sensitive hashing (LSH). This approximation allows to dramatically reduce both the training and the prediction time without a significant loss in performance. We theoretically prove that this approximation does not lead to a significant loss of the risk of the model and provide empirical evidence over five publicly available large scale datasets, showing that the proposed approach is highly competitive with respect to state-of-the-art approaches on time, memory and performance measures. |
Tasks | |
Published | 2018-11-24 |
URL | http://arxiv.org/abs/1811.09863v1 |
http://arxiv.org/pdf/1811.09863v1.pdf | |
PWC | https://paperswithcode.com/paper/memoir-multi-class-extreme-classification |
Repo | |
Framework | |
Utilizing Complex-valued Network for Learning to Compare Image Patches
Title | Utilizing Complex-valued Network for Learning to Compare Image Patches |
Authors | Siwen Jiang, Wenxuan Wei, Shihao Guo, Hongguang Fu, Lei Huang |
Abstract | At present, the great achievements of convolutional neural network(CNN) in feature and metric learning have attracted many researchers. However, the vast majority of deep network architectures have been used to represent based on real values. The research of complex-valued networks is seldom concerned due to the absence of effective models and suitable distance of complex-valued vector. Motived by recent works, complex vectors have been shown to have a richer representational capacity and efficient complex blocks have been reported, we propose a new approach for learning image descriptors with complex numbers to compare image patches. We also propose a new architecture to learn image similarity function directly based on complex-valued network. We show that our models can perform competitive results on benchmark datasets. We make the source code of our models publicly available. |
Tasks | Metric Learning |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12035v2 |
http://arxiv.org/pdf/1811.12035v2.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-complex-valued-network-for-learning |
Repo | |
Framework | |
A Coordinate-Free Construction of Scalable Natural Gradient
Title | A Coordinate-Free Construction of Scalable Natural Gradient |
Authors | Kevin Luk, Roger Grosse |
Abstract | Most neural networks are trained using first-order optimization methods, which are sensitive to the parameterization of the model. Natural gradient descent is invariant to smooth reparameterizations because it is defined in a coordinate-free way, but tractable approximations are typically defined in terms of coordinate systems, and hence may lose the invariance properties. We analyze the invariance properties of the Kronecker-Factored Approximate Curvature (K-FAC) algorithm by constructing the algorithm in a coordinate-free way. We explicitly construct a Riemannian metric under which the natural gradient matches the K-FAC update; invariance to affine transformations of the activations follows immediately. We extend our framework to analyze the invariance properties of K-FAC applied to convolutional networks and recurrent neural networks, as well as metrics other than the usual Fisher metric. |
Tasks | |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10340v1 |
http://arxiv.org/pdf/1808.10340v1.pdf | |
PWC | https://paperswithcode.com/paper/a-coordinate-free-construction-of-scalable |
Repo | |
Framework | |
Deep Reinforcement Learning for Dynamic Urban Transportation Problems
Title | Deep Reinforcement Learning for Dynamic Urban Transportation Problems |
Authors | Laura Schultz, Vadim Sokolov |
Abstract | We explore the use of deep learning and deep reinforcement learning for optimization problems in transportation. Many transportation system analysis tasks are formulated as an optimization problem - such as optimal control problems in intelligent transportation systems and long term urban planning. Often transportation models used to represent dynamics of a transportation system involve large data sets with complex input-output interactions and are difficult to use in the context of optimization. Use of deep learning metamodels can produce a lower dimensional representation of those relations and allow to implement optimization and reinforcement learning algorithms in an efficient manner. In particular, we develop deep learning models for calibrating transportation simulators and for reinforcement learning to solve the problem of optimal scheduling of travelers on the network. |
Tasks | |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05310v1 |
http://arxiv.org/pdf/1806.05310v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-for-dynamic-urban |
Repo | |
Framework | |
On the Dual Geometry of Laplacian Eigenfunctions
Title | On the Dual Geometry of Laplacian Eigenfunctions |
Authors | Alexander Cloninger, Stefan Steinerberger |
Abstract | We discuss the geometry of Laplacian eigenfunctions $-\Delta \phi = \lambda \phi$ on compact manifolds $(M,g)$ and combinatorial graphs $G=(V,E)$. The ‘dual’ geometry of Laplacian eigenfunctions is well understood on $\mathbb{T}^d$ (identified with $\mathbb{Z}^d$) and $\mathbb{R}^n$ (which is self-dual). The dual geometry is of tremendous role in various fields of pure and applied mathematics. The purpose of our paper is to point out a notion of similarity between eigenfunctions that allows to reconstruct that geometry. Our measure of ‘similarity’ $ \alpha(\phi_{\lambda}, \phi_{\mu})$ between eigenfunctions $\phi_{\lambda}$ and $\phi_{\mu}$ is given by a global average of local correlations $$ \alpha(\phi_{\lambda}, \phi_{\mu})^2 = \ \phi_{\lambda} \phi_{\mu} _{L^2}^{-2}\int_{M}{ \left( \int_{M}{ p(t,x,y)( \phi_{\lambda}(y) - \phi_{\lambda}(x))( \phi_{\mu}(y) - \phi_{\mu}(x)) dy} \right)^2 dx},$$ where $p(t,x,y)$ is the classical heat kernel and $e^{-t \lambda} + e^{-t \mu} = 1$. This notion recovers all classical notions of duality but is equally applicable to other (rough) geometries and graphs; many numerical examples in different continuous and discrete settings illustrate the result. |
Tasks | |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09816v1 |
http://arxiv.org/pdf/1804.09816v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-dual-geometry-of-laplacian |
Repo | |
Framework | |
Review Helpfulness Prediction with Embedding-Gated CNN
Title | Review Helpfulness Prediction with Embedding-Gated CNN |
Authors | Cen Chen, Minghui Qiu, Yinfei Yang, Jun Zhou, Jun Huang, Xiaolong Li, Forrest Bao |
Abstract | Product reviews, in the form of texts dominantly, significantly help consumers finalize their purchasing decisions. Thus, it is important for e-commerce companies to predict review helpfulness to present and recommend reviews in a more informative manner. In this work, we introduce a convolutional neural network model that is able to extract abstract features from multi-granularity representations. Inspired by the fact that different words contribute to the meaning of a sentence differently, we consider to learn word-level embedding-gates for all the representations. Furthermore, as it is common that some product domains/categories have rich user reviews, other domains not. To help domains with less sufficient data, we integrate our model into a cross-domain relationship learning framework for effectively transferring knowledge from other domains. Extensive experiments show that our model yields better performance than the existing methods. |
Tasks | |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09896v1 |
http://arxiv.org/pdf/1808.09896v1.pdf | |
PWC | https://paperswithcode.com/paper/review-helpfulness-prediction-with-embedding |
Repo | |
Framework | |
Deep Asymmetric Networks with a Set of Node-wise Variant Activation Functions
Title | Deep Asymmetric Networks with a Set of Node-wise Variant Activation Functions |
Authors | Jinhyeok Jang, Hyunjoong Cho, Jaehong Kim, Jaeyeon Lee, Seungjoon Yang |
Abstract | This work presents deep asymmetric networks with a set of node-wise variant activation functions. The nodes’ sensitivities are affected by activation function selections such that the nodes with smaller indices become increasingly more sensitive. As a result, features learned by the nodes are sorted by the node indices in the order of their importance. Asymmetric networks not only learn input features but also the importance of those features. Nodes of lesser importance in asymmetric networks can be pruned to reduce the complexity of the networks, and the pruned networks can be retrained without incurring performance losses. We validate the feature-sorting property using both shallow and deep asymmetric networks as well as deep asymmetric networks transferred from famous networks. |
Tasks | |
Published | 2018-09-11 |
URL | https://arxiv.org/abs/1809.03721v2 |
https://arxiv.org/pdf/1809.03721v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-asymmetric-networks-with-a-set-of-node |
Repo | |
Framework | |