Paper Group ANR 534
Visual Space Optimization for Zero-shot Learning. Semantics to Space(S2S): Embedding semantics into spatial space for zero-shot verb-object query inferencing. Degrees of Freedom Analysis of Unrolled Neural Networks. Control-Tutored Reinforcement Learning. Joint Concept Matching based Learning for Zero-Shot Recognition. Forget the Learning Rate, Dec …
Visual Space Optimization for Zero-shot Learning
Title | Visual Space Optimization for Zero-shot Learning |
Authors | Xinsheng Wang, Shanmin Pang, Jihua Zhu, Zhongyu Li, Zhiqiang Tian, Yaochen Li |
Abstract | Zero-shot learning, which aims to recognize new categories that are not included in the training set, has gained popularity owing to its potential ability in the real-word applications. Zero-shot learning models rely on learning an embedding space, where both semantic descriptions of classes and visual features of instances can be embedded for nearest neighbor search. Recently, most of the existing works consider the visual space formulated by deep visual features as an ideal choice of the embedding space. However, the discrete distribution of instances in the visual space makes the data structure unremarkable. We argue that optimizing the visual space is crucial as it allows semantic vectors to be embedded into the visual space more effectively. In this work, we propose two strategies to accomplish this purpose. One is the visual prototype based method, which learns a visual prototype for each visual class, so that, in the visual space, a class can be represented by a prototype feature instead of a series of discrete visual features. The other is to optimize the visual feature structure in an intermediate embedding space, and in this method we successfully devise a multilayer perceptron framework based algorithm that is able to learn the common intermediate embedding space and meanwhile to make the visual data structure more distinctive. Through extensive experimental evaluation on four benchmark datasets, we demonstrate that optimizing visual space is beneficial for zero-shot learning. Besides, the proposed prototype based method achieves the new state-of-the-art performance. |
Tasks | Zero-Shot Learning |
Published | 2019-06-30 |
URL | https://arxiv.org/abs/1907.00330v1 |
https://arxiv.org/pdf/1907.00330v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-space-optimization-for-zero-shot |
Repo | |
Framework | |
Semantics to Space(S2S): Embedding semantics into spatial space for zero-shot verb-object query inferencing
Title | Semantics to Space(S2S): Embedding semantics into spatial space for zero-shot verb-object query inferencing |
Authors | Sungmin Eum, Heesung Kwon |
Abstract | We present a novel deep zero-shot learning (ZSL) model for inferencing human-object-interaction with verb-object (VO) query. While the previous two-stream ZSL approaches only use the semantic/textual information to be fed into the query stream, we seek to incorporate and embed the semantics into the visual representation stream as well. Our approach is powered by Semantics-to-Space (S2S) architecture where semantics derived from the residing objects are embedded into a spatial space of the visual stream. This architecture allows the co-capturing of the semantic attributes of the human and the objects along with their location/size/silhouette information. To validate, we have constructed a new dataset, Verb-Transferability 60 (VT60). VT60 provides 60 different VO pairs with overlapping verbs tailored for testing two-stream ZSL approaches with VO query. Experimental evaluations show that our approach not only outperforms the state-of-the-art, but also shows the capability of consistently improving performance regardless of which ZSL baseline architecture is used. |
Tasks | Human-Object Interaction Detection, Zero-Shot Learning |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05894v2 |
https://arxiv.org/pdf/1906.05894v2.pdf | |
PWC | https://paperswithcode.com/paper/semantics-to-spaces2s-embedding-semantics |
Repo | |
Framework | |
Degrees of Freedom Analysis of Unrolled Neural Networks
Title | Degrees of Freedom Analysis of Unrolled Neural Networks |
Authors | Morteza Mardani, Qingyun Sun, Vardan Papyan, Shreyas Vasanawala, John Pauly, David Donoho |
Abstract | Unrolled neural networks emerged recently as an effective model for learning inverse maps appearing in image restoration tasks. However, their generalization risk (i.e., test mean-squared-error) and its link to network design and train sample size remains mysterious. Leveraging the Stein’s Unbiased Risk Estimator (SURE), this paper analyzes the generalization risk with its bias and variance components for recurrent unrolled networks. We particularly investigate the degrees-of-freedom (DOF) component of SURE, trace of the end-to-end network Jacobian, to quantify the prediction variance. We prove that DOF is well-approximated by the weighted \textit{path sparsity} of the network under incoherence conditions on the trained weights. Empirically, we examine the SURE components as a function of train sample size for both recurrent and non-recurrent (with many more parameters) unrolled networks. Our key observations indicate that: 1) DOF increases with train sample size and converges to the generalization risk for both recurrent and non-recurrent schemes; 2) recurrent network converges significantly faster (with less train samples) compared with non-recurrent scheme, hence recurrence serves as a regularization for low sample size regimes. |
Tasks | Image Restoration |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.03742v1 |
https://arxiv.org/pdf/1906.03742v1.pdf | |
PWC | https://paperswithcode.com/paper/degrees-of-freedom-analysis-of-unrolled |
Repo | |
Framework | |
Control-Tutored Reinforcement Learning
Title | Control-Tutored Reinforcement Learning |
Authors | Francesco De Lellis, Fabrizia Auletta, Giovanni Russo, Piero De Lellis, Mario di Bernardo |
Abstract | We introduce a control-tutored reinforcement learning (CTRL) algorithm. The idea is to enhance tabular learning algorithms so as to improve the exploration of the state-space, and substantially reduce learning times by leveraging some limited knowledge of the plant encoded into a tutoring model-based control strategy. We illustrate the benefits of our novel approach and its effectiveness by using the problem of controlling one or more agents to herd and contain within a goal region a set of target free-roving agents in the plane. |
Tasks | |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.06085v1 |
https://arxiv.org/pdf/1912.06085v1.pdf | |
PWC | https://paperswithcode.com/paper/control-tutored-reinforcement-learning |
Repo | |
Framework | |
Joint Concept Matching based Learning for Zero-Shot Recognition
Title | Joint Concept Matching based Learning for Zero-Shot Recognition |
Authors | Wen Tang, Ashkan Panahi, Hamid Krim |
Abstract | Zero-shot learning (ZSL) which aims to recognize unseen object classes by only training on seen object classes, has increasingly been of great interest in Machine Learning, and has registered with some successes. Most existing ZSL methods typically learn a projection map between the visual feature space and the semantic space and mainly suffer which is prone to a projection domain shift primarily due to a large domain gap between seen and unseen classes. In this paper, we propose a novel inductive ZSL model based on projecting both visual and semantic features into a common distinct latent space with class-specific knowledge, and on reconstructing both visual and semantic features by such a distinct common space to narrow the domain shift gap. We show that all these constraints on the latent space, class-specific knowledge, reconstruction of features and their combinations enhance the robustness against the projection domain shift problem, and improve the generalization ability to unseen object classes. Comprehensive experiments on four benchmark datasets demonstrate that our proposed method is superior to state-of-the-art algorithms. |
Tasks | Zero-Shot Learning |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05879v3 |
https://arxiv.org/pdf/1906.05879v3.pdf | |
PWC | https://paperswithcode.com/paper/joint-concept-matching-space-projection |
Repo | |
Framework | |
Forget the Learning Rate, Decay Loss
Title | Forget the Learning Rate, Decay Loss |
Authors | Jiakai Wei |
Abstract | In the usual deep neural network optimization process, the learning rate is the most important hyper parameter, which greatly affects the final convergence effect. The purpose of learning rate is to control the stepsize and gradually reduce the impact of noise on the network. In this paper, we will use a fixed learning rate with method of decaying loss to control the magnitude of the update. We used Image classification, Semantic segmentation, and GANs to verify this method. Experiments show that the loss decay strategy can greatly improve the performance of the model |
Tasks | Image Classification, Semantic Segmentation |
Published | 2019-04-27 |
URL | http://arxiv.org/abs/1905.00094v1 |
http://arxiv.org/pdf/1905.00094v1.pdf | |
PWC | https://paperswithcode.com/paper/190500094 |
Repo | |
Framework | |
Network Elastic Net for Identifying Smoking specific gene expression for lung cancer
Title | Network Elastic Net for Identifying Smoking specific gene expression for lung cancer |
Authors | Avinash Barnwal |
Abstract | Survival month for non-small lung cancer patients depend upon which stage of lung cancer is present. Our aim is to identify smoking specific gene expression biomarkers in the prognosis of lung cancer patients. In this paper, we introduce the network elastic net, a generalization of network lasso that allows for simultaneous clustering and regression on graphs. In Network elastic net, we consider similar patients based on smoking cigarettes per year to form the network. We then further find the suitable cluster among patients based on coefficients of genes having different survival month structures and showed the efficacy of the clusters using stage enrichment. This can be used to identify the stage of cancer using gene expression and smoking behavior of patients without doing any tests. |
Tasks | |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11833v1 |
https://arxiv.org/pdf/1908.11833v1.pdf | |
PWC | https://paperswithcode.com/paper/network-elastic-net-for-identifying-smoking |
Repo | |
Framework | |
A Natural Language-Inspired Multi-label Video Streaming Traffic Classification Method Based on Deep Neural Networks
Title | A Natural Language-Inspired Multi-label Video Streaming Traffic Classification Method Based on Deep Neural Networks |
Authors | Yan Shi, Dezhi Feng, Subir Biswas |
Abstract | This paper presents a deep-learning based traffic classification method for identifying multiple streaming video sources at the same time within an encrypted tunnel. The work defines a novel feature inspired by Natural Language Processing (NLP) that allows existing NLP techniques to help the traffic classification. The feature extraction method is described, and a large dataset containing video streaming and web traffic is created to verify its effectiveness. Results are obtained by applying several NLP methods to show that the proposed method performs well on both binary and multilabel traffic classification problems. We also show the ability to achieve zero-shot learning with the proposed method. |
Tasks | Zero-Shot Learning |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.02679v1 |
https://arxiv.org/pdf/1906.02679v1.pdf | |
PWC | https://paperswithcode.com/paper/190602679 |
Repo | |
Framework | |
BGADAM: Boosting based Genetic-Evolutionary ADAM for Convolutional Neural Network Optimization
Title | BGADAM: Boosting based Genetic-Evolutionary ADAM for Convolutional Neural Network Optimization |
Authors | Jiyang Bai, Jiawei Zhang |
Abstract | Among various optimization algorithms, ADAM can achieve outstanding performance and has been widely used in model learning. ADAM has the advantages of fast convergence with both momentum and adaptive learning rate. For deep neural network learning problems, since their objective functions are nonconvex, ADAM can also get stuck in local optima easily. To resolve such a problem, the genetic evolutionary ADAM (GADAM) algorithm, which combines the ADAM and genetic algorithm, was introduced in recent years. To further maximize the advantages of the GADAM model, we propose to implement the boosting strategy for unit model training in GADAM. In this paper, we introduce a novel optimization algorithm, namely Boosting based GADAM (BGADAM). We will show that after adding the boosting strategy to the GADAM model, it can help unit models jump out the local optima and converge to better solutions. |
Tasks | |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1908.08015v1 |
https://arxiv.org/pdf/1908.08015v1.pdf | |
PWC | https://paperswithcode.com/paper/bgadam-boosting-based-genetic-evolutionary |
Repo | |
Framework | |
Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks
Title | Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks |
Authors | Guodong Zhang, James Martens, Roger Grosse |
Abstract | Natural gradient descent has proven effective at mitigating the effects of pathological curvature in neural network optimization, but little is known theoretically about its convergence properties, especially for \emph{nonlinear} networks. In this work, we analyze for the first time the speed of convergence of natural gradient descent on nonlinear neural networks with squared-error loss. We identify two conditions which guarantee efficient convergence from random initializations: (1) the Jacobian matrix (of network’s output for all training cases with respect to the parameters) has full row rank, and (2) the Jacobian matrix is stable for small perturbations around the initialization. For two-layer ReLU neural networks, we prove that these two conditions do in fact hold throughout the training, under the assumptions of nondegenerate inputs and overparameterization. We further extend our analysis to more general loss functions. Lastly, we show that K-FAC, an approximate natural gradient descent method, also converges to global minima under the same assumptions, and we give a bound on the rate of this convergence. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.10961v2 |
https://arxiv.org/pdf/1905.10961v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-convergence-of-natural-gradient-descent |
Repo | |
Framework | |
Predicting origin-destination ride-sourcing demand with a spatio-temporal encoder-decoder residual multi-graph convolutional network
Title | Predicting origin-destination ride-sourcing demand with a spatio-temporal encoder-decoder residual multi-graph convolutional network |
Authors | Jintao Ke, Xiaoran Qin, Hai Yang, Zhengfei Zheng, Zheng Zhu, Jieping Ye |
Abstract | With the rapid development of mobile-internet technologies, on-demand ride-sourcing services have become increasingly popular and largely reshaped the way people travel. Demand prediction is one of the most fundamental components in supply-demand management systems of ride-sourcing platforms. With accurate short-term prediction for origin-destination (OD) demand, the platforms make precise and timely decisions on real-time matching, idle vehicle reallocations and ride-sharing vehicle routing, etc. Compared to zone-based demand prediction that has been examined by many previous studies, OD-based demand prediction is more challenging. This is mainly due to the complicated spatial and temporal dependencies among demand of different OD pairs. To overcome this challenge, we propose the Spatio-Temporal Encoder-Decoder Residual Multi-Graph Convolutional network (ST-ED-RMGC), a novel deep learning model for predicting ride-sourcing demand of various OD pairs. Firstly, the model constructs OD graphs, which utilize adjacent matrices to characterize the non-Euclidean pair-wise geographical and semantic correlations among different OD pairs. Secondly, based on the constructed graphs, a residual multi-graph convolutional (RMGC) network is designed to encode the contextual-aware spatial dependencies, and a long-short term memory (LSTM) network is used to encode the temporal dependencies, into a dense vector space. Finally, we reuse the RMGC networks to decode the compressed vector back to OD graphs and predict the future OD demand. Through extensive experiments on the for-hire-vehicles datasets in Manhattan, New York City, we show that our proposed deep learning framework outperforms the state-of-arts by a significant margin. |
Tasks | |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.09103v1 |
https://arxiv.org/pdf/1910.09103v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-origin-destination-ride-sourcing |
Repo | |
Framework | |
CGaP: Continuous Growth and Pruning for Efficient Deep Learning
Title | CGaP: Continuous Growth and Pruning for Efficient Deep Learning |
Authors | Xiaocong Du, Zheng Li, Yu Cao |
Abstract | Today a canonical approach to reduce the computation cost of Deep Neural Networks (DNNs) is to pre-define an over-parameterized model before training to guarantee the learning capacity, and then prune unimportant learning units (filters and neurons) during training to improve model compactness. We argue it is unnecessary to introduce redundancy at the beginning of the training but then reduce redundancy for the ultimate inference model. In this paper, we propose a Continuous Growth and Pruning (CGaP) scheme to minimize the redundancy from the beginning. CGaP starts the training from a small network seed, then expands the model continuously by reinforcing important learning units, and finally prunes the network to obtain a compact and accurate model. As the growth phase favors important learning units, CGaP provides a clear learning purpose to the pruning phase. Experimental results on representative datasets and DNN architectures demonstrate that CGaP outperforms previous pruning-only approaches that deal with pre-defined structures. For VGG-19 on CIFAR-100 and SVHN datasets, CGaP reduces the number of parameters by 78.9% and 85.8%, FLOPs by 53.2% and 74.2%, respectively; For ResNet-110 On CIFAR-10, CGaP reduces 64.0% number of parameters and 63.3% FLOPs. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11533v2 |
https://arxiv.org/pdf/1905.11533v2.pdf | |
PWC | https://paperswithcode.com/paper/cgap-continuous-growth-and-pruning-for |
Repo | |
Framework | |
ViTOR: Learning to Rank Webpages Based on Visual Features
Title | ViTOR: Learning to Rank Webpages Based on Visual Features |
Authors | Bram van den Akker, Ilya Markov, Maarten de Rijke |
Abstract | The visual appearance of a webpage carries valuable information about its quality and can be used to improve the performance of learning to rank (LTR). We introduce the Visual learning TO Rank (ViTOR) model that integrates state-of-the-art visual features extraction methods by (i) transfer learning from a pre-trained image classification model, and (ii) synthetic saliency heat maps generated from webpage snapshots. Since there is currently no public dataset for the task of LTR with visual features, we also introduce and release the ViTOR dataset, containing visually rich and diverse webpages. The ViTOR dataset consists of visual snapshots, non-visual features and relevance judgments for ClueWeb12 webpages and TREC Web Track queries. We experiment with the proposed ViTOR model on the ViTOR dataset and show that it significantly improves the performance of LTR with visual features |
Tasks | Image Classification, Learning-To-Rank, Transfer Learning |
Published | 2019-03-07 |
URL | http://arxiv.org/abs/1903.02939v1 |
http://arxiv.org/pdf/1903.02939v1.pdf | |
PWC | https://paperswithcode.com/paper/vitor-learning-to-rank-webpages-based-on |
Repo | |
Framework | |
Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces
Title | Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces |
Authors | Barun Patra, Joel Ruben Antony Moniz, Sarthak Garg, Matthew R. Gormley, Graham Neubig |
Abstract | Recent work on bilingual lexicon induction (BLI) has frequently depended either on aligned bilingual lexicons or on distribution matching, often with an assumption about the isometry of the two spaces. We propose a technique to quantitatively estimate this assumption of the isometry between two embedding spaces and empirically show that this assumption weakens as the languages in question become increasingly etymologically distant. We then propose Bilingual Lexicon Induction with Semi-Supervision (BLISS) — a semi-supervised approach that relaxes the isometric assumption while leveraging both limited aligned bilingual lexicons and a larger set of unaligned word embeddings, as well as a novel hubness filtering technique. Our proposed method obtains state of the art results on 15 of 18 language pairs on the MUSE dataset, and does particularly well when the embedding spaces don’t appear to be isometric. In addition, we also show that adding supervision stabilizes the learning procedure, and is effective even with minimal supervision. |
Tasks | Word Embeddings |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06625v1 |
https://arxiv.org/pdf/1908.06625v1.pdf | |
PWC | https://paperswithcode.com/paper/bilingual-lexicon-induction-with-semi-1 |
Repo | |
Framework | |
Generative-Discriminative Complementary Learning
Title | Generative-Discriminative Complementary Learning |
Authors | Yanwu Xu, Mingming Gong, Junxiang Chen, Tongliang Liu, Kun Zhang, Kayhan Batmanghelich |
Abstract | Majority of state-of-the-art deep learning methods are discriminative approaches, which model the conditional distribution of labels given inputs features. The success of such approaches heavily depends on high-quality labeled instances, which are not easy to obtain, especially as the number of candidate classes increases. In this paper, we study the complementary learning problem. Unlike ordinary labels, complementary labels are easy to obtain because an annotator only needs to provide a yes/no answer to a randomly chosen candidate class for each instance. We propose a generative-discriminative complementary learning method that estimates the ordinary labels by modeling both the conditional (discriminative) and instance (generative) distributions. Our method, we call Complementary Conditional GAN (CCGAN), improves the accuracy of predicting ordinary labels and can generate high-quality instances in spite of weak supervision. In addition to the extensive empirical studies, we also theoretically show that our model can retrieve the true conditional distribution from the complementarily-labeled data. |
Tasks | |
Published | 2019-04-02 |
URL | https://arxiv.org/abs/1904.01612v4 |
https://arxiv.org/pdf/1904.01612v4.pdf | |
PWC | https://paperswithcode.com/paper/generative-discriminative-complementary |
Repo | |
Framework | |