Paper Group ANR 1047
An Exponential Efron-Stein Inequality for Lq Stable Learning Rules. Attributes Guided Feature Learning for Vehicle Re-identification. $S^{2}$-LBI: Stochastic Split Linearized Bregman Iterations for Parsimonious Deep Learning. Cross Domain Knowledge Learning with Dual-branch Adversarial Network for Vehicle Re-identification. Bayesian leave-one-out c …
An Exponential Efron-Stein Inequality for Lq Stable Learning Rules
Title | An Exponential Efron-Stein Inequality for Lq Stable Learning Rules |
Authors | Karim Abou-Moustafa, Csaba Szepesvari |
Abstract | There is accumulating evidence in the literature that stability of learning algorithms is a key characteristic that permits a learning algorithm to generalize. Despite various insightful results in this direction, there seems to be an overlooked dichotomy in the type of stability-based generalization bounds we have in the literature. On one hand, the literature seems to suggest that exponential generalization bounds for the estimated risk, which are optimal, can be only obtained through stringent, distribution independent and computationally intractable notions of stability such as uniform stability. On the other hand, it seems that weaker notions of stability such as hypothesis stability, although it is distribution dependent and more amenable to computation, can only yield polynomial generalization bounds for the estimated risk, which are suboptimal. In this paper, we address the gap between these two regimes of results. In particular, the main question we address here is \emph{whether it is possible to derive exponential generalization bounds for the estimated risk using a notion of stability that is computationally tractable and distribution dependent, but weaker than uniform stability. Using recent advances in concentration inequalities, and using a notion of stability that is weaker than uniform stability but distribution dependent and amenable to computation, we derive an exponential tail bound for the concentration of the estimated risk of a hypothesis returned by a general learning rule, where the estimated risk is expressed in terms of either the resubstitution estimate (empirical error), or the deleted (or, leave-one-out) estimate. As an illustration, we derive exponential tail bounds for ridge regression with unbounded responses, where we show how stability changes with the tail behavior of the response variables. |
Tasks | |
Published | 2019-03-12 |
URL | https://arxiv.org/abs/1903.05457v2 |
https://arxiv.org/pdf/1903.05457v2.pdf | |
PWC | https://paperswithcode.com/paper/an-exponential-efron-stein-inequality-for-lq |
Repo | |
Framework | |
Attributes Guided Feature Learning for Vehicle Re-identification
Title | Attributes Guided Feature Learning for Vehicle Re-identification |
Authors | Aihua Zheng, Xianmin Lin, Chenglong Li, Ran He, Jin Tang |
Abstract | Vehicle Re-ID has recently attracted enthusiastic attention due to its potential applications in smart city and urban surveillance. However, it suffers from large intra-class variation caused by view variations and illumination changes, and inter-class similarity especially for different identities with the similar appearance. To handle these issues, in this paper, we propose a novel deep network architecture, which guided by meaningful attributes including camera views, vehicle types and colors for vehicle Re-ID. In particular, our network is end-to-end trained and contains three subnetworks of deep features embedded by the corresponding attributes (i.e., camera view, vehicle type and vehicle color). Moreover, to overcome the shortcomings of limited vehicle images of different views, we design a view-specified generative adversarial network to generate the multi-view vehicle images. For network training, we annotate the view labels on the VeRi-776 dataset. Note that one can directly adopt the pre-trained view (as well as type and color) subnetwork on the other datasets with only ID information, which demonstrates the generalization of our model. Extensive experiments on the benchmark datasets VeRi-776 and VehicleID suggest that the proposed approach achieves the promising performance and yields to a new state-of-the-art for vehicle Re-ID. |
Tasks | Vehicle Re-Identification |
Published | 2019-05-22 |
URL | https://arxiv.org/abs/1905.08997v1 |
https://arxiv.org/pdf/1905.08997v1.pdf | |
PWC | https://paperswithcode.com/paper/attributes-guided-feature-learning-for |
Repo | |
Framework | |
$S^{2}$-LBI: Stochastic Split Linearized Bregman Iterations for Parsimonious Deep Learning
Title | $S^{2}$-LBI: Stochastic Split Linearized Bregman Iterations for Parsimonious Deep Learning |
Authors | Yanwei Fu, Donghao Li, Xinwei Sun, Shun Zhang, Yizhou Wang, Yuan Yao |
Abstract | This paper proposes a novel Stochastic Split Linearized Bregman Iteration ($S^{2}$-LBI) algorithm to efficiently train the deep network. The $S^{2}$-LBI introduces an iterative regularization path with structural sparsity. Our $S^{2}$-LBI combines the computational efficiency of the LBI, and model selection consistency in learning the structural sparsity. The computed solution path intrinsically enables us to enlarge or simplify a network, which theoretically, is benefited from the dynamics property of our $S^{2}$-LBI algorithm. The experimental results validate our $S^{2}$-LBI on MNIST and CIFAR-10 dataset. For example, in MNIST, we can either boost a network with only 1.5K parameters (1 convolutional layer of 5 filters, and 1 FC layer), achieves 98.40% recognition accuracy; or we simplify $82.5%$ of parameters in LeNet-5 network, and still achieves the 98.47% recognition accuracy. In addition, we also have the learning results on ImageNet, which will be added in the next version of our report. |
Tasks | Model Selection |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10873v1 |
http://arxiv.org/pdf/1904.10873v1.pdf | |
PWC | https://paperswithcode.com/paper/s2-lbi-stochastic-split-linearized-bregman |
Repo | |
Framework | |
Cross Domain Knowledge Learning with Dual-branch Adversarial Network for Vehicle Re-identification
Title | Cross Domain Knowledge Learning with Dual-branch Adversarial Network for Vehicle Re-identification |
Authors | Jinjia Peng, Huibing Wang, Xianping Fu |
Abstract | The widespread popularization of vehicles has facilitated all people’s life during the last decades. However, the emergence of a large number of vehicles poses the critical but challenging problem of vehicle re-identification (reID). Till now, for most vehicle reID algorithms, both the training and testing processes are conducted on the same annotated datasets under supervision. However, even a well-trained model will still cause fateful performance drop due to the severe domain bias between the trained dataset and the real-world scenes. To address this problem, this paper proposes a domain adaptation framework for vehicle reID (DAVR), which narrows the cross-domain bias by fully exploiting the labeled data from the source domain to adapt the target domain. DAVR develops an image-to-image translation network named Dual-branch Adversarial Network (DAN), which could promote the images from the source domain (well-labeled) to learn the style of target domain (unlabeled) without any annotation and preserve identity information from source domain. Then the generated images are employed to train the vehicle reID model by a proposed attention-based feature learning model with more reasonable styles. Through the proposed framework, the well-trained reID model has better domain adaptation ability for various scenes in real-world situations. Comprehensive experimental results have demonstrated that our proposed DAVR can achieve excellent performances on both VehicleID dataset and VeRi-776 dataset. |
Tasks | Domain Adaptation, Image-to-Image Translation, Vehicle Re-Identification |
Published | 2019-04-30 |
URL | http://arxiv.org/abs/1905.00006v1 |
http://arxiv.org/pdf/1905.00006v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-knowledge-learning-with-dual |
Repo | |
Framework | |
Bayesian leave-one-out cross-validation for large data
Title | Bayesian leave-one-out cross-validation for large data |
Authors | Måns Magnusson, Michael Riis Andersen, Johan Jonasson, Aki Vehtari |
Abstract | Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data. |
Tasks | Model Selection |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10679v1 |
http://arxiv.org/pdf/1904.10679v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-leave-one-out-cross-validation-for |
Repo | |
Framework | |
On Learning to Prove
Title | On Learning to Prove |
Authors | Daniel Huang |
Abstract | In this paper, we consider the problem of learning a first-order theorem prover that uses a representation of beliefs in mathematical claims to construct proofs. The inspiration for doing so comes from the practices of human mathematicians where “plausible reasoning” is applied in addition to deductive reasoning to find proofs. Towards this end, we introduce a representation of beliefs that assigns probabilities to the exhaustive and mutually exclusive first-order possibilities found in Hintikka’s theory of distributive normal forms. The representation supports Bayesian update, induces a distribution on statements that does not enforce that logically equivalent statements are assigned the same probability, and suggests an embedding of statements into an associated Hilbert space. We then examine conjecturing as model selection and an alternating-turn game of determining consistency. The game is amenable (in principle) to self-play training to learn beliefs and derive a prover that is complete when logical omniscience is attained and sound when beliefs are reasonable. The representation has super-exponential space requirements as a function of quantifier depth so the ideas in this paper should be taken as theoretical. We will comment on how abstractions can be used to control the space requirements at the cost of completeness. |
Tasks | Model Selection |
Published | 2019-04-24 |
URL | https://arxiv.org/abs/1904.11099v3 |
https://arxiv.org/pdf/1904.11099v3.pdf | |
PWC | https://paperswithcode.com/paper/on-learning-to-prove |
Repo | |
Framework | |
A deep learning based solution for construction equipment detection: from development to deployment
Title | A deep learning based solution for construction equipment detection: from development to deployment |
Authors | Saeed Arabi, Arya Haghighat, Anuj Sharma |
Abstract | This paper aims at providing researchers and engineering professionals with a practical and comprehensive deep learning based solution to detect construction equipment from the very first step of its development to the last one which is deployment. This paper focuses on the last step of deployment. The first phase of solution development, involved data preparation, model selection, model training, and model evaluation. The second phase of the study comprises of model optimization, application specific embedded system selection, and economic analysis. Several embedded systems were proposed and compared. The review of the results confirms superior real-time performance of the solutions with a consistent above 90% rate of accuracy. The current study validates the practicality of deep learning based object detection solutions for construction scenarios. Moreover, the detailed knowledge, presented in this study, can be employed for several purposes such as, safety monitoring, productivity assessments, and managerial decisions. |
Tasks | Model Selection, Object Detection |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.09021v1 |
http://arxiv.org/pdf/1904.09021v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-based-solution-for |
Repo | |
Framework | |
Forecasting with time series imaging
Title | Forecasting with time series imaging |
Authors | Xixi Li, Yanfei Kang, Feng Li |
Abstract | Feature-based time series representations have attracted substantial attention in a wide range of time series analysis methods. Recently, the use of time series features for forecast model averaging has been an emerging research focus in the forecasting community. Nonetheless, most of the existing approaches depend on the manual choice of an appropriate set of features. Exploiting machine learning methods to automatically extract features from time series becomes crucially important in state-of-the-art time series analysis. In this paper, we introduce an automated approach to extract time series features based on images. Time series images are first transformed into recurrence images, from which local features can be extracted using computer vision algorithms. The extracted features are used for forecast model averaging. Our experiments show that forecasting based on automatically extracted features, with less human intervention and a more comprehensive view of the raw time series data yields comparable performances with the best methods proposed in the largest forecasting competition dataset (M4). |
Tasks | Model Selection, Time Series, Time Series Analysis |
Published | 2019-04-17 |
URL | https://arxiv.org/abs/1904.08064v2 |
https://arxiv.org/pdf/1904.08064v2.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-with-time-series-imaging |
Repo | |
Framework | |
Deep Learning Inversion of Electrical Resistivity Data
Title | Deep Learning Inversion of Electrical Resistivity Data |
Authors | Bin Liu, Qian Guo, Shucai Li, Benchao Liu, Yuxiao Ren, Yonghao Pang, Lanbo Liu, Peng Jiang |
Abstract | The inverse problem of electrical resistivity surveys (ERS) is difficult because of its nonlinear and ill-posed nature. For this task, traditional linear inversion methods still face challenges such as sub-optimal approximation and initial model selection. Inspired by the remarkable non-linear mapping ability of deep learning approaches, in this paper we propose to build the mapping from apparent resistivity data (input) to resistivity model (output) directly by convolutional neural networks (CNNs). However, the vertically varying characteristic of patterns in the apparent resistivity data may cause ambiguity when using CNNs with the weight sharing and effective receptive field properties. To address the potential issue, we supply an additional tier feature map to CNNs to help it get aware of the relationship between input and output. Based on the prevalent U-Net architecture, we design our network (ERSInvNet) which can be trained end-to-end and reach real-time inference during testing. We further introduce depth weighting function and smooth constraint into loss function to improve inversion accuracy for the deep region and suppress false anomalies. Four groups of experiments are considered to demonstrate the feasibility and efficiency of the proposed methods. According to the comprehensive qualitative analysis and quantitative comparison, ERSInvNet with tier feature map, smooth constraints and depth weighting function together achieve the best performance. |
Tasks | Model Selection |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05265v1 |
http://arxiv.org/pdf/1904.05265v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-inversion-of-electrical |
Repo | |
Framework | |
Adaptive Sequential Machine Learning
Title | Adaptive Sequential Machine Learning |
Authors | Craig Wilson, Yuheng Bu, Venugopal Veeravalli |
Abstract | A framework previously introduced in [3] for solving a sequence of stochastic optimization problems with bounded changes in the minimizers is extended and applied to machine learning problems such as regression and classification. The stochastic optimization problems arising in these machine learning problems is solved using algorithms such as stochastic gradient descent (SGD). A method based on estimates of the change in the minimizers and properties of the optimization algorithm is introduced for adaptively selecting the number of samples at each time step to ensure that the excess risk, i.e., the expected gap between the loss achieved by the approximate minimizer produced by the optimization algorithm and the exact minimizer, does not exceed a target level. A bound is developed to show that the estimate of the change in the minimizers is non-trivial provided that the excess risk is small enough. Extensions relevant to the machine learning setting are considered, including a cost-based approach to select the number of samples with a cost budget over a fixed horizon, and an approach to applying cross-validation for model selection. Finally, experiments with synthetic and real data are used to validate the algorithms. |
Tasks | Model Selection, Stochastic Optimization |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02773v1 |
http://arxiv.org/pdf/1904.02773v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-sequential-machine-learning |
Repo | |
Framework | |
Transfer Learning in General Lensless Imaging through Scattering Media
Title | Transfer Learning in General Lensless Imaging through Scattering Media |
Authors | Yukuan Yang, Lei Deng, Peng Jiao, Yansong Chua, Jing Pei, Cheng Ma, Guoqi Li |
Abstract | Recently deep neural networks (DNNs) have been successfully introduced to the field of lensless imaging through scattering media. By solving an inverse problem in computational imaging, DNNs can overcome several shortcomings in the conventional lensless imaging through scattering media methods, namely, high cost, poor quality, complex control, and poor anti-interference. However, for training, a large number of training samples on various datasets have to be collected, with a DNN trained on one dataset generally performing poorly for recovering images from another dataset. The underlying reason is that lensless imaging through scattering media is a high dimensional regression problem and it is difficult to obtain an analytical solution. In this work, transfer learning is proposed to address this issue. Our main idea is to train a DNN on a relatively complex dataset using a large number of training samples and fine-tune the last few layers using very few samples from other datasets. Instead of the thousands of samples required to train from scratch, transfer learning alleviates the problem of costly data acquisition. Specifically, considering the difference in sample sizes and similarity among datasets, we propose two DNN architectures, namely LISMU-FCN and LISMU-OCN, and a balance loss function designed for balancing smoothness and sharpness. LISMU-FCN, with much fewer parameters, can achieve imaging across similar datasets while LISMU-OCN can achieve imaging across significantly different datasets. What’s more, we establish a set of simulation algorithms which are close to the real experiment, and it is of great significance and practical value in the research on lensless scattering imaging. In summary, this work provides a new solution for lensless imaging through scattering media using transfer learning in DNNs. |
Tasks | Transfer Learning |
Published | 2019-12-28 |
URL | https://arxiv.org/abs/1912.12419v1 |
https://arxiv.org/pdf/1912.12419v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-in-general-lensless-imaging |
Repo | |
Framework | |
Online Selection of CMA-ES Variants
Title | Online Selection of CMA-ES Variants |
Authors | Diederick Vermetten, Sander van Rijn, Thomas Bäck, Carola Doerr |
Abstract | In the field of evolutionary computation, one of the most challenging topics is algorithm selection. Knowing which heuristics to use for which optimization problem is key to obtaining high-quality solutions. We aim to extend this research topic by taking a first step towards a selection method for adaptive CMA-ES algorithms. We build upon the theoretical work done by van Rijn \textit{et al.} [PPSN’18], in which the potential of switching between different CMA-ES variants was quantified in the context of a modular CMA-ES framework. We demonstrate in this work that their proposed approach is not very reliable, in that implementing the suggested adaptive configurations does not yield the predicted performance gains. We propose a revised approach, which results in a more robust fit between predicted and actual performance. The adaptive CMA-ES approach obtains performance gains on 18 out of 24 tested functions of the BBOB benchmark, with stable advantages of up to 23%. An analysis of module activation indicates which modules are most crucial for the different phases of optimizing each of the 24 benchmark problems. The module activation also suggests that additional gains are possible when including the (B)IPOP modules, which we have excluded for this present work. |
Tasks | |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07801v1 |
http://arxiv.org/pdf/1904.07801v1.pdf | |
PWC | https://paperswithcode.com/paper/online-selection-of-cma-es-variants |
Repo | |
Framework | |
N-gram Statistical Stemmer for Bangla Corpus
Title | N-gram Statistical Stemmer for Bangla Corpus |
Authors | Rabeya Sadia, Md Ataur Rahman, Md Hanif Seddiqui |
Abstract | Stemming is a process that can be utilized to trim inflected words to stem or root form. It is useful for enhancing the retrieval effectiveness, especially for text search in order to solve the mismatch problems. Previous research on Bangla stemming mostly relied on eliminating multiple suffixes from a solitary word through a recursive rule based procedure to recover progressively applicable relative root. Our proposed system has enhanced the aforementioned exploration by actualizing one of the stemming algorithms called N-gram stemming. By utilizing an affiliation measure called dice coefficient, related sets of words are clustered depending on their character structure. The smallest word in one cluster may be considered as the stem. We additionally analyzed Affinity Propagation clustering algorithms with coefficient similarity as well as with median similarity. Our result indicates N-gram stemming techniques to be effective in general which gave us around 87% accurate clusters. |
Tasks | |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/1912.11612v1 |
https://arxiv.org/pdf/1912.11612v1.pdf | |
PWC | https://paperswithcode.com/paper/n-gram-statistical-stemmer-for-bangla-corpus |
Repo | |
Framework | |
Modeling treatment events in disease progression
Title | Modeling treatment events in disease progression |
Authors | Guanyang Wang, Yumeng Zhang, Yong Deng, Xuxin Huang, Łukasz Kidziński |
Abstract | Ability to quantify and predict progression of a disease is fundamental for selecting an appropriate treatment. Many clinical metrics cannot be acquired frequently either because of their cost (e.g. MRI, gait analysis) or because they are inconvenient or harmful to a patient (e.g. biopsy, x-ray). In such scenarios, in order to estimate individual trajectories of disease progression, it is advantageous to leverage similarities between patients, i.e. the covariance of trajectories, and find a latent representation of progression. Most of existing methods for estimating trajectories do not account for events in-between observations, what dramatically decreases their adequacy for clinical practice. In this study, we develop a machine learning framework named Coordinatewise-Soft-Impute (CSI) for analyzing disease progression from sparse observations in the presence of confounding events. CSI is guaranteed to converge to the global minimum of the corresponding optimization problem. Experimental results also demonstrates the effectiveness of CSI using both simulated and real dataset. |
Tasks | |
Published | 2019-05-26 |
URL | https://arxiv.org/abs/1905.10705v1 |
https://arxiv.org/pdf/1905.10705v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-treatment-events-in-disease |
Repo | |
Framework | |
Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning
Title | Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning |
Authors | Yang Liu, Yunan Luo, Yuanyi Zhong, Xi Chen, Qiang Liu, Jian Peng |
Abstract | Recent advances in deep reinforcement learning algorithms have shown great potential and success for solving many challenging real-world problems, including Go game and robotic applications. Usually, these algorithms need a carefully designed reward function to guide training in each time step. However, in real world, it is non-trivial to design such a reward function, and the only signal available is usually obtained at the end of a trajectory, also known as the episodic reward or return. In this work, we introduce a new algorithm for temporal credit assignment, which learns to decompose the episodic return back to each time-step in the trajectory using deep neural networks. With this learned reward signal, the learning efficiency can be substantially improved for episodic reinforcement learning. In particular, we find that expressive language models such as the Transformer can be adopted for learning the importance and the dependency of states in the trajectory, therefore providing high-quality and interpretable learned reward signals. We have performed extensive experiments on a set of MuJoCo continuous locomotive control tasks with only episodic returns and demonstrated the effectiveness of our algorithm. |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13420v1 |
https://arxiv.org/pdf/1905.13420v1.pdf | |
PWC | https://paperswithcode.com/paper/sequence-modeling-of-temporal-credit |
Repo | |
Framework | |