Paper Group ANR 963
forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction. Tutorial: Deriving the Standard Variational Autoencoder (VAE) Loss Function. Improving Back-Translation with Uncertainty-based Confidence Estimation. Concise Fuzzy System Modeling Integrating Soft Subspace Clustering and Sparse Learning. A prop …
forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction
Title | forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction |
Authors | Yunchuan Kong, Tianwei Yu |
Abstract | A unique challenge in predictive model building for omics data has been the small number of samples $(n)$ versus the large amount of features $(p)$. This “$n\ll p$” property brings difficulties for disease outcome classification using deep learning techniques. Sparse learning by incorporating external gene network information such as the graph-embedded deep feedforward network (GEDFN) model has been a solution to this issue. However, such methods require an existing feature graph, and potential mis-specification of the feature graph can be harmful on classification and feature selection. To address this limitation and develop a robust classification model without relying on external knowledge, we propose a \underline{for}est \underline{g}raph-\underline{e}mbedded deep feedforward \underline{net}work (forgeNet) model, to integrate the GEDFN architecture with a forest feature graph extractor, so that the feature graph can be learned in a supervised manner and specifically constructed for a given prediction task. To validate the method’s capability, we experimented the forgeNet model with both synthetic and real datasets. The resulting high classification accuracy suggests that the method is a valuable addition to sparse deep learning models for omics data. |
Tasks | Feature Selection, Sparse Learning |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09889v1 |
https://arxiv.org/pdf/1905.09889v1.pdf | |
PWC | https://paperswithcode.com/paper/forgenet-a-graph-deep-neural-network-model |
Repo | |
Framework | |
Tutorial: Deriving the Standard Variational Autoencoder (VAE) Loss Function
Title | Tutorial: Deriving the Standard Variational Autoencoder (VAE) Loss Function |
Authors | Stephen Odaibo |
Abstract | In Bayesian machine learning, the posterior distribution is typically computationally intractable, hence variational inference is often required. In this approach, an evidence lower bound on the log likelihood of data is maximized during training. Variational Autoencoders (VAE) are one important example where variational inference is utilized. In this tutorial, we derive the variational lower bound loss function of the standard variational autoencoder. We do so in the instance of a gaussian latent prior and gaussian approximate posterior, under which assumptions the Kullback-Leibler term in the variational lower bound has a closed form solution. We derive essentially everything we use along the way; everything from Bayes’ theorem to the Kullback-Leibler divergence. |
Tasks | |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.08956v1 |
https://arxiv.org/pdf/1907.08956v1.pdf | |
PWC | https://paperswithcode.com/paper/tutorial-deriving-the-standard-variational |
Repo | |
Framework | |
Improving Back-Translation with Uncertainty-based Confidence Estimation
Title | Improving Back-Translation with Uncertainty-based Confidence Estimation |
Authors | Shuo Wang, Yang Liu, Chao Wang, Huanbo Luan, Maosong Sun |
Abstract | While back-translation is simple and effective in exploiting abundant monolingual corpora to improve low-resource neural machine translation (NMT), the synthetic bilingual corpora generated by NMT models trained on limited authentic bilingual data are inevitably noisy. In this work, we propose to quantify the confidence of NMT model predictions based on model uncertainty. With word- and sentence-level confidence measures based on uncertainty, it is possible for back-translation to better cope with noise in synthetic bilingual corpora. Experiments on Chinese-English and English-German translation tasks show that uncertainty-based confidence estimation significantly improves the performance of back-translation. |
Tasks | Low-Resource Neural Machine Translation, Machine Translation |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00157v1 |
https://arxiv.org/pdf/1909.00157v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-back-translation-with-uncertainty |
Repo | |
Framework | |
Concise Fuzzy System Modeling Integrating Soft Subspace Clustering and Sparse Learning
Title | Concise Fuzzy System Modeling Integrating Soft Subspace Clustering and Sparse Learning |
Authors | Peng Xu, Zhaohong Deng, Chen Cui, Te Zhang, Kup-Sze Choi, Gu Suhang, Jun Wang, ShiTong Wang |
Abstract | The superior interpretability and uncertainty modeling ability of Takagi-Sugeno-Kang fuzzy system (TSK FS) make it possible to describe complex nonlinear systems intuitively and efficiently. However, classical TSK FS usually adopts the whole feature space of the data for model construction, which can result in lengthy rules for high-dimensional data and lead to degeneration in interpretability. Furthermore, for highly nonlinear modeling task, it is usually necessary to use a large number of rules which further weakens the clarity and interpretability of TSK FS. To address these issues, a concise zero-order TSK FS construction method, called ESSC-SL-CTSK-FS, is proposed in this paper by integrating the techniques of enhanced soft subspace clustering (ESSC) and sparse learning (SL). In this method, ESSC is used to generate the antecedents and various sparse subspace for different fuzzy rules, whereas SL is used to optimize the consequent parameters of the fuzzy rules, based on which the number of fuzzy rules can be effectively reduced. Finally, the proposed ESSC-SL-CTSK-FS method is used to construct con-cise zero-order TSK FS that can explain the scenes in high-dimensional data modeling more clearly and easily. Experiments are conducted on various real-world datasets to confirm the advantages. |
Tasks | Sparse Learning |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10683v1 |
http://arxiv.org/pdf/1904.10683v1.pdf | |
PWC | https://paperswithcode.com/paper/concise-fuzzy-system-modeling-integrating |
Repo | |
Framework | |
A proposed method to extract maximum possible power in the shortest time on solar PV arrays under partial shadings using metaheuristic algorithms
Title | A proposed method to extract maximum possible power in the shortest time on solar PV arrays under partial shadings using metaheuristic algorithms |
Authors | Reza Hedayati Majdabadi, Saeed Sharifian Khortoomi |
Abstract | The increasing use of fossil fuels to produce energy is leading to environmental problems. Hence, it has led the human society to move towards the use of renewable energies, including solar energy. In recent years, one of the most popular methods to gain energy is using photovoltaic arrays to produce solar energy. Skyscrapers and different weather conditions cause shadings on these PV arrays, which leads to less power generation. Various methods such as TCT and Sudoku patterns have been proposed to improve power generation for partial shading PV arrays, but these methods have some problems such as not generating maximum power and being designed for a specific dimension of PV arrays. Therefore, we proposed a metaheuristic algorithm-based approach to extract maximum possible power in the shortest possible time. In this paper, five algorithms which have proper results in most of the searching problems are chosen from different groups of metaheuristic algorithms. Also, four different standard shading patterns are used for more realistic analysis. Results show that the proposed method achieves better results in maximum power generation compared to TCT arrangement (18.53%) and Sudoku arrangement (4.93%). Also, the results show that GWO is the fastest metaheuristic algorithm to reach maximum output power in PV arrays under partial shading condition. Thus, the authors believe that by using metaheuristic algorithms, an efficient, reliable, and fast solution is reached to solve partial shading PV arrays problem |
Tasks | |
Published | 2019-03-15 |
URL | http://arxiv.org/abs/1903.06413v1 |
http://arxiv.org/pdf/1903.06413v1.pdf | |
PWC | https://paperswithcode.com/paper/a-proposed-method-to-extract-maximum-possible |
Repo | |
Framework | |
Sparse Learning in reproducing kernel Hilbert space
Title | Sparse Learning in reproducing kernel Hilbert space |
Authors | Xin He, Junhui Wang |
Abstract | Sparse learning aims to learn the sparse structure of the true target function from the collected data, which plays a crucial role in high dimensional data analysis. This article proposes a unified and universal method for learning sparsity of M-estimators within a rich family of loss functions in a reproducing kernel Hilbert space (RKHS). The family of loss functions interested is very rich, including most commonly used ones in literature. More importantly, the proposed method is motivated by some nice properties in the induced RKHS, and is computationally efficient for large-scale data, and can be further improved through parallel computing. The asymptotic estimation and selection consistencies of the proposed method are established for a general loss function under mild conditions. It works for general loss function, admits general dependence structure, allows for efficient computation, and with theoretical guarantee. The superior performance of our proposed method is also supported by a variety of simulated examples and a real application in the human breast cancer study (GSE20194). |
Tasks | Sparse Learning |
Published | 2019-01-03 |
URL | http://arxiv.org/abs/1901.00615v1 |
http://arxiv.org/pdf/1901.00615v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-learning-in-reproducing-kernel-hilbert |
Repo | |
Framework | |
The Nonstochastic Control Problem
Title | The Nonstochastic Control Problem |
Authors | Elad Hazan, Sham M. Kakade, Karan Singh |
Abstract | We consider the problem of controlling an unknown linear dynamical system in the presence of (nonstochastic) adversarial perturbations and adversarial convex loss functions. In contrast to classical control, the a priori determination of an optimal controller here is hindered by the latter’s dependence on the yet unknown perturbations and costs. Instead, we measure regret against an optimal linear policy in hindsight, and give the first efficient algorithm that guarantees a sublinear regret bound, scaling as T^{2/3}, in this setting. |
Tasks | |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.12178v2 |
https://arxiv.org/pdf/1911.12178v2.pdf | |
PWC | https://paperswithcode.com/paper/the-nonstochastic-control-problem |
Repo | |
Framework | |
Comb Convolution for Efficient Convolutional Architecture
Title | Comb Convolution for Efficient Convolutional Architecture |
Authors | Dandan Li, Yuan Zhou, Shuwei Huo, Sun-Yuan Kung |
Abstract | Convolutional neural networks (CNNs) are inherently suffering from massively redundant computation (FLOPs) due to the dense connection pattern between feature maps and convolution kernels. Recent research has investigated the sparse relationship between channels, however, they ignored the spatial relationship within a channel. In this paper, we present a novel convolutional operator, namely comb convolution, to exploit the intra-channel sparse relationship among neurons. The proposed convolutional operator eliminates nearly 50% of connections by inserting uniform mappings into standard convolutions and removing about half of spatial connections in convolutional layer. Notably, our work is orthogonal and complementary to existing methods that reduce channel-wise redundancy. Thus, it has great potential to further increase efficiency through integrating the comb convolution to existing architectures. Experimental results demonstrate that by simply replacing standard convolutions with comb convolutions on state-of-the-art CNN architectures (e.g., VGGNets, Xception and SE-Net), we can achieve 50% FLOPs reduction while still maintaining the accuracy. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00387v1 |
https://arxiv.org/pdf/1911.00387v1.pdf | |
PWC | https://paperswithcode.com/paper/comb-convolution-for-efficient-convolutional |
Repo | |
Framework | |
Data-driven Algorithm Selection and Parameter Tuning: Two Case studies in Optimization and Signal Processing
Title | Data-driven Algorithm Selection and Parameter Tuning: Two Case studies in Optimization and Signal Processing |
Authors | Jesus A. De Loera, Jamie Haddock, Anna Ma, Deanna Needell |
Abstract | Machine learning algorithms typically rely on optimization subroutines and are well-known to provide very effective outcomes for many types of problems. Here, we flip the reliance and ask the reverse question: can machine learning algorithms lead to more effective outcomes for optimization problems? Our goal is to train machine learning methods to automatically improve the performance of optimization and signal processing algorithms. As a proof of concept, we use our approach to improve two popular data processing subroutines in data science: stochastic gradient descent and greedy methods in compressed sensing. We provide experimental results that demonstrate the answer is ``yes’', machine learning algorithms do lead to more effective outcomes for optimization problems, and show the future potential for this research direction. | |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13404v2 |
https://arxiv.org/pdf/1905.13404v2.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-algorithm-selection-and-parameter |
Repo | |
Framework | |
A Sequential Set Generation Method for Predicting Set-Valued Outputs
Title | A Sequential Set Generation Method for Predicting Set-Valued Outputs |
Authors | Tian Gao, Jie Chen, Vijil Chenthamarakshan, Michael Witbrock |
Abstract | Consider a general machine learning setting where the output is a set of labels or sequences. This output set is unordered and its size varies with the input. Whereas multi-label classification methods seem a natural first resort, they are not readily applicable to set-valued outputs because of the growth rate of the output space; and because conventional sequence generation doesn’t reflect sets’ order-free nature. In this paper, we propose a unified framework–sequential set generation (SSG)–that can handle output sets of labels and sequences. SSG is a meta-algorithm that leverages any probabilistic learning method for label or sequence prediction, but employs a proper regularization such that a new label or sequence is generated repeatedly until the full set is produced. Though SSG is sequential in nature, it does not penalize the ordering of the appearance of the set elements and can be applied to a variety of set output problems, such as a set of classification labels or sequences. We perform experiments with both benchmark and synthetic data sets and demonstrate SSG’s strong performance over baseline methods. |
Tasks | Multi-Label Classification |
Published | 2019-03-12 |
URL | http://arxiv.org/abs/1903.05153v1 |
http://arxiv.org/pdf/1903.05153v1.pdf | |
PWC | https://paperswithcode.com/paper/a-sequential-set-generation-method-for |
Repo | |
Framework | |
Prediction and optimization of NaV1.7 inhibitors based on machine learning methods
Title | Prediction and optimization of NaV1.7 inhibitors based on machine learning methods |
Authors | Weikaixin Kong, Xinyu Tu, Zhengwei Xie, Zhuo Huang |
Abstract | We used machine learning methods to predict NaV1.7 inhibitors and found the model RF-CDK that performed best on the imbalanced dataset. Using the RF-CDK model for screening drugs, we got effective compounds K1. We use the cell patch clamp method to verify K1. However, because the model evaluation method in this article is not comprehensive enough, there is still a lot of research work to be performed, such as comparison with other existing methods. The target protein has multiple active sites and requires our further research. We need more detailed models to consider this biological process and compare it with the current results, which is an error in this article. So we want to withdraw this article. |
Tasks | |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1912.05903v2 |
https://arxiv.org/pdf/1912.05903v2.pdf | |
PWC | https://paperswithcode.com/paper/prediction-and-optimization-of-nav17 |
Repo | |
Framework | |
The True Sample Complexity of Identifying Good Arms
Title | The True Sample Complexity of Identifying Good Arms |
Authors | Julian Katz-Samuels, Kevin Jamieson |
Abstract | We consider two multi-armed bandit problems with $n$ arms: (i) given an $\epsilon > 0$, identify an arm with mean that is within $\epsilon$ of the largest mean and (ii) given a threshold $\mu_0$ and integer $k$, identify $k$ arms with means larger than $\mu_0$. Existing lower bounds and algorithms for the PAC framework suggest that both of these problems require $\Omega(n)$ samples. However, we argue that these definitions not only conflict with how these algorithms are used in practice, but also that these results disagree with intuition that says (i) requires only $\Theta(\frac{n}{m})$ samples where $m = { i : \mu_i > \max_{i \in [n]} \mu_i - \epsilon}$ and (ii) requires $\Theta(\frac{n}{m}k)$ samples where $m = { i : \mu_i > \mu_0 }$. We provide definitions that formalize these intuitions, obtain lower bounds that match the above sample complexities, and develop explicit, practical algorithms that achieve nearly matching upper bounds. |
Tasks | |
Published | 2019-06-15 |
URL | https://arxiv.org/abs/1906.06594v1 |
https://arxiv.org/pdf/1906.06594v1.pdf | |
PWC | https://paperswithcode.com/paper/the-true-sample-complexity-of-identifying |
Repo | |
Framework | |
Introduction to Concentration Inequalities
Title | Introduction to Concentration Inequalities |
Authors | Kumar Abhishek, Sneha Maheshwari, Sujit Gujar |
Abstract | In this report, we aim to exemplify concentration inequalities and provide easy to understand proofs for it. Our focus is on the inequalities which are helpful in the design and analysis of machine learning algorithms. |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.02884v1 |
https://arxiv.org/pdf/1910.02884v1.pdf | |
PWC | https://paperswithcode.com/paper/introduction-to-concentration-inequalities |
Repo | |
Framework | |
IIITM Face: A Database for Facial Attribute Detection in Constrained and Simulated Unconstrained Environments
Title | IIITM Face: A Database for Facial Attribute Detection in Constrained and Simulated Unconstrained Environments |
Authors | Raj Kuwar Gupta, Shresth Verma, KV Arya, Soumya Agarwal, Prince Gupta |
Abstract | This paper addresses the challenges of face attribute detection specifically in the Indian context. While there are numerous face datasets in unconstrained environments, none of them captures emotions in different face orientations. Moreover, there is an under-representation of people of Indian ethnicity in these datasets since they have been scraped from popular search engines. As a result, the performance of state-of-the-art techniques can’t be evaluated on Indian faces. In this work, we introduce a new dataset, IIITM Face, for the scientific community to address these challenges. Our dataset includes 107 participants who exhibit 6 emotions in 3 different face orientations. Each of these images is further labelled on attributes like gender, presence of moustache, beard or eyeglasses, clothes worn by the subjects and the density of their hair. Moreover, the images are captured in high resolution with specific background colors which can be easily replaced by cluttered backgrounds to simulate `in the Wild’ behaviour. We demonstrate the same by constructing IIITM Face-SUE. Both IIITM Face and IIITM Face-SUE have been benchmarked across key multi-label metrics for the research community to compare their results. | |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01219v1 |
https://arxiv.org/pdf/1910.01219v1.pdf | |
PWC | https://paperswithcode.com/paper/iiitm-face-a-database-for-facial-attribute |
Repo | |
Framework | |
A Deep Neural Network’s Loss Surface Contains Every Low-dimensional Pattern
Title | A Deep Neural Network’s Loss Surface Contains Every Low-dimensional Pattern |
Authors | Wojciech Marian Czarnecki, Simon Osindero, Razvan Pascanu, Max Jaderberg |
Abstract | The work “Loss Landscape Sightseeing with Multi-Point Optimization” (Skorokhodov and Burtsev, 2019) demonstrated that one can empirically find arbitrary 2D binary patterns inside loss surfaces of popular neural networks. In this paper we prove that: (i) this is a general property of deep universal approximators; and (ii) this property holds for arbitrary smooth patterns, for other dimensionalities, for every dataset, and any neural network that is sufficiently deep and wide. Our analysis predicts not only the existence of all such low-dimensional patterns, but also two other properties that were observed empirically: (i) that it is easy to find these patterns; and (ii) that they transfer to other data-sets (e.g. a test-set). |
Tasks | |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07559v2 |
https://arxiv.org/pdf/1912.07559v2.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-neural-networks-loss-surface-contains |
Repo | |
Framework | |