Paper Group ANR 468
PILAE: A Non-gradient Descent Learning Scheme for Deep Feedforward Neural Networks. Behavioral-clinical phenotyping with type 2 diabetes self-monitoring data. Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation. REST: Real-to-Synthetic Transform for Illumination Invariant Camera Localization. Satellite Image Forgery …
PILAE: A Non-gradient Descent Learning Scheme for Deep Feedforward Neural Networks
Title | PILAE: A Non-gradient Descent Learning Scheme for Deep Feedforward Neural Networks |
Authors | P. Guo, X. L. Zhou, K. Wang |
Abstract | In this work, a non-gradient descent learning scheme is proposed for deep feedforward neural networks (DNN). As we known, autoencoder can be used as the building blocks of the multi-layer perceptron (MLP) deep neural network. So, the MLP will be taken as an example to illustrate the proposed scheme of pseudoinverse learning algorithm for autoencoder (PILAE) training. The PILAE with low rank approximation is a non-gradient based learning algorithm, and the encoder weight matrix is set to be the low rank approximation of the pseudoinverse of the input matrix, while the decoder weight matrix is calculated by the pseudoinverse learning algorithm. It is worth to note that only few network structure hyperparameters need to be tuned. Hence, the proposed algorithm can be regarded as a quasi-automated training algorithm which can be utilized in autonomous machine learning research field. The experimental results show that the proposed learning scheme for DNN can achieve better performance on considering the tradeoff between training efficiency and classification accuracy. |
Tasks | |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01545v1 |
http://arxiv.org/pdf/1811.01545v1.pdf | |
PWC | https://paperswithcode.com/paper/pilae-a-non-gradient-descent-learning-scheme |
Repo | |
Framework | |
Behavioral-clinical phenotyping with type 2 diabetes self-monitoring data
Title | Behavioral-clinical phenotyping with type 2 diabetes self-monitoring data |
Authors | Matthew E. Levine, David J. Albers, Marissa Burgermaster, Patricia G. Davidson, Arlene M. Smaldone, Lena Mamykina |
Abstract | Objective: To evaluate unsupervised clustering methods for identifying individual-level behavioral-clinical phenotypes that relate personal biomarkers and behavioral traits in type 2 diabetes (T2DM) self-monitoring data. Materials and Methods: We used hierarchical clustering (HC) to identify groups of meals with similar nutrition and glycemic impact for 6 individuals with T2DM who collected self-monitoring data. We evaluated clusters on: 1) correspondence to gold standards generated by certified diabetes educators (CDEs) for 3 participants; 2) face validity, rated by CDEs, and 3) impact on CDEs’ ability to identify patterns for another 3 participants. Results: Gold standard (GS) included 9 patterns across 3 participants. Of these, all 9 were re-discovered using HC: 4 GS patterns were consistent with patterns identified by HC (over 50% of meals in a cluster followed the pattern); another 5 were included as sub-groups in broader clusers. 50% (9/18) of clusters were rated over 3 on 5-point Likert scale for validity, significance, and being actionable. After reviewing clusters, CDEs identified patterns that were more consistent with data (70% reduction in contradictions between patterns and participants’ records). Discussion: Hierarchical clustering of blood glucose and macronutrient consumption appears suitable for discovering behavioral-clinical phenotypes in T2DM. Most clusters corresponded to gold standard and were rated positively by CDEs for face validity. Cluster visualizations helped CDEs identify more robust patterns in nutrition and glycemic impact, creating new possibilities for visual analytic solutions. Conclusion: Machine learning methods can use diabetes self-monitoring data to create personalized behavioral-clinical phenotypes, which may prove useful for delivering personalized medicine. |
Tasks | |
Published | 2018-02-23 |
URL | http://arxiv.org/abs/1802.08761v1 |
http://arxiv.org/pdf/1802.08761v1.pdf | |
PWC | https://paperswithcode.com/paper/behavioral-clinical-phenotyping-with-type-2 |
Repo | |
Framework | |
Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation
Title | Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation |
Authors | Nikolay Bogoychev, Marcin Junczys-Dowmunt, Kenneth Heafield, Alham Fikri Aji |
Abstract | In order to extract the best possible performance from asynchronous stochastic gradient descent one must increase the mini-batch size and scale the learning rate accordingly. In order to achieve further speedup we introduce a technique that delays gradient updates effectively increasing the mini-batch size. Unfortunately with the increase of mini-batch size we worsen the stale gradient problem in asynchronous stochastic gradient descent (SGD) which makes the model convergence poor. We introduce local optimizers which mitigate the stale gradient problem and together with fine tuning our momentum we are able to train a shallow machine translation system 27% faster than an optimized baseline with negligible penalty in BLEU. |
Tasks | Machine Translation |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.08859v2 |
http://arxiv.org/pdf/1808.08859v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-asynchronous-stochastic-gradient |
Repo | |
Framework | |
REST: Real-to-Synthetic Transform for Illumination Invariant Camera Localization
Title | REST: Real-to-Synthetic Transform for Illumination Invariant Camera Localization |
Authors | Sota Shoman, Tomohiro Mashita, Alexander Plopski, Photchara Ratsamee, Yuki Uranishi, Haruo Takemura |
Abstract | Accurate camera localization is an essential part of tracking systems. However, localization results are greatly affected by illumination. Including data collected under various lighting conditions can improve the robustness of the localization algorithm to lighting variation. However, this is very tedious and time consuming. By using synthesized images it is possible to easily accumulate a large variety of views under varying illumination and weather conditions. Despite continuously improving processing power and rendering algorithms, synthesized images do not perfectly match real images of the same scene, i.e. there exists a gap between real and synthesized images that also affects the accuracy of camera localization. To reduce the impact of this gap, we introduce “REal-to-Synthetic Transform (REST).” REST is an autoencoder-like network that converts real features to their synthetic counterpart. The converted features can then be matched against the accumulated database for robust camera localization. In our experiments REST improved feature matching accuracy under variable lighting conditions by approximately 30%. Moreover, our system outperforms state of the art CNN-based camera localization methods trained with synthetic images. We believe our method could be used to initialize local tracking and to simplify data accumulation for lighting robust localization. |
Tasks | Camera Localization |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09448v1 |
http://arxiv.org/pdf/1803.09448v1.pdf | |
PWC | https://paperswithcode.com/paper/rest-real-to-synthetic-transform-for |
Repo | |
Framework | |
Satellite Image Forgery Detection and Localization Using GAN and One-Class Classifier
Title | Satellite Image Forgery Detection and Localization Using GAN and One-Class Classifier |
Authors | Sri Kalyan Yarlagadda, David Güera, Paolo Bestagini, Fengqing Maggie Zhu, Stefano Tubaro, Edward J. Delp |
Abstract | Current satellite imaging technology enables shooting high-resolution pictures of the ground. As any other kind of digital images, overhead pictures can also be easily forged. However, common image forensic techniques are often developed for consumer camera images, which strongly differ in their nature from satellite ones (e.g., compression schemes, post-processing, sensors, etc.). Therefore, many accurate state-of-the-art forensic algorithms are bound to fail if blindly applied to overhead image analysis. Development of novel forensic tools for satellite images is paramount to assess their authenticity and integrity. In this paper, we propose an algorithm for satellite image forgery detection and localization. Specifically, we consider the scenario in which pixels within a region of a satellite image are replaced to add or remove an object from the scene. Our algorithm works under the assumption that no forged images are available for training. Using a generative adversarial network (GAN), we learn a feature representation of pristine satellite images. A one-class support vector machine (SVM) is trained on these features to determine their distribution. Finally, image forgeries are detected as anomalies. The proposed algorithm is validated against different kinds of satellite images containing forgeries of different size and shape. |
Tasks | One-class classifier |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04881v1 |
http://arxiv.org/pdf/1802.04881v1.pdf | |
PWC | https://paperswithcode.com/paper/satellite-image-forgery-detection-and |
Repo | |
Framework | |
Syntactico-Semantic Reasoning using PCFG, MEBN & PP Attachment Ambiguity
Title | Syntactico-Semantic Reasoning using PCFG, MEBN & PP Attachment Ambiguity |
Authors | Shrinivasan R Patnaik Patnaikuni, Dr. Sachin R Gengaje |
Abstract | Probabilistic context free grammars (PCFG) have been the core of the probabilistic reasoning based parsers for several years especially in the context of the NLP. Multi entity bayesian networks (MEBN) a First Order Logic probabilistic reasoning methodology is widely adopted and used method for uncertainty reasoning. Further upper ontology like Probabilistic Ontology Web Language (PR-OWL) built using MEBN takes care of probabilistic ontologies which model and capture the uncertainties inherent in the domain’s semantic information. The paper attempts to establish a link between probabilistic reasoning in PCFG and MEBN by proposing a formal description of PCFG driven by MEBN leading to usage of PR-OWL modeled ontologies in PCFG parsers. Furthermore, the paper outlines an approach to resolve prepositional phrase (PP) attachment ambiguity using the proposed mapping between PCFG and MEBN. |
Tasks | |
Published | 2018-09-20 |
URL | http://arxiv.org/abs/1809.07607v2 |
http://arxiv.org/pdf/1809.07607v2.pdf | |
PWC | https://paperswithcode.com/paper/syntactico-semantic-reasoning-using-pcfg-mebn |
Repo | |
Framework | |
A New SVDD-Based Multivariate Non-parametric Process Capability Index
Title | A New SVDD-Based Multivariate Non-parametric Process Capability Index |
Authors | Deovrat Kakde, Arin Chaudhuri, Diana Shaw |
Abstract | Process capability index (PCI) is a commonly used statistic to measure ability of a process to operate within the given specifications or to produce products which meet the required quality specifications. PCI can be univariate or multivariate depending upon the number of process specifications or quality characteristics of interest. Most PCIs make distributional assumptions which are often unrealistic in practice. This paper proposes a new multivariate non-parametric process capability index. This index can be used when distribution of the process or quality parameters is either unknown or does not follow commonly used distributions such as multivariate normal. |
Tasks | |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05561v1 |
http://arxiv.org/pdf/1811.05561v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-svdd-based-multivariate-non-parametric |
Repo | |
Framework | |
An AI-driven Malfunction Detection Concept for NFV Instances in 5G
Title | An AI-driven Malfunction Detection Concept for NFV Instances in 5G |
Authors | Julian Ahrens, Mathias Strufe, Lia Ahrens, Hans D. Schotten |
Abstract | Efficient network management is one of the key challenges of the constantly growing and increasingly complex wide area networks (WAN). The paradigm shift towards virtualized (NFV) and software defined networks (SDN) in the next generation of mobile networks (5G), as well as the latest scientific insights in the field of Artificial Intelligence (AI) enable the transition from manually managed networks nowadays to fully autonomic and dynamic self-organized networks (SON). This helps to meet the KPIs and reduce at the same time operational costs (OPEX). In this paper, an AI driven concept is presented for the malfunction detection in NFV applications with the help of semi-supervised learning. For this purpose, a profile of the application under test is created. This profile then is used as a reference to detect abnormal behaviour. For example, if there is a bug in the updated version of the app, it is now possible to react autonomously and roll-back the NFV app to a previous version in order to avoid network outages. |
Tasks | |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05796v1 |
http://arxiv.org/pdf/1804.05796v1.pdf | |
PWC | https://paperswithcode.com/paper/an-ai-driven-malfunction-detection-concept |
Repo | |
Framework | |
Greedy Graph Searching for Vascular Tracking in Angiographic Image Sequences
Title | Greedy Graph Searching for Vascular Tracking in Angiographic Image Sequences |
Authors | Huihui Fang, Jian Yang, Jianjun Zhu, Danni Ai, Yong Huang, Yurong Jiang, Hong Song, Yongtian Wang |
Abstract | Vascular tracking of angiographic image sequences is one of the most clinically important tasks in the diagnostic assessment and interventional guidance of cardiac disease. However, this task can be challenging to accomplish because of unsatisfactory angiography image quality and complex vascular structures. Thus, this study proposed a new greedy graph search-based method for vascular tracking. Each vascular branch is separated from the vasculature and is tracked independently. Then, all branches are combined using topology optimization, thereby resulting in complete vasculature tracking. A gray-based image registration method was applied to determine the tracking range, and the deformation field between two consecutive frames was calculated. The vascular branch was described using a vascular centerline extraction method with multi-probability fusion-based topology optimization. We introduce an undirected acyclic graph establishment technique. A greedy search method was proposed to acquire all possible paths in the graph that might match the tracked vascular branch. The final tracking result was selected by branch matching using dynamic time warping with a DAISY descriptor. The solution to the problem reflected both the spatial and textural information between successive frames. Experimental results demonstrated that the proposed method was effective and robust for vascular tracking, attaining a F1 score of 0.89 on a single branch dataset and 0.88 on a vessel tree dataset. This approach provided a universal solution to address the problem of filamentary structure tracking. |
Tasks | Image Registration |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.09940v1 |
http://arxiv.org/pdf/1805.09940v1.pdf | |
PWC | https://paperswithcode.com/paper/greedy-graph-searching-for-vascular-tracking |
Repo | |
Framework | |
MEBN-RM: A Mapping between Multi-Entity Bayesian Network and Relational Model
Title | MEBN-RM: A Mapping between Multi-Entity Bayesian Network and Relational Model |
Authors | Cheol Young Park, Kathryn Blackmond Laskey |
Abstract | Multi-Entity Bayesian Network (MEBN) is a knowledge representation formalism combining Bayesian Networks (BN) with First-Order Logic (FOL). MEBN has sufficient expressive power for general-purpose knowledge representation and reasoning. Developing a MEBN model to support a given application is a challenge, requiring definition of entities, relationships, random variables, conditional dependence relationships, and probability distributions. When available, data can be invaluable both to improve performance and to streamline development. By far the most common format for available data is the relational database (RDB). Relational databases describe and organize data according to the Relational Model (RM). Developing a MEBN model from data stored in an RDB therefore requires mapping between the two formalisms. This paper presents MEBN-RM, a set of mapping rules between key elements of MEBN and RM. We identify links between the two languages (RM and MEBN) and define four levels of mapping from elements of RM to elements of MEBN. These definitions are implemented in the MEBN-RM algorithm, which converts a relational schema in RM to a partial MEBN model. Through this research, the software has been released as a MEBN-RM open-source software tool. The method is illustrated through two example use cases using MEBN-RM to develop MEBN models: a Critical Infrastructure Defense System and a Smart Manufacturing System. |
Tasks | |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02455v2 |
http://arxiv.org/pdf/1806.02455v2.pdf | |
PWC | https://paperswithcode.com/paper/mebn-rm-a-mapping-between-multi-entity |
Repo | |
Framework | |
Task-generalizable Adversarial Attack based on Perceptual Metric
Title | Task-generalizable Adversarial Attack based on Perceptual Metric |
Authors | Muzammal Naseer, Salman H. Khan, Shafin Rahman, Fatih Porikli |
Abstract | Deep neural networks (DNNs) can be easily fooled by adding human imperceptible perturbations to the images. These perturbed images are known as `adversarial examples’ and pose a serious threat to security and safety critical systems. A litmus test for the strength of adversarial examples is their transferability across different DNN models in a black box setting (i.e. when the target model’s architecture and parameters are not known to attacker). Current attack algorithms that seek to enhance adversarial transferability work on the decision level i.e. generate perturbations that alter the network decisions. This leads to two key limitations: (a) An attack is dependent on the task-specific loss function (e.g. softmax cross-entropy for object recognition) and therefore does not generalize beyond its original task. (b) The adversarial examples are specific to the network architecture and demonstrate poor transferability to other network architectures. We propose a novel approach to create adversarial examples that can broadly fool different networks on multiple tasks. Our approach is based on the following intuition: “Perpetual metrics based on neural network features are highly generalizable and show excellent performance in measuring and stabilizing input distortions. Therefore an ideal attack that creates maximum distortions in the network feature space should realize highly transferable examples”. We report extensive experiments to show how adversarial examples generalize across multiple networks for classification, object detection and segmentation tasks. | |
Tasks | Adversarial Attack, Object Detection, Object Recognition |
Published | 2018-11-22 |
URL | http://arxiv.org/abs/1811.09020v3 |
http://arxiv.org/pdf/1811.09020v3.pdf | |
PWC | https://paperswithcode.com/paper/distorting-neural-representations-to-generate |
Repo | |
Framework | |
Accelerating SGD with momentum for over-parameterized learning
Title | Accelerating SGD with momentum for over-parameterized learning |
Authors | Chaoyue Liu, Mikhail Belkin |
Abstract | Nesterov SGD is widely used for training modern neural networks and other machine learning models. Yet, its advantages over SGD have not been theoretically clarified. Indeed, as we show in our paper, both theoretically and empirically, Nesterov SGD with any parameter selection does not in general provide acceleration over ordinary SGD. Furthermore, Nesterov SGD may diverge for step sizes that ensure convergence of ordinary SGD. This is in contrast to the classical results in the deterministic scenario, where the same step size ensures accelerated convergence of the Nesterov’s method over optimal gradient descent. To address the non-acceleration issue, we introduce a compensation term to Nesterov SGD. The resulting algorithm, which we call MaSS, converges for same step sizes as SGD. We prove that MaSS obtains an accelerated convergence rates over SGD for any mini-batch size in the linear setting. For full batch, the convergence rate of MaSS matches the well-known accelerated rate of the Nesterov’s method. We also analyze the practically important question of the dependence of the convergence rate and optimal hyper-parameters on the mini-batch size, demonstrating three distinct regimes: linear scaling, diminishing returns and saturation. Experimental evaluation of MaSS for several standard architectures of deep networks, including ResNet and convolutional networks, shows improved performance over SGD, Nesterov SGD and Adam. |
Tasks | |
Published | 2018-10-31 |
URL | https://arxiv.org/abs/1810.13395v5 |
https://arxiv.org/pdf/1810.13395v5.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-stochastic-training-for-over |
Repo | |
Framework | |
Intermediate Level Adversarial Attack for Enhanced Transferability
Title | Intermediate Level Adversarial Attack for Enhanced Transferability |
Authors | Qian Huang, Zeqi Gu, Isay Katsman, Horace He, Pian Pawakapan, Zhiqiu Lin, Serge Belongie, Ser-Nam Lim |
Abstract | Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples may be overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box transfer attacks to other target models. This leads us to introduce the Intermediate Level Attack (ILA), which attempts to fine-tune an existing adversarial example for greater black-box transferability by increasing its perturbation on a pre-specified layer of the source model. We show that our method can effectively achieve this goal and that we can decide a nearly-optimal layer of the source model to perturb without any knowledge of the target models. |
Tasks | Adversarial Attack |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08458v1 |
http://arxiv.org/pdf/1811.08458v1.pdf | |
PWC | https://paperswithcode.com/paper/intermediate-level-adversarial-attack-for |
Repo | |
Framework | |
Testability of high-dimensional linear models with non-sparse structures
Title | Testability of high-dimensional linear models with non-sparse structures |
Authors | Jelena Bradic, Jianqing Fan, Yinchu Zhu |
Abstract | Understanding statistical inference under possibly non-sparse high-dimensional models has gained much interest recently. For a given component of the regression coefficient, we show that the difficulty of the problem depends on the sparsity of the corresponding row of the precision matrix of the covariates, not the sparsity of the regression coefficients. We develop new concepts of uniform and essentially uniform non-testability that allow the study of limitations of tests across a broad set of alternatives. Uniform non-testability identifies a collection of alternatives such that the power of any test, against any alternative in the group, is asymptotically at most equal to the nominal size. Implications of the new constructions include new minimax testability results that, in sharp contrast to the current results, do not depend on the sparsity of the regression parameters. We identify new tradeoffs between testability and feature correlation. In particular, we show that, in models with weak feature correlations, minimax lower bound can be attained by a test whose power has the $\sqrt{n}$ rate, regardless of the size of the model sparsity. |
Tasks | |
Published | 2018-02-26 |
URL | https://arxiv.org/abs/1802.09117v3 |
https://arxiv.org/pdf/1802.09117v3.pdf | |
PWC | https://paperswithcode.com/paper/testability-of-high-dimensional-linear-models |
Repo | |
Framework | |
Never look back - A modified EnKF method and its application to the training of neural networks without back propagation
Title | Never look back - A modified EnKF method and its application to the training of neural networks without back propagation |
Authors | Eldad Haber, Felix Lucka, Lars Ruthotto |
Abstract | In this work, we present a new derivative-free optimization method and investigate its use for training neural networks. Our method is motivated by the Ensemble Kalman Filter (EnKF), which has been used successfully for solving optimization problems that involve large-scale, highly nonlinear dynamical systems. A key benefit of the EnKF method is that it requires only the evaluation of the forward propagation but not its derivatives. Hence, in the context of neural networks, it alleviates the need for back propagation and reduces the memory consumption dramatically. However, the method is not a pure “black-box” global optimization heuristic as it efficiently utilizes the structure of typical learning problems. Promising first results of the EnKF for training deep neural networks have been presented recently by Kovachki and Stuart. We propose an important modification of the EnKF that enables us to prove convergence of our method to the minimizer of a strongly convex function. Our method also bears similarity with implicit filtering and we demonstrate its potential for minimizing highly oscillatory functions using a simple example. Further, we provide numerical examples that demonstrate the potential of our method for training deep neural networks. |
Tasks | |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.08034v2 |
http://arxiv.org/pdf/1805.08034v2.pdf | |
PWC | https://paperswithcode.com/paper/never-look-back-a-modified-enkf-method-and |
Repo | |
Framework | |