January 31, 2020

3497 words 17 mins read

Paper Group ANR 53

MUTE: Data-Similarity Driven Multi-hot Target Encoding for Neural Network Design. Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds. Audits as Evidence: Experiments, Ensembles, and Enforcement. Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machin …

MUTE: Data-Similarity Driven Multi-hot Target Encoding for Neural Network Design


Title	MUTE: Data-Similarity Driven Multi-hot Target Encoding for Neural Network Design
Authors	Mayoore S. Jaiswal, Bumboo Kang, Jinho Lee, Minsik Cho
Abstract	Target encoding is an effective technique to deliver better performance for conventional machine learning methods, and recently, for deep neural networks as well. However, the existing target encoding approaches require significant increase in the learning capacity, thus demand higher computation power and more training data. In this paper, we present a novel and efficient target encoding scheme, MUTE to improve both generalizability and robustness of a target model by understanding the inter-class characteristics of a target dataset. By extracting the confusion level between the target classes in a dataset, MUTE strategically optimizes the Hamming distances among target encoding. Such optimized target encoding offers higher classification strength for neural network models with negligible computation overhead and without increasing the model size. When MUTE is applied to the popular image classification networks and datasets, our experimental results show that MUTE offers better generalization and defense against the noises and adversarial attacks over the existing solutions.
Tasks	Image Classification
Published	2019-10-15
URL	https://arxiv.org/abs/1910.07042v1
PDF	https://arxiv.org/pdf/1910.07042v1.pdf
PWC	https://paperswithcode.com/paper/mute-data-similarity-driven-multi-hot-target
Repo
Framework

Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds


Title	Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds
Authors	Zihao Liu, Xiaowei Xu, Tao Liu, Qi Liu, Yanzhi Wang, Yiyu Shi, Wujie Wen, Meiping Huang, Haiyun Yuan, Jian Zhuang
Abstract	Cloud based medical image analysis has become popular recently due to the high computation complexities of various deep neural network (DNN) based frameworks and the increasingly large volume of medical images that need to be processed. It has been demonstrated that for medical images the transmission from local to clouds is much more expensive than the computation in the clouds itself. Towards this, 3D image compression techniques have been widely applied to reduce the data traffic. However, most of the existing image compression techniques are developed around human vision, i.e., they are designed to minimize distortions that can be perceived by human eyes. In this paper we will use deep learning based medical image segmentation as a vehicle and demonstrate that interestingly, machine and human view the compression quality differently. Medical images compressed with good quality w.r.t. human vision may result in inferior segmentation accuracy. We then design a machine vision oriented 3D image compression framework tailored for segmentation using DNNs. Our method automatically extracts and retains image features that are most important to the segmentation. Comprehensive experiments on widely adopted segmentation frameworks with HVSMR 2016 challenge dataset show that our method can achieve significantly higher segmentation accuracy at the same compression rate, or much better compression rate under the same segmentation accuracy, when compared with the existing JPEG 2000 method. To the best of the authors’ knowledge, this is the first machine vision guided medical image compression framework for segmentation in the clouds.
Tasks	Image Compression, Medical Image Segmentation, Semantic Segmentation
Published	2019-04-09
URL	http://arxiv.org/abs/1904.08487v1
PDF	http://arxiv.org/pdf/1904.08487v1.pdf
PWC	https://paperswithcode.com/paper/190408487
Repo
Framework

Audits as Evidence: Experiments, Ensembles, and Enforcement


Title	Audits as Evidence: Experiments, Ensembles, and Enforcement
Authors	Patrick Kline, Christopher Walters
Abstract	We develop tools for utilizing correspondence experiments to detect illegal discrimination by individual employers. Employers violate US employment law if their propensity to contact applicants depends on protected characteristics such as race or sex. We establish identification of higher moments of the causal effects of protected characteristics on callback rates as a function of the number of fictitious applications sent to each job ad. These moments are used to bound the fraction of jobs that illegally discriminate. Applying our results to three experimental datasets, we find evidence of significant employer heterogeneity in discriminatory behavior, with the standard deviation of gaps in job-specific callback probabilities across protected groups averaging roughly twice the mean gap. In a recent experiment manipulating racially distinctive names, we estimate that at least 85% of jobs that contact both of two white applications and neither of two black applications are engaged in illegal discrimination. To assess the tradeoff between type I and II errors presented by these patterns, we consider the performance of a series of decision rules for investigating suspicious callback behavior under a simple two-type model that rationalizes the experimental data. Though, in our preferred specification, only 17% of employers are estimated to discriminate on the basis of race, we find that an experiment sending 10 applications to each job would enable accurate detection of 7-10% of discriminators while falsely accusing fewer than 0.2% of non-discriminators. A minimax decision rule acknowledging partial identification of the joint distribution of callback rates yields higher error rates but more investigations than our baseline two-type model. Our results suggest illegal labor market discrimination can be reliably monitored with relatively small modifications to existing audit designs.
Tasks
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06622v2
PDF	https://arxiv.org/pdf/1907.06622v2.pdf
PWC	https://paperswithcode.com/paper/audits-as-evidence-experiments-ensembles-and
Repo
Framework

Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning


Title	Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning
Authors	Artur d’Avila Garcez, Marco Gori, Luis C. Lamb, Luciano Serafini, Michael Spranger, Son N. Tran
Abstract	Current advances in Artificial Intelligence and machine learning in general, and deep learning in particular have reached unprecedented impact not only across research communities, but also over popular media channels. However, concerns about interpretability and accountability of AI have been raised by influential thinkers. In spite of the recent impact of AI, several works have identified the need for principled knowledge representation and reasoning mechanisms integrated with deep learning-based systems to provide sound and explainable models for such systems. Neural-symbolic computing aims at integrating, as foreseen by Valiant, two most fundamental cognitive abilities: the ability to learn from the environment, and the ability to reason from what has been learned. Neural-symbolic computing has been an active topic of research for many years, reconciling the advantages of robust learning in neural networks and reasoning and interpretability of symbolic representation. In this paper, we survey recent accomplishments of neural-symbolic computing as a principled methodology for integrated machine learning and reasoning. We illustrate the effectiveness of the approach by outlining the main characteristics of the methodology: principled integration of neural learning with symbolic knowledge representation and reasoning allowing for the construction of explainable AI systems. The insights provided by neural-symbolic computing shed new light on the increasingly prominent need for interpretable and accountable AI systems.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06088v1
PDF	https://arxiv.org/pdf/1905.06088v1.pdf
PWC	https://paperswithcode.com/paper/neural-symbolic-computing-an-effective
Repo
Framework

On Transformations in Stochastic Gradient MCMC


Title	On Transformations in Stochastic Gradient MCMC
Authors	Soma Yokoi, Takuma Otsuka, Issei Sato
Abstract	Stochastic gradient Langevin dynamics (SGLD) is a computationally efficient sampler for Bayesian posterior inference given a large scale dataset. Although SGLD is designed for unbounded random variables, many practical models incorporate variables with boundaries such as non-negative ones or those in a finite interval. To bridge this gap, we consider mapping unbounded samples into the target interval. This paper reveals that several mapping approaches commonly used in the literature produces erroneous samples from theoretical and empirical perspectives. We show that the change of random variable using an invertible Lipschitz mapping function overcomes the pitfall as well as attains the weak convergence. Experiments demonstrate its efficacy for widely-used models with bounded latent variables including Bayesian non-negative matrix factorization and binary neural networks.
Tasks
Published	2019-03-07
URL	https://arxiv.org/abs/1903.02750v2
PDF	https://arxiv.org/pdf/1903.02750v2.pdf
PWC	https://paperswithcode.com/paper/on-transformations-in-stochastic-gradient
Repo
Framework

Reducing Uncertainty in Undersampled MRI Reconstruction with Active Acquisition


Title	Reducing Uncertainty in Undersampled MRI Reconstruction with Active Acquisition
Authors	Zizhao Zhang, Adriana Romero, Matthew J. Muckley, Pascal Vincent, Lin Yang, Michal Drozdzal
Abstract	The goal of MRI reconstruction is to restore a high fidelity image from partially observed measurements. This partial view naturally induces reconstruction uncertainty that can only be reduced by acquiring additional measurements. In this paper, we present a novel method for MRI reconstruction that, at inference time, dynamically selects the measurements to take and iteratively refines the prediction in order to best reduce the reconstruction error and, thus, its uncertainty. We validate our method on a large scale knee MRI dataset, as well as on ImageNet. Results show that (1) our system successfully outperforms active acquisition baselines; (2) our uncertainty estimates correlate with error maps; and (3) our ResNet-based architecture surpasses standard pixel-to-pixel models in the task of MRI reconstruction. The proposed method not only shows high-quality reconstructions but also paves the road towards more applicable solutions for accelerating MRI.
Tasks
Published	2019-02-08
URL	http://arxiv.org/abs/1902.03051v1
PDF	http://arxiv.org/pdf/1902.03051v1.pdf
PWC	https://paperswithcode.com/paper/reducing-uncertainty-in-undersampled-mri
Repo
Framework

An anomaly prediction framework for financial IT systems using hybrid machine learning methods


Title	An anomaly prediction framework for financial IT systems using hybrid machine learning methods
Authors	Jingwen Wang, Jingxin Liu, Juntao Pu, Qinghong Yang, Zhongchen Miao, Jian Gao, You Song
Abstract	In financial field, a robust software system is of vital importance to ensure the smooth operation of financial transactions. However, many financial corporations still depend on operators to identify and eliminate the system failures when financial software systems break down. This traditional operation method is time consuming and extremely inefficient. To improve the efficiency and accuracy of system failure detection and thereby reduce the impact of system failures on financial services, we propose a novel machine learning-based framework to predict the occurrence of system exceptions and failures in a financial software system. In particular, we first extract rich information from system logs and eliminate noises in the data. Then the cleaned data is leveraged as the input of our proposed anomaly prediction framework which consists of three modules: key performance indicator(KPI) data prediction module, anomaly identification module and severity classification module. Notably, we design a hierarchical architecture of alarm classifiers and try to alleviate the influence of class-imbalance problem on the overall performance. Empirically, the experimental results demonstrate the superior performance of our proposed method on a real-world financial software system log data set.
Tasks	Time Series, Time Series Prediction
Published	2019-07-30
URL	https://arxiv.org/abs/1907.12778v3
PDF	https://arxiv.org/pdf/1907.12778v3.pdf
PWC	https://paperswithcode.com/paper/an-alarm-prediction-framework-for-financial
Repo
Framework

Leveraging Two Reference Functions in Block Bregman Proximal Gradient Descent for Non-convex and Non-Lipschitz Problems


Title	Leveraging Two Reference Functions in Block Bregman Proximal Gradient Descent for Non-convex and Non-Lipschitz Problems
Authors	Tianxiang Gao, Songtao Lu, Jia Liu, Chris Chu
Abstract	In the applications of signal processing and data analytics, there is a wide class of non-convex problems whose objective function is freed from the common global Lipschitz continuous gradient assumption (e.g., the nonnegative matrix factorization (NMF) problem). Recently, this type of problem with some certain special structures has been solved by Bregman proximal gradient (BPG). This inspires us to propose a new Block-wise two-references Bregman proximal gradient (B2B) method, which adopts two reference functions so that a closed-form solution in the Bregman projection is obtained. Based on the relative smoothness, we prove the global convergence of the proposed algorithms for various block selection rules. In particular, we establish the global convergence rate of $O(\frac{\sqrt{s}}{\sqrt{k}})$ for the greedy and randomized block updating rule for B2B, which is $O(\sqrt{s})$ times faster than the cyclic variant, i.e., $O(\frac{s}{\sqrt{k}} )$, where $s$ is the number of blocks, and $k$ is the number of iterations. Multiple numerical results are provided to illustrate the superiority of the proposed B2B compared to the state-of-the-art works in solving NMF problems.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07527v1
PDF	https://arxiv.org/pdf/1912.07527v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-two-reference-functions-in-block
Repo
Framework

Mechanisms of Artistic Creativity in Deep Learning Neural Networks


Title	Mechanisms of Artistic Creativity in Deep Learning Neural Networks
Authors	Lonce Wyse
Abstract	The generative capabilities of deep learning neural networks (DNNs) have been attracting increasing attention for both the remarkable artifacts they produce, but also because of the vast conceptual difference between how they are programmed and what they do. DNNs are ‘black boxes’ where high-level behavior is not explicitly programmed, but emerges from the complex interactions of thousands or millions of simple computational elements. Their behavior is often described in anthropomorphic terms that can be misleading, seem magical, or stoke fears of an imminent singularity in which machines become ‘more’ than human. In this paper, we examine 5 distinct behavioral characteristics associated with creativity, and provide an example of a mechanisms from generative deep learning architectures that give rise to each these characteristics. All 5 emerge from machinery built for purposes other than the creative characteristics they exhibit, mostly classification. These mechanisms of creative generative capabilities thus demonstrate a deep kinship to computational perceptual processes. By understanding how these different behaviors arise, we hope to on one hand take the magic out of anthropomorphic descriptions, but on the other, to build a deeper appreciation of machinic forms of creativity on their own terms that will allow us to nurture their further development.
Tasks
Published	2019-06-30
URL	https://arxiv.org/abs/1907.00321v1
PDF	https://arxiv.org/pdf/1907.00321v1.pdf
PWC	https://paperswithcode.com/paper/mechanisms-of-artistic-creativity-in-deep
Repo
Framework

Computational Limitations in Robust Classification and Win-Win Results


Title	Computational Limitations in Robust Classification and Win-Win Results
Authors	Akshay Degwekar, Preetum Nakkiran, Vinod Vaikuntanathan
Abstract	We continue the study of statistical/computational tradeoffs in learning robust classifiers, following the recent work of Bubeck, Lee, Price and Razenshteyn who showed examples of classification tasks where (a) an efficient robust classifier exists, in the small-perturbation regime; (b) a non-robust classifier can be learned efficiently; but (c) it is computationally hard to learn a robust classifier, assuming the hardness of factoring large numbers. The question of whether a robust classifier for their task exists in the large perturbation regime seems related to important open questions in computational number theory. In this work, we extend their work in three directions. First, we demonstrate classification tasks where computationally efficient robust classification is impossible, even when computationally unbounded robust classifiers exist. For this, we rely on the existence of average-case hard functions. Second, we show hard-to-robustly-learn classification tasks in the large-perturbation regime. Namely, we show that even though an efficient classifier that is robust to large perturbations exists, it is computationally hard to learn any non-trivial robust classifier. Our first construction relies on the existence of one-way functions, and the second on the hardness of the learning parity with noise problem. In the latter setting, not only does a non-robust classifier exist, but also an efficient algorithm that generates fresh new labeled samples given access to polynomially many training examples (termed as generation by Kearns et. al. (1994)). Third, we show that any such counterexample implies the existence of cryptographic primitives such as one-way functions. This leads us to a win-win scenario: either we can learn an efficient robust classifier, or we can construct new instances of cryptographic primitives.
Tasks
Published	2019-02-04
URL	https://arxiv.org/abs/1902.01086v2
PDF	https://arxiv.org/pdf/1902.01086v2.pdf
PWC	https://paperswithcode.com/paper/computational-limitations-in-robust
Repo
Framework

3FabRec: Fast Few-shot Face alignment by Reconstruction


Title	3FabRec: Fast Few-shot Face alignment by Reconstruction
Authors	Bjoern Browatzki, Christian Wallraven
Abstract	Current supervised frameworks for facial landmark detection require a large amount of training data and due to the massive number of parameters may suffer from overfitting to the specific datasets. We introduce a semi-supervised method in which the crucial idea is to first generate implicit knowledge about the face appearance & shape from the large amounts of unlabeled images of faces available today. In a first, unsupervised stage, we train an adversarial autoencoder to reconstruct faces via a low-dimensional, latent face-representation vector. In a second, supervised stage, we augment the generator-decoder pipeline with interleaved transfer layers in order to both reconstruct the face and a probabilistic landmark heatmap. We show that this framework (3FabRec) achieves state-of-the-art performance on popular benchmarks, such as 300-W, AFLW, and WLFW. Importantly, due to the power of the implicit face representation, our framework achieves impressive landmark localization accuracy from only a few percent of training data to as low as even 10 images. As the interleaved layers only add a small number of parameters to the encoder, inference runs at several hundred FPS on a GPU.
Tasks	Face Alignment, Facial Landmark Detection
Published	2019-11-24
URL	https://arxiv.org/abs/1911.10448v1
PDF	https://arxiv.org/pdf/1911.10448v1.pdf
PWC	https://paperswithcode.com/paper/3fabrec-fast-few-shot-face-alignment-by
Repo
Framework

Assessing Regulatory Risk in Personal Financial Advice Documents: a Pilot Study


Title	Assessing Regulatory Risk in Personal Financial Advice Documents: a Pilot Study
Authors	Wanita Sherchan, Simon Harris, Sue Ann Chen, Nebula Alam, Khoi-Nguyen Tran, Adam J. Makarucha, Christopher J. Butler
Abstract	Assessing regulatory compliance of personal financial advice is currently a complex manual process. In Australia, only 5%- 15% of advice documents are audited annually and 75% of these are found to be non-compliant(ASI 2018b). This paper describes a pilot with an Australian government regulation agency where Artificial Intelligence (AI) models based on techniques such natural language processing (NLP), machine learning and deep learning were developed to methodically characterise the regulatory risk status of personal financial advice documents. The solution provides traffic light rating of advice documents for various risk factors enabling comprehensive coverage of documents in the review and allowing rapid identification of documents that are at high risk of non-compliance with government regulations. This pilot serves as a case study of public-private partnership in developing AI systems for government and public sector.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.12580v1
PDF	https://arxiv.org/pdf/1910.12580v1.pdf
PWC	https://paperswithcode.com/paper/assessing-regulatory-risk-in-personal
Repo
Framework

Defects Mitigation in Resistive Crossbars for Analog Vector Matrix Multiplication


Title	Defects Mitigation in Resistive Crossbars for Analog Vector Matrix Multiplication
Authors	Fan Zhang, Miao Hu
Abstract	With storage and computation happening at the same place, computing in resistive crossbars minimizes data movement and avoids the memory bottleneck issue. It leads to ultra-high energy efficiency for data-intensive applications. However, defects in crossbars severely affect computing accuracy. Existing solutions, including re-training with defects and redundant designs, but they have limitations in practical implementations. In this work, we introduce row shuffling and output compensation to mitigate defects without re-training or redundant resistive crossbars. We also analyzed the coupling effects of defects and circuit parasitics. Moreover, We study different combinations of methods to achieve the best trade-off between cost and performance. Our proposed methods could rescue up to 10% of defects in ResNet-20 application without performance degradation.
Tasks
Published	2019-12-17
URL	https://arxiv.org/abs/1912.07829v1
PDF	https://arxiv.org/pdf/1912.07829v1.pdf
PWC	https://paperswithcode.com/paper/defects-mitigation-in-resistive-crossbars-for
Repo
Framework

Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles


Title	Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles
Authors	Jelena Fiosina, Maksims Fiosins, Stefan Bonn
Abstract	The lack of well-structured annotations in a growing amount of RNA expression data complicates data interoperability and reusability. Commonly - used text mining methods extract annotations from existing unstructured data descriptions and often provide inaccurate output that requires manual curation. Automatic data-based augmentation (generation of annotations on the base of expression data) can considerably improve the annotation quality and has not been well-studied. We formulate an automatic augmentation of small RNA-seq expression data as a classification problem and investigate deep learning (DL) and random forest (RF) approaches to solve it. We generate tissue and sex annotations from small RNA-seq expression data for tissues and cell lines of homo sapiens. We validate our approach on 4243 annotated small RNA-seq samples from the Small RNA Expression Atlas (SEA) database. The average prediction accuracy for tissue groups is 98% (DL), for tissues - 96.5% (DL), and for sex - 77% (DL). The “one dataset out” average accuracy for tissue group prediction is 83% (DL) and 59% (RF). On average, DL provides better results as compared to RF, and considerably improves classification performance for ‘unseen’ datasets.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11943v1
PDF	https://arxiv.org/pdf/1909.11943v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-and-random-forest-based
Repo
Framework

A framework for the extraction of Deep Neural Networks by leveraging public data


Title	A framework for the extraction of Deep Neural Networks by leveraging public data
Authors	Soham Pal, Yash Gupta, Aditya Shukla, Aditya Kanade, Shirish Shevade, Vinod Ganapathy
Abstract	Machine learning models trained on confidential datasets are increasingly being deployed for profit. Machine Learning as a Service (MLaaS) has made such models easily accessible to end-users. Prior work has developed model extraction attacks, in which an adversary extracts an approximation of MLaaS models by making black-box queries to it. However, none of these works is able to satisfy all the three essential criteria for practical model extraction: (1) the ability to work on deep learning models, (2) the non-requirement of domain knowledge and (3) the ability to work with a limited query budget. We design a model extraction framework that makes use of active learning and large public datasets to satisfy them. We demonstrate that it is possible to use this framework to steal deep classifiers trained on a variety of datasets from image and text domains. By querying a model via black-box access for its top prediction, our framework improves performance on an average over a uniform noise baseline by 4.70x for image tasks and 2.11x for text tasks respectively, while using only 30% (30,000 samples) of the public dataset at its disposal.
Tasks	Active Learning
Published	2019-05-22
URL	https://arxiv.org/abs/1905.09165v1
PDF	https://arxiv.org/pdf/1905.09165v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-the-extraction-of-deep-neural
Repo
Framework