July 28, 2019

2708 words 13 mins read

Paper Group ANR 440

Distribution-Free One-Pass Learning. Attacking the Madry Defense Model with $L_1$-based Adversarial Examples. Boosted Multiple Kernel Learning for First-Person Activity Recognition. Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research. Blind estimation of white Gaussian noise variance in highly textured i …

Distribution-Free One-Pass Learning


Title	Distribution-Free One-Pass Learning
Authors	Peng Zhao, Zhi-Hua Zhou
Abstract	In many large-scale machine learning applications, data are accumulated with time, and thus, an appropriate model should be able to update in an online paradigm. Moreover, as the whole data volume is unknown when constructing the model, it is desired to scan each data item only once with a storage independent with the data volume. It is also noteworthy that the distribution underlying may change during the data accumulation procedure. To handle such tasks, in this paper we propose DFOP, a distribution-free one-pass learning approach. This approach works well when distribution change occurs during data accumulation, without requiring prior knowledge about the change. Every data item can be discarded once it has been scanned. Besides, theoretical guarantee shows that the estimate error, under a mild assumption, decreases until convergence with high probability. The performance of DFOP for both regression and classification are validated in experiments.
Tasks
Published	2017-06-08
URL	http://arxiv.org/abs/1706.02471v1
PDF	http://arxiv.org/pdf/1706.02471v1.pdf
PWC	https://paperswithcode.com/paper/distribution-free-one-pass-learning
Repo
Framework

Attacking the Madry Defense Model with $L_1$-based Adversarial Examples


Title	Attacking the Madry Defense Model with $L_1$-based Adversarial Examples
Authors	Yash Sharma, Pin-Yu Chen
Abstract	The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal $L_\infty$ distortion $\epsilon$ = 0.3. This discourages the use of attacks which are not optimized on the $L_\infty$ distortion metric. Our experimental results demonstrate that by relaxing the $L_\infty$ constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average $L_\infty$ distortion, have minimal visual distortion. These results call into question the use of $L_\infty$ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.
Tasks
Published	2017-10-30
URL	http://arxiv.org/abs/1710.10733v4
PDF	http://arxiv.org/pdf/1710.10733v4.pdf
PWC	https://paperswithcode.com/paper/attacking-the-madry-defense-model-with-l_1
Repo
Framework

Boosted Multiple Kernel Learning for First-Person Activity Recognition


Title	Boosted Multiple Kernel Learning for First-Person Activity Recognition
Authors	Fatih Ozkan, Mehmet Ali Arabaci, Elif Surer, Alptekin Temizel
Abstract	Activity recognition from first-person (ego-centric) videos has recently gained attention due to the increasing ubiquity of the wearable cameras. There has been a surge of efforts adapting existing feature descriptors and designing new descriptors for the first-person videos. An effective activity recognition system requires selection and use of complementary features and appropriate kernels for each feature. In this study, we propose a data-driven framework for first-person activity recognition which effectively selects and combines features and their respective kernels during the training. Our experimental results show that use of Multiple Kernel Learning (MKL) and Boosted MKL in first-person activity recognition problem exhibits improved results in comparison to the state-of-the-art. In addition, these techniques enable the expansion of the framework with new features in an efficient and convenient way.
Tasks	Activity Recognition
Published	2017-02-22
URL	http://arxiv.org/abs/1702.06799v2
PDF	http://arxiv.org/pdf/1702.06799v2.pdf
PWC	https://paperswithcode.com/paper/boosted-multiple-kernel-learning-for-first
Repo
Framework

Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research


Title	Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research
Authors	Vincent Major, Alisa Surkis, Yindalon Aphinyanaphongs
Abstract	Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an entirely unsupervised manner using a contextual window and doing so much faster than previous methods. Each word is projected into vector space such that similar meaning words such as “strong” and “powerful” are projected into the same general Euclidean space. Open questions about these embeddings include their utility across classification tasks and the optimal properties and source of documents to construct broadly functional embeddings. In this work, we demonstrate the usefulness of pre-trained embeddings for classification in our task and demonstrate that custom word embeddings, built in the domain and for the tasks, can improve performance over word embeddings learnt on more general data including news articles or Wikipedia.
Tasks	Text Classification, Word Embeddings
Published	2017-05-17
URL	http://arxiv.org/abs/1705.06262v2
PDF	http://arxiv.org/pdf/1705.06262v2.pdf
PWC	https://paperswithcode.com/paper/utility-of-general-and-specific-word
Repo
Framework

Blind estimation of white Gaussian noise variance in highly textured images


Title	Blind estimation of white Gaussian noise variance in highly textured images
Authors	Mykola Ponomarenko, Nikolay Gapon, Viacheslav Voronin, Karen Egiazarian
Abstract	In the paper, a new method of blind estimation of noise variance in a single highly textured image is proposed. An input image is divided into 8x8 blocks and discrete cosine transform (DCT) is performed for each block. A part of 64 DCT coefficients with lowest energy calculated through all blocks is selected for further analysis. For the DCT coefficients, a robust estimate of noise variance is calculated. Corresponding to the obtained estimate, a part of blocks having very large values of local variance calculated only for the selected DCT coefficients are excluded from the further analysis. These two steps (estimation of noise variance and exclusion of blocks) are iteratively repeated three times. For the verification of the proposed method, a new noise-free test image database TAMPERE17 consisting of many highly textured images is designed. It is shown for this database and different values of noise variance from the set {25, 49, 100, 225}, that the proposed method provides approximately two times lower estimation root mean square error than other methods.
Tasks
Published	2017-11-29
URL	http://arxiv.org/abs/1711.10792v1
PDF	http://arxiv.org/pdf/1711.10792v1.pdf
PWC	https://paperswithcode.com/paper/blind-estimation-of-white-gaussian-noise
Repo
Framework

Normative theory of visual receptive fields


Title	Normative theory of visual receptive fields
Authors	Tony Lindeberg
Abstract	This article gives an overview of a normative computational theory of visual receptive fields, by which idealized functional models of early spatial, spatio-chromatic and spatio-temporal receptive fields can be derived in an axiomatic way based on structural properties of the environment in combination with assumptions about the internal structure of a vision system to guarantee consistent handling of image representations over multiple spatial and temporal scales. Interestingly, this theory leads to predictions about visual receptive field shapes with qualitatively very good similarity to biological receptive fields measured in the retina, the LGN and the primary visual cortex (V1) of mammals.
Tasks
Published	2017-01-23
URL	http://arxiv.org/abs/1701.06333v4
PDF	http://arxiv.org/pdf/1701.06333v4.pdf
PWC	https://paperswithcode.com/paper/normative-theory-of-visual-receptive-fields
Repo
Framework

Partial Face Detection in the Mobile Domain


Title	Partial Face Detection in the Mobile Domain
Authors	Upal Mahbub, Sayantan Sarkar, Rama Chellappa
Abstract	Generic face detection algorithms do not perform well in the mobile domain due to significant presence of occluded and partially visible faces. One promising technique to handle the challenge of partial faces is to design face detectors based on facial segments. In this paper two different approaches of facial segment-based face detection are discussed, namely, proposal-based detection and detection by end-to-end regression. Methods that follow the first approach rely on generating face proposals that contain facial segment information. The three detectors following this approach, namely Facial Segment-based Face Detector (FSFD), SegFace and DeepSegFace, discussed in this paper, perform binary classification on each proposal based on features learned from facial segments. The process of proposal generation, however, needs to be handled separately, which can be very time consuming, and is not truly necessary given the nature of the active authentication problem. Hence a novel algorithm, Deep Regression-based User Image Detector (DRUID) is proposed, which shifts from the classification to the regression paradigm, thus obviating the need for proposal generation. DRUID has an unique network architecture with customized loss functions, is trained using a relatively small amount of data by utilizing a novel data augmentation scheme and is fast since it outputs the bounding boxes of a face and its segments in a single pass. Being robust to occlusion by design, the facial segment-based face detection methods, especially DRUID show superior performance over other state-of-the-art face detectors in terms of precision-recall and ROC curve on two mobile face datasets.
Tasks	Data Augmentation, Face Detection
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02117v1
PDF	http://arxiv.org/pdf/1704.02117v1.pdf
PWC	https://paperswithcode.com/paper/partial-face-detection-in-the-mobile-domain
Repo
Framework

UPSET and ANGRI : Breaking High Performance Image Classifiers


Title	UPSET and ANGRI : Breaking High Performance Image Classifiers
Authors	Sayantan Sarkar, Ankan Bansal, Upal Mahbub, Rama Chellappa
Abstract	In this paper, targeted fooling of high performance image classifiers is achieved by developing two novel attack methods. The first method generates universal perturbations for target classes and the second generates image specific perturbations. Extensive experiments are conducted on MNIST and CIFAR10 datasets to provide insights about the proposed algorithms and show their effectiveness.
Tasks
Published	2017-07-04
URL	http://arxiv.org/abs/1707.01159v1
PDF	http://arxiv.org/pdf/1707.01159v1.pdf
PWC	https://paperswithcode.com/paper/upset-and-angri-breaking-high-performance
Repo
Framework

An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks


Title	An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks
Authors	Qinglong Wang, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles
Abstract	Rule extraction from black-box models is critical in domains that require model validation before implementation, as can be the case in credit scoring and medical diagnosis. Though already a challenging problem in statistical learning in general, the difficulty is even greater when highly non-linear, recursive models, such as recurrent neural networks (RNNs), are fit to data. Here, we study the extraction of rules from second-order recurrent neural networks trained to recognize the Tomita grammars. We show that production rules can be stably extracted from trained RNNs and that in certain cases the rules outperform the trained RNNs.
Tasks	Medical Diagnosis
Published	2017-09-29
URL	http://arxiv.org/abs/1709.10380v5
PDF	http://arxiv.org/pdf/1709.10380v5.pdf
PWC	https://paperswithcode.com/paper/an-empirical-evaluation-of-rule-extraction
Repo
Framework

SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences


Title	SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences
Authors	René Schuster, Oliver Wasenmüller, Georg Kuschk, Christian Bailer, Didier Stricker
Abstract	While most scene flow methods use either variational optimization or a strong rigid motion assumption, we show for the first time that scene flow can also be estimated by dense interpolation of sparse matches. To this end, we find sparse matches across two stereo image pairs that are detected without any prior regularization and perform dense interpolation preserving geometric and motion boundaries by using edge information. A few iterations of variational energy minimization are performed to refine our results, which are thoroughly evaluated on the KITTI benchmark and additionally compared to state-of-the-art on MPI Sintel. For application in an automotive context, we further show that an optional ego-motion model helps to boost performance and blends smoothly into our approach to produce a segmentation of the scene into static and dynamic parts.
Tasks
Published	2017-10-27
URL	http://arxiv.org/abs/1710.10096v1
PDF	http://arxiv.org/pdf/1710.10096v1.pdf
PWC	https://paperswithcode.com/paper/sceneflowfields-dense-interpolation-of-sparse
Repo
Framework

On the importance of normative data in speech-based assessment


Title	On the importance of normative data in speech-based assessment
Authors	Zeinab Noorian, Chloé Pou-Prom, Frank Rudzicz
Abstract	Data sets for identifying Alzheimer’s disease (AD) are often relatively sparse, which limits their ability to train generalizable models. Here, we augment such a data set, DementiaBank, with each of two normative data sets, the Wisconsin Longitudinal Study and Talk2Me, each of which employs a speech-based picture-description assessment. Through minority class oversampling with ADASYN, we outperform state-of-the-art results in binary classification of people with and without AD in DementiaBank. This work highlights the effectiveness of combining sparse and difficult-to-acquire patient data with relatively large and easily accessible normative datasets.
Tasks
Published	2017-11-30
URL	http://arxiv.org/abs/1712.00069v1
PDF	http://arxiv.org/pdf/1712.00069v1.pdf
PWC	https://paperswithcode.com/paper/on-the-importance-of-normative-data-in-speech
Repo
Framework

Fast Optimization of Wildfire Suppression Policies with SMAC


Title	Fast Optimization of Wildfire Suppression Policies with SMAC
Authors	Sean McGregor, Rachel Houtman, Claire Montgomery, Ronald Metoyer, Thomas G. Dietterich
Abstract	Managers of US National Forests must decide what policy to apply for dealing with lightning-caused wildfires. Conflicts among stakeholders (e.g., timber companies, home owners, and wildlife biologists) have often led to spirited political debates and even violent eco-terrorism. One way to transform these conflicts into multi-stakeholder negotiations is to provide a high-fidelity simulation environment in which stakeholders can explore the space of alternative policies and understand the tradeoffs therein. Such an environment needs to support fast optimization of MDP policies so that users can adjust reward functions and analyze the resulting optimal policies. This paper assesses the suitability of SMAC—a black-box empirical function optimization algorithm—for rapid optimization of MDP policies. The paper describes five reward function components and four stakeholder constituencies. It then introduces a parameterized class of policies that can be easily understood by the stakeholders. SMAC is applied to find the optimal policy in this class for the reward functions of each of the stakeholder constituencies. The results confirm that SMAC is able to rapidly find good policies that make sense from the domain perspective. Because the full-fidelity forest fire simulator is far too expensive to support interactive optimization, SMAC is applied to a surrogate model constructed from a modest number of runs of the full-fidelity simulator. To check the quality of the SMAC-optimized policies, the policies are evaluated on the full-fidelity simulator. The results confirm that the surrogate values estimates are valid. This is the first successful optimization of wildfire management policies using a full-fidelity simulation. The same methodology should be applicable to other contentious natural resource management problems where high-fidelity simulation is extremely expensive.
Tasks
Published	2017-03-28
URL	http://arxiv.org/abs/1703.09391v1
PDF	http://arxiv.org/pdf/1703.09391v1.pdf
PWC	https://paperswithcode.com/paper/fast-optimization-of-wildfire-suppression
Repo
Framework

Medical Text Classification using Convolutional Neural Networks


Title	Medical Text Classification using Convolutional Neural Networks
Authors	Mark Hughes, Irene Li, Spyros Kotoulas, Toyotaro Suzumura
Abstract	We present an approach to automatically classify clinical text at a sentence level. We are using deep convolutional neural networks to represent complex features. We train the network on a dataset providing a broad categorization of health information. Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by about 15%.
Tasks	Text Classification
Published	2017-04-22
URL	http://arxiv.org/abs/1704.06841v1
PDF	http://arxiv.org/pdf/1704.06841v1.pdf
PWC	https://paperswithcode.com/paper/medical-text-classification-using
Repo
Framework

Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study


Title	Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
Authors	Samuel Ritter, David G. T. Barrett, Adam Santoro, Matt M. Botvinick
Abstract	Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to that observed in humans: they prefer to categorize objects according to shape rather than color. The magnitude of this shape bias varies greatly among architecturally identical, but differently seeded models, and even fluctuates within seeds throughout training, despite nearly equivalent classification performance. These results demonstrate the capability of tools from cognitive psychology for exposing hidden computational properties of DNNs, while concurrently providing us with a computational model for human word learning.
Tasks	One-Shot Learning
Published	2017-06-26
URL	http://arxiv.org/abs/1706.08606v2
PDF	http://arxiv.org/pdf/1706.08606v2.pdf
PWC	https://paperswithcode.com/paper/cognitive-psychology-for-deep-neural-networks
Repo
Framework

Improving Neural Parsing by Disentangling Model Combination and Reranking Effects


Title	Improving Neural Parsing by Disentangling Model Combination and Reranking Effects
Authors	Daniel Fried, Mitchell Stern, Dan Klein
Abstract	Recent work has proposed several generative neural models for constituency parsing that achieve state-of-the-art results. Since direct search in these generative models is difficult, they have primarily been used to rescore candidate outputs from base parsers in which decoding is more straightforward. We first present an algorithm for direct search in these generative models. We then demonstrate that the rescoring results are at least partly due to implicit model combination rather than reranking effects. Finally, we show that explicit model combination can improve performance even further, resulting in new state-of-the-art numbers on the PTB of 94.25 F1 when training only on gold data and 94.66 F1 when using external data.
Tasks	Constituency Parsing
Published	2017-07-10
URL	http://arxiv.org/abs/1707.03058v1
PDF	http://arxiv.org/pdf/1707.03058v1.pdf
PWC	https://paperswithcode.com/paper/improving-neural-parsing-by-disentangling
Repo
Framework