Paper Group ANR 440
Distribution-Free One-Pass Learning. Attacking the Madry Defense Model with $L_1$-based Adversarial Examples. Boosted Multiple Kernel Learning for First-Person Activity Recognition. Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research. Blind estimation of white Gaussian noise variance in highly textured i …
Distribution-Free One-Pass Learning
Title | Distribution-Free One-Pass Learning |
Authors | Peng Zhao, Zhi-Hua Zhou |
Abstract | In many large-scale machine learning applications, data are accumulated with time, and thus, an appropriate model should be able to update in an online paradigm. Moreover, as the whole data volume is unknown when constructing the model, it is desired to scan each data item only once with a storage independent with the data volume. It is also noteworthy that the distribution underlying may change during the data accumulation procedure. To handle such tasks, in this paper we propose DFOP, a distribution-free one-pass learning approach. This approach works well when distribution change occurs during data accumulation, without requiring prior knowledge about the change. Every data item can be discarded once it has been scanned. Besides, theoretical guarantee shows that the estimate error, under a mild assumption, decreases until convergence with high probability. The performance of DFOP for both regression and classification are validated in experiments. |
Tasks | |
Published | 2017-06-08 |
URL | http://arxiv.org/abs/1706.02471v1 |
http://arxiv.org/pdf/1706.02471v1.pdf | |
PWC | https://paperswithcode.com/paper/distribution-free-one-pass-learning |
Repo | |
Framework | |
Attacking the Madry Defense Model with $L_1$-based Adversarial Examples
Title | Attacking the Madry Defense Model with $L_1$-based Adversarial Examples |
Authors | Yash Sharma, Pin-Yu Chen |
Abstract | The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal $L_\infty$ distortion $\epsilon$ = 0.3. This discourages the use of attacks which are not optimized on the $L_\infty$ distortion metric. Our experimental results demonstrate that by relaxing the $L_\infty$ constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average $L_\infty$ distortion, have minimal visual distortion. These results call into question the use of $L_\infty$ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples. |
Tasks | |
Published | 2017-10-30 |
URL | http://arxiv.org/abs/1710.10733v4 |
http://arxiv.org/pdf/1710.10733v4.pdf | |
PWC | https://paperswithcode.com/paper/attacking-the-madry-defense-model-with-l_1 |
Repo | |
Framework | |
Boosted Multiple Kernel Learning for First-Person Activity Recognition
Title | Boosted Multiple Kernel Learning for First-Person Activity Recognition |
Authors | Fatih Ozkan, Mehmet Ali Arabaci, Elif Surer, Alptekin Temizel |
Abstract | Activity recognition from first-person (ego-centric) videos has recently gained attention due to the increasing ubiquity of the wearable cameras. There has been a surge of efforts adapting existing feature descriptors and designing new descriptors for the first-person videos. An effective activity recognition system requires selection and use of complementary features and appropriate kernels for each feature. In this study, we propose a data-driven framework for first-person activity recognition which effectively selects and combines features and their respective kernels during the training. Our experimental results show that use of Multiple Kernel Learning (MKL) and Boosted MKL in first-person activity recognition problem exhibits improved results in comparison to the state-of-the-art. In addition, these techniques enable the expansion of the framework with new features in an efficient and convenient way. |
Tasks | Activity Recognition |
Published | 2017-02-22 |
URL | http://arxiv.org/abs/1702.06799v2 |
http://arxiv.org/pdf/1702.06799v2.pdf | |
PWC | https://paperswithcode.com/paper/boosted-multiple-kernel-learning-for-first |
Repo | |
Framework | |
Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research
Title | Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research |
Authors | Vincent Major, Alisa Surkis, Yindalon Aphinyanaphongs |
Abstract | Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an entirely unsupervised manner using a contextual window and doing so much faster than previous methods. Each word is projected into vector space such that similar meaning words such as “strong” and “powerful” are projected into the same general Euclidean space. Open questions about these embeddings include their utility across classification tasks and the optimal properties and source of documents to construct broadly functional embeddings. In this work, we demonstrate the usefulness of pre-trained embeddings for classification in our task and demonstrate that custom word embeddings, built in the domain and for the tasks, can improve performance over word embeddings learnt on more general data including news articles or Wikipedia. |
Tasks | Text Classification, Word Embeddings |
Published | 2017-05-17 |
URL | http://arxiv.org/abs/1705.06262v2 |
http://arxiv.org/pdf/1705.06262v2.pdf | |
PWC | https://paperswithcode.com/paper/utility-of-general-and-specific-word |
Repo | |
Framework | |
Blind estimation of white Gaussian noise variance in highly textured images
Title | Blind estimation of white Gaussian noise variance in highly textured images |
Authors | Mykola Ponomarenko, Nikolay Gapon, Viacheslav Voronin, Karen Egiazarian |
Abstract | In the paper, a new method of blind estimation of noise variance in a single highly textured image is proposed. An input image is divided into 8x8 blocks and discrete cosine transform (DCT) is performed for each block. A part of 64 DCT coefficients with lowest energy calculated through all blocks is selected for further analysis. For the DCT coefficients, a robust estimate of noise variance is calculated. Corresponding to the obtained estimate, a part of blocks having very large values of local variance calculated only for the selected DCT coefficients are excluded from the further analysis. These two steps (estimation of noise variance and exclusion of blocks) are iteratively repeated three times. For the verification of the proposed method, a new noise-free test image database TAMPERE17 consisting of many highly textured images is designed. It is shown for this database and different values of noise variance from the set {25, 49, 100, 225}, that the proposed method provides approximately two times lower estimation root mean square error than other methods. |
Tasks | |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10792v1 |
http://arxiv.org/pdf/1711.10792v1.pdf | |
PWC | https://paperswithcode.com/paper/blind-estimation-of-white-gaussian-noise |
Repo | |
Framework | |
Normative theory of visual receptive fields
Title | Normative theory of visual receptive fields |
Authors | Tony Lindeberg |
Abstract | This article gives an overview of a normative computational theory of visual receptive fields, by which idealized functional models of early spatial, spatio-chromatic and spatio-temporal receptive fields can be derived in an axiomatic way based on structural properties of the environment in combination with assumptions about the internal structure of a vision system to guarantee consistent handling of image representations over multiple spatial and temporal scales. Interestingly, this theory leads to predictions about visual receptive field shapes with qualitatively very good similarity to biological receptive fields measured in the retina, the LGN and the primary visual cortex (V1) of mammals. |
Tasks | |
Published | 2017-01-23 |
URL | http://arxiv.org/abs/1701.06333v4 |
http://arxiv.org/pdf/1701.06333v4.pdf | |
PWC | https://paperswithcode.com/paper/normative-theory-of-visual-receptive-fields |
Repo | |
Framework | |
Partial Face Detection in the Mobile Domain
Title | Partial Face Detection in the Mobile Domain |
Authors | Upal Mahbub, Sayantan Sarkar, Rama Chellappa |
Abstract | Generic face detection algorithms do not perform well in the mobile domain due to significant presence of occluded and partially visible faces. One promising technique to handle the challenge of partial faces is to design face detectors based on facial segments. In this paper two different approaches of facial segment-based face detection are discussed, namely, proposal-based detection and detection by end-to-end regression. Methods that follow the first approach rely on generating face proposals that contain facial segment information. The three detectors following this approach, namely Facial Segment-based Face Detector (FSFD), SegFace and DeepSegFace, discussed in this paper, perform binary classification on each proposal based on features learned from facial segments. The process of proposal generation, however, needs to be handled separately, which can be very time consuming, and is not truly necessary given the nature of the active authentication problem. Hence a novel algorithm, Deep Regression-based User Image Detector (DRUID) is proposed, which shifts from the classification to the regression paradigm, thus obviating the need for proposal generation. DRUID has an unique network architecture with customized loss functions, is trained using a relatively small amount of data by utilizing a novel data augmentation scheme and is fast since it outputs the bounding boxes of a face and its segments in a single pass. Being robust to occlusion by design, the facial segment-based face detection methods, especially DRUID show superior performance over other state-of-the-art face detectors in terms of precision-recall and ROC curve on two mobile face datasets. |
Tasks | Data Augmentation, Face Detection |
Published | 2017-04-07 |
URL | http://arxiv.org/abs/1704.02117v1 |
http://arxiv.org/pdf/1704.02117v1.pdf | |
PWC | https://paperswithcode.com/paper/partial-face-detection-in-the-mobile-domain |
Repo | |
Framework | |
UPSET and ANGRI : Breaking High Performance Image Classifiers
Title | UPSET and ANGRI : Breaking High Performance Image Classifiers |
Authors | Sayantan Sarkar, Ankan Bansal, Upal Mahbub, Rama Chellappa |
Abstract | In this paper, targeted fooling of high performance image classifiers is achieved by developing two novel attack methods. The first method generates universal perturbations for target classes and the second generates image specific perturbations. Extensive experiments are conducted on MNIST and CIFAR10 datasets to provide insights about the proposed algorithms and show their effectiveness. |
Tasks | |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.01159v1 |
http://arxiv.org/pdf/1707.01159v1.pdf | |
PWC | https://paperswithcode.com/paper/upset-and-angri-breaking-high-performance |
Repo | |
Framework | |
An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks
Title | An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks |
Authors | Qinglong Wang, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles |
Abstract | Rule extraction from black-box models is critical in domains that require model validation before implementation, as can be the case in credit scoring and medical diagnosis. Though already a challenging problem in statistical learning in general, the difficulty is even greater when highly non-linear, recursive models, such as recurrent neural networks (RNNs), are fit to data. Here, we study the extraction of rules from second-order recurrent neural networks trained to recognize the Tomita grammars. We show that production rules can be stably extracted from trained RNNs and that in certain cases the rules outperform the trained RNNs. |
Tasks | Medical Diagnosis |
Published | 2017-09-29 |
URL | http://arxiv.org/abs/1709.10380v5 |
http://arxiv.org/pdf/1709.10380v5.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-evaluation-of-rule-extraction |
Repo | |
Framework | |
SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences
Title | SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences |
Authors | René Schuster, Oliver Wasenmüller, Georg Kuschk, Christian Bailer, Didier Stricker |
Abstract | While most scene flow methods use either variational optimization or a strong rigid motion assumption, we show for the first time that scene flow can also be estimated by dense interpolation of sparse matches. To this end, we find sparse matches across two stereo image pairs that are detected without any prior regularization and perform dense interpolation preserving geometric and motion boundaries by using edge information. A few iterations of variational energy minimization are performed to refine our results, which are thoroughly evaluated on the KITTI benchmark and additionally compared to state-of-the-art on MPI Sintel. For application in an automotive context, we further show that an optional ego-motion model helps to boost performance and blends smoothly into our approach to produce a segmentation of the scene into static and dynamic parts. |
Tasks | |
Published | 2017-10-27 |
URL | http://arxiv.org/abs/1710.10096v1 |
http://arxiv.org/pdf/1710.10096v1.pdf | |
PWC | https://paperswithcode.com/paper/sceneflowfields-dense-interpolation-of-sparse |
Repo | |
Framework | |
On the importance of normative data in speech-based assessment
Title | On the importance of normative data in speech-based assessment |
Authors | Zeinab Noorian, Chloé Pou-Prom, Frank Rudzicz |
Abstract | Data sets for identifying Alzheimer’s disease (AD) are often relatively sparse, which limits their ability to train generalizable models. Here, we augment such a data set, DementiaBank, with each of two normative data sets, the Wisconsin Longitudinal Study and Talk2Me, each of which employs a speech-based picture-description assessment. Through minority class oversampling with ADASYN, we outperform state-of-the-art results in binary classification of people with and without AD in DementiaBank. This work highlights the effectiveness of combining sparse and difficult-to-acquire patient data with relatively large and easily accessible normative datasets. |
Tasks | |
Published | 2017-11-30 |
URL | http://arxiv.org/abs/1712.00069v1 |
http://arxiv.org/pdf/1712.00069v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-importance-of-normative-data-in-speech |
Repo | |
Framework | |
Fast Optimization of Wildfire Suppression Policies with SMAC
Title | Fast Optimization of Wildfire Suppression Policies with SMAC |
Authors | Sean McGregor, Rachel Houtman, Claire Montgomery, Ronald Metoyer, Thomas G. Dietterich |
Abstract | Managers of US National Forests must decide what policy to apply for dealing with lightning-caused wildfires. Conflicts among stakeholders (e.g., timber companies, home owners, and wildlife biologists) have often led to spirited political debates and even violent eco-terrorism. One way to transform these conflicts into multi-stakeholder negotiations is to provide a high-fidelity simulation environment in which stakeholders can explore the space of alternative policies and understand the tradeoffs therein. Such an environment needs to support fast optimization of MDP policies so that users can adjust reward functions and analyze the resulting optimal policies. This paper assesses the suitability of SMAC—a black-box empirical function optimization algorithm—for rapid optimization of MDP policies. The paper describes five reward function components and four stakeholder constituencies. It then introduces a parameterized class of policies that can be easily understood by the stakeholders. SMAC is applied to find the optimal policy in this class for the reward functions of each of the stakeholder constituencies. The results confirm that SMAC is able to rapidly find good policies that make sense from the domain perspective. Because the full-fidelity forest fire simulator is far too expensive to support interactive optimization, SMAC is applied to a surrogate model constructed from a modest number of runs of the full-fidelity simulator. To check the quality of the SMAC-optimized policies, the policies are evaluated on the full-fidelity simulator. The results confirm that the surrogate values estimates are valid. This is the first successful optimization of wildfire management policies using a full-fidelity simulation. The same methodology should be applicable to other contentious natural resource management problems where high-fidelity simulation is extremely expensive. |
Tasks | |
Published | 2017-03-28 |
URL | http://arxiv.org/abs/1703.09391v1 |
http://arxiv.org/pdf/1703.09391v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-optimization-of-wildfire-suppression |
Repo | |
Framework | |
Medical Text Classification using Convolutional Neural Networks
Title | Medical Text Classification using Convolutional Neural Networks |
Authors | Mark Hughes, Irene Li, Spyros Kotoulas, Toyotaro Suzumura |
Abstract | We present an approach to automatically classify clinical text at a sentence level. We are using deep convolutional neural networks to represent complex features. We train the network on a dataset providing a broad categorization of health information. Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by about 15%. |
Tasks | Text Classification |
Published | 2017-04-22 |
URL | http://arxiv.org/abs/1704.06841v1 |
http://arxiv.org/pdf/1704.06841v1.pdf | |
PWC | https://paperswithcode.com/paper/medical-text-classification-using |
Repo | |
Framework | |
Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
Title | Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study |
Authors | Samuel Ritter, David G. T. Barrett, Adam Santoro, Matt M. Botvinick |
Abstract | Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to that observed in humans: they prefer to categorize objects according to shape rather than color. The magnitude of this shape bias varies greatly among architecturally identical, but differently seeded models, and even fluctuates within seeds throughout training, despite nearly equivalent classification performance. These results demonstrate the capability of tools from cognitive psychology for exposing hidden computational properties of DNNs, while concurrently providing us with a computational model for human word learning. |
Tasks | One-Shot Learning |
Published | 2017-06-26 |
URL | http://arxiv.org/abs/1706.08606v2 |
http://arxiv.org/pdf/1706.08606v2.pdf | |
PWC | https://paperswithcode.com/paper/cognitive-psychology-for-deep-neural-networks |
Repo | |
Framework | |
Improving Neural Parsing by Disentangling Model Combination and Reranking Effects
Title | Improving Neural Parsing by Disentangling Model Combination and Reranking Effects |
Authors | Daniel Fried, Mitchell Stern, Dan Klein |
Abstract | Recent work has proposed several generative neural models for constituency parsing that achieve state-of-the-art results. Since direct search in these generative models is difficult, they have primarily been used to rescore candidate outputs from base parsers in which decoding is more straightforward. We first present an algorithm for direct search in these generative models. We then demonstrate that the rescoring results are at least partly due to implicit model combination rather than reranking effects. Finally, we show that explicit model combination can improve performance even further, resulting in new state-of-the-art numbers on the PTB of 94.25 F1 when training only on gold data and 94.66 F1 when using external data. |
Tasks | Constituency Parsing |
Published | 2017-07-10 |
URL | http://arxiv.org/abs/1707.03058v1 |
http://arxiv.org/pdf/1707.03058v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-neural-parsing-by-disentangling |
Repo | |
Framework | |