Paper Group ANR 349
On the Effectiveness of Low Frequency Perturbations. ConCORDe-Net: Cell Count Regularized Convolutional Neural Network for Cell Detection in Multiplex Immunohistochemistry Images. Compact Approximation for Polynomial of Covariance Feature. Low radiation tomographic reconstruction with and without template information. Towards Universal End-to-End A …
On the Effectiveness of Low Frequency Perturbations
Title | On the Effectiveness of Low Frequency Perturbations |
Authors | Yash Sharma, Gavin Weiguang Ding, Marcus Brubaker |
Abstract | Carefully crafted, often imperceptible, adversarial perturbations have been shown to cause state-of-the-art models to yield extremely inaccurate outputs, rendering them unsuitable for safety-critical application domains. In addition, recent work has shown that constraining the attack space to a low frequency regime is particularly effective. Yet, it remains unclear whether this is due to generally constraining the attack search space or specifically removing high frequency components from consideration. By systematically controlling the frequency components of the perturbation, evaluating against the top-placing defense submissions in the NeurIPS 2017 competition, we empirically show that performance improvements in both the white-box and black-box transfer settings are yielded only when low frequency components are preserved. In fact, the defended models based on adversarial training are roughly as vulnerable to low frequency perturbations as undefended models, suggesting that the purported robustness of state-of-the-art ImageNet defenses is reliant upon adversarial perturbations being high frequency in nature. We do find that under $\ell_\infty$ $\epsilon=16/255$, the competition distortion bound, low frequency perturbations are indeed perceptible. This questions the use of the $\ell_\infty$-norm, in particular, as a distortion metric, and, in turn, suggests that explicitly considering the frequency space is promising for learning robust models which better align with human perception. |
Tasks | |
Published | 2019-02-28 |
URL | https://arxiv.org/abs/1903.00073v2 |
https://arxiv.org/pdf/1903.00073v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-effectiveness-of-low-frequency |
Repo | |
Framework | |
ConCORDe-Net: Cell Count Regularized Convolutional Neural Network for Cell Detection in Multiplex Immunohistochemistry Images
Title | ConCORDe-Net: Cell Count Regularized Convolutional Neural Network for Cell Detection in Multiplex Immunohistochemistry Images |
Authors | Yeman Brhane Hagos, Priya Lakshmi Narayanan, Ayse U. Akarca, Teresa Marafioti, Yinyin Yuan |
Abstract | In digital pathology, cell detection and classification are often prerequisites to quantify cell abundance and explore tissue spatial heterogeneity. However, these tasks are particularly challenging for multiplex immunohistochemistry (mIHC) images due to high levels of variability in staining, expression intensity, and inherent noise as a result of preprocessing artefacts. We proposed a deep learning method to detect and classify cells in mIHC whole-tumour slide images of breast cancer. Inspired by inception-v3, we developed Cell COunt RegularizeD Convolutional neural Network (ConCORDe-Net) which integrates conventional dice overlap and a new cell count loss function for optimizing cell detection, followed by a multi-stage convolutional neural network for cell classification. In total, 20447 cells, belonging to five cell classes were annotated by experts from 175 patches extracted from 6 whole-tumour mIHC images. These patches were randomly split into training, validation and testing sets. Using ConCORDe-Net, we obtained a cell detection F1 score of 0.873, which is the best score compared to three state of the art methods. In particular, ConCORDe-Net excels at detecting closely located and weakly stained cells compared to other methods. Incorporating cell count loss in the objective function regularizes the network to learn weak gradient boundaries and separate weakly stained cells from background artefacts. Moreover, cell classification accuracy of 96.5% was achieved. These results support that incorporating problem-specific knowledge such as cell count into deep learning-based cell detection architectures improve the robustness of the algorithm. |
Tasks | |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00907v1 |
https://arxiv.org/pdf/1908.00907v1.pdf | |
PWC | https://paperswithcode.com/paper/concorde-net-cell-count-regularized |
Repo | |
Framework | |
Compact Approximation for Polynomial of Covariance Feature
Title | Compact Approximation for Polynomial of Covariance Feature |
Authors | Yusuke Mukuta, Tatsuaki Machida, Tatsuya Harada |
Abstract | Covariance pooling is a feature pooling method with good classification accuracy. Because covariance features consist of second-order statistics, the scale of the feature elements are varied. Therefore, normalizing covariance features using a matrix square root affects the performance improvement. When pooling methods are applied to local features extracted from CNN models, the accuracy increases when the pooling function is back-propagatable and the feature-extraction model is learned in an end-to-end manner. Recently, the iterative polynomial approximation method for the matrix square root of a covariance feature was proposed, and resulted in a faster and more stable training than the methods based on singular-value decomposition. In this paper, we propose an extension of compact bilinear pooling, which is a compact approximation of the standard covariance feature, to the polynomials of the covariance feature. Subsequently, we apply the proposed approximation to the polynomial corresponding to the matrix square root to obtain a compact approximation for the square root of the covariance feature. Our method approximates a higher-dimensional polynomial of a covariance by the weighted sum of the approximate features corresponding to a pair of local features based on the similarity of the local features. We apply our method for standard fine-grained image recognition datasets and demonstrate that the proposed method shows comparable accuracy with fewer dimensions than the original feature. |
Tasks | Fine-Grained Image Recognition |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01851v1 |
https://arxiv.org/pdf/1906.01851v1.pdf | |
PWC | https://paperswithcode.com/paper/compact-approximation-for-polynomial-of |
Repo | |
Framework | |
Low radiation tomographic reconstruction with and without template information
Title | Low radiation tomographic reconstruction with and without template information |
Authors | Preeti Gopal, Sharat Chandran, Imants Svalbe, Ajit Rajwade |
Abstract | Low-dose tomography is highly preferred in medical procedures for its reduced radiation risk when compared to standard-dose Computed Tomography (CT). However, the lower the intensity of X-rays, the higher the acquisition noise and hence the reconstructions suffer from artefacts. A large body of work has focussed on improving the algorithms to minimize these artefacts. In this work, we propose two new techniques, rescaled non-linear least squares and Poisson-Gaussian convolution, that reconstruct the underlying image making use of an accurate or near-accurate statistical model of the noise in the projections. We also propose a reconstruction method when prior knowledge of the underlying object is available in the form of templates. This is applicable to longitudinal studies wherein the same object is scanned multiple times to observe the changes that evolve in it over time. Our results on 3D data show that prior information can be used to compensate for the low-dose artefacts, and we demonstrate that it is possible to simultaneously prevent the prior from adversely biasing the reconstructions of new changes in the test object, via a method called ``re-irradiation’'. Additionally, we also present two techniques for automated tuning of the regularization parameters for tomographic inversion. | |
Tasks | Computed Tomography (CT) |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.11022v1 |
https://arxiv.org/pdf/1912.11022v1.pdf | |
PWC | https://paperswithcode.com/paper/low-radiation-tomographic-reconstruction-with |
Repo | |
Framework | |
Towards Universal End-to-End Affect Recognition from Multilingual Speech by ConvNets
Title | Towards Universal End-to-End Affect Recognition from Multilingual Speech by ConvNets |
Authors | Dario Bertero, Onno Kampman, Pascale Fung |
Abstract | We propose an end-to-end affect recognition approach using a Convolutional Neural Network (CNN) that handles multiple languages, with applications to emotion and personality recognition from speech. We lay the foundation of a universal model that is trained on multiple languages at once. As affect is shared across all languages, we are able to leverage shared information between languages and improve the overall performance for each one. We obtained an average improvement of 12.8% on emotion and 10.1% on personality when compared with the same model trained on each language only. It is end-to-end because we directly take narrow-band raw waveforms as input. This allows us to accept as input audio recorded from any source and to avoid the overhead and information loss of feature extraction. It outperforms a similar CNN using spectrograms as input by 12.8% for emotion and 6.3% for personality, based on F-scores. Analysis of the network parameters and layers activation shows that the network learns and extracts significant features in the first layer, in particular pitch, energy and contour variations. Subsequent convolutional layers instead capture language-specific representations through the analysis of supra-segmental features. Our model represents an important step for the development of a fully universal affect recognizer, able to recognize additional descriptors, such as stress, and for the future implementation into affective interactive systems. |
Tasks | |
Published | 2019-01-19 |
URL | http://arxiv.org/abs/1901.06486v1 |
http://arxiv.org/pdf/1901.06486v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-universal-end-to-end-affect |
Repo | |
Framework | |
Large scale Lasso with windowed active set for convolutional spike sorting
Title | Large scale Lasso with windowed active set for convolutional spike sorting |
Authors | Laurent Dragoni, Rémi Flamary, Karim Lounici, Patricia Reynaud-Bouret |
Abstract | Spike sorting is a fundamental preprocessing step in neuroscience that is central to access simultaneous but distinct neuronal activities and therefore to better understand the animal or even human brain. But numerical complexity limits studies that require processing large scale datasets in terms of number of electrodes, neurons, spikes and length of the recorded signals. We propose in this work a novel active set algorithm aimed at solving the Lasso for a classical convolutional model. Our algorithm can be implemented efficiently on parallel architecture and has a linear complexity w.r.t. the temporal dimensionality which ensures scaling and will open the door to online spike sorting. We provide theoretical results about the complexity of the algorithm and illustrate it in numerical experiments along with results about the accuracy of the spike recovery and robustness to the regularization parameter. |
Tasks | |
Published | 2019-06-28 |
URL | https://arxiv.org/abs/1906.12077v1 |
https://arxiv.org/pdf/1906.12077v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-lasso-with-windowed-active-set |
Repo | |
Framework | |
On the Detection of Digital Face Manipulation
Title | On the Detection of Digital Face Manipulation |
Authors | Hao Dang, Feng Liu, Joel Stehouwer, Xiaoming Liu, Anil Jain |
Abstract | Detecting manipulated facial images and videos is an increasingly important topic in digital media forensics. As advanced face synthesis and manipulation methods are made available, new types of fake face representations are being created which have raised significant concerns for their use in social media. Hence, it is crucial to detect manipulated face images and localize manipulated regions. Instead of simply using multi-task learning to simultaneously detect manipulated images and predict the manipulated mask (regions), we propose to utilize an attention mechanism to process and improve the feature maps for the classification task. The learned attention maps highlight the informative regions to further improve the binary classification (genuine face v. fake face), and also visualize the manipulated regions. To enable our study of manipulated face detection and localization, we collect a large-scale database that contains numerous types of facial forgeries. With this dataset, we perform a thorough analysis of data-driven fake face detection. We show that the use of an attention mechanism improves facial forgery detection and manipulated region localization. |
Tasks | Face Detection, Face Generation, Multi-Task Learning |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01717v3 |
https://arxiv.org/pdf/1910.01717v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-detection-of-digital-face-manipulation |
Repo | |
Framework | |
Renyi Differentially Private ADMM for Non-Smooth Regularized Optimization
Title | Renyi Differentially Private ADMM for Non-Smooth Regularized Optimization |
Authors | Chen Chen, Jaewoo Lee |
Abstract | In this paper we consider the problem of minimizing composite objective functions consisting of a convex differentiable loss function plus a non-smooth regularization term, such as $L_1$ norm or nuclear norm, under R'enyi differential privacy (RDP). To solve the problem, we propose two stochastic alternating direction method of multipliers (ADMM) algorithms: ssADMM based on gradient perturbation and mpADMM based on output perturbation. Both algorithms decompose the original problem into sub-problems that have closed-form solutions. The first algorithm, ssADMM, applies the recent privacy amplification result for RDP to reduce the amount of noise to add. The second algorithm, mpADMM, numerically computes the sensitivity of ADMM variable updates and releases the updated parameter vector at the end of each epoch. We compare the performance of our algorithms with several baseline algorithms on both real and simulated datasets. Experimental results show that, in high privacy regimes (small $\epsilon$), ssADMM and mpADMM outperform other baseline algorithms in terms of classification and feature selection performance, respectively. |
Tasks | Feature Selection |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08180v2 |
https://arxiv.org/pdf/1909.08180v2.pdf | |
PWC | https://paperswithcode.com/paper/renyi-differentially-private-admm-based-l1 |
Repo | |
Framework | |
Minimum-Margin Active Learning
Title | Minimum-Margin Active Learning |
Authors | Heinrich Jiang, Maya Gupta |
Abstract | We present a new active sampling method we call min-margin which trains multiple learners on bootstrap samples and then chooses the examples to label based on the candidates’ minimum margin amongst the bootstrapped models. This extends standard margin sampling in a way that increases its diversity in a supervised manner as it arises from the model uncertainty. We focus on the one-shot batch active learning setting, and show theoretically and through extensive experiments on a broad set of problems that min-margin outperforms other methods, particularly as batch size grows. |
Tasks | Active Learning |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1906.00025v1 |
https://arxiv.org/pdf/1906.00025v1.pdf | |
PWC | https://paperswithcode.com/paper/190600025 |
Repo | |
Framework | |
My Approach = Your Apparatus? Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections
Title | My Approach = Your Apparatus? Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections |
Authors | Julian Risch, Ralf Krestel |
Abstract | Comparative text mining extends from genre analysis and political bias detection to the revelation of cultural and geographic differences, through to the search for prior art across patents and scientific papers. These applications use cross-collection topic modeling for the exploration, clustering, and comparison of large sets of documents, such as digital libraries. However, topic modeling on documents from different collections is challenging because of domain-specific vocabulary. We present a cross-collection topic model combined with automatic domain term extraction and phrase segmentation. This model distinguishes collection-specific and collection-independent words based on information entropy and reveals commonalities and differences of multiple text collections. We evaluate our model on patents, scientific papers, newspaper articles, forum posts, and Wikipedia articles. In comparison to state-of-the-art cross-collection topic modeling, our model achieves up to 13% higher topic coherence, up to 4% lower perplexity, and up to 31% higher document classification accuracy. More importantly, our approach is the first topic model that ensures disjunct general and specific word distributions, resulting in clear-cut topic representations. |
Tasks | Document Classification |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.11240v1 |
https://arxiv.org/pdf/1911.11240v1.pdf | |
PWC | https://paperswithcode.com/paper/my-approach-your-apparatus-entropy-based |
Repo | |
Framework | |
Active Learning Solution on Distributed Edge Computing
Title | Active Learning Solution on Distributed Edge Computing |
Authors | Jia Qian, Sayantan Sengupta, Lars Kai Hansen |
Abstract | Industry 4.0 becomes possible through the convergence between Operational and Information Technologies. All the requirements to realize the convergence is integrated on the Fog Platform. Fog Platform is introduced between the cloud server and edge devices when the unprecedented generation of data causes the burden of the cloud server, leading the ineligible latency. In this new paradigm, we divide the computation tasks and push it down to edge devices. Furthermore, local computing (at edge side) may improve privacy and trust. To address these problems, we present a new method, in which we decompose the data aggregation and processing, by dividing them between edge devices and fog nodes intelligently. We apply active learning on edge devices; and federated learning on the fog node which significantly reduces the data samples to train the model as well as the communication cost. To show the effectiveness of the proposed method, we implemented and evaluated its performance for an image classification task. In addition, we consider two settings: massively distributed and non-massively distributed and offer the corresponding solutions. |
Tasks | Active Learning, Image Classification |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.10718v1 |
https://arxiv.org/pdf/1906.10718v1.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-solution-on-distributed-edge |
Repo | |
Framework | |
Meta-Learning Deep Energy-Based Memory Models
Title | Meta-Learning Deep Energy-Based Memory Models |
Authors | Sergey Bartunov, Jack W Rae, Simon Osindero, Timothy P Lillicrap |
Abstract | We study the problem of learning associative memory – a system which is able to retrieve a remembered pattern based on its distorted or incomplete version. Attractor networks provide a sound model of associative memory: patterns are stored as attractors of the network dynamics and associative retrieval is performed by running the dynamics starting from a query pattern until it converges to an attractor. In such models the dynamics are often implemented as an optimization procedure that minimizes an energy function, such as in the classical Hopfield network. In general it is difficult to derive a writing rule for a given dynamics and energy that is both compressive and fast. Thus, most research in energy-based memory has been limited either to tractable energy models not expressive enough to handle complex high-dimensional objects such as natural images, or to models that do not offer fast writing. We present a novel meta-learning approach to energy-based memory models (EBMM) that allows one to use an arbitrary neural architecture as an energy model and quickly store patterns in its weights. We demonstrate experimentally that our EBMM approach can build compressed memories for synthetic and natural data, and is capable of associative retrieval that outperforms existing memory systems in terms of the reconstruction error and compression rate. |
Tasks | Meta-Learning |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02720v1 |
https://arxiv.org/pdf/1910.02720v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-deep-energy-based-memory-models-1 |
Repo | |
Framework | |
Large-scale 6D Object Pose Estimation Dataset for Industrial Bin-Picking
Title | Large-scale 6D Object Pose Estimation Dataset for Industrial Bin-Picking |
Authors | Kilian Kleeberger, Christian Landgraf, Marco F. Huber |
Abstract | In this paper, we introduce a new public dataset for 6D object pose estimation and instance segmentation for industrial bin-picking. The dataset comprises both synthetic and real-world scenes. For both, point clouds, depth images, and annotations comprising the 6D pose (position and orientation), a visibility score, and a segmentation mask for each object are provided. Along with the raw data, a method for precisely annotating real-world scenes is proposed. To the best of our knowledge, this is the first public dataset for 6D object pose estimation and instance segmentation for bin-picking containing sufficiently annotated data for learning-based approaches. Furthermore, it is one of the largest public datasets for object pose estimation in general. The dataset is publicly available at http://www.bin-picking.ai/en/dataset.html. |
Tasks | 6D Pose Estimation using RGB, Instance Segmentation, Pose Estimation, Semantic Segmentation |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.12125v1 |
https://arxiv.org/pdf/1912.12125v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-6d-object-pose-estimation-dataset |
Repo | |
Framework | |
Learning Where to Look While Tracking Instruments in Robot-assisted Surgery
Title | Learning Where to Look While Tracking Instruments in Robot-assisted Surgery |
Authors | Mobarakol Islam, Yueyuan Li, Hongliang Ren |
Abstract | Directing of the task-specific attention while tracking instrument in surgery holds great potential in robot-assisted intervention. For this purpose, we propose an end-to-end trainable multitask learning (MTL) model for real-time surgical instrument segmentation and attention prediction. Our model is designed with a weight-shared encoder and two task-oriented decoders and optimized for the joint tasks. We introduce batch-Wasserstein (bW) loss and construct a soft attention module to refine the distinctive visual region for efficient saliency learning. For multitask optimization, it is always challenging to obtain convergence of both tasks in the same epoch. We deal with this problem by adopting `poly’ loss weight and two phases of training. We further propose a novel way to generate task-aware saliency map and scanpath of the instruments on MICCAI robotic instrument segmentation dataset. Compared to the state of the art segmentation and saliency models, our model outperforms most of the evaluation metrics. | |
Tasks | |
Published | 2019-06-29 |
URL | https://arxiv.org/abs/1907.00214v1 |
https://arxiv.org/pdf/1907.00214v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-where-to-look-while-tracking |
Repo | |
Framework | |
What you get is what you see: Decomposing Epistemic Planning using Functional STRIPS
Title | What you get is what you see: Decomposing Epistemic Planning using Functional STRIPS |
Authors | Guang Hu, Tim Miller, Nir Lipovetzky |
Abstract | Epistemic planning — planning with knowledge and belief — is essential in many multi-agent and human-agent interaction domains. Most state-of-the-art epistemic planners solve this problem by compiling to propositional classical planning, for example, generating all possible knowledge atoms, or compiling epistemic formula to normal forms. However, these methods become computationally infeasible as problems grow. In this paper, we decompose epistemic planning by delegating reasoning about epistemic formula to an external solver. We do this by modelling the problem using \emph{functional STRIPS}, which is more expressive than standard STRIPS and supports the use of external, black-box functions within action models. Exploiting recent work that demonstrates the relationship between what an agent `sees’ and what it knows, we allow modellers to provide new implementations of externals functions. These define what agents see in their environment, allowing new epistemic logics to be defined without changing the planner. As a result, it increases the capability and flexibility of the epistemic model itself, and avoids the exponential pre-compilation step. We ran evaluations on well-known epistemic planning benchmarks to compare with an existing state-of-the-art planner, and on new scenarios based on different external functions. The results show that our planner scales significantly better than the state-of-the-art planner against which we compared, and can express problems more succinctly. | |
Tasks | |
Published | 2019-03-28 |
URL | http://arxiv.org/abs/1903.11777v2 |
http://arxiv.org/pdf/1903.11777v2.pdf | |
PWC | https://paperswithcode.com/paper/what-you-get-is-what-you-see-decomposing |
Repo | |
Framework | |