January 29, 2020

3296 words 16 mins read

Paper Group ANR 502

Paper Group ANR 502

Winograd Convolution for DNNs: Beyond linear polynomials. Artificial neural networks in action for an automated cell-type classification of biological neural networks. Object Viewpoint Classification Based 3D Bounding Box Estimation for Autonomous Vehicles. Adversarial Privacy Preservation under Attribute Inference Attack. Deep Learning Techniques …

Winograd Convolution for DNNs: Beyond linear polynomials

Title Winograd Convolution for DNNs: Beyond linear polynomials
Authors Barbara Barabasz, David Gregg
Abstract Winograd convolution is widely used in deep neural networks (DNNs). Existing work for DNNs considers only the subset Winograd algorithms that are equivalent to Toom-Cook convolution. We investigate a wider range of Winograd algorithms for DNNs and show that these additional algorithms can significantly improve floating point (FP) accuracy in many cases. We present results for three FP formats: fp32, fp16 and bf16 (a truncated form of fp32) using 2000 inputs from the ImageNet dataset. We found that in fp16 this approach gives us up to 6.5 times better image recognition accuracy in one important case while maintaining the same number of elementwise multiplication operations in the innermost loop. In bf16 the convolution can be computed using 5% fewer innermost loop multiplications than with currently used Winograd algorithms while keeping the accuracy of image recognition the same as for direct convolution method.
Tasks
Published 2019-05-13
URL https://arxiv.org/abs/1905.05233v2
PDF https://arxiv.org/pdf/1905.05233v2.pdf
PWC https://paperswithcode.com/paper/winograd-convolution-for-dnns-beyond-linear
Repo
Framework

Artificial neural networks in action for an automated cell-type classification of biological neural networks

Title Artificial neural networks in action for an automated cell-type classification of biological neural networks
Authors Eirini Troullinou, Grigorios Tsagkatakis, Spyridon Chavlis, Gergely Turi, Wen-Ke Li, Attila Losonczy, Panagiotis Tsakalides, Panayiota Poirazi
Abstract In this work we address the problem of neuronal cell-type classification, and we employ a real-world dataset of raw neuronal activity measurements obtained with calcium imaging techniques. While neuronal cell-type classification is a crucial step in understanding the function of neuronal circuits, and thus a systematic classification of neurons is much needed, it still remains a challenge. In recent years, several approaches have been employed for a reliable neuronal cell-type recognition, such as immunohistochemical (IHC) analysis and feature extraction algorithms based on several characteristics of neuronal cells. These methods, however, demand a lot of human intervention and observation, they are time-consuming and regarding the feature extraction algorithms it is not clear or obvious what are the best features that define a neuronal cell class. In this work we examine three different deep learning models aiming at an automated neuronal cell-type classification and compare their performance. Experimental analysis demonstrates the efficacy and potent capabilities for each one of the proposed schemes.
Tasks
Published 2019-11-22
URL https://arxiv.org/abs/1911.09977v1
PDF https://arxiv.org/pdf/1911.09977v1.pdf
PWC https://paperswithcode.com/paper/artificial-neural-networks-in-action-for-an
Repo
Framework

Object Viewpoint Classification Based 3D Bounding Box Estimation for Autonomous Vehicles

Title Object Viewpoint Classification Based 3D Bounding Box Estimation for Autonomous Vehicles
Authors Zhou Lingtao, Fang Jiaojiao, Liu Guizhong
Abstract 3D object detection is one of the most important tasks for the perception systems of autonomous vehicles. With the significant success in the field of 2D object detection, several monocular image based 3D object detection algorithms have been proposed based on advanced 2D object detectors and the geometric constraints between the 2D and 3D bounding boxes. In this paper, we propose a novel method for determining the configuration of the 2D-3D geometric constraints which is based on the well-known 2D-3D two stage object detection framework. First, we discrete viewpoints in which the camera shots the object into 16 categories with respect to the observation relationship between camera and objects. Second, we design a viewpoint classifier by integrated a new sub-branch into the existing multi-branches CNN. Then, the configuration of geometric constraint between the 2D and 3D bounding boxes can be determined according to the output of this classifier. Extensive experiments on the KITTI dataset show that, our method not only improves the computational efficiency, but also increases the overall precision of the model, especially to the orientation angle estimation.
Tasks 3D Object Detection, Autonomous Vehicles, Object Detection
Published 2019-09-03
URL https://arxiv.org/abs/1909.01025v1
PDF https://arxiv.org/pdf/1909.01025v1.pdf
PWC https://paperswithcode.com/paper/object-viewpoint-classification-based-3d
Repo
Framework

Adversarial Privacy Preservation under Attribute Inference Attack

Title Adversarial Privacy Preservation under Attribute Inference Attack
Authors Han Zhao, Jianfeng Chi, Yuan Tian, Geoffrey J. Gordon
Abstract With the prevalence of machine learning services, crowdsourced data containing sensitive information poses substantial privacy challenges. Existing work focusing on protecting against membership inference attacks under the rigorous framework of differential privacy are vulnerable to attribute inference attacks. In light of the current gap between theory and practice, we develop a novel theoretical framework for privacy-preservation under the attack of attribute inference. Under our framework, we propose a minimax optimization formulation to protect the given attribute and analyze its privacy guarantees against arbitrary adversaries. On the other hand, it is clear that privacy constraint may cripple utility when the protected attribute is correlated with the target variable. To this end, we also prove an information-theoretic lower bound to precisely characterize the fundamental trade-off between utility and privacy. Empirically, we extensively conduct experiments to corroborate our privacy guarantee and validate the inherent trade-offs in different privacy preservation algorithms. Our experimental results indicate that the adversarial representation learning approaches achieve the best trade-off in terms of privacy preservation and utility maximization.
Tasks Inference Attack, Representation Learning
Published 2019-06-19
URL https://arxiv.org/abs/1906.07902v2
PDF https://arxiv.org/pdf/1906.07902v2.pdf
PWC https://paperswithcode.com/paper/adversarial-task-specific-privacy
Repo
Framework

Deep Learning Techniques for Improving Digital Gait Segmentation

Title Deep Learning Techniques for Improving Digital Gait Segmentation
Authors Matteo Gadaleta, Giulia Cisotto, Michele Rossi, Rana Zia Ur Rehman, Lynn Rochester, Silvia Del Din
Abstract Wearable technology for the automatic detection of gait events has recently gained growing interest, enabling advanced analyses that were previously limited to specialist centres and equipment (e.g., instrumented walkway). In this study, we present a novel method based on dilated convolutions for an accurate detection of gait events (initial and final foot contacts) from wearable inertial sensors. A rich dataset has been used to validate the method, featuring 71 people with Parkinson’s disease (PD) and 67 healthy control subjects. Multiple sensors have been considered, one located on the fifth lumbar vertebrae and two on the ankles. The aims of this study were: (i) to apply deep learning (DL) techniques on wearable sensor data for gait segmentation and quantification in older adults and in people with PD; (ii) to validate the proposed technique for measuring gait against traditional gold standard laboratory reference and a widely used algorithm based on wavelet transforms (WT); (iii) to assess the performance of DL methods in assessing high-level gait characteristics, with focus on stride, stance and swing related features. The results showed a high reliability of the proposed approach, which achieves temporal errors considerably smaller than WT, in particular for the detection of final contacts, with an inter-quartile range below 70 ms in the worst case. This study showes encouraging results, and paves the road for further research, addressing the effectiveness and the generalization of data-driven learning systems for accurate event detection in challenging conditions.
Tasks
Published 2019-07-09
URL https://arxiv.org/abs/1907.04281v1
PDF https://arxiv.org/pdf/1907.04281v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-techniques-for-improving
Repo
Framework

Limitations and Biases in Facial Landmark Detection – An Empirical Study on Older Adults with Dementia

Title Limitations and Biases in Facial Landmark Detection – An Empirical Study on Older Adults with Dementia
Authors Azin Asgarian, Shun Zhao, Ahmed B. Ashraf, M. Erin Browne, Kenneth M. Prkachin, Alex Mihailidis, Thomas Hadjistavropoulos, Babak Taati
Abstract Accurate facial expression analysis is an essential step in various clinical applications that involve physical and mental health assessments of older adults (e.g. diagnosis of pain or depression). Although remarkable progress has been achieved toward developing robust facial landmark detection methods, state-of-the-art methods still face many challenges when encountering uncontrolled environments, different ranges of facial expressions, and different demographics of the population. A recent study has revealed that the health status of individuals can also affect the performance of facial landmark detection methods on front views of faces. In this work, we investigate this matter in a much greater context using seven facial landmark detection methods. We perform our evaluation not only on frontal faces but also on profile faces and in various regions of the face. Our results shed light on limitations of the existing methods and challenges of applying these methods in clinical settings by indicating: 1) a significant difference between the performance of state-of-the-art when tested on the profile or frontal faces of individuals with vs. without dementia; 2) insights on the existing bias for all regions of the face; and 3) the presence of this bias despite re-training/fine-tuning with various configurations of six datasets.
Tasks Facial Landmark Detection
Published 2019-05-17
URL https://arxiv.org/abs/1905.07446v1
PDF https://arxiv.org/pdf/1905.07446v1.pdf
PWC https://paperswithcode.com/paper/limitations-and-biases-in-facial-landmark
Repo
Framework

Formal Verification of Decision-Tree Ensemble Model and Detection of its Violating-input-value Ranges

Title Formal Verification of Decision-Tree Ensemble Model and Detection of its Violating-input-value Ranges
Authors Naoto Sato, Hironobu Kuruma, Yuichiroh Nakagawa, Hideto Ogawa
Abstract As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value for any input value. Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect input values that lead to malfunctions of a system (failures) during development and take appropriate measures. One conceivable solution is to install an input filter that controls the input to the DTEM, and to use separate software to process input values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition of the input value that leads to the malfunction of the system. Given that necessity, in this paper, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an input value leading to a failure is found, extracting the range in which such an input value exists. The proposed method can comprehensively extract the range in which the input value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure occurring in the system. In this paper, the algorithm of the proposed method is described, and the results of a case study using a dataset of house prices are presented. On the basis of those results, the feasibility of the proposed method is demonstrated, and its scalability is evaluated.
Tasks
Published 2019-04-26
URL http://arxiv.org/abs/1904.11753v1
PDF http://arxiv.org/pdf/1904.11753v1.pdf
PWC https://paperswithcode.com/paper/formal-verification-of-decision-tree-ensemble
Repo
Framework

Texture retrieval using periodically extended and adaptive curvelets

Title Texture retrieval using periodically extended and adaptive curvelets
Authors Hasan Al-Marzouqi, Yuting Hu, Ghassan AlRegib
Abstract Image retrieval is an important problem in the area of multimedia processing. This paper presents two new curvelet-based algorithms for texture retrieval which are suitable for use in constrained-memory devices. The developed algorithms are tested on three publicly available texture datasets: CUReT, Mondial-Marmi, and STex-fabric. Our experiments confirm the effectiveness of the proposed system. Furthermore, a weighted version of the proposed retrieval algorithm is proposed, which is shown to achieve promising results in the classification of seismic activities.
Tasks Image Retrieval
Published 2019-05-24
URL https://arxiv.org/abs/1905.09976v1
PDF https://arxiv.org/pdf/1905.09976v1.pdf
PWC https://paperswithcode.com/paper/texture-retrieval-using-periodically-extended
Repo
Framework

Open Domain Web Keyphrase Extraction Beyond Language Modeling

Title Open Domain Web Keyphrase Extraction Beyond Language Modeling
Authors Lee Xiong, Chuan Hu, Chenyan Xiong, Daniel Campos, Arnold Overwijk
Abstract This paper studies keyphrase extraction in real-world scenarios where documents are from diverse domains and have variant content quality. We curate and release OpenKP, a large scale open domain keyphrase extraction dataset with near one hundred thousand web documents and expert keyphrase annotations. To handle the variations of domain and content quality, we develop BLING-KPE, a neural keyphrase extraction model that goes beyond language understanding using visual presentations of documents and weak supervision from search queries. Experimental results on OpenKP confirm the effectiveness of BLING-KPE and the contributions of its neural architecture, visual features, and search log weak supervision. Zero-shot evaluations on DUC-2001 demonstrate the improved generalization ability of learning from the open domain data compared to a specific domain.
Tasks Language Modelling
Published 2019-11-06
URL https://arxiv.org/abs/1911.02671v1
PDF https://arxiv.org/pdf/1911.02671v1.pdf
PWC https://paperswithcode.com/paper/open-domain-web-keyphrase-extraction-beyond-1
Repo
Framework

An Efficient Multi-Domain Framework for Image-to-Image Translation

Title An Efficient Multi-Domain Framework for Image-to-Image Translation
Authors Ye Lin, Keren Fu, Shenggui Ling, Cheng Peng
Abstract Existing approaches have been proposed to tackle unsupervised image-to-image translation in recent years. However, they mainly focus on one-to-one mappings, making it difficult to handle more general and practical problems such as multi-domain translations. To address issues like large cost of training time and resources in translation between any number of domains, we propose a general framework called multi-domain translator (MDT), which is extended from bi-directional image-to-image translation. MDT is designed to have only one domain-shared encoder for the consideration of efficiency, together with several domain-specified decoders to transform an image into multiple domains without knowing the input domain label. Moreover, we propose to employ two constraints, namely reconstruction loss and identity loss to further improve the generation. Experiments are conducted on different databases for several multi-domain translation tasks. Both qualitative and quantitative results demonstrate the effectiveness and efficiency performed by the proposed MDT against the state-of-the-art models.
Tasks Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published 2019-11-28
URL https://arxiv.org/abs/1911.12552v1
PDF https://arxiv.org/pdf/1911.12552v1.pdf
PWC https://paperswithcode.com/paper/an-efficient-multi-domain-framework-for-image
Repo
Framework

Nonlinear System Identification via Tensor Completion

Title Nonlinear System Identification via Tensor Completion
Authors Nikos Kargas, Nicholas D. Sidiropoulos
Abstract Function approximation from input and output data pairs constitutes a fundamental problem in supervised learning. Deep neural networks are currently the most popular method for learning to mimic the input-output relationship of a general nonlinear system, as they have proven to be very effective in approximating complex highly nonlinear functions. In this work, we show that identifying a general nonlinear function $y = f(x_1,\ldots,x_N)$ from input-output examples can be formulated as a tensor completion problem and under certain conditions provably correct nonlinear system identification is possible. Specifically, we model the interactions between the $N$ input variables and the scalar output of a system by a single $N$-way tensor, and setup a weighted low-rank tensor completion problem with smoothness regularization which we tackle using a block coordinate descent algorithm. We extend our method to the multi-output setting and the case of partially observed data, which cannot be readily handled by neural networks. Finally, we demonstrate the effectiveness of the approach using several regression tasks including some standard benchmarks and a challenging student grade prediction task.
Tasks
Published 2019-06-13
URL https://arxiv.org/abs/1906.05746v3
PDF https://arxiv.org/pdf/1906.05746v3.pdf
PWC https://paperswithcode.com/paper/nonlinear-system-identification-via-tensor
Repo
Framework

Incremental Binarization On Recurrent Neural Networks For Single-Channel Source Separation

Title Incremental Binarization On Recurrent Neural Networks For Single-Channel Source Separation
Authors Sunwoo Kim, Mrinmoy Maity, Minje Kim
Abstract This paper proposes a Bitwise Gated Recurrent Unit (BGRU) network for the single-channel source separation task. Recurrent Neural Networks (RNN) require several sets of weights within its cells, which significantly increases the computational cost compared to the fully-connected networks. To mitigate this increased computation, we focus on the GRU cells and quantize the feedforward procedure with binarized values and bitwise operations. The BGRU network is trained in two stages. The real-valued weights are pretrained and transferred to the bitwise network, which are then incrementally binarized to minimize the potential loss that can occur from a sudden introduction of quantization. As the proposed binarization technique turns only a few randomly chosen parameters into their binary versions, it gives the network training procedure a chance to gently adapt to the partly quantized version of the network. It eventually achieves the full binarization by incrementally increasing the amount of binarization over the iterations. Our experiments show that the proposed BGRU method produces source separation results greater than that of a real-valued fully connected network, with 11-12 dB mean Signal-to-Distortion Ratio (SDR). A fully binarized BGRU still outperforms a Bitwise Neural Network (BNN) by 1-2 dB even with less number of layers.
Tasks Quantization
Published 2019-08-23
URL https://arxiv.org/abs/1908.08898v1
PDF https://arxiv.org/pdf/1908.08898v1.pdf
PWC https://paperswithcode.com/paper/incremental-binarization-on-recurrent-neural
Repo
Framework

Direct Energy-resolving CT Imaging via Energy-integrating CT images using a Unified Generative Adversarial Network

Title Direct Energy-resolving CT Imaging via Energy-integrating CT images using a Unified Generative Adversarial Network
Authors Lisha Yao, Sui Li, Manman Zhu, Dong Zeng, Zhaoying Bian, Jianhua Ma
Abstract Energy-resolving computed tomography (ErCT) has the ability to acquire energy-dependent measurements simultaneously and quantitative material information with improved contrast-to-noise ratio. Meanwhile, ErCT imaging system is usually equipped with an advanced photon counting detector, which is expensive and technically complex. Therefore, clinical ErCT scanners are not yet commercially available, and they are in various stage of completion. This makes the researchers less accessible to the ErCT images. In this work, we investigate to produce ErCT images directly from existing energy-integrating CT (EiCT) images via deep neural network. Specifically, different from other networks that produce ErCT images at one specific energy, this model employs a unified generative adversarial network (uGAN) to concurrently train EiCT datasets and ErCT datasets with different energies and then performs image-to-image translation from existing EiCT images to multiple ErCT image outputs at various energy bins. In this study, the present uGAN generates ErCT images at 70keV, 90keV, 110keV, and 130keV simultaneously from EiCT images at140kVp. We evaluate the present uGAN model on a set of over 1380 CT image slices and show that the present uGAN model can produce promising ErCT estimation results compared with the ground truth qualitatively and quantitatively.
Tasks Image-to-Image Translation
Published 2019-10-14
URL https://arxiv.org/abs/1910.06154v1
PDF https://arxiv.org/pdf/1910.06154v1.pdf
PWC https://paperswithcode.com/paper/direct-energy-resolving-ct-imaging-via-energy
Repo
Framework

Localizing Occluders with Compositional Convolutional Networks

Title Localizing Occluders with Compositional Convolutional Networks
Authors Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille
Abstract Compositional convolutional networks are generative compositional models of neural network features, that achieve state of the art results when classifying partially occluded objects, even when they have not been exposed to occluded objects during training. In this work, we study the performance of CompositionalNets at localizing occluders in images. We show that the original model is not able to localize occluders well. We propose to overcome this limitation by modeling the feature activations as a mixture of von-Mises-Fisher distributions, which also allows for an end-to-end training of CompositionalNets. Our experimental results demonstrate that the proposed extensions increase the model’s performance at localizing occluders as well as at classifying partially occluded objects.
Tasks
Published 2019-11-18
URL https://arxiv.org/abs/1911.08571v1
PDF https://arxiv.org/pdf/1911.08571v1.pdf
PWC https://paperswithcode.com/paper/localizing-occluders-with-compositional
Repo
Framework

An efficient branch-and-cut algorithm for approximately submodular function maximization

Title An efficient branch-and-cut algorithm for approximately submodular function maximization
Authors Naoya Uematsu, Shunji Umetani, Yoshinobu Kawahara
Abstract When approaching to problems in computer science, we often encounter situations where a subset of a finite set maximizing some utility function needs to be selected. Some of such utility functions are known to be approximately submodular. For the problem of maximizing an approximately submodular function (ASFM problem), a greedy algorithm quickly finds good feasible solutions for many instances while guaranteeing ($1-e^{-\gamma}$)-approximation ratio for a given submodular ratio $\gamma$. However, we still encounter its applications that ask more accurate or exactly optimal solutions within a reasonable computation time. In this paper, we present an efficient branch-and-cut algorithm for the non-decreasing ASFM problem based on its binary integer programming (BIP) formulation with an exponential number of constraints. To this end, we first derive a BIP formulation of the ASFM problem and then, develop an improved constraint generation algorithm that starts from a reduced BIP problem with a small subset of constraints and repeats solving the reduced BIP problem while adding a promising set of constraints at each iteration. Moreover, we incorporate it into a branch-and-cut algorithm to attain good upper bounds while solving a smaller number of nodes of a search tree. The computational results for three types of well-known benchmark instances show that our algorithm performs better than the conventional exact algorithms.
Tasks
Published 2019-04-26
URL http://arxiv.org/abs/1904.12682v1
PDF http://arxiv.org/pdf/1904.12682v1.pdf
PWC https://paperswithcode.com/paper/an-efficient-branch-and-cut-algorithm-for
Repo
Framework
comments powered by Disqus