October 19, 2019

2734 words 13 mins read

Paper Group ANR 273

Paper Group ANR 273

A Brain-Inspired Trust Management Model to Assure Security in a Cloud based IoT Framework for Neuroscience Applications. Using Automatic Generation of Relaxation Constraints to Improve the Preimage Attack on 39-step MD4. Skeptical Deep Learning with Distribution Correction. Heterogeneous Bitwidth Binarization in Convolutional Neural Networks. Multi …

A Brain-Inspired Trust Management Model to Assure Security in a Cloud based IoT Framework for Neuroscience Applications

Title A Brain-Inspired Trust Management Model to Assure Security in a Cloud based IoT Framework for Neuroscience Applications
Authors Mufti Mahmud, M. Shamim Kaiser, M. Mostafizur Rahman, M. Arifur Rahman, Antesar Shabut, Shamim Al-Mamun, Amir Hussain
Abstract Rapid popularity of Internet of Things (IoT) and cloud computing permits neuroscientists to collect multilevel and multichannel brain data to better understand brain functions, diagnose diseases, and devise treatments. To ensure secure and reliable data communication between end-to-end (E2E) devices supported by current IoT and cloud infrastructure, trust management is needed at the IoT and user ends. This paper introduces a Neuro-Fuzzy based Brain-inspired trust management model (TMM) to secure IoT devices and relay nodes, and to ensure data reliability. The proposed TMM utilizes node behavioral trust and data trust estimated using Adaptive Neuro-Fuzzy Inference System and weighted-additive methods respectively to assess the nodes trustworthiness. In contrast to the existing fuzzy based TMMs, the NS2 simulation results confirm the robustness and accuracy of the proposed TMM in identifying malicious nodes in the communication network. With the growing usage of cloud based IoT frameworks in Neuroscience research, integrating the proposed TMM into the existing infrastructure will assure secure and reliable data communication among the E2E devices.
Tasks
Published 2018-01-11
URL http://arxiv.org/abs/1801.03984v1
PDF http://arxiv.org/pdf/1801.03984v1.pdf
PWC https://paperswithcode.com/paper/a-brain-inspired-trust-management-model-to
Repo
Framework

Using Automatic Generation of Relaxation Constraints to Improve the Preimage Attack on 39-step MD4

Title Using Automatic Generation of Relaxation Constraints to Improve the Preimage Attack on 39-step MD4
Authors Gribanova Irina, Semenov Alexander
Abstract In this paper we construct preimage attack on the truncated variant of the MD4 hash function. Specifically, we study the MD4-39 function defined by the first 39 steps of the MD4 algorithm. We suggest a new attack on MD4-39, which develops the ideas proposed by H. Dobbertin in 1998. Namely, the special relaxation constraints are introduced in order to simplify the equations corresponding to the problem of finding a preimage for an arbitrary MD4-39 hash value. The equations supplemented with the relaxation constraints are then reduced to the Boolean Satisfiability Problem (SAT) and solved using the state-of-the-art SAT solvers. We show that the effectiveness of a set of relaxation constraints can be evaluated using the black-box function of a special kind. Thus, we suggest automatic method of relaxation constraints generation by applying the black-box optimization to this function. The proposed method made it possible to find new relaxation constraints that contribute to a SAT-based preimage attack on MD4-39 which significantly outperforms the competition.
Tasks
Published 2018-02-20
URL http://arxiv.org/abs/1802.06940v1
PDF http://arxiv.org/pdf/1802.06940v1.pdf
PWC https://paperswithcode.com/paper/using-automatic-generation-of-relaxation
Repo
Framework

Skeptical Deep Learning with Distribution Correction

Title Skeptical Deep Learning with Distribution Correction
Authors Mingxiao An, Yongzhou Chen, Qi Liu, Chuanren Liu, Guangyi Lv, Fangzhao Wu, Jianhui Ma
Abstract Recently deep neural networks have been successfully used for various classification tasks, especially for problems with massive perfectly labeled training data. However, it is often costly to have large-scale credible labels in real-world applications. One solution is to make supervised learning robust with imperfectly labeled input. In this paper, we develop a distribution correction approach that allows deep neural networks to avoid overfitting imperfect training data. Specifically, we treat the noisy input as samples from an incorrect distribution, which will be automatically corrected during our training process. We test our approach on several classification datasets with elaborately generated noisy labels. The results show significantly higher prediction and recovery accuracy with our approach compared to alternative methods.
Tasks
Published 2018-11-09
URL http://arxiv.org/abs/1811.03821v3
PDF http://arxiv.org/pdf/1811.03821v3.pdf
PWC https://paperswithcode.com/paper/skeptical-deep-learning-with-distribution
Repo
Framework

Heterogeneous Bitwidth Binarization in Convolutional Neural Networks

Title Heterogeneous Bitwidth Binarization in Convolutional Neural Networks
Authors Josh Fromm, Shwetak Patel, Matthai Philipose
Abstract Recent work has shown that fast, compact low-bitwidth neural networks can be surprisingly accurate. These networks use homogeneous binarization: all parameters in each layer or (more commonly) the whole model have the same low bitwidth (e.g., 2 bits). However, modern hardware allows efficient designs where each arithmetic instruction can have a custom bitwidth, motivating heterogeneous binarization, where every parameter in the network may have a different bitwidth. In this paper, we show that it is feasible and useful to select bitwidths at the parameter granularity during training. For instance a heterogeneously quantized version of modern networks such as AlexNet and MobileNet, with the right mix of 1-, 2- and 3-bit parameters that average to just 1.4 bits can equal the accuracy of homogeneous 2-bit versions of these networks. Further, we provide analyses to show that the heterogeneously binarized systems yield FPGA- and ASIC-based implementations that are correspondingly more efficient in both circuit area and energy efficiency than their homogeneous counterparts.
Tasks
Published 2018-05-25
URL http://arxiv.org/abs/1805.10368v2
PDF http://arxiv.org/pdf/1805.10368v2.pdf
PWC https://paperswithcode.com/paper/heterogeneous-bitwidth-binarization-in
Repo
Framework

Multi-Task Learning for Argumentation Mining in Low-Resource Settings

Title Multi-Task Learning for Argumentation Mining in Low-Resource Settings
Authors Claudia Schulz, Steffen Eger, Johannes Daxenberger, Tobias Kahse, Iryna Gurevych
Abstract We investigate whether and where multi-task learning (MTL) can improve performance on NLP problems related to argumentation mining (AM), in particular argument component identification. Our results show that MTL performs particularly well (and better than single-task learning) when little training data is available for the main task, a common scenario in AM. Our findings challenge previous assumptions that conceptualizations across AM datasets are divergent and that MTL is difficult for semantic or higher-level tasks.
Tasks Multi-Task Learning
Published 2018-04-11
URL http://arxiv.org/abs/1804.04083v3
PDF http://arxiv.org/pdf/1804.04083v3.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-for-argumentation-mining
Repo
Framework

Marian: Cost-effective High-Quality Neural Machine Translation in C++

Title Marian: Cost-effective High-Quality Neural Machine Translation in C++
Authors Marcin Junczys-Dowmunt, Kenneth Heafield, Hieu Hoang, Roman Grundkiewicz, Anthony Aue
Abstract This paper describes the submissions of the “Marian” team to the WNMT 2018 shared task. We investigate combinations of teacher-student training, low-precision matrix products, auto-tuning and other methods to optimize the Transformer model on GPU and CPU. By further integrating these methods with the new averaging attention networks, a recently introduced faster Transformer variant, we create a number of high-quality, high-performance models on the GPU and CPU, dominating the Pareto frontier for this shared task.
Tasks Machine Translation
Published 2018-05-30
URL http://arxiv.org/abs/1805.12096v1
PDF http://arxiv.org/pdf/1805.12096v1.pdf
PWC https://paperswithcode.com/paper/marian-cost-effective-high-quality-neural
Repo
Framework

Attention-based Temporal Weighted Convolutional Neural Network for Action Recognition

Title Attention-based Temporal Weighted Convolutional Neural Network for Action Recognition
Authors Jinliang Zang, Le Wang, Ziyi Liu, Qilin Zhang, Zhenxing Niu, Gang Hua, Nanning Zheng
Abstract Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs). However, effective and efficient methods for incorporation of temporal information into CNNs are still being actively explored in the recent literature. Motivated by the popular recurrent attention models in the research area of natural language processing, we propose the Attention-based Temporal Weighted CNN (ATW), which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is simply implemented as temporal weighting yet it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW framework is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with backpropagation. Our experiments show that the proposed attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments.
Tasks Temporal Action Localization
Published 2018-03-19
URL http://arxiv.org/abs/1803.07179v1
PDF http://arxiv.org/pdf/1803.07179v1.pdf
PWC https://paperswithcode.com/paper/attention-based-temporal-weighted
Repo
Framework

A Self-Organizing Tensor Architecture for Multi-View Clustering

Title A Self-Organizing Tensor Architecture for Multi-View Clustering
Authors Lifang He, Chun-ta Lu, Yong Chen, Jiawei Zhang, Linlin Shen, Philip S. Yu, Fei Wang
Abstract In many real-world applications, data are often unlabeled and comprised of different representations/views which often provide information complementary to each other. Although several multi-view clustering methods have been proposed, most of them routinely assume one weight for one view of features, and thus inter-view correlations are only considered at the view-level. These approaches, however, fail to explore the explicit correlations between features across multiple views. In this paper, we introduce a tensor-based approach to incorporate the higher-order interactions among multiple views as a tensor structure. Specifically, we propose a multi-linear multi-view clustering (MMC) method that can efficiently explore the full-order structural information among all views and reveal the underlying subspace structure embedded within the tensor. Extensive experiments on real-world datasets demonstrate that our proposed MMC algorithm clearly outperforms other related state-of-the-art methods.
Tasks
Published 2018-10-18
URL http://arxiv.org/abs/1810.07874v1
PDF http://arxiv.org/pdf/1810.07874v1.pdf
PWC https://paperswithcode.com/paper/a-self-organizing-tensor-architecture-for
Repo
Framework

Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities

Title Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities
Authors Yin Tat Lee, Zhao Song, Santosh S. Vempala
Abstract Sampling logconcave functions arising in statistics and machine learning has been a subject of intensive study. Recent developments include analyses for Langevin dynamics and Hamiltonian Monte Carlo (HMC). While both approaches have dimension-independent bounds for the underlying $\mathit{continuous}$ processes under sufficiently strong smoothness conditions, the resulting discrete algorithms have complexity and number of function evaluations growing with the dimension. Motivated by this problem, in this paper, we give a general algorithm for solving multivariate ordinary differential equations whose solution is close to the span of a known basis of functions (e.g., polynomials or piecewise polynomials). The resulting algorithm has polylogarithmic depth and essentially tight runtime - it is nearly linear in the size of the representation of the solution. We apply this to the sampling problem to obtain a nearly linear implementation of HMC for a broad class of smooth, strongly logconcave densities, with the number of iterations (parallel depth) and gradient evaluations being $\mathit{polylogarithmic}$ in the dimension (rather than polynomial as in previous work). This class includes the widely-used loss function for logistic regression with incoherent weight matrices and has been subject of much study recently. We also give a faster algorithm with $ \mathit{polylogarithmic~depth}$ for the more general and standard class of strongly convex functions with Lipschitz gradient. These results are based on (1) an improved contraction bound for the exact HMC process and (2) logarithmic bounds on the degree of polynomials that approximate solutions of the differential equations arising in implementing HMC.
Tasks
Published 2018-12-15
URL http://arxiv.org/abs/1812.06243v1
PDF http://arxiv.org/pdf/1812.06243v1.pdf
PWC https://paperswithcode.com/paper/algorithmic-theory-of-odes-and-sampling-from
Repo
Framework

DiscrimNet: Semi-Supervised Action Recognition from Videos using Generative Adversarial Networks

Title DiscrimNet: Semi-Supervised Action Recognition from Videos using Generative Adversarial Networks
Authors Unaiza Ahsan, Chen Sun, Irfan Essa
Abstract We propose an action recognition framework using Gen- erative Adversarial Networks. Our model involves train- ing a deep convolutional generative adversarial network (DCGAN) using a large video activity dataset without la- bel information. Then we use the trained discriminator from the GAN model as an unsupervised pre-training step and fine-tune the trained discriminator model on a labeled dataset to recognize human activities. We determine good network architectural and hyperparameter settings for us- ing the discriminator from DCGAN as a trained model to learn useful representations for action recognition. Our semi-supervised framework using only appearance infor- mation achieves superior or comparable performance to the current state-of-the-art semi-supervised action recog- nition methods on two challenging video activity datasets: UCF101 and HMDB51.
Tasks Temporal Action Localization
Published 2018-01-22
URL http://arxiv.org/abs/1801.07230v1
PDF http://arxiv.org/pdf/1801.07230v1.pdf
PWC https://paperswithcode.com/paper/discrimnet-semi-supervised-action-recognition
Repo
Framework

ZerNet: Convolutional Neural Networks on Arbitrary Surfaces via Zernike Local Tangent Space Estimation

Title ZerNet: Convolutional Neural Networks on Arbitrary Surfaces via Zernike Local Tangent Space Estimation
Authors Zhiyu Sun, Ethan Rooke, Jerome Charton, Yusen He, Jia Lu, Stephen Baek
Abstract In this paper, we propose a novel formulation to extend CNNs to two-dimensional (2D) manifolds using orthogonal basis functions, called Zernike polynomials. In many areas, geometric features play a key role in understanding scientific phenomena. Thus, an ability to codify geometric features into a mathematical quantity can be critical. Recently, convolutional neural networks (CNNs) have demonstrated the promising capability of extracting and codifying features from visual information. However, the progress has been concentrated in computer vision applications where there exists an inherent grid-like structure. In contrast, many geometry processing problems are defined on curved surfaces, and the generalization of CNNs is not quite trivial. The difficulties are rooted in the lack of key ingredients such as the canonical grid-like representation, the notion of consistent orientation, and a compatible local topology across the domain. In this paper, we prove that the convolution of two functions can be represented as a simple dot product between Zernike polynomial coefficients; and the rotation of a convolution kernel is essentially a set of 2-by-2 rotation matrices applied to the coefficients. As such, the key contribution of this work resides in a concise but rigorous mathematical generalization of the CNN building blocks.
Tasks
Published 2018-12-03
URL https://arxiv.org/abs/1812.01082v3
PDF https://arxiv.org/pdf/1812.01082v3.pdf
PWC https://paperswithcode.com/paper/zernet-convolutional-neural-networks-on
Repo
Framework

Dependency Grammar Induction with a Neural Variational Transition-based Parser

Title Dependency Grammar Induction with a Neural Variational Transition-based Parser
Authors Bowen Li, Jianpeng Cheng, Yang Liu, Frank Keller
Abstract Dependency grammar induction is the task of learning dependency syntax without annotated training data. Traditional graph-based models with global inference achieve state-of-the-art results on this task but they require $O(n^3)$ run time. Transition-based models enable faster inference with $O(n)$ time complexity, but their performance still lags behind. In this work, we propose a neural transition-based parser for dependency grammar induction, whose inference procedure utilizes rich neural features with $O(n)$ time complexity. We train the parser with an integration of variational inference, posterior regularization and variance reduction techniques. The resulting framework outperforms previous unsupervised transition-based dependency parsers and achieves performance comparable to graph-based models, both on the English Penn Treebank and on the Universal Dependency Treebank. In an empirical comparison, we show that our approach substantially increases parsing speed over graph-based models.
Tasks Dependency Grammar Induction
Published 2018-11-14
URL http://arxiv.org/abs/1811.05889v1
PDF http://arxiv.org/pdf/1811.05889v1.pdf
PWC https://paperswithcode.com/paper/dependency-grammar-induction-with-a-neural
Repo
Framework

Fully Associative Patch-based 1-to-N Matcher for Face Recognition

Title Fully Associative Patch-based 1-to-N Matcher for Face Recognition
Authors Lingfeng Zhang, Ioannis A. Kakadiaris
Abstract This paper focuses on improving face recognition performance by a patch-based 1-to-N signature matcher that learns correlations between different facial patches. A Fully Associative Patch-based Signature Matcher (FAPSM) is proposed so that the local matching identity of each patch contributes to the global matching identities of all the patches. The proposed matcher consists of three steps. First, based on the signature, the local matching identity and the corresponding matching score of each patch are computed. Then, a fully associative weight matrix is learned to obtain the global matching identities and scores of all the patches. At last, the l1-regularized weighting is applied to combine the global matching identity of each patch and obtain a final matching identity. The proposed matcher has been integrated with the UR2D system for evaluation. The experimental results indicate that the proposed matcher achieves better performance than the current UR2D system. The Rank-1 accuracy is improved significantly by 3% and 0.55% on the UHDB31 dataset and the IJB-A dataset, respectively.
Tasks Face Recognition
Published 2018-05-15
URL http://arxiv.org/abs/1805.06306v1
PDF http://arxiv.org/pdf/1805.06306v1.pdf
PWC https://paperswithcode.com/paper/fully-associative-patch-based-1-to-n-matcher
Repo
Framework

Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?

Title Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?
Authors Boris Hanin
Abstract We give a rigorous analysis of the statistical behavior of gradients in a randomly initialized fully connected network N with ReLU activations. Our results show that the empirical variance of the squares of the entries in the input-output Jacobian of N is exponential in a simple architecture-dependent constant beta, given by the sum of the reciprocals of the hidden layer widths. When beta is large, the gradients computed by N at initialization vary wildly. Our approach complements the mean field theory analysis of random networks. From this point of view, we rigorously compute finite width corrections to the statistics of gradients at the edge of chaos.
Tasks
Published 2018-01-11
URL http://arxiv.org/abs/1801.03744v3
PDF http://arxiv.org/pdf/1801.03744v3.pdf
PWC https://paperswithcode.com/paper/which-neural-net-architectures-give-rise-to
Repo
Framework

Extrapolation in NLP

Title Extrapolation in NLP
Authors Jeff Mitchell, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel
Abstract We argue that extrapolation to examples outside the training space will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec.
Tasks
Published 2018-05-17
URL http://arxiv.org/abs/1805.06648v1
PDF http://arxiv.org/pdf/1805.06648v1.pdf
PWC https://paperswithcode.com/paper/extrapolation-in-nlp
Repo
Framework
comments powered by Disqus