October 16, 2019

2834 words 14 mins read

Paper Group ANR 1047

Paper Group ANR 1047

Transformationally Identical and Invariant Convolutional Neural Networks by Combining Symmetric Operations or Input Vectors. GhostVLAD for set-based face recognition. Pathological Voice Classification Using Mel-Cepstrum Vectors and Support Vector Machine. Inferring the ground truth through crowdsourcing. Fading of collective attention shapes the ev …

Transformationally Identical and Invariant Convolutional Neural Networks by Combining Symmetric Operations or Input Vectors

Title Transformationally Identical and Invariant Convolutional Neural Networks by Combining Symmetric Operations or Input Vectors
Authors ShihChung B. Lo, Matthew T. Freedman, Seong K. Mun
Abstract Transformationally invariant processors constructed by transformed input vectors or operators have been suggested and applied to many applications. In this study, transformationally identical processing based on combining results of all sub-processes with corresponding transformations at one of the processing steps or at the beginning step were found to be equivalent for a given condition. This property can be applied to most convolutional neural network (CNN) systems. Specifically, a transformationally identical CNN can be constructed by arranging internally symmetric operations in parallel with the same transformation family that includes a flatten layer with weights sharing among their corresponding transformation elements. Other transformationally identical CNNs can be constructed by averaging transformed input vectors of the family at the input layer followed by an ordinary CNN process or by a set of symmetric operations. Interestingly, we found that both types of transformationally identical CNN systems are mathematically equivalent by either applying an averaging operation to corresponding elements of all sub-channels before the activation function or without using a non-linear activation function.
Tasks
Published 2018-07-30
URL http://arxiv.org/abs/1807.11156v3
PDF http://arxiv.org/pdf/1807.11156v3.pdf
PWC https://paperswithcode.com/paper/transformationally-identical-and-invariant
Repo
Framework

GhostVLAD for set-based face recognition

Title GhostVLAD for set-based face recognition
Authors Yujie Zhong, Relja Arandjelović, Andrew Zisserman
Abstract The objective of this paper is to learn a compact representation of image sets for template-based face recognition. We make the following contributions: first, we propose a network architecture which aggregates and embeds the face descriptors produced by deep convolutional neural networks into a compact fixed-length representation. This compact representation requires minimal memory storage and enables efficient similarity computation. Second, we propose a novel GhostVLAD layer that includes {\em ghost clusters}, that do not contribute to the aggregation. We show that a quality weighting on the input faces emerges automatically such that informative images contribute more than those with low quality, and that the ghost clusters enhance the network’s ability to deal with poor quality images. Third, we explore how input feature dimension, number of clusters and different training techniques affect the recognition performance. Given this analysis, we train a network that far exceeds the state-of-the-art on the IJB-B face recognition dataset. This is currently one of the most challenging public benchmarks, and we surpass the state-of-the-art on both the identification and verification protocols.
Tasks Face Recognition, Face Verification
Published 2018-10-23
URL http://arxiv.org/abs/1810.09951v1
PDF http://arxiv.org/pdf/1810.09951v1.pdf
PWC https://paperswithcode.com/paper/ghostvlad-for-set-based-face-recognition
Repo
Framework

Pathological Voice Classification Using Mel-Cepstrum Vectors and Support Vector Machine

Title Pathological Voice Classification Using Mel-Cepstrum Vectors and Support Vector Machine
Authors Maryam Pishgar, Fazle Karim, Somshubra Majumdar, Houshang Darabi
Abstract Vocal disorders have affected several patients all over the world. Due to the inherent difficulty of diagnosing vocal disorders without sophisticated equipment and trained personnel, a number of patients remain undiagnosed. To alleviate the monetary cost of diagnosis, there has been a recent growth in the use of data analysis to accurately detect and diagnose individuals for a fraction of the cost. We propose a cheap, efficient and accurate model to diagnose whether a patient suffers from one of three vocal disorders on the FEMH 2018 challenge.
Tasks
Published 2018-12-19
URL http://arxiv.org/abs/1812.07729v1
PDF http://arxiv.org/pdf/1812.07729v1.pdf
PWC https://paperswithcode.com/paper/pathological-voice-classification-using-mel
Repo
Framework

Inferring the ground truth through crowdsourcing

Title Inferring the ground truth through crowdsourcing
Authors Jean Pierre Char
Abstract Universally valid ground truth is almost impossible to obtain or would come at a very high cost. For supervised learning without universally valid ground truth, a recommended approach is applying crowdsourcing: Gathering a large data set annotated by multiple individuals of varying possibly expertise levels and inferring the ground truth data to be used as labels to train the classifier. Nevertheless, due to the sensitivity of the problem at hand (e.g. mitosis detection in breast cancer histology images), the obtained data needs verification and proper assessment before being used for classifier training. Even in the context of organic computing systems, an indisputable ground truth might not always exist. Therefore, it should be inferred through the aggregation and verification of the local knowledge of each autonomous agent.
Tasks Mitosis Detection
Published 2018-07-31
URL http://arxiv.org/abs/1807.11836v1
PDF http://arxiv.org/pdf/1807.11836v1.pdf
PWC https://paperswithcode.com/paper/inferring-the-ground-truth-through
Repo
Framework

Fading of collective attention shapes the evolution of linguistic variants

Title Fading of collective attention shapes the evolution of linguistic variants
Authors Diego E Shalom, Mariano Sigman, Gabriel Mindlin, Marcos A Trevisan
Abstract Language change involves the competition between alternative linguistic forms (1). The spontaneous evolution of these forms typically results in monotonic growths or decays (2, 3) like in winner-take-all attractor behaviors. In the case of the Spanish past subjunctive, the spontaneous evolution of its two competing forms (ended in -ra and -se) was perturbed by the appearance of the Royal Spanish Academy in 1713, which enforced the spelling of both forms as perfectly interchangeable variants (4), at a moment in which the -ra form was dominant (5). Time series extracted from a massive corpus of books (6) reveal that this regulation in fact produced a transient renewed interest for the old form -se which, once faded, left the -ra again as the dominant form up to the present day. We show that time series are successfully explained by a two-dimensional linear model that integrates an imitative and a novelty component. The model reveals that the temporal scale over which collective attention fades is in inverse proportion to the verb frequency. The integration of the two basic mechanisms of imitation and attention to novelty allows to understand diverse competing objects, with lifetimes that range from hours for memes and news (7, 8) to decades for verbs, suggesting the existence of a general mechanism underlying cultural evolution.
Tasks Time Series
Published 2018-11-20
URL http://arxiv.org/abs/1811.08465v4
PDF http://arxiv.org/pdf/1811.08465v4.pdf
PWC https://paperswithcode.com/paper/fading-of-collective-attention-shapes-the
Repo
Framework
Title Evolutionary-Neural Hybrid Agents for Architecture Search
Authors Krzysztof Maziarz, Mingxing Tan, Andrey Khorlin, Marin Georgiev, Andrea Gesmundo
Abstract Neural Architecture Search has shown potential to automate the design of neural networks. Deep Reinforcement Learning based agents can learn complex architectural patterns, as well as explore a vast and compositional search space. On the other hand, evolutionary algorithms offer higher sample efficiency, which is critical for such a resource intensive application. In order to capture the best of both worlds, we propose a class of Evolutionary-Neural hybrid agents (Evo-NAS). We show that the Evo-NAS agent outperforms both neural and evolutionary agents when applied to architecture search for a suite of text and image classification benchmarks. On a high-complexity architecture search space for image classification, the Evo-NAS agent surpasses the accuracy achieved by commonly used agents with only 1/3 of the search cost.
Tasks Image Classification, Neural Architecture Search, Text Classification
Published 2018-11-24
URL https://arxiv.org/abs/1811.09828v4
PDF https://arxiv.org/pdf/1811.09828v4.pdf
PWC https://paperswithcode.com/paper/evolutionary-neural-hybrid-agents-for
Repo
Framework

Context-Dependent Diffusion Network for Visual Relationship Detection

Title Context-Dependent Diffusion Network for Visual Relationship Detection
Authors Zhen Cui, Chunyan Xu, Wenming Zheng, Jian Yang
Abstract Visual relationship detection can bridge the gap between computer vision and natural language for scene understanding of images. Different from pure object recognition tasks, the relation triplets of subject-predicate-object lie on an extreme diversity space, such as \textit{person-behind-person} and \textit{car-behind-building}, while suffering from the problem of combinatorial explosion. In this paper, we propose a context-dependent diffusion network (CDDN) framework to deal with visual relationship detection. To capture the interactions of different object instances, two types of graphs, word semantic graph and visual scene graph, are constructed to encode global context interdependency. The semantic graph is built through language priors to model semantic correlations across objects, whilst the visual scene graph defines the connections of scene objects so as to utilize the surrounding scene information. For the graph-structured data, we design a diffusion network to adaptively aggregate information from contexts, which can effectively learn latent representations of visual relationships and well cater to visual relationship detection in view of its isomorphic invariance to graphs. Experiments on two widely-used datasets demonstrate that our proposed method is more effective and achieves the state-of-the-art performance.
Tasks Object Recognition, Scene Understanding
Published 2018-09-11
URL http://arxiv.org/abs/1809.06213v1
PDF http://arxiv.org/pdf/1809.06213v1.pdf
PWC https://paperswithcode.com/paper/context-dependent-diffusion-network-for
Repo
Framework

Towards Optimal Transport with Global Invariances

Title Towards Optimal Transport with Global Invariances
Authors David Alvarez-Melis, Stefanie Jegelka, Tommi S. Jaakkola
Abstract Many problems in machine learning involve calculating correspondences between sets of objects, such as point clouds or images. Discrete optimal transport provides a natural and successful approach to such tasks whenever the two sets of objects can be represented in the same space, or at least distances between them can be directly evaluated. Unfortunately neither requirement is likely to hold when object representations are learned from data. Indeed, automatically derived representations such as word embeddings are typically fixed only up to some global transformations, for example, reflection or rotation. As a result, pairwise distances across two such instances are ill-defined without specifying their relative transformation. In this work, we propose a general framework for optimal transport in the presence of latent global transformations. We cast the problem as a joint optimization over transport couplings and transformations chosen from a flexible class of invariances, propose algorithms to solve it, and show promising results in various tasks, including a popular unsupervised word translation benchmark.
Tasks Word Embeddings
Published 2018-06-25
URL http://arxiv.org/abs/1806.09277v2
PDF http://arxiv.org/pdf/1806.09277v2.pdf
PWC https://paperswithcode.com/paper/towards-optimal-transport-with-global
Repo
Framework

Improving part-of-speech tagging via multi-task learning and character-level word representations

Title Improving part-of-speech tagging via multi-task learning and character-level word representations
Authors Daniil Anastasyev, Ilya Gusev, Eugene Indenbom
Abstract In this paper, we explore the ways to improve POS-tagging using various types of auxiliary losses and different word representations. As a baseline, we utilized a BiLSTM tagger, which is able to achieve state-of-the-art results on the sequence labelling tasks. We developed a new method for character-level word representation using feedforward neural network. Such representation gave us better results in terms of speed and performance of the model. We also applied a novel technique of pretraining such word representations with existing word vectors. Finally, we designed a new variant of auxiliary loss for sequence labelling tasks: an additional prediction of the neighbour labels. Such loss forces a model to learn the dependencies in-side a sequence of labels and accelerates the process of training. We test these methods on English and Russian languages.
Tasks Multi-Task Learning, Part-Of-Speech Tagging
Published 2018-07-02
URL http://arxiv.org/abs/1807.00818v1
PDF http://arxiv.org/pdf/1807.00818v1.pdf
PWC https://paperswithcode.com/paper/improving-part-of-speech-tagging-via-multi
Repo
Framework

QuasarNET: Human-level spectral classification and redshifting with Deep Neural Networks

Title QuasarNET: Human-level spectral classification and redshifting with Deep Neural Networks
Authors Nicolas Busca, Christophe Balland
Abstract We introduce QuasarNET, a deep convolutional neural network that performs classification and redshift estimation of astrophysical spectra with human-expert accuracy. We pose these two tasks as a \emph{feature detection} problem: presence or absence of spectral features determines the class, and their wavelength determines the redshift, very much like human-experts proceed. When ran on BOSS data to identify quasars through their emission lines, QuasarNET defines a sample $99.51\pm0.03$% pure and $99.52\pm0.03$% complete, well above the requirements of many analyses using these data. QuasarNET significantly reduces the problem of line-confusion that induces catastrophic redshift failures to below 0.2%. We also extend QuasarNET to classify spectra with broad absorption line (BAL) features, achieving an accuracy of $98.0\pm0.4$% for recognizing BAL and $97.0\pm0.2$% for rejecting non-BAL quasars. QuasarNET is trained on data of low signal-to-noise and medium resolution, typical of current and future astrophysical surveys, and could be easily applied to classify spectra from current and upcoming surveys such as eBOSS, DESI and 4MOST.
Tasks
Published 2018-08-29
URL http://arxiv.org/abs/1808.09955v1
PDF http://arxiv.org/pdf/1808.09955v1.pdf
PWC https://paperswithcode.com/paper/quasarnet-human-level-spectral-classification
Repo
Framework

Augmenting Compositional Models for Knowledge Base Completion Using Gradient Representations

Title Augmenting Compositional Models for Knowledge Base Completion Using Gradient Representations
Authors Matthias Lalisse, Paul Smolensky
Abstract Neural models of Knowledge Base data have typically employed compositional representations of graph objects: entity and relation embeddings are systematically combined to evaluate the truth of a candidate Knowedge Base entry. Using a model inspired by Harmonic Grammar, we propose to tokenize triplet embeddings by subjecting them to a process of optimization with respect to learned well-formedness conditions on Knowledge Base triplets. The resulting model, known as Gradient Graphs, leads to sizable improvements when implemented as a companion to compositional models. Also, we show that the “supracompositional” triplet token embeddings it produces have interpretable properties that prove helpful in performing inference on the resulting triplet representations.
Tasks Knowledge Base Completion
Published 2018-11-02
URL https://arxiv.org/abs/1811.01062v2
PDF https://arxiv.org/pdf/1811.01062v2.pdf
PWC https://paperswithcode.com/paper/augmenting-compositional-models-for-knowledge
Repo
Framework

Approximate FPGA-based LSTMs under Computation Time Constraints

Title Approximate FPGA-based LSTMs under Computation Time Constraints
Authors Michalis Rizakis, Stylianos I. Venieris, Alexandros Kouris, Christos-Savvas Bouganis
Abstract Recurrent Neural Networks and in particular Long Short-Term Memory (LSTM) networks have demonstrated state-of-the-art accuracy in several emerging Artificial Intelligence tasks. However, the models are becoming increasingly demanding in terms of computational and memory load. Emerging latency-sensitive applications including mobile robots and autonomous vehicles often operate under stringent computation time constraints. In this paper, we address the challenge of deploying computationally demanding LSTMs at a constrained time budget by introducing an approximate computing scheme that combines iterative low-rank compression and pruning, along with a novel FPGA-based LSTM architecture. Combined in an end-to-end framework, the approximation method’s parameters are optimised and the architecture is configured to address the problem of high-performance LSTM execution in time-constrained applications. Quantitative evaluation on a real-life image captioning application indicates that the proposed methods required up to 6.5x less time to achieve the same application-level accuracy compared to a baseline method, while achieving an average of 25x higher accuracy under the same computation time constraints.
Tasks Autonomous Vehicles, Image Captioning
Published 2018-01-07
URL http://arxiv.org/abs/1801.02190v1
PDF http://arxiv.org/pdf/1801.02190v1.pdf
PWC https://paperswithcode.com/paper/approximate-fpga-based-lstms-under
Repo
Framework

Label Embedding with Partial Heterogeneous Contexts

Title Label Embedding with Partial Heterogeneous Contexts
Authors Yaxin Shi, Donna Xu, Yuangang Pan, Ivor W. Tsang, Shirui Pan
Abstract Label embedding plays an important role in many real-world applications. To enhance the label relatedness captured by the embeddings, multiple contexts can be adopted. However, these contexts are heterogeneous and often partially observed in practical tasks, imposing significant challenges to capture the overall relatedness among labels. In this paper, we propose a general Partial Heterogeneous Context Label Embedding (PHCLE) framework to address these challenges. Categorizing heterogeneous contexts into two groups, relational context and descriptive context, we design tailor-made matrix factorization formula to effectively exploit the label relatedness in each context. With a shared embedding principle across heterogeneous contexts, the label relatedness is selectively aligned in a shared space. Due to our elegant formulation, PHCLE overcomes the partial context problem and can nicely incorporate more contexts, which both cannot be tackled with existing multi-context label embedding methods. An effective alternative optimization algorithm is further derived to solve the sparse matrix factorization problem. Experimental results demonstrate that the label embeddings obtained with PHCLE achieve superb performance in image classification task and exhibit good interpretability in the downstream label similarity analysis and image understanding task.
Tasks Image Classification
Published 2018-05-03
URL http://arxiv.org/abs/1805.01199v2
PDF http://arxiv.org/pdf/1805.01199v2.pdf
PWC https://paperswithcode.com/paper/label-embedding-with-partial-heterogeneous
Repo
Framework

Winograd Schema - Knowledge Extraction Using Narrative Chains

Title Winograd Schema - Knowledge Extraction Using Narrative Chains
Authors Vatsal Mahajan
Abstract The Winograd Schema Challenge (WSC) is a test of machine intelligence, designed to be an improvement on the Turing test. A Winograd Schema consists of a sentence and a corresponding question. To successfully answer these questions, one requires the use of commonsense knowledge and reasoning. This work focuses on extracting common sense knowledge which can be used to generate answers for the Winograd schema challenge. Common sense knowledge is extracted based on events (or actions) and their participants; called Event-Based Conditional Commonsense (ECC). I propose an approach using Narrative Event Chains [Chambers et al., 2008] to extract ECC knowledge. These are stored in templates, to be later used for answering the WSC questions. This approach works well with respect to a subset of WSC tasks.
Tasks Common Sense Reasoning
Published 2018-01-08
URL http://arxiv.org/abs/1801.02281v1
PDF http://arxiv.org/pdf/1801.02281v1.pdf
PWC https://paperswithcode.com/paper/winograd-schema-knowledge-extraction-using
Repo
Framework

LaVAN: Localized and Visible Adversarial Noise

Title LaVAN: Localized and Visible Adversarial Noise
Authors Danny Karmon, Daniel Zoran, Yoav Goldberg
Abstract Most works on adversarial examples for deep-learning based image classifiers use noise that, while small, covers the entire image. We explore the case where the noise is allowed to be visible but confined to a small, localized patch of the image, without covering any of the main object(s) in the image. We show that it is possible to generate localized adversarial noises that cover only 2% of the pixels in the image, none of them over the main object, and that are transferable across images and locations, and successfully fool a state-of-the-art Inception v3 model with very high success rates.
Tasks
Published 2018-01-08
URL http://arxiv.org/abs/1801.02608v2
PDF http://arxiv.org/pdf/1801.02608v2.pdf
PWC https://paperswithcode.com/paper/lavan-localized-and-visible-adversarial-noise
Repo
Framework
comments powered by Disqus