October 16, 2019

2889 words 14 mins read

Paper Group ANR 1107

Paper Group ANR 1107

Texture Synthesis Guided Deep Hashing for Texture Image Retrieval. Neural Conditional Gradients. Learning Cross-lingual Distributed Logical Representations for Semantic Parsing. WEBCA: Weakly-Electric-Fish Bioinspired Cognitive Architecture. Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information. Beyond Context: Ex …

Texture Synthesis Guided Deep Hashing for Texture Image Retrieval

Title Texture Synthesis Guided Deep Hashing for Texture Image Retrieval
Authors Ayan Kumar Bhunia, Perla Sai Raj Kishore, Pranay Mukherjee, Abhirup Das, Partha Pratim Roy
Abstract With the large-scale explosion of images and videos over the internet, efficient hashing methods have been developed to facilitate memory and time efficient retrieval of similar images. However, none of the existing works uses hashing to address texture image retrieval mostly because of the lack of sufficiently large texture image databases. Our work addresses this problem by developing a novel deep learning architecture that generates binary hash codes for input texture images. For this, we first pre-train a Texture Synthesis Network (TSN) which takes a texture patch as input and outputs an enlarged view of the texture by injecting newer texture content. Thus it signifies that the TSN encodes the learnt texture specific information in its intermediate layers. In the next stage, a second network gathers the multi-scale feature representations from the TSN’s intermediate layers using channel-wise attention, combines them in a progressive manner to a dense continuous representation which is finally converted into a binary hash code with the help of individual and pairwise label information. The new enlarged texture patches also help in data augmentation to alleviate the problem of insufficient texture data and are used to train the second stage of the network. Experiments on three public texture image retrieval datasets indicate the superiority of our texture synthesis guided hashing approach over current state-of-the-art methods.
Tasks Data Augmentation, Image Retrieval, Texture Image Retrieval, Texture Synthesis
Published 2018-11-04
URL https://arxiv.org/abs/1811.01401v5
PDF https://arxiv.org/pdf/1811.01401v5.pdf
PWC https://paperswithcode.com/paper/texture-synthesis-guided-deep-hashing-for
Repo
Framework

Neural Conditional Gradients

Title Neural Conditional Gradients
Authors Patrick Schramowski, Christian Bauckhage, Kristian Kersting
Abstract The move from hand-designed to learned optimizers in machine learning has been quite successful for gradient-based and -free optimizers. When facing a constrained problem, however, maintaining feasibility typically requires a projection step, which might be computationally expensive and not differentiable. We show how the design of projection-free convex optimization algorithms can be cast as a learning problem based on Frank-Wolfe Networks: recurrent networks implementing the Frank-Wolfe algorithm aka. conditional gradients. This allows them to learn to exploit structure when, e.g., optimizing over rank-1 matrices. Our LSTM-learned optimizers outperform hand-designed as well learned but unconstrained ones. We demonstrate this for training support vector machines and softmax classifiers.
Tasks
Published 2018-03-12
URL http://arxiv.org/abs/1803.04300v2
PDF http://arxiv.org/pdf/1803.04300v2.pdf
PWC https://paperswithcode.com/paper/neural-conditional-gradients
Repo
Framework

Learning Cross-lingual Distributed Logical Representations for Semantic Parsing

Title Learning Cross-lingual Distributed Logical Representations for Semantic Parsing
Authors Yanyan Zou, Wei Lu
Abstract With the development of several multilingual datasets used for semantic parsing, recent research efforts have looked into the problem of learning semantic parsers in a multilingual setup. However, how to improve the performance of a monolingual semantic parser for a specific language by leveraging data annotated in different languages remains a research question that is under-explored. In this work, we present a study to show how learning distributed representations of the logical forms from data annotated in different languages can be used for improving the performance of a monolingual semantic parser. We extend two existing monolingual semantic parsers to incorporate such cross-lingual distributed logical representations as features. Experiments show that our proposed approach is able to yield improved semantic parsing results on the standard multilingual GeoQuery dataset.
Tasks Semantic Parsing
Published 2018-06-14
URL http://arxiv.org/abs/1806.05461v1
PDF http://arxiv.org/pdf/1806.05461v1.pdf
PWC https://paperswithcode.com/paper/learning-cross-lingual-distributed-logical
Repo
Framework

WEBCA: Weakly-Electric-Fish Bioinspired Cognitive Architecture

Title WEBCA: Weakly-Electric-Fish Bioinspired Cognitive Architecture
Authors Amit Kumar Mishra
Abstract Neuroethology has been an active field of study for more than a century now. Out of some of the most interesting species that has been studied so far, weakly electric fish is a fascinating one. It performs communication, echo-location and inter-species detection efficiently with an interesting configuration of sensors, neu-rons and a simple brain. In this paper we propose a cognitive architecture inspired by the way these fishes handle and process information. We believe that it is eas-ier to understand and mimic the neural architectures of a simpler species than that of human. Hence, the proposed architecture is expected to both help research in cognitive robotics and also help understand more complicated brains like that of human beings.
Tasks
Published 2018-06-29
URL http://arxiv.org/abs/1806.11401v1
PDF http://arxiv.org/pdf/1806.11401v1.pdf
PWC https://paperswithcode.com/paper/webca-weakly-electric-fish-bioinspired
Repo
Framework

Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information

Title Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information
Authors Masaki Asada, Makoto Miwa, Yutaka Sasaki
Abstract We propose a novel neural method to extract drug-drug interactions (DDIs) from texts using external drug molecular structure information. We encode textual drug pairs with convolutional neural networks and their molecular pairs with graph convolutional networks (GCNs), and then we concatenate the outputs of these two networks. In the experiments, we show that GCNs can predict DDIs from the molecular structures of drugs in high accuracy and the molecular information can enhance text-based DDI extraction by 2.39 percent points in the F-score on the DDIExtraction 2013 shared task data set.
Tasks
Published 2018-05-15
URL http://arxiv.org/abs/1805.05593v1
PDF http://arxiv.org/pdf/1805.05593v1.pdf
PWC https://paperswithcode.com/paper/enhancing-drug-drug-interaction-extraction
Repo
Framework

Beyond Context: Exploring Semantic Similarity for Tiny Face Detection

Title Beyond Context: Exploring Semantic Similarity for Tiny Face Detection
Authors Yue Xi, Jiangbin Zheng, Xiangjian He, Wenjing Jia, Hanhui Li
Abstract Tiny face detection aims to find faces with high degrees of variability in scale, resolution and occlusion in cluttered scenes. Due to the very little information available on tiny faces, it is not sufficient to detect them merely based on the information presented inside the tiny bounding boxes or their context. In this paper, we propose to exploit the semantic similarity among all predicted targets in each image to boost current face detectors. To this end, we present a novel framework to model semantic similarity as pairwise constraints within the metric learning scheme, and then refine our predictions with the semantic similarity by utilizing the graph cut techniques. Experiments conducted on three widely-used benchmark datasets have demonstrated the improvement over the-state-of-the-arts gained by applying this idea.
Tasks Face Detection, Metric Learning, Semantic Similarity, Semantic Textual Similarity
Published 2018-03-05
URL http://arxiv.org/abs/1803.01555v1
PDF http://arxiv.org/pdf/1803.01555v1.pdf
PWC https://paperswithcode.com/paper/beyond-context-exploring-semantic-similarity
Repo
Framework

Neural Network Architecture for Credibility Assessment of Textual Claims

Title Neural Network Architecture for Credibility Assessment of Textual Claims
Authors Rajat Singh, Nurendra Choudhary, Ishita Bindlish, Manish Shrivastava
Abstract Text articles with false claims, especially news, have recently become aggravating for the Internet users. These articles are in wide circulation and readers face difficulty discerning fact from fiction. Previous work on credibility assessment has focused on factual analysis and linguistic features. The task’s main challenge is the distinction between the features of true and false articles. In this paper, we propose a novel approach called Credibility Outcome (CREDO) which aims at scoring the credibility of an article in an open domain setting. CREDO consists of different modules for capturing various features responsible for the credibility of an article. These features includes credibility of the article’s source and author, semantic similarity between the article and related credible articles retrieved from a knowledge base, and sentiments conveyed by the article. A neural network architecture learns the contribution of each of these modules to the overall credibility of an article. Experiments on Snopes dataset reveals that CREDO outperforms the state-of-the-art approaches based on linguistic features.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2018-03-28
URL http://arxiv.org/abs/1803.10547v2
PDF http://arxiv.org/pdf/1803.10547v2.pdf
PWC https://paperswithcode.com/paper/neural-network-architecture-for-credibility
Repo
Framework

No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference

Title No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference
Authors Shumeet Baluja, David Marwood, Michele Covell, Nick Johnston
Abstract For successful deployment of deep neural networks on highly–resource-constrained devices (hearing aids, earbuds, wearables), we must simplify the types of operations and the memory/power resources used during inference. Completely avoiding inference-time floating-point operations is one of the simplest ways to design networks for these highly-constrained environments. By discretizing both our in-network non-linearities and our network weights, we can move to simple, compact networks without floating point operations, without multiplications, and avoid all non-linear function computations. Our approach allows us to explore the spectrum of possible networks, ranging from fully continuous versions down to networks with bi-level weights and activations. Our results show that discretization can be done without loss of performance and that we can train a network that will successfully operate without floating-point, without multiplication, and with less RAM on both regression tasks (auto encoding) and multi-class classification tasks (ImageNet). The memory needed to deploy our discretized networks is less than one third of the equivalent architecture that does use floating-point operations.
Tasks
Published 2018-09-24
URL http://arxiv.org/abs/1809.09244v2
PDF http://arxiv.org/pdf/1809.09244v2.pdf
PWC https://paperswithcode.com/paper/no-multiplication-no-floating-point-no
Repo
Framework

A Unified Framework for Multi-View Multi-Class Object Pose Estimation

Title A Unified Framework for Multi-View Multi-Class Object Pose Estimation
Authors Chi Li, Jin Bai, Gregory D. Hager
Abstract One core challenge in object pose estimation is to ensure accurate and robust performance for large numbers of diverse foreground objects amidst complex background clutter. In this work, we present a scalable framework for accurately inferring six Degree-of-Freedom (6-DoF) pose for a large number of object classes from single or multiple views. To learn discriminative pose features, we integrate three new capabilities into a deep Convolutional Neural Network (CNN): an inference scheme that combines both classification and pose regression based on a uniform tessellation of the Special Euclidean group in three dimensions (SE(3)), the fusion of class priors into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask. Further, an efficient multi-view framework is formulated to address single-view ambiguity. We show that this framework consistently improves the performance of the single-view network. We evaluate our method on three large-scale benchmarks: YCB-Video, JHUScene-50 and ObjectNet-3D. Our approach achieves competitive or superior performance over the current state-of-the-art methods.
Tasks Pose Estimation
Published 2018-03-21
URL http://arxiv.org/abs/1803.08103v2
PDF http://arxiv.org/pdf/1803.08103v2.pdf
PWC https://paperswithcode.com/paper/a-unified-framework-for-multi-view-multi
Repo
Framework

Power Normalizing Second-order Similarity Network for Few-shot Learning

Title Power Normalizing Second-order Similarity Network for Few-shot Learning
Authors Hongguang Zhang, Piotr Koniusz
Abstract Second- and higher-order statistics of data points have played an important role in advancing the state of the art on several computer vision problems such as the fine-grained image and scene recognition. However, these statistics need to be passed via an appropriate pooling scheme to obtain the best performance. Power Normalizations are non-linear activation units which enjoy probability-inspired derivations and can be applied in CNNs. In this paper, we propose a similarity learning network leveraging second-order information and Power Normalizations. To this end, we propose several formulations capturing second-order statistics and derive a sigmoid-like Power Normalizing function to demonstrate its interpretability. Our model is trained end-to-end to learn the similarity between the support set and query images for the problem of one- and few-shot learning. The evaluations on Omniglot, miniImagenet and Open MIC datasets demonstrate that this network obtains state-of-the-art results on several few-shot learning protocols.
Tasks Few-Shot Learning, Omniglot, Scene Recognition
Published 2018-11-10
URL http://arxiv.org/abs/1811.04167v1
PDF http://arxiv.org/pdf/1811.04167v1.pdf
PWC https://paperswithcode.com/paper/power-normalizing-second-order-similarity
Repo
Framework

Classifier-Guided Visual Correction of Noisy Labels for Image Classification Tasks

Title Classifier-Guided Visual Correction of Noisy Labels for Image Classification Tasks
Authors Alex Bäuerle, Heiko Neumann, Timo Ropinski
Abstract Training data plays an essential role in modern applications of machine learning. However, gathering labeled training data is time-consuming. Therefore, labeling is often outsourced to less experienced users, or completely automated. This can introduce errors, which compromise valuable training data, and lead to suboptimal training results. We thus propose a novel approach that uses the power of pretrained classifiers to visually guide users to noisy labels, and let them interactively check error candidates, to iteratively improve the training dataset. To systematically investigate training data, we propose a categorization of labeling errors into three different types based on an analyzation of potential pitfalls in label acquisition processes. For each of these types, we present approaches to detect, reason about, and resolve error candidates, as we propose metrics and visual guidance techniques to support machine learning users. Our approach has been used to spot errors in well-known machine learning benchmark datasets, and we tested its usability during a user evaluation. While initially developed for images, the techniques presented in this paper are independent of the classification algorithm, and can also be extended to many other types of training data.
Tasks Image Classification
Published 2018-08-09
URL https://arxiv.org/abs/1808.03114v2
PDF https://arxiv.org/pdf/1808.03114v2.pdf
PWC https://paperswithcode.com/paper/training-de-confusion-an-interactive-network
Repo
Framework

Learning Dynamics from Kinematics: Estimating 2D Foot Pressure Maps from Video Frames

Title Learning Dynamics from Kinematics: Estimating 2D Foot Pressure Maps from Video Frames
Authors Christopher Funk, Savinay Nagendra, Jesse Scott, Bharadwaj Ravichandran, John H. Challis, Robert T. Collins, Yanxi Liu
Abstract Pose stability analysis is the key to understanding locomotion and control of body equilibrium, with applications in numerous fields such as kinesiology, medicine, and robotics. In biomechanics, Center of Pressure (CoP) is used in studies of human postural control and gait. We propose and validate a novel approach to learn CoP from pose of a human body to aid stability analysis. More specifically, we propose an end-to-end deep learning architecture to regress foot pressure heatmaps, and hence the CoP locations, from 2D human pose derived from video. We have collected a set of long (5min +) choreographed Taiji (Tai Chi) sequences of multiple subjects with synchronized foot pressure and video data. The derived human pose data and corresponding foot pressure maps are used jointly in training a convolutional neural network with residual architecture, named PressNET. Cross-subject validation results show promising performance of PressNET, significantly outperforming the baseline method of K-Nearest Neighbors. Furthermore, we demonstrate that our computation of center of pressure (CoP) from PressNET is not only significantly more accurate than those obtained from the baseline approach but also meets the expectations of corresponding lab-based measurements of stability studies in kinesiology.
Tasks Motion Capture
Published 2018-11-30
URL https://arxiv.org/abs/1811.12607v4
PDF https://arxiv.org/pdf/1811.12607v4.pdf
PWC https://paperswithcode.com/paper/learning-dynamics-from-kinematics-estimating
Repo
Framework

Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks

Title Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks
Authors Christoph Wick, Christian Reul, Frank Puppe
Abstract This paper proposes a combination of a convolutional and a LSTM network to improve the accuracy of OCR on early printed books. While the standard model of line based OCR uses a single LSTM layer, we utilize a CNN- and Pooling-Layer combination in advance of an LSTM layer. Due to the higher amount of trainable parameters the performance of the network relies on a high amount of training examples to unleash its power. Hereby, the error is reduced by a factor of up to 44%, yielding a CER of 1% and below. To further improve the results we use a voting mechanism to achieve character error rates (CER) below $0.5%$. The runtime of the deep model for training and prediction of a book behaves very similar to a shallow network.
Tasks Optical Character Recognition
Published 2018-02-27
URL http://arxiv.org/abs/1802.10033v1
PDF http://arxiv.org/pdf/1802.10033v1.pdf
PWC https://paperswithcode.com/paper/improving-ocr-accuracy-on-early-printed-books-1
Repo
Framework

Sampling-based Bayesian Inference with gradient uncertainty

Title Sampling-based Bayesian Inference with gradient uncertainty
Authors Chanwoo Park, Jae Myung Kim, Seok Hyeon Ha, Jungwoo Lee
Abstract Deep neural networks(NNs) have achieved impressive performance, often exceed human performance on many computer vision tasks. However, one of the most challenging issues that still remains is that NNs are overconfident in their predictions, which can be very harmful when this arises in safety critical applications. In this paper, we show that predictive uncertainty can be efficiently estimated when we incorporate the concept of gradients uncertainty into posterior sampling. The proposed method is tested on two different datasets, MNIST for in-distribution confusing examples and notMNIST for out-of-distribution data. We show that our method is able to efficiently represent predictive uncertainty on both datasets.
Tasks Bayesian Inference
Published 2018-12-08
URL https://arxiv.org/abs/1812.03285v2
PDF https://arxiv.org/pdf/1812.03285v2.pdf
PWC https://paperswithcode.com/paper/sampling-based-bayesian-inference-with
Repo
Framework

Progressive Weight Pruning of Deep Neural Networks using ADMM

Title Progressive Weight Pruning of Deep Neural Networks using ADMM
Authors Shaokai Ye, Tianyun Zhang, Kaiqi Zhang, Jiayu Li, Kaidi Xu, Yunfei Yang, Fuxun Yu, Jian Tang, Makan Fardad, Sijia Liu, Xiang Chen, Xue Lin, Yanzhi Wang
Abstract Deep neural networks (DNNs) although achieving human-level performance in many domains, have very large model size that hinders their broader applications on edge computing devices. Extensive research work have been conducted on DNN model compression or pruning. However, most of the previous work took heuristic approaches. This work proposes a progressive weight pruning approach based on ADMM (Alternating Direction Method of Multipliers), a powerful technique to deal with non-convex optimization problems with potentially combinatorial constraints. Motivated by dynamic programming, the proposed method reaches extremely high pruning rate by using partial prunings with moderate pruning rates. Therefore, it resolves the accuracy degradation and long convergence time problems when pursuing extremely high pruning ratios. It achieves up to 34 times pruning rate for ImageNet dataset and 167 times pruning rate for MNIST dataset, significantly higher than those reached by the literature work. Under the same number of epochs, the proposed method also achieves faster convergence and higher compression rates. The codes and pruned DNN models are released in the link bit.ly/2zxdlss
Tasks Model Compression
Published 2018-10-17
URL http://arxiv.org/abs/1810.07378v2
PDF http://arxiv.org/pdf/1810.07378v2.pdf
PWC https://paperswithcode.com/paper/progressive-weight-pruning-of-deep-neural
Repo
Framework
comments powered by Disqus