October 16, 2019

2889 words 14 mins read

Paper Group ANR 1107

Texture Synthesis Guided Deep Hashing for Texture Image Retrieval. Neural Conditional Gradients. Learning Cross-lingual Distributed Logical Representations for Semantic Parsing. WEBCA: Weakly-Electric-Fish Bioinspired Cognitive Architecture. Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information. Beyond Context: Ex …

Texture Synthesis Guided Deep Hashing for Texture Image Retrieval


Title	Texture Synthesis Guided Deep Hashing for Texture Image Retrieval
Authors	Ayan Kumar Bhunia, Perla Sai Raj Kishore, Pranay Mukherjee, Abhirup Das, Partha Pratim Roy
Abstract	With the large-scale explosion of images and videos over the internet, efficient hashing methods have been developed to facilitate memory and time efficient retrieval of similar images. However, none of the existing works uses hashing to address texture image retrieval mostly because of the lack of sufficiently large texture image databases. Our work addresses this problem by developing a novel deep learning architecture that generates binary hash codes for input texture images. For this, we first pre-train a Texture Synthesis Network (TSN) which takes a texture patch as input and outputs an enlarged view of the texture by injecting newer texture content. Thus it signifies that the TSN encodes the learnt texture specific information in its intermediate layers. In the next stage, a second network gathers the multi-scale feature representations from the TSN’s intermediate layers using channel-wise attention, combines them in a progressive manner to a dense continuous representation which is finally converted into a binary hash code with the help of individual and pairwise label information. The new enlarged texture patches also help in data augmentation to alleviate the problem of insufficient texture data and are used to train the second stage of the network. Experiments on three public texture image retrieval datasets indicate the superiority of our texture synthesis guided hashing approach over current state-of-the-art methods.
Tasks	Data Augmentation, Image Retrieval, Texture Image Retrieval, Texture Synthesis
Published	2018-11-04
URL	https://arxiv.org/abs/1811.01401v5
PDF	https://arxiv.org/pdf/1811.01401v5.pdf
PWC	https://paperswithcode.com/paper/texture-synthesis-guided-deep-hashing-for
Repo
Framework

Neural Conditional Gradients


Title	Neural Conditional Gradients
Authors	Patrick Schramowski, Christian Bauckhage, Kristian Kersting
Abstract	The move from hand-designed to learned optimizers in machine learning has been quite successful for gradient-based and -free optimizers. When facing a constrained problem, however, maintaining feasibility typically requires a projection step, which might be computationally expensive and not differentiable. We show how the design of projection-free convex optimization algorithms can be cast as a learning problem based on Frank-Wolfe Networks: recurrent networks implementing the Frank-Wolfe algorithm aka. conditional gradients. This allows them to learn to exploit structure when, e.g., optimizing over rank-1 matrices. Our LSTM-learned optimizers outperform hand-designed as well learned but unconstrained ones. We demonstrate this for training support vector machines and softmax classifiers.
Tasks
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04300v2
PDF	http://arxiv.org/pdf/1803.04300v2.pdf
PWC	https://paperswithcode.com/paper/neural-conditional-gradients
Repo
Framework

Learning Cross-lingual Distributed Logical Representations for Semantic Parsing


Title	Learning Cross-lingual Distributed Logical Representations for Semantic Parsing
Authors	Yanyan Zou, Wei Lu
Abstract	With the development of several multilingual datasets used for semantic parsing, recent research efforts have looked into the problem of learning semantic parsers in a multilingual setup. However, how to improve the performance of a monolingual semantic parser for a specific language by leveraging data annotated in different languages remains a research question that is under-explored. In this work, we present a study to show how learning distributed representations of the logical forms from data annotated in different languages can be used for improving the performance of a monolingual semantic parser. We extend two existing monolingual semantic parsers to incorporate such cross-lingual distributed logical representations as features. Experiments show that our proposed approach is able to yield improved semantic parsing results on the standard multilingual GeoQuery dataset.
Tasks	Semantic Parsing
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05461v1
PDF	http://arxiv.org/pdf/1806.05461v1.pdf
PWC	https://paperswithcode.com/paper/learning-cross-lingual-distributed-logical
Repo
Framework

WEBCA: Weakly-Electric-Fish Bioinspired Cognitive Architecture


Title	WEBCA: Weakly-Electric-Fish Bioinspired Cognitive Architecture
Authors	Amit Kumar Mishra
Abstract	Neuroethology has been an active field of study for more than a century now. Out of some of the most interesting species that has been studied so far, weakly electric fish is a fascinating one. It performs communication, echo-location and inter-species detection efficiently with an interesting configuration of sensors, neu-rons and a simple brain. In this paper we propose a cognitive architecture inspired by the way these fishes handle and process information. We believe that it is eas-ier to understand and mimic the neural architectures of a simpler species than that of human. Hence, the proposed architecture is expected to both help research in cognitive robotics and also help understand more complicated brains like that of human beings.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11401v1
PDF	http://arxiv.org/pdf/1806.11401v1.pdf
PWC	https://paperswithcode.com/paper/webca-weakly-electric-fish-bioinspired
Repo
Framework

Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information


Title	Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information
Authors	Masaki Asada, Makoto Miwa, Yutaka Sasaki
Abstract	We propose a novel neural method to extract drug-drug interactions (DDIs) from texts using external drug molecular structure information. We encode textual drug pairs with convolutional neural networks and their molecular pairs with graph convolutional networks (GCNs), and then we concatenate the outputs of these two networks. In the experiments, we show that GCNs can predict DDIs from the molecular structures of drugs in high accuracy and the molecular information can enhance text-based DDI extraction by 2.39 percent points in the F-score on the DDIExtraction 2013 shared task data set.
Tasks
Published	2018-05-15
URL	http://arxiv.org/abs/1805.05593v1
PDF	http://arxiv.org/pdf/1805.05593v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-drug-drug-interaction-extraction
Repo
Framework

Beyond Context: Exploring Semantic Similarity for Tiny Face Detection


Title	Beyond Context: Exploring Semantic Similarity for Tiny Face Detection
Authors	Yue Xi, Jiangbin Zheng, Xiangjian He, Wenjing Jia, Hanhui Li
Abstract	Tiny face detection aims to find faces with high degrees of variability in scale, resolution and occlusion in cluttered scenes. Due to the very little information available on tiny faces, it is not sufficient to detect them merely based on the information presented inside the tiny bounding boxes or their context. In this paper, we propose to exploit the semantic similarity among all predicted targets in each image to boost current face detectors. To this end, we present a novel framework to model semantic similarity as pairwise constraints within the metric learning scheme, and then refine our predictions with the semantic similarity by utilizing the graph cut techniques. Experiments conducted on three widely-used benchmark datasets have demonstrated the improvement over the-state-of-the-arts gained by applying this idea.
Tasks	Face Detection, Metric Learning, Semantic Similarity, Semantic Textual Similarity
Published	2018-03-05
URL	http://arxiv.org/abs/1803.01555v1
PDF	http://arxiv.org/pdf/1803.01555v1.pdf
PWC	https://paperswithcode.com/paper/beyond-context-exploring-semantic-similarity
Repo
Framework

Neural Network Architecture for Credibility Assessment of Textual Claims


Title	Neural Network Architecture for Credibility Assessment of Textual Claims
Authors	Rajat Singh, Nurendra Choudhary, Ishita Bindlish, Manish Shrivastava
Abstract	Text articles with false claims, especially news, have recently become aggravating for the Internet users. These articles are in wide circulation and readers face difficulty discerning fact from fiction. Previous work on credibility assessment has focused on factual analysis and linguistic features. The task’s main challenge is the distinction between the features of true and false articles. In this paper, we propose a novel approach called Credibility Outcome (CREDO) which aims at scoring the credibility of an article in an open domain setting. CREDO consists of different modules for capturing various features responsible for the credibility of an article. These features includes credibility of the article’s source and author, semantic similarity between the article and related credible articles retrieved from a knowledge base, and sentiments conveyed by the article. A neural network architecture learns the contribution of each of these modules to the overall credibility of an article. Experiments on Snopes dataset reveals that CREDO outperforms the state-of-the-art approaches based on linguistic features.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10547v2
PDF	http://arxiv.org/pdf/1803.10547v2.pdf
PWC	https://paperswithcode.com/paper/neural-network-architecture-for-credibility
Repo
Framework

No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference


Title	No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference
Authors	Shumeet Baluja, David Marwood, Michele Covell, Nick Johnston
Abstract	For successful deployment of deep neural networks on highly–resource-constrained devices (hearing aids, earbuds, wearables), we must simplify the types of operations and the memory/power resources used during inference. Completely avoiding inference-time floating-point operations is one of the simplest ways to design networks for these highly-constrained environments. By discretizing both our in-network non-linearities and our network weights, we can move to simple, compact networks without floating point operations, without multiplications, and avoid all non-linear function computations. Our approach allows us to explore the spectrum of possible networks, ranging from fully continuous versions down to networks with bi-level weights and activations. Our results show that discretization can be done without loss of performance and that we can train a network that will successfully operate without floating-point, without multiplication, and with less RAM on both regression tasks (auto encoding) and multi-class classification tasks (ImageNet). The memory needed to deploy our discretized networks is less than one third of the equivalent architecture that does use floating-point operations.
Tasks
Published	2018-09-24
URL	http://arxiv.org/abs/1809.09244v2
PDF	http://arxiv.org/pdf/1809.09244v2.pdf
PWC	https://paperswithcode.com/paper/no-multiplication-no-floating-point-no
Repo
Framework

A Unified Framework for Multi-View Multi-Class Object Pose Estimation


Title	A Unified Framework for Multi-View Multi-Class Object Pose Estimation
Authors	Chi Li, Jin Bai, Gregory D. Hager
Abstract	One core challenge in object pose estimation is to ensure accurate and robust performance for large numbers of diverse foreground objects amidst complex background clutter. In this work, we present a scalable framework for accurately inferring six Degree-of-Freedom (6-DoF) pose for a large number of object classes from single or multiple views. To learn discriminative pose features, we integrate three new capabilities into a deep Convolutional Neural Network (CNN): an inference scheme that combines both classification and pose regression based on a uniform tessellation of the Special Euclidean group in three dimensions (SE(3)), the fusion of class priors into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask. Further, an efficient multi-view framework is formulated to address single-view ambiguity. We show that this framework consistently improves the performance of the single-view network. We evaluate our method on three large-scale benchmarks: YCB-Video, JHUScene-50 and ObjectNet-3D. Our approach achieves competitive or superior performance over the current state-of-the-art methods.
Tasks	Pose Estimation
Published	2018-03-21
URL	http://arxiv.org/abs/1803.08103v2
PDF	http://arxiv.org/pdf/1803.08103v2.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-for-multi-view-multi
Repo
Framework

Power Normalizing Second-order Similarity Network for Few-shot Learning


Title	Power Normalizing Second-order Similarity Network for Few-shot Learning
Authors	Hongguang Zhang, Piotr Koniusz
Abstract	Second- and higher-order statistics of data points have played an important role in advancing the state of the art on several computer vision problems such as the fine-grained image and scene recognition. However, these statistics need to be passed via an appropriate pooling scheme to obtain the best performance. Power Normalizations are non-linear activation units which enjoy probability-inspired derivations and can be applied in CNNs. In this paper, we propose a similarity learning network leveraging second-order information and Power Normalizations. To this end, we propose several formulations capturing second-order statistics and derive a sigmoid-like Power Normalizing function to demonstrate its interpretability. Our model is trained end-to-end to learn the similarity between the support set and query images for the problem of one- and few-shot learning. The evaluations on Omniglot, miniImagenet and Open MIC datasets demonstrate that this network obtains state-of-the-art results on several few-shot learning protocols.
Tasks	Few-Shot Learning, Omniglot, Scene Recognition
Published	2018-11-10
URL	http://arxiv.org/abs/1811.04167v1
PDF	http://arxiv.org/pdf/1811.04167v1.pdf
PWC	https://paperswithcode.com/paper/power-normalizing-second-order-similarity
Repo
Framework

Classifier-Guided Visual Correction of Noisy Labels for Image Classification Tasks


Title	Classifier-Guided Visual Correction of Noisy Labels for Image Classification Tasks
Authors	Alex Bäuerle, Heiko Neumann, Timo Ropinski
Abstract	Training data plays an essential role in modern applications of machine learning. However, gathering labeled training data is time-consuming. Therefore, labeling is often outsourced to less experienced users, or completely automated. This can introduce errors, which compromise valuable training data, and lead to suboptimal training results. We thus propose a novel approach that uses the power of pretrained classifiers to visually guide users to noisy labels, and let them interactively check error candidates, to iteratively improve the training dataset. To systematically investigate training data, we propose a categorization of labeling errors into three different types based on an analyzation of potential pitfalls in label acquisition processes. For each of these types, we present approaches to detect, reason about, and resolve error candidates, as we propose metrics and visual guidance techniques to support machine learning users. Our approach has been used to spot errors in well-known machine learning benchmark datasets, and we tested its usability during a user evaluation. While initially developed for images, the techniques presented in this paper are independent of the classification algorithm, and can also be extended to many other types of training data.
Tasks	Image Classification
Published	2018-08-09
URL	https://arxiv.org/abs/1808.03114v2
PDF	https://arxiv.org/pdf/1808.03114v2.pdf
PWC	https://paperswithcode.com/paper/training-de-confusion-an-interactive-network
Repo
Framework

Learning Dynamics from Kinematics: Estimating 2D Foot Pressure Maps from Video Frames


Title	Learning Dynamics from Kinematics: Estimating 2D Foot Pressure Maps from Video Frames
Authors	Christopher Funk, Savinay Nagendra, Jesse Scott, Bharadwaj Ravichandran, John H. Challis, Robert T. Collins, Yanxi Liu
Abstract	Pose stability analysis is the key to understanding locomotion and control of body equilibrium, with applications in numerous fields such as kinesiology, medicine, and robotics. In biomechanics, Center of Pressure (CoP) is used in studies of human postural control and gait. We propose and validate a novel approach to learn CoP from pose of a human body to aid stability analysis. More specifically, we propose an end-to-end deep learning architecture to regress foot pressure heatmaps, and hence the CoP locations, from 2D human pose derived from video. We have collected a set of long (5min +) choreographed Taiji (Tai Chi) sequences of multiple subjects with synchronized foot pressure and video data. The derived human pose data and corresponding foot pressure maps are used jointly in training a convolutional neural network with residual architecture, named PressNET. Cross-subject validation results show promising performance of PressNET, significantly outperforming the baseline method of K-Nearest Neighbors. Furthermore, we demonstrate that our computation of center of pressure (CoP) from PressNET is not only significantly more accurate than those obtained from the baseline approach but also meets the expectations of corresponding lab-based measurements of stability studies in kinesiology.
Tasks	Motion Capture
Published	2018-11-30
URL	https://arxiv.org/abs/1811.12607v4
PDF	https://arxiv.org/pdf/1811.12607v4.pdf
PWC	https://paperswithcode.com/paper/learning-dynamics-from-kinematics-estimating
Repo
Framework

Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks


Title	Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks
Authors	Christoph Wick, Christian Reul, Frank Puppe
Abstract	This paper proposes a combination of a convolutional and a LSTM network to improve the accuracy of OCR on early printed books. While the standard model of line based OCR uses a single LSTM layer, we utilize a CNN- and Pooling-Layer combination in advance of an LSTM layer. Due to the higher amount of trainable parameters the performance of the network relies on a high amount of training examples to unleash its power. Hereby, the error is reduced by a factor of up to 44%, yielding a CER of 1% and below. To further improve the results we use a voting mechanism to achieve character error rates (CER) below $0.5%$. The runtime of the deep model for training and prediction of a book behaves very similar to a shallow network.
Tasks	Optical Character Recognition
Published	2018-02-27
URL	http://arxiv.org/abs/1802.10033v1
PDF	http://arxiv.org/pdf/1802.10033v1.pdf
PWC	https://paperswithcode.com/paper/improving-ocr-accuracy-on-early-printed-books-1
Repo
Framework

Sampling-based Bayesian Inference with gradient uncertainty


Title	Sampling-based Bayesian Inference with gradient uncertainty
Authors	Chanwoo Park, Jae Myung Kim, Seok Hyeon Ha, Jungwoo Lee
Abstract	Deep neural networks(NNs) have achieved impressive performance, often exceed human performance on many computer vision tasks. However, one of the most challenging issues that still remains is that NNs are overconfident in their predictions, which can be very harmful when this arises in safety critical applications. In this paper, we show that predictive uncertainty can be efficiently estimated when we incorporate the concept of gradients uncertainty into posterior sampling. The proposed method is tested on two different datasets, MNIST for in-distribution confusing examples and notMNIST for out-of-distribution data. We show that our method is able to efficiently represent predictive uncertainty on both datasets.
Tasks	Bayesian Inference
Published	2018-12-08
URL	https://arxiv.org/abs/1812.03285v2
PDF	https://arxiv.org/pdf/1812.03285v2.pdf
PWC	https://paperswithcode.com/paper/sampling-based-bayesian-inference-with
Repo
Framework

Progressive Weight Pruning of Deep Neural Networks using ADMM


Title	Progressive Weight Pruning of Deep Neural Networks using ADMM
Authors	Shaokai Ye, Tianyun Zhang, Kaiqi Zhang, Jiayu Li, Kaidi Xu, Yunfei Yang, Fuxun Yu, Jian Tang, Makan Fardad, Sijia Liu, Xiang Chen, Xue Lin, Yanzhi Wang
Abstract	Deep neural networks (DNNs) although achieving human-level performance in many domains, have very large model size that hinders their broader applications on edge computing devices. Extensive research work have been conducted on DNN model compression or pruning. However, most of the previous work took heuristic approaches. This work proposes a progressive weight pruning approach based on ADMM (Alternating Direction Method of Multipliers), a powerful technique to deal with non-convex optimization problems with potentially combinatorial constraints. Motivated by dynamic programming, the proposed method reaches extremely high pruning rate by using partial prunings with moderate pruning rates. Therefore, it resolves the accuracy degradation and long convergence time problems when pursuing extremely high pruning ratios. It achieves up to 34 times pruning rate for ImageNet dataset and 167 times pruning rate for MNIST dataset, significantly higher than those reached by the literature work. Under the same number of epochs, the proposed method also achieves faster convergence and higher compression rates. The codes and pruned DNN models are released in the link bit.ly/2zxdlss
Tasks	Model Compression
Published	2018-10-17
URL	http://arxiv.org/abs/1810.07378v2
PDF	http://arxiv.org/pdf/1810.07378v2.pdf
PWC	https://paperswithcode.com/paper/progressive-weight-pruning-of-deep-neural
Repo
Framework