October 18, 2019

2918 words 14 mins read

Paper Group ANR 432

Tri-axial Self-Attention for Concurrent Activity Recognition. Surface Networks via General Covers. State-Denoised Recurrent Neural Networks. Texture Segmentation Based Video Compression Using Convolutional Neural Networks. Q-Map: Clinical Concept Mining from Clinical Documents. Sparsity in Variational Autoencoders. Acquisition and use of knowledge …

Tri-axial Self-Attention for Concurrent Activity Recognition


Title	Tri-axial Self-Attention for Concurrent Activity Recognition
Authors	Yanyi Zhang, Xinyu Li, Kaixiang Huang, Yehan Wang, Shuhong Chen, Ivan Marsic
Abstract	We present a system for concurrent activity recognition. To extract features associated with different activities, we propose a feature-to-activity attention that maps the extracted global features to sub-features associated with individual activities. To model the temporal associations of individual activities, we propose a transformer-network encoder that models independent temporal associations for each activity. To make the concurrent activity prediction aware of the potential associations between activities, we propose self-attention with an association mask. Our system achieved state-of-the-art or comparable performance on three commonly used concurrent activity detection datasets. Our visualizations demonstrate that our system is able to locate the important spatial-temporal features for final decision making. We also showed that our system can be applied to general multilabel classification problems.
Tasks	Action Detection, Activity Detection, Activity Prediction, Activity Recognition, Concurrent Activity Recognition, Decision Making
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02817v1
PDF	http://arxiv.org/pdf/1812.02817v1.pdf
PWC	https://paperswithcode.com/paper/tri-axial-self-attention-for-concurrent
Repo
Framework

Surface Networks via General Covers


Title	Surface Networks via General Covers
Authors	Niv Haim, Nimrod Segol, Heli Ben-Hamu, Haggai Maron, Yaron Lipman
Abstract	Developing deep learning techniques for geometric data is an active and fruitful research area. This paper tackles the problem of sphere-type surface learning by developing a novel surface-to-image representation. Using this representation we are able to quickly adapt successful CNN models to the surface setting. The surface-image representation is based on a covering map from the image domain to the surface. Namely, the map wraps around the surface several times, making sure that every part of the surface is well represented in the image. Differently from previous surface-to-image representations, we provide a low distortion coverage of all surface parts in a single image. Specifically, for the use case of learning spherical signals, our representation provides a low distortion alternative to several popular spherical parameterizations used in deep learning. We have used the surface-to-image representation to apply standard CNN architectures to 3D models as well as spherical signals. We show that our method achieves state of the art or comparable results on the tasks of shape retrieval, shape classification and semantic shape segmentation.
Tasks
Published	2018-12-27
URL	https://arxiv.org/abs/1812.10705v3
PDF	https://arxiv.org/pdf/1812.10705v3.pdf
PWC	https://paperswithcode.com/paper/surface-networks-via-general-covers
Repo
Framework

State-Denoised Recurrent Neural Networks


Title	State-Denoised Recurrent Neural Networks
Authors	Michael C. Mozer, Denis Kazakov, Robert V. Lindsey
Abstract	Recurrent neural networks (RNNs) are difficult to train on sequence processing tasks, not only because input noise may be amplified through feedback, but also because any inaccuracy in the weights has similar consequences as input noise. We describe a method for denoising the hidden state during training to achieve more robust representations thereby improving generalization performance. Attractor dynamics are incorporated into the hidden state to `clean up’ representations at each step of a sequence. The attractor dynamics are trained through an auxillary denoising loss to recover previously experienced hidden states from noisy versions of those states. This state-denoised recurrent neural network {SDRNN} performs multiple steps of internal processing for each external sequence step. On a range of tasks, we show that the SDRNN outperforms a generic RNN as well as a variant of the SDRNN with attractor dynamics on the hidden state but without the auxillary loss. We argue that attractor dynamics—and corresponding connectivity constraints—are an essential component of the deep learning arsenal and should be invoked not only for recurrent networks but also for improving deep feedforward nets and intertask transfer. \|
Tasks	Denoising
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08394v2
PDF	http://arxiv.org/pdf/1805.08394v2.pdf
PWC	https://paperswithcode.com/paper/state-denoised-recurrent-neural-networks
Repo
Framework

Texture Segmentation Based Video Compression Using Convolutional Neural Networks


Title	Texture Segmentation Based Video Compression Using Convolutional Neural Networks
Authors	Chichen Fu, Di Chen, Edward J. Delp, Zoe Liu, Fengqing Zhu
Abstract	There has been a growing interest in using different approaches to improve the coding efficiency of modern video codec in recent years as demand for web-based video consumption increases. In this paper, we propose a model-based approach that uses texture analysis/synthesis to reconstruct blocks in texture regions of a video to achieve potential coding gains using the AV1 codec developed by the Alliance for Open Media (AOM). The proposed method uses convolutional neural networks to extract texture regions in a frame, which are then reconstructed using a global motion model. Our preliminary results show an increase in coding efficiency while maintaining satisfactory visual quality.
Tasks	Texture Classification, Video Compression
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02992v1
PDF	http://arxiv.org/pdf/1802.02992v1.pdf
PWC	https://paperswithcode.com/paper/texture-segmentation-based-video-compression
Repo
Framework

Q-Map: Clinical Concept Mining from Clinical Documents


Title	Q-Map: Clinical Concept Mining from Clinical Documents
Authors	Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala
Abstract	Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.
Tasks	Information Retrieval, Survival Analysis
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11149v2
PDF	http://arxiv.org/pdf/1804.11149v2.pdf
PWC	https://paperswithcode.com/paper/q-map-clinical-concept-mining-from-clinical
Repo
Framework

Sparsity in Variational Autoencoders


Title	Sparsity in Variational Autoencoders
Authors	Andrea Asperti
Abstract	Working in high-dimensional latent spaces, the internal encoding of data in Variational Autoencoders becomes naturally sparse. We discuss this known but controversial phenomenon sometimes refereed to as overpruning, to emphasize the under-use of the model capacity. In fact, it is an important form of self-regularization, with all the typical benefits associated with sparsity: it forces the model to focus on the really important features, highly reducing the risk of overfitting. Especially, it is a major methodological guide for the correct tuning of the model capacity, progressively augmenting it to attain sparsity, or conversely reducing the dimension of the network removing links to zeroed out neurons. The degree of sparsity crucially depends on the network architecture: for instance, convolutional networks typically show less sparsity, likely due to the tighter relation of features to different spatial regions of the input.
Tasks
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07238v3
PDF	http://arxiv.org/pdf/1812.07238v3.pdf
PWC	https://paperswithcode.com/paper/sparsity-in-variational-autoencoders
Repo
Framework

Acquisition and use of knowledge over a restricted domain by intelligent agents


Title	Acquisition and use of knowledge over a restricted domain by intelligent agents
Authors	Juliao Braga, Nizam Omar, Luciana F. Thome
Abstract	This short paper provides a description of an architecture to acquisition and use of knowledge by intelligent agents over a restricted domain of the Internet Infrastructure. The proposed architecture is added to an intelligent agent deployment model over a very useful server for Internet Autonomous System administrators. Such servers, which are heavily dependent on arbitrary and eventual updates of human beings, become unreliable. This is a position paper that proposes three research questions that are still in progress.
Tasks
Published	2018-05-06
URL	http://arxiv.org/abs/1805.02241v1
PDF	http://arxiv.org/pdf/1805.02241v1.pdf
PWC	https://paperswithcode.com/paper/acquisition-and-use-of-knowledge-over-a
Repo
Framework

Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems


Title	Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems
Authors	Zihao Xiao, Jianfei Chen, Jun Zhu
Abstract	Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs), which are flexible in programming but energy-consuming. Towards low-energy implementations, this paper investigates their training on an emerging hardware technology called the neuromorphic multi-chip systems (NMSs). NMSs are very effective for a family of algorithms called spiking neural networks (SNNs). We present three SNNs to train topic models. The first SNN is a batch algorithm combining the conventional collapsed Gibbs sampling (CGS) algorithm and an inference SNN to train LDA. The other two SNNs are online algorithms targeting at both energy- and storage-limited environments. The two online algorithms are equivalent with training LDA by using maximum-a-posterior estimation and maximizing the semi-collapsed likelihood, respectively. They use novel, tailored ordinary differential equations for stochastic optimization. We simulate the new algorithms and show that they are comparable with the GPC algorithms, while being suitable for NMS implementation. We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.
Tasks	Stochastic Optimization, Topic Models
Published	2018-04-10
URL	http://arxiv.org/abs/1804.03578v1
PDF	http://arxiv.org/pdf/1804.03578v1.pdf
PWC	https://paperswithcode.com/paper/towards-training-probabilistic-topic-models
Repo
Framework

Adversarial Learning for Fine-grained Image Search


Title	Adversarial Learning for Fine-grained Image Search
Authors	Kevin Lin, Fan Yang, Qiaosong Wang, Robinson Piramuthu
Abstract	Fine-grained image search is still a challenging problem due to the difficulty in capturing subtle differences regardless of pose variations of objects from fine-grained categories. In practice, a dynamic inventory with new fine-grained categories adds another dimension to this challenge. In this work, we propose an end-to-end network, called FGGAN, that learns discriminative representations by implicitly learning a geometric transformation from multi-view images for fine-grained image search. We integrate a generative adversarial network (GAN) that can automatically handle complex view and pose variations by converting them to a canonical view without any predefined transformations. Moreover, in an open-set scenario, our network is able to better match images from unseen and unknown fine-grained categories. Extensive experiments on two public datasets and a newly collected dataset have demonstrated the outstanding robust performance of the proposed FGGAN in both closed-set and open-set scenarios, providing as much as 10% relative improvement compared to baselines.
Tasks	Image Retrieval
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02247v1
PDF	http://arxiv.org/pdf/1807.02247v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-learning-for-fine-grained-image
Repo
Framework

SCK: A sparse coding based key-point detector


Title	SCK: A sparse coding based key-point detector
Authors	Thanh Hong-Phuoc, Yifeng He, Ling Guan
Abstract	All current popular hand-crafted key-point detectors such as Harris corner, MSER, SIFT, SURF… rely on some specific pre-designed structures for the detection of corners, blobs, or junctions in an image. In this paper, a novel sparse coding based key-point detector which requires no particular pre-designed structures is presented. The key-point detector is based on measuring the complexity level of each block in an image to decide where a key-point should be. The complexity level of a block is defined as the total number of non-zero components of a sparse representation of that block. Generally, a block constructed with more components is more complex and has greater potential to be a good key-point. Experimental results on Webcam and EF datasets [1, 2] show that the proposed detector achieves significantly high repeatability compared to hand-crafted features, and even outperforms the matching scores of the state-of-the-art learning based detector.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02647v5
PDF	http://arxiv.org/pdf/1802.02647v5.pdf
PWC	https://paperswithcode.com/paper/sck-a-sparse-coding-based-key-point-detector
Repo
Framework

KGCleaner : Identifying and Correcting Errors Produced by Information Extraction Systems


Title	KGCleaner : Identifying and Correcting Errors Produced by Information Extraction Systems
Authors	Ankur Padia, Frank Ferraro, Tim Finin
Abstract	KGCleaner is a framework to identify and correct errors in data produced and delivered by an information extraction system. These tasks have been understudied and KGCleaner is the first to address both. We introduce a multi-task model that jointly learns to predict if an extracted relation is credible and repair it if not. We evaluate our approach and other models as instance of our framework on two collections: a Wikidata corpus of nearly 700K facts and 5M fact-relevant sentences and a collection of 30K facts from the 2015 TAC Knowledge Base Population task. For credibility classification, parameter efficient simple shallow neural network can achieve an absolute performance gain of 30 $F_1$ points on Wikidata and comparable performance on TAC. For the repair task, significant performance (at more than twice) gain can be obtained depending on the nature of the dataset and the models.
Tasks	Knowledge Base Population
Published	2018-08-14
URL	http://arxiv.org/abs/1808.04816v2
PDF	http://arxiv.org/pdf/1808.04816v2.pdf
PWC	https://paperswithcode.com/paper/kgcleaner-identifying-and-correcting-errors
Repo
Framework

Image declipping with deep networks


Title	Image declipping with deep networks
Authors	Shachar Honig, Michael Werman
Abstract	We present a deep network to recover pixel values lost to clipping. The clipped area of the image is typically a uniform area of minimum or maximum brightness, losing image detail and color fidelity. The degree to which the clipping is visually noticeable depends on the amount by which values were clipped, and the extent of the clipped area. Clipping may occur in any (or all) of the pixel’s color channels. Although clipped pixels are common and occur to some degree in almost every image we tested, current automatic solutions have only partial success in repairing clipped pixels and work only in limited cases such as only with overexposure (not under-exposure) and when some of the color channels are not clipped. Using neural networks and their ability to model natural images allows our neural network, DeclipNet, to reconstruct data in clipped regions producing state of the art results.
Tasks	Image Declipping
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06277v1
PDF	http://arxiv.org/pdf/1811.06277v1.pdf
PWC	https://paperswithcode.com/paper/image-declipping-with-deep-networks
Repo
Framework

XJTLUIndoorLoc: A New Fingerprinting Database for Indoor Localization and Trajectory Estimation Based on Wi-Fi RSS and Geomagnetic Field


Title	XJTLUIndoorLoc: A New Fingerprinting Database for Indoor Localization and Trajectory Estimation Based on Wi-Fi RSS and Geomagnetic Field
Authors	Zhenghang Zhong, Zhe Tang, Xiangxing Li, Tiancheng Yuan, Yang Yang, Meng Wei, Yuanyuan Zhang, Renzhi Sheng, Naomi Grant, Chongfeng Ling, Xintao Huan, Kyeong Soo Kim, Sanghyuk Lee
Abstract	In this paper, we present a new location fingerprinting database comprised of Wi-Fi received signal strength (RSS) and geomagnetic field intensity measured with multiple devices at a multi-floor building in Xi’an Jiatong-Liverpool University, Suzhou, China. We also provide preliminary results of localization and trajectory estimation based on convolutional neural network (CNN) and long short-term memory (LSTM) network with this database. For localization, we map RSS data for a reference point to an image-like, two-dimensional array and then apply CNN which is popular in image and video analysis and recognition. For trajectory estimation, we use a modified random way point model to efficiently generate continuous step traces imitating human walking and train a stacked two-layer LSTM network with the generated data to remember the changing pattern of geomagnetic field intensity against (x,y) coordinates. Experimental results demonstrate the usefulness of our new database and the feasibility of the CNN and LSTM-based localization and trajectory estimation with the database.
Tasks
Published	2018-10-17
URL	http://arxiv.org/abs/1810.07377v1
PDF	http://arxiv.org/pdf/1810.07377v1.pdf
PWC	https://paperswithcode.com/paper/xjtluindoorloc-a-new-fingerprinting-database
Repo
Framework

Complexity of Training ReLU Neural Network


Title	Complexity of Training ReLU Neural Network
Authors	Digvijay Boob, Santanu S. Dey, Guanghui Lan
Abstract	In this paper, we explore some basic questions on the complexity of training Neural networks with ReLU activation function. We show that it is NP-hard to train a two- hidden layer feedforward ReLU neural network. If dimension d of the data is fixed then we show that there exists a polynomial time algorithm for the same training problem. We also show that if sufficient over-parameterization is provided in the first hidden layer of ReLU neural network then there is a polynomial time algorithm which finds weights such that output of the over-parameterized ReLU neural network matches with the output of the given data
Tasks
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10787v1
PDF	http://arxiv.org/pdf/1809.10787v1.pdf
PWC	https://paperswithcode.com/paper/complexity-of-training-relu-neural-network
Repo
Framework

struc2gauss: Structure Preserving Network Embedding via Gaussian Embedding


Title	struc2gauss: Structure Preserving Network Embedding via Gaussian Embedding
Authors	Yulong Pei, Xin Du, Jianpeng Zhang, George Fletcher, Mykola Pechenizkiy
Abstract	Network embedding (NE) is playing a principal role in network mining, due to its ability to map nodes into efficient low-dimensional embedding vectors. However, two major limitations exist in state-of-the-art NE methods: structure preservation and uncertainty modeling. Almost all previous methods represent a node into a point in space and focus on the local structural information, i.e., neighborhood information. However, neighborhood information does not capture the global structural information and point vector representation fails in modeling the uncertainty of node representations. In this paper, we propose a new NE framework, struc2gauss, which learns node representations in the space of Gaussian distributions and performs network embedding based on global structural information. struc2gauss first employs a given node similarity metric to measure the global structural information, then generates structural context for nodes and finally learns node representations via Gaussian embedding. Different structural similarity measures of networks and energy functions of Gaussian embedding are investigated. Experiments conducted on both synthetic and real-world data sets demonstrate that struc2gauss effectively captures the global structural information while state-of-the-art network embedding methods fails to, outperforms other methods on the structure-based clustering task and provides more information on uncertainties of node representations.
Tasks	Network Embedding
Published	2018-05-25
URL	http://arxiv.org/abs/1805.10043v1
PDF	http://arxiv.org/pdf/1805.10043v1.pdf
PWC	https://paperswithcode.com/paper/struc2gauss-structure-preserving-network
Repo
Framework