October 19, 2019

3174 words 15 mins read

Paper Group ANR 322

Sparse Persistent RNNs: Squeezing Large Recurrent Networks On-Chip. Efficient active learning of sparse halfspaces. SQuantizer: Simultaneous Learning for Both Sparse and Low-precision Neural Networks. Using Neural Networks to Generate Information Maps for Mobile Sensors. Probabilistic Attribute Tree in Convolutional Neural Networks for Facial Expre …

Sparse Persistent RNNs: Squeezing Large Recurrent Networks On-Chip


Title	Sparse Persistent RNNs: Squeezing Large Recurrent Networks On-Chip
Authors	Feiwen Zhu, Jeff Pool, Michael Andersch, Jeremy Appleyard, Fung Xie
Abstract	Recurrent Neural Networks (RNNs) are powerful tools for solving sequence-based problems, but their efficacy and execution time are dependent on the size of the network. Following recent work in simplifying these networks with model pruning and a novel mapping of work onto GPUs, we design an efficient implementation for sparse RNNs. We investigate several optimizations and tradeoffs: Lamport timestamps, wide memory loads, and a bank-aware weight layout. With these optimizations, we achieve speedups of over 6x over the next best algorithm for a hidden layer of size 2304, batch size of 4, and a density of 30%. Further, our technique allows for models of over 5x the size to fit on a GPU for a speedup of 2x, enabling larger networks to help advance the state-of-the-art. We perform case studies on NMT and speech recognition tasks in the appendix, accelerating their recurrent layers by up to 3x.
Tasks	Speech Recognition
Published	2018-04-26
URL	http://arxiv.org/abs/1804.10223v1
PDF	http://arxiv.org/pdf/1804.10223v1.pdf
PWC	https://paperswithcode.com/paper/sparse-persistent-rnns-squeezing-large
Repo
Framework

Efficient active learning of sparse halfspaces


Title	Efficient active learning of sparse halfspaces
Authors	Chicheng Zhang
Abstract	We study the problem of efficient PAC active learning of homogeneous linear classifiers (halfspaces) in $\mathbb{R}^d$, where the goal is to learn a halfspace with low error using as few label queries as possible. Under the extra assumption that there is a $t$-sparse halfspace that performs well on the data ($t \ll d$), we would like our active learning algorithm to be {\em attribute efficient}, i.e. to have label requirements sublinear in $d$. In this paper, we provide a computationally efficient algorithm that achieves this goal. Under certain distributional assumptions on the data, our algorithm achieves a label complexity of $O(t \cdot \mathrm{polylog}(d, \frac 1 \epsilon))$. In contrast, existing algorithms in this setting are either computationally inefficient, or subject to label requirements polynomial in $d$ or $\frac 1 \epsilon$.
Tasks	Active Learning
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02350v2
PDF	http://arxiv.org/pdf/1805.02350v2.pdf
PWC	https://paperswithcode.com/paper/efficient-active-learning-of-sparse
Repo
Framework

SQuantizer: Simultaneous Learning for Both Sparse and Low-precision Neural Networks


Title	SQuantizer: Simultaneous Learning for Both Sparse and Low-precision Neural Networks
Authors	Mi Sun Park, Xiaofan Xu, Cormac Brick
Abstract	Deep neural networks have achieved state-of-the-art accuracies in a wide range of computer vision, speech recognition, and machine translation tasks. However the limits of memory bandwidth and computational power constrain the range of devices capable of deploying these modern networks. To address this problem, we propose SQuantizer, a new training method that jointly optimizes for both sparse and low-precision neural networks while maintaining high accuracy and providing a high compression rate. This approach brings sparsification and low-bit quantization into a single training pass, employing these techniques in an order demonstrated to be optimal. Our method achieves state-of-the-art accuracies using 4-bit and 2-bit precision for ResNet18, MobileNet-v2 and ResNet50, even with high degree of sparsity. The compression rates of 18x for ResNet18 and 17x for ResNet50, and 9x for MobileNet-v2 are obtained when SQuantizing both weights and activations within 1% and 2% loss in accuracy for ResNets and MobileNet-v2 respectively. An extension of these techniques to object detection also demonstrates high accuracy on YOLO-v3. Additionally, our method allows for fast single pass training, which is important for rapid prototyping and neural architecture search techniques. Finally extensive results from this simultaneous training approach allows us to draw some useful insights into the relative merits of sparsity and quantization.
Tasks	Machine Translation, Neural Architecture Search, Object Detection, Quantization, Speech Recognition
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08301v2
PDF	http://arxiv.org/pdf/1812.08301v2.pdf
PWC	https://paperswithcode.com/paper/squantizer-simultaneous-learning-for-both
Repo
Framework

Using Neural Networks to Generate Information Maps for Mobile Sensors


Title	Using Neural Networks to Generate Information Maps for Mobile Sensors
Authors	Louis Dressel, Mykel J. Kochenderfer
Abstract	Target localization is a critical task for mobile sensors and has many applications. However, generating informative trajectories for these sensors is a challenging research problem. A common method uses information maps that estimate the value of taking measurements from any point in the sensor state space. These information maps are used to generate trajectories; for example, a trajectory might be designed so its distribution of measurements matches the distribution of the information map. Regardless of the trajectory generation method, generating information maps as new observations are made is critical. However, it can be challenging to compute these maps in real-time. We propose using convolutional neural networks to generate information maps from a target estimate and sensor model in real-time. Simulations show that maps are accurately rendered while offering orders of magnitude reduction in computation time.
Tasks
Published	2018-09-26
URL	http://arxiv.org/abs/1809.10012v1
PDF	http://arxiv.org/pdf/1809.10012v1.pdf
PWC	https://paperswithcode.com/paper/using-neural-networks-to-generate-information
Repo
Framework

Probabilistic Attribute Tree in Convolutional Neural Networks for Facial Expression Recognition


Title	Probabilistic Attribute Tree in Convolutional Neural Networks for Facial Expression Recognition
Authors	Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O’Reilly, Yan Tong
Abstract	In this paper, we proposed a novel Probabilistic Attribute Tree-CNN (PAT-CNN) to explicitly deal with the large intra-class variations caused by identity-related attributes, e.g., age, race, and gender. Specifically, a novel PAT module with an associated PAT loss was proposed to learn features in a hierarchical tree structure organized according to attributes, where the final features are less affected by the attributes. Then, expression-related features are extracted from leaf nodes. Samples are probabilistically assigned to tree nodes at different levels such that expression-related features can be learned from all samples weighted by probabilities. We further proposed a semi-supervised strategy to learn the PAT-CNN from limited attribute-annotated samples to make the best use of available data. Experimental results on five facial expression datasets have demonstrated that the proposed PAT-CNN outperforms the baseline models by explicitly modeling attributes. More impressively, the PAT-CNN using a single model achieves the best performance for faces in the wild on the SFEW dataset, compared with the state-of-the-art methods using an ensemble of hundreds of CNNs.
Tasks	Facial Expression Recognition
Published	2018-12-17
URL	http://arxiv.org/abs/1812.07067v1
PDF	http://arxiv.org/pdf/1812.07067v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-attribute-tree-in-convolutional
Repo
Framework

Assessing a mobile-based deep learning model for plant disease surveillance


Title	Assessing a mobile-based deep learning model for plant disease surveillance
Authors	Amanda Ramcharan, Peter McCloskey, Kelsee Baranowski, Neema Mbilinyi, Latifa Mrisho, Mathias Ndalahwa, James Legg, David Hughes
Abstract	Convolutional neural network models (CNNs) have made major advances in computer vision tasks in the last five years. Given the challenge in collecting real world datasets, most studies report performance metrics based on available research datasets. In scenarios where CNNs are to be deployed on images or videos from mobile devices, models are presented with new challenges due to lighting, angle, and camera specifications, which are not accounted for in research datasets. It is essential for assessment to also be conducted on real world datasets if such models are to be reliably integrated with products and services in society. Plant disease datasets can be used to test CNNs in real time and gain insight into real world performance. We train a CNN object detection model to identify foliar symptoms of diseases (or lack thereof) in cassava (Manihot esculenta Crantz). We then deploy the model on a mobile app and test its performance on mobile images and video of 720 diseased leaflets in an agricultural field in Tanzania. Within each disease category we test two levels of severity of symptoms - mild and pronounced, to assess the model performance for early detection of symptoms. In both severities we see a decrease in the F-1 score for real world images and video. The F-1 score dropped by 32% for pronounced symptoms in real world images (the closest data to the training data) due to a drop in model recall. If the potential of smartphone CNNs are to be realized our data suggest it is crucial to consider tuning precision and recall performance in order to achieve the desired performance in real world settings. In addition, the varied performance related to different input data (image or video) is an important consideration for the design of CNNs in real world applications.
Tasks	Object Detection
Published	2018-05-04
URL	http://arxiv.org/abs/1805.08692v1
PDF	http://arxiv.org/pdf/1805.08692v1.pdf
PWC	https://paperswithcode.com/paper/assessing-a-mobile-based-deep-learning-model
Repo
Framework

Layer Flexible Adaptive Computational Time for Recurrent Neural Networks


Title	Layer Flexible Adaptive Computational Time for Recurrent Neural Networks
Authors	Lida Zhang, Diego Klabjan
Abstract	Deep recurrent neural networks perform well on sequence data and are the model of choice. It is a daunting task to decide the number of layers, especially considering different computational needs for tasks within a sequence of different difficulties. We propose a layer flexible recurrent neural network with adaptive computation time, and expand it to a sequence to sequence model. Contrary to the adaptive computation time model, our model has a dynamic number of transmission states which vary by step and sequence. We evaluate the model on a financial data set and Wikipedia language modeling. Experimental results show the performance improvement of 8% to 12% and indicate the model’s ability to dynamically change the number of layers.
Tasks	Language Modelling
Published	2018-12-06
URL	https://arxiv.org/abs/1812.02335v4
PDF	https://arxiv.org/pdf/1812.02335v4.pdf
PWC	https://paperswithcode.com/paper/layer-flexible-adaptive-computational-time
Repo
Framework

On the Metric Distortion of Embedding Persistence Diagrams into separable Hilbert spaces


Title	On the Metric Distortion of Embedding Persistence Diagrams into separable Hilbert spaces
Authors	Mathieu Carriere, Ulrich Bauer
Abstract	Persistence diagrams are important descriptors in Topological Data Analysis. Due to the nonlinearity of the space of persistence diagrams equipped with their {\em diagram distances}, most of the recent attempts at using persistence diagrams in machine learning have been done through kernel methods, i.e., embeddings of persistence diagrams into Reproducing Kernel Hilbert Spaces, in which all computations can be performed easily. Since persistence diagrams enjoy theoretical stability guarantees for the diagram distances, the {\em metric properties} of the feature map, i.e., the relationship between the Hilbert distance and the diagram distances, are of central interest for understanding if the persistence diagram guarantees carry over to the embedding. In this article, we study the possibility of embedding persistence diagrams into separable Hilbert spaces, with bi-Lipschitz maps. In particular, we show that for several stable embeddings into infinite-dimensional Hilbert spaces defined in the literature, any lower bound must depend on the cardinalities of the persistence diagrams, and that when the Hilbert space is finite dimensional, finding a bi-Lipschitz embedding is impossible, even when restricting the persistence diagrams to have bounded cardinalities.
Tasks	Topological Data Analysis
Published	2018-06-19
URL	https://arxiv.org/abs/1806.06924v3
PDF	https://arxiv.org/pdf/1806.06924v3.pdf
PWC	https://paperswithcode.com/paper/on-the-metric-distortion-of-embedding
Repo
Framework

Markov Property in Generative Classifiers


Title	Markov Property in Generative Classifiers
Authors	Gherardo Varando, Concha Bielza, Pedro Larrañaga, Eva Riccomagno
Abstract	We show that, for generative classifiers, conditional independence corresponds to linear constraints for the induced discrimination functions. Discrimination functions of undirected Markov network classifiers can thus be characterized by sets of linear constraints. These constraints are represented by a second order finite difference operator over functions of categorical variables. As an application we study the expressive power of generative classifiers under the undirected Markov property and we present a general method to combine discriminative and generative classifiers.
Tasks
Published	2018-11-12
URL	http://arxiv.org/abs/1811.04759v1
PDF	http://arxiv.org/pdf/1811.04759v1.pdf
PWC	https://paperswithcode.com/paper/markov-property-in-generative-classifiers
Repo
Framework

Single Stream Parallelization of Recurrent Neural Networks for Low Power and Fast Inference


Title	Single Stream Parallelization of Recurrent Neural Networks for Low Power and Fast Inference
Authors	Wonyong Sung, Jinhwan Park
Abstract	As neural network algorithms show high performance in many applications, their efficient inference on mobile and embedded systems are of great interests. When a single stream recurrent neural network (RNN) is executed for a personal user in embedded systems, it demands a large amount of DRAM accesses because the network size is usually much bigger than the cache size and the weights of an RNN are used only once at each time step. We overcome this problem by parallelizing the algorithm and executing it multiple time steps at a time. This approach also reduces the power consumption by lowering the number of DRAM accesses. QRNN (Quasi Recurrent Neural Networks) and SRU (Simple Recurrent Unit) based recurrent neural networks are used for implementation. The experiments for SRU showed about 300% and 930% of speed-up when the numbers of multi time steps are 4 and 16, respectively, in an ARM CPU based system.
Tasks
Published	2018-03-30
URL	http://arxiv.org/abs/1803.11389v1
PDF	http://arxiv.org/pdf/1803.11389v1.pdf
PWC	https://paperswithcode.com/paper/single-stream-parallelization-of-recurrent
Repo
Framework

Weakly Supervised Instance Segmentation Using Hybrid Network


Title	Weakly Supervised Instance Segmentation Using Hybrid Network
Authors	Shisha Liao, Yongqing Sun, Chenqiang Gao, Pranav Shenoy K P, Song Mu, Jun Shimamura, Atsushi Sagata
Abstract	Weakly-supervised instance segmentation, which could greatly save labor and time cost of pixel mask annotation, has attracted increasing attention in recent years. The commonly used pipeline firstly utilizes conventional image segmentation methods to automatically generate initial masks and then use them to train an off-the-shelf segmentation network in an iterative way. However, the initial generated masks usually contains a notable proportion of invalid masks which are mainly caused by small object instances. Directly using these initial masks to train segmentation model is harmful for the performance. To address this problem, we propose a hybrid network in this paper. In our architecture, there is a principle segmentation network which is used to handle the normal samples with valid generated masks. In addition, a complementary branch is added to handle the small and dim objects without valid masks. Experimental results indicate that our method can achieve significantly performance improvement both on the small object instances and large ones, and outperforms all state-of-the-art methods.
Tasks	Instance Segmentation, Semantic Segmentation, Weakly-supervised instance segmentation
Published	2018-12-12
URL	http://arxiv.org/abs/1812.04831v1
PDF	http://arxiv.org/pdf/1812.04831v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-instance-segmentation-using
Repo
Framework

Training of photonic neural networks through in situ backpropagation


Title	Training of photonic neural networks through in situ backpropagation
Authors	Tyler W. Hughes, Momchil Minkov, Yu Shi, Shanhui Fan
Abstract	Recently, integrated optics has gained interest as a hardware platform for implementing machine learning algorithms. Of particular interest are artificial neural networks, since matrix-vector multi- plications, which are used heavily in artificial neural networks, can be done efficiently in photonic circuits. The training of an artificial neural network is a crucial step in its application. However, currently on the integrated photonics platform there is no efficient protocol for the training of these networks. In this work, we introduce a method that enables highly efficient, in situ training of a photonic neural network. We use adjoint variable methods to derive the photonic analogue of the backpropagation algorithm, which is the standard method for computing gradients of conventional neural networks. We further show how these gradients may be obtained exactly by performing intensity measurements within the device. As an application, we demonstrate the training of a numerically simulated photonic artificial neural network. Beyond the training of photonic machine learning implementations, our method may also be of broad interest to experimental sensitivity analysis of photonic systems and the optimization of reconfigurable optics platforms.
Tasks
Published	2018-05-25
URL	http://arxiv.org/abs/1805.09943v1
PDF	http://arxiv.org/pdf/1805.09943v1.pdf
PWC	https://paperswithcode.com/paper/training-of-photonic-neural-networks-through
Repo
Framework

Staging Human-computer Dialogs: An Application of the Futamura Projections


Title	Staging Human-computer Dialogs: An Application of the Futamura Projections
Authors	Brandon M. Williams, Saverio Perugini
Abstract	We demonstrate an application of the Futamura Projections to human-computer interaction, and particularly to staging human-computer dialogs. Specifically, by providing staging analogs to the classical Futamura Projections, we demonstrate that the Futamura Projections can be applied to the staging of human-computer dialogs in addition to the execution of programs.
Tasks
Published	2018-11-13
URL	http://arxiv.org/abs/1811.05536v1
PDF	http://arxiv.org/pdf/1811.05536v1.pdf
PWC	https://paperswithcode.com/paper/staging-human-computer-dialogs-an-application
Repo
Framework

Proximity Full-Text Search by Means of Additional Indexes with Multi-component Keys: In Pursuit of Optimal Performance


Title	Proximity Full-Text Search by Means of Additional Indexes with Multi-component Keys: In Pursuit of Optimal Performance
Authors	Alexander B. Veretennikov
Abstract	Full-text search engines are important tools for information retrieval. In a proximity full-text search, a document is relevant if it contains query terms near each other, especially if the query terms are frequently occurring words. For each word in a text, we use additional indexes to store information about nearby words that are at distances from the given word of less than or equal to the MaxDistance parameter. We showed that additional indexes with three-component keys can be used to improve the average query execution time by up to 94.7 times if the queries consist of high-frequency occurring words. In this paper, we present a new search algorithm with even more performance gains. We consider several strategies for selecting multi-component key indexes for a specific query and compare these strategies with the optimal strategy. We also present the results of search experiments, which show that three-component key indexes enable much faster searches in comparison with two-component key indexes. This is a pre-print of a contribution “Veretennikov A.B. (2019) Proximity Full-Text Search by Means of Additional Indexes with Multi-component Keys: In Pursuit of Optimal Performance.” published in “Manolopoulos Y., Stupnikov S. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2018. Communications in Computer and Information Science, vol 1003” published by Springer, Cham. This book constitutes the refereed proceedings of the 20th International Conference on Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2018, held in Moscow, Russia, in October 2018. The 9 revised full papers presented together with three invited papers were carefully reviewed and selected from 54 submissions. The final authenticated version is available online at https://doi.org/10.1007/978-3-030-23584-0_7.
Tasks	Information Retrieval
Published	2018-12-18
URL	https://arxiv.org/abs/1812.07640v2
PDF	https://arxiv.org/pdf/1812.07640v2.pdf
PWC	https://paperswithcode.com/paper/proximity-full-text-search-by-means-of
Repo
Framework

PatientEG Dataset: Bringing Event Graph Model with Temporal Relations to Electronic Medical Records


Title	PatientEG Dataset: Bringing Event Graph Model with Temporal Relations to Electronic Medical Records
Authors	Xuli Liu, Jihao Jin, Qi Wang, Tong Ruan, Yangming Zhou, Daqi Gao, Yichao Yin
Abstract	Medical activities, such as diagnoses, medicine treatments, and laboratory tests, as well as temporal relations between these activities are the basic concepts in clinical research. However, existing relational data model on electronic medical records (EMRs) lacks explicit and accurate semantic definitions of these concepts. It leads to the inconvenience of query construction and the inefficiency of query execution where multi-table join queries are frequently required. In this paper, we propose a patient event graph (PatientEG) model to capture the characteristics of EMRs. We respectively define five types of medical entities, five types of medical events and five types of temporal relations. Based on the proposed model, we also construct a PatientEG dataset with 191,294 events, 3,429 distinct entities, and 545,993 temporal relations using EMRs from Shanghai Shuguang hospital. To help to normalize entity values which contain synonyms, hyponymies, and abbreviations, we link them with the Chinese biomedical knowledge graph. With the help of PatientEG dataset, we are able to conveniently perform complex queries for clinical research such as auxiliary diagnosis and therapeutic effectiveness analysis. In addition, we provide a SPARQL endpoint to access PatientEG dataset and the dataset is also publicly available online. Also, we list several illustrative SPARQL queries on our website.
Tasks
Published	2018-12-24
URL	http://arxiv.org/abs/1812.09905v1
PDF	http://arxiv.org/pdf/1812.09905v1.pdf
PWC	https://paperswithcode.com/paper/patienteg-dataset-bringing-event-graph-model
Repo
Framework