July 27, 2019

3358 words 16 mins read

Paper Group ANR 473

An On-chip Trainable and Clock-less Spiking Neural Network with 1R Memristive Synapses. Reconstructing the Forest of Lineage Trees of Diverse Bacterial Communities Using Bio-inspired Image Analysis. Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness. $k$-Nearest Neighbor Augmented Neural Networks for Text Classification. Cros …

An On-chip Trainable and Clock-less Spiking Neural Network with 1R Memristive Synapses


Title	An On-chip Trainable and Clock-less Spiking Neural Network with 1R Memristive Synapses
Authors	Aditya Shukla, Udayan Ganguly
Abstract	Spiking neural networks (SNNs) are being explored in an attempt to mimic brain’s capability to learn and recognize at low power. Crossbar architecture with highly scalable Resistive RAM or RRAM array serving as synaptic weights and neuronal drivers in the periphery is an attractive option for SNN. Recognition (akin to reading the synaptic weight) requires small amplitude bias applied across the RRAM to minimize conductance change. Learning (akin to writing or updating the synaptic weight) requires large amplitude bias pulses to produce a conductance change. The contradictory bias amplitude requirement to perform reading and writing simultaneously and asynchronously, akin to biology, is a major challenge. Solutions suggested in the literature rely on time-division-multiplexing of read and write operations based on clocks, or approximations ignoring the reading when coincidental with writing. In this work, we overcome this challenge and present a clock-less approach wherein reading and writing are performed in different frequency domains. This enables learning and recognition simultaneously on an SNN. We validate our scheme in SPICE circuit simulator by translating a two-layered feed-forward Iris classifying SNN to demonstrate software-equivalent performance. The system performance is not adversely affected by a voltage dependence of conductance in realistic RRAMs, despite departing from linearity. Overall, our approach enables direct implementation of biological SNN algorithms in hardware.
Tasks
Published	2017-09-08
URL	http://arxiv.org/abs/1709.02699v2
PDF	http://arxiv.org/pdf/1709.02699v2.pdf
PWC	https://paperswithcode.com/paper/an-on-chip-trainable-and-clock-less-spiking
Repo
Framework

Reconstructing the Forest of Lineage Trees of Diverse Bacterial Communities Using Bio-inspired Image Analysis


Title	Reconstructing the Forest of Lineage Trees of Diverse Bacterial Communities Using Bio-inspired Image Analysis
Authors	Athanasios D. Balomenos, Elias S. Manolakos
Abstract	Cell segmentation and tracking allow us to extract a plethora of cell attributes from bacterial time-lapse cell movies, thus promoting computational modeling and simulation of biological processes down to the single-cell level. However, to analyze successfully complex cell movies, imaging multiple interacting bacterial clones as they grow and merge to generate overcrowded bacterial communities with thousands of cells in the field of view, segmentation results should be near perfect to warrant good tracking results. We introduce here a fully automated closed-loop bio-inspired computational strategy that exploits prior knowledge about the expected structure of a colony’s lineage tree to locate and correct segmentation errors in analyzed movie frames. We show that this correction strategy is effective, resulting in improved cell tracking and consequently trustworthy deep colony lineage trees. Our image analysis approach has the unique capability to keep tracking cells even after clonal subpopulations merge in the movie. This enables the reconstruction of the complete Forest of Lineage Trees (FLT) representation of evolving multi-clonal bacterial communities. Moreover, the percentage of valid cell trajectories extracted from the image analysis almost doubles after segmentation correction. This plethora of trustworthy data extracted from a complex cell movie analysis enables single-cell analytics as a tool for addressing compelling questions for human health, such as understanding the role of single-cell stochasticity in antibiotics resistance without losing site of the inter-cellular interactions and microenvironment effects that may shape it.
Tasks	Cell Segmentation
Published	2017-06-22
URL	http://arxiv.org/abs/1706.07359v1
PDF	http://arxiv.org/pdf/1706.07359v1.pdf
PWC	https://paperswithcode.com/paper/reconstructing-the-forest-of-lineage-trees-of
Repo
Framework

Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness


Title	Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness
Authors	Shuo Liu, Zheng Liu
Abstract	Object Detection is critical for automatic military operations. However, the performance of current object detection algorithms is deficient in terms of the requirements in military scenarios. This is mainly because the object presence is hard to detect due to the indistinguishable appearance and dramatic changes of object’s size which is determined by the distance to the detection sensors. Recent advances in deep learning have achieved promising results in many challenging tasks. The state-of-the-art in object detection is represented by convolutional neural networks (CNNs), such as the fast R-CNN algorithm. These CNN-based methods improve the detection performance significantly on several public generic object detection datasets. However, their performance on detecting small objects or undistinguishable objects in visible spectrum images is still insufficient. In this study, we propose a novel detection algorithm for military objects by fusing multi-channel CNNs. We combine spatial, temporal and thermal information by generating a three-channel image, and they will be fused as CNN feature maps in an unsupervised manner. The backbone of our object detection framework is from the fast R-CNN algorithm, and we utilize cross-domain transfer learning technique to fine-tune the CNN model on generated multi-channel images. In the experiments, we validated the proposed method with the images from SENSIAC (Military Sensing Information Analysis Centre) database and compared it with the state-of-the-art. The experimental results demonstrated the effectiveness of the proposed method on both accuracy and computational efficiency.
Tasks	Object Detection, Transfer Learning
Published	2017-11-30
URL	http://arxiv.org/abs/1712.00075v1
PDF	http://arxiv.org/pdf/1712.00075v1.pdf
PWC	https://paperswithcode.com/paper/multi-channel-cnn-based-object-detection-for
Repo
Framework

$k$-Nearest Neighbor Augmented Neural Networks for Text Classification


Title	$k$-Nearest Neighbor Augmented Neural Networks for Text Classification
Authors	Zhiguo Wang, Wael Hamza, Linfeng Song
Abstract	In recent years, many deep-learning based models are proposed for text classification. This kind of models well fits the training set from the statistical point of view. However, it lacks the capacity of utilizing instance-level information from individual instances in the training set. In this work, we propose to enhance neural network models by allowing them to leverage information from $k$-nearest neighbor (kNN) of the input text. Our model employs a neural network that encodes texts into text embeddings. Moreover, we also utilize $k$-nearest neighbor of the input text as an external memory, and utilize it to capture instance-level information from the training set. The final prediction is made based on features from both the neural network encoder and the kNN memory. Experimental results on several standard benchmark datasets show that our model outperforms the baseline model on all the datasets, and it even beats a very deep neural network model (with 29 layers) in several datasets. Our model also shows superior performance when training instances are scarce, and when the training set is severely unbalanced. Our model also leverages techniques such as semi-supervised training and transfer learning quite well.
Tasks	Text Classification, Transfer Learning
Published	2017-08-25
URL	http://arxiv.org/abs/1708.07863v1
PDF	http://arxiv.org/pdf/1708.07863v1.pdf
PWC	https://paperswithcode.com/paper/k-nearest-neighbor-augmented-neural-networks
Repo
Framework

Crossmatching variable objects with the Gaia data


Title	Crossmatching variable objects with the Gaia data
Authors	Lorenzo Rimoldini, Krzysztof Nienartowicz, Maria Süveges, Jonathan Charnas, Leanne P. Guy, Grégory Jevardat de Fombelle, Berry Holl, Isabelle Lecoeur-Taïbi, Nami Mowlavi, Diego Ordóñez-Blanco, Laurent Eyer
Abstract	Tens of millions of new variable objects are expected to be identified in over a billion time series from the Gaia mission. Crossmatching known variable sources with those from Gaia is crucial to incorporate current knowledge, understand how these objects appear in the Gaia data, train supervised classifiers to recognise known classes, and validate the results of the Variability Processing and Analysis Coordination Unit (CU7) within the Gaia Data Analysis and Processing Consortium (DPAC). The method employed by CU7 to crossmatch variables for the first Gaia data release includes a binary classifier to take into account positional uncertainties, proper motion, targeted variability signals, and artefacts present in the early calibration of the Gaia data. Crossmatching with a classifier makes it possible to automate all those decisions which are typically made during visual inspection. The classifier can be trained with objects characterized by a variety of attributes to ensure similarity in multiple dimensions (astrometry, photometry, time-series features), with no need for a-priori transformations to compare different photometric bands, or of predictive models of the motion of objects to compare positions. Other advantages as well as some disadvantages of the method are discussed. Implementation steps from the training to the assessment of the crossmatch classifier and selection of results are described.
Tasks	Calibration, Time Series
Published	2017-02-14
URL	http://arxiv.org/abs/1702.04165v1
PDF	http://arxiv.org/pdf/1702.04165v1.pdf
PWC	https://paperswithcode.com/paper/crossmatching-variable-objects-with-the-gaia
Repo
Framework

Collective Vertex Classification Using Recursive Neural Network


Title	Collective Vertex Classification Using Recursive Neural Network
Authors	Qiongkai Xu, Qing Wang, Chenchen Xu, Lizhen Qu
Abstract	Collective classification of vertices is a task of assigning categories to each vertex in a graph based on both vertex attributes and link structure. Nevertheless, some existing approaches do not use the features of neighbouring vertices properly, due to the noise introduced by these features. In this paper, we propose a graph-based recursive neural network framework for collective vertex classification. In this framework, we generate hidden representations from both attributes of vertices and representations of neighbouring vertices via recursive neural networks. Under this framework, we explore two types of recursive neural units, naive recursive neural unit and long short-term memory unit. We have conducted experiments on four real-world network datasets. The experimental results show that our frame- work with long short-term memory model achieves better results and outperforms several competitive baseline methods.
Tasks
Published	2017-01-24
URL	http://arxiv.org/abs/1701.06751v1
PDF	http://arxiv.org/pdf/1701.06751v1.pdf
PWC	https://paperswithcode.com/paper/collective-vertex-classification-using
Repo
Framework

Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks


Title	Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks
Authors	Zhuolin Jiang, Viktor Rozgic, Sancar Adali
Abstract	Infrared (IR) imaging has the potential to enable more robust action recognition systems compared to visible spectrum cameras due to lower sensitivity to lighting conditions and appearance variability. While the action recognition task on videos collected from visible spectrum imaging has received much attention, action recognition in IR videos is significantly less explored. Our objective is to exploit imaging data in this modality for the action recognition task. In this work, we propose a novel two-stream 3D convolutional neural network (CNN) architecture by introducing the discriminative code layer and the corresponding discriminative code loss function. The proposed network processes IR image and the IR-based optical flow field sequences. We pretrain the 3D CNN model on the visible spectrum Sports-1M action dataset and finetune it on the Infrared Action Recognition (InfAR) dataset. To our best knowledge, this is the first application of the 3D CNN to action recognition in the IR domain. We conduct an elaborate analysis of different fusion schemes (weighted average, single and double-layer neural nets) applied to different 3D CNN outputs. Experimental results demonstrate that our approach can achieve state-of-the-art average precision (AP) performances on the InfAR dataset: (1) the proposed two-stream 3D CNN achieves the best reported 77.5% AP, and (2) our 3D CNN model applied to the optical flow fields achieves the best reported single stream 75.42% AP.
Tasks	Optical Flow Estimation, Temporal Action Localization
Published	2017-05-18
URL	http://arxiv.org/abs/1705.06709v1
PDF	http://arxiv.org/pdf/1705.06709v1.pdf
PWC	https://paperswithcode.com/paper/learning-spatiotemporal-features-for-infrared
Repo
Framework

Deep Edge-Aware Saliency Detection


Title	Deep Edge-Aware Saliency Detection
Authors	Jing Zhang, Yuchao Dai, Fatih Porikli, Mingyi He
Abstract	There has been profound progress in visual saliency thanks to the deep learning architectures, however, there still exist three major challenges that hinder the detection performance for scenes with complex compositions, multiple salient objects, and salient objects of diverse scales. In particular, output maps of the existing methods remain low in spatial resolution causing blurred edges due to the stride and pooling operations, networks often neglect descriptive statistical and handcrafted priors that have potential to complement saliency detection results, and deep features at different layers stay mainly desolate waiting to be effectively fused to handle multi-scale salient objects. In this paper, we tackle these issues by a new fully convolutional neural network that jointly learns salient edges and saliency labels in an end-to-end fashion. Our framework first employs convolutional layers that reformulate the detection task as a dense labeling problem, then integrates handcrafted saliency features in a hierarchical manner into lower and higher levels of the deep network to leverage available information for multi-scale response, and finally refines the saliency map through dilated convolutions by imposing context. In this way, the salient edge priors are efficiently incorporated and the output resolution is significantly improved while keeping the memory requirements low, leading to cleaner and sharper object boundaries. Extensive experimental analyses on ten benchmarks demonstrate that our framework achieves consistently superior performance and attains robustness for complex scenes in comparison to the very recent state-of-the-art approaches.
Tasks	Saliency Detection
Published	2017-08-15
URL	http://arxiv.org/abs/1708.04366v1
PDF	http://arxiv.org/pdf/1708.04366v1.pdf
PWC	https://paperswithcode.com/paper/deep-edge-aware-saliency-detection
Repo
Framework

Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets


Title	Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Authors	Wei-Lun Chao, Hexiang Hu, Fei Sha
Abstract	Visual question answering (Visual QA) has attracted a lot of attention lately, seen essentially as a form of (visual) Turing test that artificial intelligence should strive to achieve. In this paper, we study a crucial component of this task: how can we design good datasets for the task? We focus on the design of multiple-choice based datasets where the learner has to select the right answer from a set of candidate ones including the target (\ie the correct one) and the decoys (\ie the incorrect ones). Through careful analysis of the results attained by state-of-the-art learning models and human annotators on existing datasets, we show that the design of the decoy answers has a significant impact on how and what the learning models learn from the datasets. In particular, the resulting learner can ignore the visual information, the question, or both while still doing well on the task. Inspired by this, we propose automatic procedures to remedy such design deficiencies. We apply the procedures to re-construct decoy answers for two popular Visual QA datasets as well as to create a new Visual QA dataset from the Visual Genome project, resulting in the largest dataset for this task. Extensive empirical studies show that the design deficiencies have been alleviated in the remedied datasets and the performance on them is likely a more faithful indicator of the difference among learning models. The datasets are released and publicly available via http://www.teds.usc.edu/website_vqa/.
Tasks	Question Answering, Visual Question Answering
Published	2017-04-24
URL	http://arxiv.org/abs/1704.07121v2
PDF	http://arxiv.org/pdf/1704.07121v2.pdf
PWC	https://paperswithcode.com/paper/being-negative-but-constructively-lessons
Repo
Framework

Some observations on computer lip-reading: moving from the dream to the reality


Title	Some observations on computer lip-reading: moving from the dream to the reality
Authors	Helen L. Bear, Gari Owen, Richard Harvey, Barry-John Theobald
Abstract	In the quest for greater computer lip-reading performance there are a number of tacit assumptions which are either present in the datasets (high resolution for example) or in the methods (recognition of spoken visual units called visemes for example). Here we review these and other assumptions and show the surprising result that computer lip-reading is not heavily constrained by video resolution, pose, lighting and other practical factors. However, the working assumption that visemes, which are the visual equivalent of phonemes, are the best unit for recognition does need further examination. We conclude that visemes, which were defined over a century ago, are unlikely to be optimal for a modern computer lip-reading system.
Tasks
Published	2017-10-03
URL	http://arxiv.org/abs/1710.01084v1
PDF	http://arxiv.org/pdf/1710.01084v1.pdf
PWC	https://paperswithcode.com/paper/some-observations-on-computer-lip-reading
Repo
Framework

Towards an Inferential Lexicon of Event Selecting Predicates for French


Title	Towards an Inferential Lexicon of Event Selecting Predicates for French
Authors	Ingrid Falk, Fabienne Martin
Abstract	We present a manually constructed seed lexicon encoding the inferential profiles of French event selecting predicates across different uses. The inferential profile (Karttunen, 1971a) of a verb is designed to capture the inferences triggered by the use of this verb in context. It reflects the influence of the clause-embedding verb on the factuality of the event described by the embedded clause. The resource developed provides evidence for the following three hypotheses: (i) French implicative verbs have an aspect dependent profile (their inferential profile varies with outer aspect), while factive verbs have an aspect independent profile (they keep the same inferential profile with both imperfective and perfective aspect); (ii) implicativity decreases with imperfective aspect: the inferences triggered by French implicative verbs combined with perfective aspect are often weakened when the same verbs are combined with imperfective aspect; (iii) implicativity decreases with an animate (deep) subject: the inferences triggered by a verb which is implicative with an inanimate subject are weakened when the same verb is used with an animate subject. The resource additionally shows that verbs with different inferential profiles display clearly distinct sub-categorisation patterns. In particular, verbs that have both factive and implicative readings are shown to prefer infinitival clauses in their implicative reading, and tensed clauses in their factive reading.
Tasks
Published	2017-10-03
URL	http://arxiv.org/abs/1710.01095v1
PDF	http://arxiv.org/pdf/1710.01095v1.pdf
PWC	https://paperswithcode.com/paper/towards-an-inferential-lexicon-of-event
Repo
Framework

Personalized Classifier Ensemble Pruning Framework for Mobile Crowdsourcing


Title	Personalized Classifier Ensemble Pruning Framework for Mobile Crowdsourcing
Authors	Shaowei Wang, Liusheng Huang, Pengzhan Wang, Hongli Xu, Wei Yang
Abstract	Ensemble learning has been widely employed by mobile applications, ranging from environmental sensing to activity recognitions. One of the fundamental issue in ensemble learning is the trade-off between classification accuracy and computational costs, which is the goal of ensemble pruning. During crowdsourcing, the centralized aggregator releases ensemble learning models to a large number of mobile participants for task evaluation or as the crowdsourcing learning results, while different participants may seek for different levels of the accuracy-cost trade-off. However, most of existing ensemble pruning approaches consider only one identical level of such trade-off. In this study, we present an efficient ensemble pruning framework for personalized accuracy-cost trade-offs via multi-objective optimization. Specifically, for the commonly used linear-combination style of the trade-off, we provide an objective-mixture optimization to further reduce the number of ensemble candidates. Experimental results show that our framework is highly efficient for personalized ensemble pruning, and achieves much better pruning performance with objective-mixture optimization when compared to state-of-art approaches.
Tasks
Published	2017-01-25
URL	http://arxiv.org/abs/1701.07166v1
PDF	http://arxiv.org/pdf/1701.07166v1.pdf
PWC	https://paperswithcode.com/paper/personalized-classifier-ensemble-pruning
Repo
Framework

A Bayesian Filtering Algorithm for Gaussian Mixture Models


Title	A Bayesian Filtering Algorithm for Gaussian Mixture Models
Authors	Adrian G. Wills, Johannes Hendriks, Christopher Renton, Brett Ninness
Abstract	A Bayesian filtering algorithm is developed for a class of state-space systems that can be modelled via Gaussian mixtures. In general, the exact solution to this filtering problem involves an exponential growth in the number of mixture terms and this is handled here by utilising a Gaussian mixture reduction step after both the time and measurement updates. In addition, a square-root implementation of the unified algorithm is presented and this algorithm is profiled on several simulated systems. This includes the state estimation for two non-linear systems that are strictly outside the class considered in this paper.
Tasks
Published	2017-05-16
URL	http://arxiv.org/abs/1705.05495v1
PDF	http://arxiv.org/pdf/1705.05495v1.pdf
PWC	https://paperswithcode.com/paper/a-bayesian-filtering-algorithm-for-gaussian
Repo
Framework

A Brownian Motion Model and Extreme Belief Machine for Modeling Sensor Data Measurements


Title	A Brownian Motion Model and Extreme Belief Machine for Modeling Sensor Data Measurements
Authors	Robert A. Murphy
Abstract	As the title suggests, we will describe (and justify through the presentation of some of the relevant mathematics) prediction methodologies for sensor measurements. This exposition will mainly be concerned with the mathematics related to modeling the sensor measurements.
Tasks
Published	2017-04-01
URL	http://arxiv.org/abs/1704.00207v2
PDF	http://arxiv.org/pdf/1704.00207v2.pdf
PWC	https://paperswithcode.com/paper/a-brownian-motion-model-and-extreme-belief
Repo
Framework

Mining Deep And-Or Object Structures via Cost-Sensitive Question-Answer-Based Active Annotations


Title	Mining Deep And-Or Object Structures via Cost-Sensitive Question-Answer-Based Active Annotations
Authors	Quanshi Zhang, Ying Nian Wu, Hao Zhang, Song-Chun Zhu
Abstract	This paper presents a cost-sensitive active Question-Answering (QA) framework for learning a nine-layer And-Or graph (AOG) from web images. The AOG explicitly represents object categories, poses/viewpoints, parts, and detailed structures within the parts in a compositional hierarchy. The QA framework is designed to minimize an overall risk, which trades off the loss and query costs. The loss is defined for nodes in all layers of the AOG, including the generative loss (measuring the likelihood of the images) and the discriminative loss (measuring the fitness to human answers). The cost comprises both the human labor of answering questions and the computational cost of model learning. The cost-sensitive QA framework iteratively selects different storylines of questions to update different nodes in the AOG. Experiments showed that our method required much less human supervision (e.g., labeling parts on 3–10 training objects for each category) and achieved better performance than baseline methods.
Tasks	Question Answering
Published	2017-08-13
URL	http://arxiv.org/abs/1708.03911v3
PDF	http://arxiv.org/pdf/1708.03911v3.pdf
PWC	https://paperswithcode.com/paper/mining-deep-and-or-object-structures-via-cost
Repo
Framework