Paper Group ANR 473
An On-chip Trainable and Clock-less Spiking Neural Network with 1R Memristive Synapses. Reconstructing the Forest of Lineage Trees of Diverse Bacterial Communities Using Bio-inspired Image Analysis. Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness. $k$-Nearest Neighbor Augmented Neural Networks for Text Classification. Cros …
An On-chip Trainable and Clock-less Spiking Neural Network with 1R Memristive Synapses
Title | An On-chip Trainable and Clock-less Spiking Neural Network with 1R Memristive Synapses |
Authors | Aditya Shukla, Udayan Ganguly |
Abstract | Spiking neural networks (SNNs) are being explored in an attempt to mimic brain’s capability to learn and recognize at low power. Crossbar architecture with highly scalable Resistive RAM or RRAM array serving as synaptic weights and neuronal drivers in the periphery is an attractive option for SNN. Recognition (akin to reading the synaptic weight) requires small amplitude bias applied across the RRAM to minimize conductance change. Learning (akin to writing or updating the synaptic weight) requires large amplitude bias pulses to produce a conductance change. The contradictory bias amplitude requirement to perform reading and writing simultaneously and asynchronously, akin to biology, is a major challenge. Solutions suggested in the literature rely on time-division-multiplexing of read and write operations based on clocks, or approximations ignoring the reading when coincidental with writing. In this work, we overcome this challenge and present a clock-less approach wherein reading and writing are performed in different frequency domains. This enables learning and recognition simultaneously on an SNN. We validate our scheme in SPICE circuit simulator by translating a two-layered feed-forward Iris classifying SNN to demonstrate software-equivalent performance. The system performance is not adversely affected by a voltage dependence of conductance in realistic RRAMs, despite departing from linearity. Overall, our approach enables direct implementation of biological SNN algorithms in hardware. |
Tasks | |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.02699v2 |
http://arxiv.org/pdf/1709.02699v2.pdf | |
PWC | https://paperswithcode.com/paper/an-on-chip-trainable-and-clock-less-spiking |
Repo | |
Framework | |
Reconstructing the Forest of Lineage Trees of Diverse Bacterial Communities Using Bio-inspired Image Analysis
Title | Reconstructing the Forest of Lineage Trees of Diverse Bacterial Communities Using Bio-inspired Image Analysis |
Authors | Athanasios D. Balomenos, Elias S. Manolakos |
Abstract | Cell segmentation and tracking allow us to extract a plethora of cell attributes from bacterial time-lapse cell movies, thus promoting computational modeling and simulation of biological processes down to the single-cell level. However, to analyze successfully complex cell movies, imaging multiple interacting bacterial clones as they grow and merge to generate overcrowded bacterial communities with thousands of cells in the field of view, segmentation results should be near perfect to warrant good tracking results. We introduce here a fully automated closed-loop bio-inspired computational strategy that exploits prior knowledge about the expected structure of a colony’s lineage tree to locate and correct segmentation errors in analyzed movie frames. We show that this correction strategy is effective, resulting in improved cell tracking and consequently trustworthy deep colony lineage trees. Our image analysis approach has the unique capability to keep tracking cells even after clonal subpopulations merge in the movie. This enables the reconstruction of the complete Forest of Lineage Trees (FLT) representation of evolving multi-clonal bacterial communities. Moreover, the percentage of valid cell trajectories extracted from the image analysis almost doubles after segmentation correction. This plethora of trustworthy data extracted from a complex cell movie analysis enables single-cell analytics as a tool for addressing compelling questions for human health, such as understanding the role of single-cell stochasticity in antibiotics resistance without losing site of the inter-cellular interactions and microenvironment effects that may shape it. |
Tasks | Cell Segmentation |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07359v1 |
http://arxiv.org/pdf/1706.07359v1.pdf | |
PWC | https://paperswithcode.com/paper/reconstructing-the-forest-of-lineage-trees-of |
Repo | |
Framework | |
Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness
Title | Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness |
Authors | Shuo Liu, Zheng Liu |
Abstract | Object Detection is critical for automatic military operations. However, the performance of current object detection algorithms is deficient in terms of the requirements in military scenarios. This is mainly because the object presence is hard to detect due to the indistinguishable appearance and dramatic changes of object’s size which is determined by the distance to the detection sensors. Recent advances in deep learning have achieved promising results in many challenging tasks. The state-of-the-art in object detection is represented by convolutional neural networks (CNNs), such as the fast R-CNN algorithm. These CNN-based methods improve the detection performance significantly on several public generic object detection datasets. However, their performance on detecting small objects or undistinguishable objects in visible spectrum images is still insufficient. In this study, we propose a novel detection algorithm for military objects by fusing multi-channel CNNs. We combine spatial, temporal and thermal information by generating a three-channel image, and they will be fused as CNN feature maps in an unsupervised manner. The backbone of our object detection framework is from the fast R-CNN algorithm, and we utilize cross-domain transfer learning technique to fine-tune the CNN model on generated multi-channel images. In the experiments, we validated the proposed method with the images from SENSIAC (Military Sensing Information Analysis Centre) database and compared it with the state-of-the-art. The experimental results demonstrated the effectiveness of the proposed method on both accuracy and computational efficiency. |
Tasks | Object Detection, Transfer Learning |
Published | 2017-11-30 |
URL | http://arxiv.org/abs/1712.00075v1 |
http://arxiv.org/pdf/1712.00075v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-channel-cnn-based-object-detection-for |
Repo | |
Framework | |
$k$-Nearest Neighbor Augmented Neural Networks for Text Classification
Title | $k$-Nearest Neighbor Augmented Neural Networks for Text Classification |
Authors | Zhiguo Wang, Wael Hamza, Linfeng Song |
Abstract | In recent years, many deep-learning based models are proposed for text classification. This kind of models well fits the training set from the statistical point of view. However, it lacks the capacity of utilizing instance-level information from individual instances in the training set. In this work, we propose to enhance neural network models by allowing them to leverage information from $k$-nearest neighbor (kNN) of the input text. Our model employs a neural network that encodes texts into text embeddings. Moreover, we also utilize $k$-nearest neighbor of the input text as an external memory, and utilize it to capture instance-level information from the training set. The final prediction is made based on features from both the neural network encoder and the kNN memory. Experimental results on several standard benchmark datasets show that our model outperforms the baseline model on all the datasets, and it even beats a very deep neural network model (with 29 layers) in several datasets. Our model also shows superior performance when training instances are scarce, and when the training set is severely unbalanced. Our model also leverages techniques such as semi-supervised training and transfer learning quite well. |
Tasks | Text Classification, Transfer Learning |
Published | 2017-08-25 |
URL | http://arxiv.org/abs/1708.07863v1 |
http://arxiv.org/pdf/1708.07863v1.pdf | |
PWC | https://paperswithcode.com/paper/k-nearest-neighbor-augmented-neural-networks |
Repo | |
Framework | |
Crossmatching variable objects with the Gaia data
Title | Crossmatching variable objects with the Gaia data |
Authors | Lorenzo Rimoldini, Krzysztof Nienartowicz, Maria Süveges, Jonathan Charnas, Leanne P. Guy, Grégory Jevardat de Fombelle, Berry Holl, Isabelle Lecoeur-Taïbi, Nami Mowlavi, Diego Ordóñez-Blanco, Laurent Eyer |
Abstract | Tens of millions of new variable objects are expected to be identified in over a billion time series from the Gaia mission. Crossmatching known variable sources with those from Gaia is crucial to incorporate current knowledge, understand how these objects appear in the Gaia data, train supervised classifiers to recognise known classes, and validate the results of the Variability Processing and Analysis Coordination Unit (CU7) within the Gaia Data Analysis and Processing Consortium (DPAC). The method employed by CU7 to crossmatch variables for the first Gaia data release includes a binary classifier to take into account positional uncertainties, proper motion, targeted variability signals, and artefacts present in the early calibration of the Gaia data. Crossmatching with a classifier makes it possible to automate all those decisions which are typically made during visual inspection. The classifier can be trained with objects characterized by a variety of attributes to ensure similarity in multiple dimensions (astrometry, photometry, time-series features), with no need for a-priori transformations to compare different photometric bands, or of predictive models of the motion of objects to compare positions. Other advantages as well as some disadvantages of the method are discussed. Implementation steps from the training to the assessment of the crossmatch classifier and selection of results are described. |
Tasks | Calibration, Time Series |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04165v1 |
http://arxiv.org/pdf/1702.04165v1.pdf | |
PWC | https://paperswithcode.com/paper/crossmatching-variable-objects-with-the-gaia |
Repo | |
Framework | |
Collective Vertex Classification Using Recursive Neural Network
Title | Collective Vertex Classification Using Recursive Neural Network |
Authors | Qiongkai Xu, Qing Wang, Chenchen Xu, Lizhen Qu |
Abstract | Collective classification of vertices is a task of assigning categories to each vertex in a graph based on both vertex attributes and link structure. Nevertheless, some existing approaches do not use the features of neighbouring vertices properly, due to the noise introduced by these features. In this paper, we propose a graph-based recursive neural network framework for collective vertex classification. In this framework, we generate hidden representations from both attributes of vertices and representations of neighbouring vertices via recursive neural networks. Under this framework, we explore two types of recursive neural units, naive recursive neural unit and long short-term memory unit. We have conducted experiments on four real-world network datasets. The experimental results show that our frame- work with long short-term memory model achieves better results and outperforms several competitive baseline methods. |
Tasks | |
Published | 2017-01-24 |
URL | http://arxiv.org/abs/1701.06751v1 |
http://arxiv.org/pdf/1701.06751v1.pdf | |
PWC | https://paperswithcode.com/paper/collective-vertex-classification-using |
Repo | |
Framework | |
Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks
Title | Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks |
Authors | Zhuolin Jiang, Viktor Rozgic, Sancar Adali |
Abstract | Infrared (IR) imaging has the potential to enable more robust action recognition systems compared to visible spectrum cameras due to lower sensitivity to lighting conditions and appearance variability. While the action recognition task on videos collected from visible spectrum imaging has received much attention, action recognition in IR videos is significantly less explored. Our objective is to exploit imaging data in this modality for the action recognition task. In this work, we propose a novel two-stream 3D convolutional neural network (CNN) architecture by introducing the discriminative code layer and the corresponding discriminative code loss function. The proposed network processes IR image and the IR-based optical flow field sequences. We pretrain the 3D CNN model on the visible spectrum Sports-1M action dataset and finetune it on the Infrared Action Recognition (InfAR) dataset. To our best knowledge, this is the first application of the 3D CNN to action recognition in the IR domain. We conduct an elaborate analysis of different fusion schemes (weighted average, single and double-layer neural nets) applied to different 3D CNN outputs. Experimental results demonstrate that our approach can achieve state-of-the-art average precision (AP) performances on the InfAR dataset: (1) the proposed two-stream 3D CNN achieves the best reported 77.5% AP, and (2) our 3D CNN model applied to the optical flow fields achieves the best reported single stream 75.42% AP. |
Tasks | Optical Flow Estimation, Temporal Action Localization |
Published | 2017-05-18 |
URL | http://arxiv.org/abs/1705.06709v1 |
http://arxiv.org/pdf/1705.06709v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-spatiotemporal-features-for-infrared |
Repo | |
Framework | |
Deep Edge-Aware Saliency Detection
Title | Deep Edge-Aware Saliency Detection |
Authors | Jing Zhang, Yuchao Dai, Fatih Porikli, Mingyi He |
Abstract | There has been profound progress in visual saliency thanks to the deep learning architectures, however, there still exist three major challenges that hinder the detection performance for scenes with complex compositions, multiple salient objects, and salient objects of diverse scales. In particular, output maps of the existing methods remain low in spatial resolution causing blurred edges due to the stride and pooling operations, networks often neglect descriptive statistical and handcrafted priors that have potential to complement saliency detection results, and deep features at different layers stay mainly desolate waiting to be effectively fused to handle multi-scale salient objects. In this paper, we tackle these issues by a new fully convolutional neural network that jointly learns salient edges and saliency labels in an end-to-end fashion. Our framework first employs convolutional layers that reformulate the detection task as a dense labeling problem, then integrates handcrafted saliency features in a hierarchical manner into lower and higher levels of the deep network to leverage available information for multi-scale response, and finally refines the saliency map through dilated convolutions by imposing context. In this way, the salient edge priors are efficiently incorporated and the output resolution is significantly improved while keeping the memory requirements low, leading to cleaner and sharper object boundaries. Extensive experimental analyses on ten benchmarks demonstrate that our framework achieves consistently superior performance and attains robustness for complex scenes in comparison to the very recent state-of-the-art approaches. |
Tasks | Saliency Detection |
Published | 2017-08-15 |
URL | http://arxiv.org/abs/1708.04366v1 |
http://arxiv.org/pdf/1708.04366v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-edge-aware-saliency-detection |
Repo | |
Framework | |
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Title | Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets |
Authors | Wei-Lun Chao, Hexiang Hu, Fei Sha |
Abstract | Visual question answering (Visual QA) has attracted a lot of attention lately, seen essentially as a form of (visual) Turing test that artificial intelligence should strive to achieve. In this paper, we study a crucial component of this task: how can we design good datasets for the task? We focus on the design of multiple-choice based datasets where the learner has to select the right answer from a set of candidate ones including the target (\ie the correct one) and the decoys (\ie the incorrect ones). Through careful analysis of the results attained by state-of-the-art learning models and human annotators on existing datasets, we show that the design of the decoy answers has a significant impact on how and what the learning models learn from the datasets. In particular, the resulting learner can ignore the visual information, the question, or both while still doing well on the task. Inspired by this, we propose automatic procedures to remedy such design deficiencies. We apply the procedures to re-construct decoy answers for two popular Visual QA datasets as well as to create a new Visual QA dataset from the Visual Genome project, resulting in the largest dataset for this task. Extensive empirical studies show that the design deficiencies have been alleviated in the remedied datasets and the performance on them is likely a more faithful indicator of the difference among learning models. The datasets are released and publicly available via http://www.teds.usc.edu/website_vqa/. |
Tasks | Question Answering, Visual Question Answering |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07121v2 |
http://arxiv.org/pdf/1704.07121v2.pdf | |
PWC | https://paperswithcode.com/paper/being-negative-but-constructively-lessons |
Repo | |
Framework | |
Some observations on computer lip-reading: moving from the dream to the reality
Title | Some observations on computer lip-reading: moving from the dream to the reality |
Authors | Helen L. Bear, Gari Owen, Richard Harvey, Barry-John Theobald |
Abstract | In the quest for greater computer lip-reading performance there are a number of tacit assumptions which are either present in the datasets (high resolution for example) or in the methods (recognition of spoken visual units called visemes for example). Here we review these and other assumptions and show the surprising result that computer lip-reading is not heavily constrained by video resolution, pose, lighting and other practical factors. However, the working assumption that visemes, which are the visual equivalent of phonemes, are the best unit for recognition does need further examination. We conclude that visemes, which were defined over a century ago, are unlikely to be optimal for a modern computer lip-reading system. |
Tasks | |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01084v1 |
http://arxiv.org/pdf/1710.01084v1.pdf | |
PWC | https://paperswithcode.com/paper/some-observations-on-computer-lip-reading |
Repo | |
Framework | |
Towards an Inferential Lexicon of Event Selecting Predicates for French
Title | Towards an Inferential Lexicon of Event Selecting Predicates for French |
Authors | Ingrid Falk, Fabienne Martin |
Abstract | We present a manually constructed seed lexicon encoding the inferential profiles of French event selecting predicates across different uses. The inferential profile (Karttunen, 1971a) of a verb is designed to capture the inferences triggered by the use of this verb in context. It reflects the influence of the clause-embedding verb on the factuality of the event described by the embedded clause. The resource developed provides evidence for the following three hypotheses: (i) French implicative verbs have an aspect dependent profile (their inferential profile varies with outer aspect), while factive verbs have an aspect independent profile (they keep the same inferential profile with both imperfective and perfective aspect); (ii) implicativity decreases with imperfective aspect: the inferences triggered by French implicative verbs combined with perfective aspect are often weakened when the same verbs are combined with imperfective aspect; (iii) implicativity decreases with an animate (deep) subject: the inferences triggered by a verb which is implicative with an inanimate subject are weakened when the same verb is used with an animate subject. The resource additionally shows that verbs with different inferential profiles display clearly distinct sub-categorisation patterns. In particular, verbs that have both factive and implicative readings are shown to prefer infinitival clauses in their implicative reading, and tensed clauses in their factive reading. |
Tasks | |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01095v1 |
http://arxiv.org/pdf/1710.01095v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-an-inferential-lexicon-of-event |
Repo | |
Framework | |
Personalized Classifier Ensemble Pruning Framework for Mobile Crowdsourcing
Title | Personalized Classifier Ensemble Pruning Framework for Mobile Crowdsourcing |
Authors | Shaowei Wang, Liusheng Huang, Pengzhan Wang, Hongli Xu, Wei Yang |
Abstract | Ensemble learning has been widely employed by mobile applications, ranging from environmental sensing to activity recognitions. One of the fundamental issue in ensemble learning is the trade-off between classification accuracy and computational costs, which is the goal of ensemble pruning. During crowdsourcing, the centralized aggregator releases ensemble learning models to a large number of mobile participants for task evaluation or as the crowdsourcing learning results, while different participants may seek for different levels of the accuracy-cost trade-off. However, most of existing ensemble pruning approaches consider only one identical level of such trade-off. In this study, we present an efficient ensemble pruning framework for personalized accuracy-cost trade-offs via multi-objective optimization. Specifically, for the commonly used linear-combination style of the trade-off, we provide an objective-mixture optimization to further reduce the number of ensemble candidates. Experimental results show that our framework is highly efficient for personalized ensemble pruning, and achieves much better pruning performance with objective-mixture optimization when compared to state-of-art approaches. |
Tasks | |
Published | 2017-01-25 |
URL | http://arxiv.org/abs/1701.07166v1 |
http://arxiv.org/pdf/1701.07166v1.pdf | |
PWC | https://paperswithcode.com/paper/personalized-classifier-ensemble-pruning |
Repo | |
Framework | |
A Bayesian Filtering Algorithm for Gaussian Mixture Models
Title | A Bayesian Filtering Algorithm for Gaussian Mixture Models |
Authors | Adrian G. Wills, Johannes Hendriks, Christopher Renton, Brett Ninness |
Abstract | A Bayesian filtering algorithm is developed for a class of state-space systems that can be modelled via Gaussian mixtures. In general, the exact solution to this filtering problem involves an exponential growth in the number of mixture terms and this is handled here by utilising a Gaussian mixture reduction step after both the time and measurement updates. In addition, a square-root implementation of the unified algorithm is presented and this algorithm is profiled on several simulated systems. This includes the state estimation for two non-linear systems that are strictly outside the class considered in this paper. |
Tasks | |
Published | 2017-05-16 |
URL | http://arxiv.org/abs/1705.05495v1 |
http://arxiv.org/pdf/1705.05495v1.pdf | |
PWC | https://paperswithcode.com/paper/a-bayesian-filtering-algorithm-for-gaussian |
Repo | |
Framework | |
A Brownian Motion Model and Extreme Belief Machine for Modeling Sensor Data Measurements
Title | A Brownian Motion Model and Extreme Belief Machine for Modeling Sensor Data Measurements |
Authors | Robert A. Murphy |
Abstract | As the title suggests, we will describe (and justify through the presentation of some of the relevant mathematics) prediction methodologies for sensor measurements. This exposition will mainly be concerned with the mathematics related to modeling the sensor measurements. |
Tasks | |
Published | 2017-04-01 |
URL | http://arxiv.org/abs/1704.00207v2 |
http://arxiv.org/pdf/1704.00207v2.pdf | |
PWC | https://paperswithcode.com/paper/a-brownian-motion-model-and-extreme-belief |
Repo | |
Framework | |
Mining Deep And-Or Object Structures via Cost-Sensitive Question-Answer-Based Active Annotations
Title | Mining Deep And-Or Object Structures via Cost-Sensitive Question-Answer-Based Active Annotations |
Authors | Quanshi Zhang, Ying Nian Wu, Hao Zhang, Song-Chun Zhu |
Abstract | This paper presents a cost-sensitive active Question-Answering (QA) framework for learning a nine-layer And-Or graph (AOG) from web images. The AOG explicitly represents object categories, poses/viewpoints, parts, and detailed structures within the parts in a compositional hierarchy. The QA framework is designed to minimize an overall risk, which trades off the loss and query costs. The loss is defined for nodes in all layers of the AOG, including the generative loss (measuring the likelihood of the images) and the discriminative loss (measuring the fitness to human answers). The cost comprises both the human labor of answering questions and the computational cost of model learning. The cost-sensitive QA framework iteratively selects different storylines of questions to update different nodes in the AOG. Experiments showed that our method required much less human supervision (e.g., labeling parts on 3–10 training objects for each category) and achieved better performance than baseline methods. |
Tasks | Question Answering |
Published | 2017-08-13 |
URL | http://arxiv.org/abs/1708.03911v3 |
http://arxiv.org/pdf/1708.03911v3.pdf | |
PWC | https://paperswithcode.com/paper/mining-deep-and-or-object-structures-via-cost |
Repo | |
Framework | |