January 27, 2020

3172 words 15 mins read

Paper Group ANR 1156

Spiking neural networks trained with backpropagation for low power neuromorphic implementation of voice activity detection. Deep Learning Based Video System for Accurate and Real-Time Parking Measurement. Analysis and development of an automatic eCall for motorcycles: a one-class cepstrum approach. Multiplicative modulations in hue-selective cells …

Spiking neural networks trained with backpropagation for low power neuromorphic implementation of voice activity detection


Title	Spiking neural networks trained with backpropagation for low power neuromorphic implementation of voice activity detection
Authors	Flavio Martinelli, Giorgia Dellaferrera, Pablo Mainar, Milos Cernak
Abstract	Recent advances in Voice Activity Detection (VAD) are driven by artificial and Recurrent Neural Networks (RNNs), however, using a VAD system in battery-operated devices requires further power efficiency. This can be achieved by neuromorphic hardware, which enables Spiking Neural Networks (SNNs) to perform inference at very low energy consumption. Spiking networks are characterized by their ability to process information efficiently, in a sparse cascade of binary events in time called spikes. However, a big performance gap separates artificial from spiking networks, mostly due to a lack of powerful SNN training algorithms. To overcome this problem we exploit an SNN model that can be recast into an RNN-like model and trained with known deep learning techniques. We describe an SNN training procedure that achieves low spiking activity and pruning algorithms to remove 85% of the network connections with no performance loss. The model achieves state-of-the-art performance with a fraction of power consumption comparing to other methods.
Tasks	Action Detection, Activity Detection
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09993v1
PDF	https://arxiv.org/pdf/1910.09993v1.pdf
PWC	https://paperswithcode.com/paper/spiking-neural-networks-trained-with
Repo
Framework

Deep Learning Based Video System for Accurate and Real-Time Parking Measurement


Title	Deep Learning Based Video System for Accurate and Real-Time Parking Measurement
Authors	Bill Yang Cai, Ricardo Alvarez, Michelle Sit, Fábio Duarte, Carlo Ratti
Abstract	Parking spaces are costly to build, parking payments are difficult to enforce, and drivers waste an excessive amount of time searching for empty lots. Accurate quantification would inform developers and municipalities in space allocation and design, while real-time measurements would provide drivers and parking enforcement with information that saves time and resources. In this paper, we propose an accurate and real-time video system for future Internet of Things (IoT) and smart cities applications. Using recent developments in deep convolutional neural networks (DCNNs) and a novel vehicle tracking filter, we combine information across multiple image frames in a video sequence to remove noise introduced by occlusions and detection failures. We demonstrate that our system achieves higher accuracy than pure image-based instance segmentation, and is comparable in performance to industry benchmark systems that utilize more expensive sensors such as radar. Furthermore, our system shows significant potential in its scalability to a city-wide scale and also in the richness of its output that goes beyond traditional binary occupancy statistics.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2019-02-20
URL	http://arxiv.org/abs/1902.07401v1
PDF	http://arxiv.org/pdf/1902.07401v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-video-system-for-accurate
Repo
Framework

Analysis and development of an automatic eCall for motorcycles: a one-class cepstrum approach


Title	Analysis and development of an automatic eCall for motorcycles: a one-class cepstrum approach
Authors	Simone Gelmini, Giulio Panzani, Sergio Savaresi
Abstract	The automatic dial of an emergency call - eCall - in response to a road accident is a feature that is gaining interest in the intelligent vehicle community. It indirectly increases the driving safety of road vehicles, but presents the technical challenge of developing an algorithm which triggers the emergency call only when needed, a non-trivial task for two-wheeled vehicles due to their complex dynamics. In the present work, we propose an eCall algorithm that detects these anomalies in the data time series, thanks to the cepstral analysis. The main advantage of the proposed approach is the direct focus on the data dynamics, solving the limits of approaches based on the analysis of the instantaneous value of some signals combination. The algorithm is calibrated and tested against real driving data of ten different drivers, including seven real crash events, and performance are compared with known methods.
Tasks	Time Series
Published	2019-07-19
URL	https://arxiv.org/abs/1907.09453v1
PDF	https://arxiv.org/pdf/1907.09453v1.pdf
PWC	https://paperswithcode.com/paper/analysis-and-development-of-an-automatic
Repo
Framework

Multiplicative modulations in hue-selective cells enhance unique hue representation


Title	Multiplicative modulations in hue-selective cells enhance unique hue representation
Authors	Paria Mehrani, Andrei Mouraviev, John K. Tsotsos
Abstract	There is still much to understand about the color processing mechanisms in the brain and the transformation from cone-opponent representations to perceptual hues. Moreover, it is unclear which areas(s) in the brain represent unique hues. We propose a hierarchical model inspired by the neuronal mechanisms in the brain for local hue representation, which reveals the contributions of each visual cortical area in hue representation. Local hue encoding is achieved through incrementally increasing processing nonlinearities beginning with cone input. Besides employing nonlinear rectifications, we propose multiplicative modulations as a form of nonlinearity. Our simulation results indicate that multiplicative modulations have significant contributions in encoding of hues along intermediate directions in the MacLeod-Boynton diagram and that model V4 neurons have the capacity to encode unique hues. Additionally, responses of our model neurons resemble those of biological color cells, suggesting that our model provides a novel formulation of the brain’s color processing pathway.
Tasks
Published	2019-07-03
URL	https://arxiv.org/abs/1907.02116v1
PDF	https://arxiv.org/pdf/1907.02116v1.pdf
PWC	https://paperswithcode.com/paper/multiplicative-modulations-in-hue-selective
Repo
Framework

End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans


Title	End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans
Authors	Armen Avetisyan, Angela Dai, Matthias Nießner
Abstract	We present a novel, end-to-end approach to align CAD models to an 3D scan of a scene, enabling transformation of a noisy, incomplete 3D scan to a compact, CAD reconstruction with clean, complete object geometry. Our main contribution lies in formulating a differentiable Procrustes alignment that is paired with a symmetry-aware dense object correspondence prediction. To simultaneously align CAD models to all the objects of a scanned scene, our approach detects object locations, then predicts symmetry-aware dense object correspondences between scan and CAD geometry in a unified object space, as well as a nearest neighbor CAD model, both of which are then used to inform a differentiable Procrustes alignment. Our approach operates in a fully-convolutional fashion, enabling alignment of CAD models to the objects of a scan in a single forward pass. This enables our method to outperform state-of-the-art approaches by $19.04%$ for CAD model alignment to scans, with $\approx 250\times$ faster runtime than previous data-driven approaches.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04201v1
PDF	https://arxiv.org/pdf/1906.04201v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-cad-model-retrieval-and-9dof
Repo
Framework

Evolving the Hearthstone Meta


Title	Evolving the Hearthstone Meta
Authors	Fernando de Mesentier Silva, Rodrigo Canaan, Scott Lee, Matthew C. Fontaine, Julian Togelius, Amy K. Hoover
Abstract	Balancing an ever growing strategic game of high complexity, such as Hearthstone is a complex task. The target of making strategies diverse and customizable results in a delicate intricate system. Tuning over 2000 cards to generate the desired outcome without disrupting the existing environment becomes a laborious challenge. In this paper, we discuss the impacts that changes to existing cards can have on strategy in Hearthstone. By analyzing the win rate on match-ups across different decks, being played by different strategies, we propose to compare their performance before and after changes are made to improve or worsen different cards. Then, using an evolutionary algorithm, we search for a combination of changes to the card attributes that cause the decks to approach equal, 50% win rates. We then expand our evolutionary algorithm to a multi-objective solution to search for this result, while making the minimum amount of changes, and as a consequence disruption, to the existing cards. Lastly, we propose and evaluate metrics to serve as heuristics with which to decide which cards to target with balance changes.
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01623v1
PDF	https://arxiv.org/pdf/1907.01623v1.pdf
PWC	https://paperswithcode.com/paper/evolving-the-hearthstone-meta
Repo
Framework

Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates


Title	Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates
Authors	Dmitry Kovalev, Konstantin Mishchenko, Peter Richtárik
Abstract	We present two new remarkably simple stochastic second-order methods for minimizing the average of a very large number of sufficiently smooth and strongly convex functions. The first is a stochastic variant of Newton’s method (SN), and the second is a stochastic variant of cubically regularized Newton’s method (SCN). We establish local linear-quadratic convergence results. Unlike existing stochastic variants of second order methods, which require the evaluation of a large number of gradients and/or Hessians in each iteration to guarantee convergence, our methods do not have this shortcoming. For instance, the simplest variants of our methods in each iteration need to compute the gradient and Hessian of a {\em single} randomly selected function only. In contrast to most existing stochastic Newton and quasi-Newton methods, our approach guarantees local convergence faster than with first-order oracle and adapts to the problem’s curvature. Interestingly, our method is not unbiased, so our theory provides new intuition for designing new stochastic methods.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01597v1
PDF	https://arxiv.org/pdf/1912.01597v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-newton-and-cubic-newton-methods
Repo
Framework

Towards an Integrative Educational Recommender for Lifelong Learners


Title	Towards an Integrative Educational Recommender for Lifelong Learners
Authors	Sahan Bulathwela, Maria Perez-Ortiz, Emine Yilmaz, John Shawe-Taylor
Abstract	One of the most ambitious use cases of computer-assisted learning is to build a recommendation system for lifelong learning. Most recommender algorithms exploit similarities between content and users, overseeing the necessity to leverage sensible learning trajectories for the learner. Lifelong learning thus presents unique challenges, requiring scalable and transparent models that can account for learner knowledge and content novelty simultaneously, while also retaining accurate learners representations for long periods of time. We attempt to build a novel educational recommender, that relies on an integrative approach combining multiple drivers of learners engagement. Our first step towards this goal is TrueLearn, which models content novelty and background knowledge of learners and achieves promising performance while retaining a human interpretable learner model.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01592v1
PDF	https://arxiv.org/pdf/1912.01592v1.pdf
PWC	https://paperswithcode.com/paper/towards-an-integrative-educational
Repo
Framework

Grounded Human-Object Interaction Hotspots from Video (Extended Abstract)


Title	Grounded Human-Object Interaction Hotspots from Video (Extended Abstract)
Authors	Tushar Nagarajan, Christoph Feichtenhofer, Kristen Grauman
Abstract	Learning how to interact with objects is an important step towards embodied visual intelligence, but existing techniques suffer from heavy supervision or sensing requirements. We propose an approach to learn human-object interaction “hotspots” directly from video. Rather than treat affordances as a manually supervised semantic segmentation task, our approach learns about interactions by watching videos of real human behavior and anticipating afforded actions. Given a novel image or video, our model infers a spatial hotspot map indicating how an object would be manipulated in a potential interaction, even if the object is currently at rest. Through results with both first and third person video, we show the value of grounding affordances in real human-object interactions. Not only are our weakly supervised hotspots competitive with strongly supervised affordance methods, but they can also anticipate object interaction for novel object categories. Project page: http://vision.cs.utexas.edu/projects/interaction-hotspots/
Tasks	Human-Object Interaction Detection, Semantic Segmentation
Published	2019-06-03
URL	https://arxiv.org/abs/1906.01963v1
PDF	https://arxiv.org/pdf/1906.01963v1.pdf
PWC	https://paperswithcode.com/paper/grounded-human-object-interaction-hotspots-1
Repo
Framework

Towards Robust Toxic Content Classification


Title	Towards Robust Toxic Content Classification
Authors	Keita Kurita, Anna Belova, Antonios Anastasopoulos
Abstract	Toxic content detection aims to identify content that can offend or harm its recipients. Automated classifiers of toxic content need to be robust against adversaries who deliberately try to bypass filters. We propose a method of generating realistic model-agnostic attacks using a lexicon of toxic tokens, which attempts to mislead toxicity classifiers by diluting the toxicity signal either by obfuscating toxic tokens through character-level perturbations, or by injecting non-toxic distractor tokens. We show that these realistic attacks reduce the detection recall of state-of-the-art neural toxicity detectors, including those using ELMo and BERT, by more than 50% in some cases. We explore two approaches for defending against such attacks. First, we examine the effect of training on synthetically noised data. Second, we propose the Contextual Denoising Autoencoder (CDAE): a method for learning robust representations that uses character-level and contextual information to denoise perturbed tokens. We show that the two approaches are complementary, improving robustness to both character-level perturbations and distractors, recovering a considerable portion of the lost accuracy. Finally, we analyze the robustness characteristics of the most competitive methods and outline practical considerations for improving toxicity detectors.
Tasks	Denoising
Published	2019-12-14
URL	https://arxiv.org/abs/1912.06872v1
PDF	https://arxiv.org/pdf/1912.06872v1.pdf
PWC	https://paperswithcode.com/paper/towards-robust-toxic-content-classification
Repo
Framework

AVDNet: A Small-Sized Vehicle Detection Network for Aerial Visual Data


Title	AVDNet: A Small-Sized Vehicle Detection Network for Aerial Visual Data
Authors	Murari Mandal, Manal Shah, Prashant Meena, Sanhita Devi, Santosh Kumar Vipparthi
Abstract	Detection of small-sized targets in aerial views is a challenging task due to the smallness of vehicle size, complex background, and monotonic object appearances. In this letter, we propose a one-stage vehicle detection network (AVDNet) to robustly detect small-sized vehicles in aerial scenes. In AVDNet, we introduced ConvRes residual blocks at multiple scales to alleviate the problem of vanishing features for smaller objects caused because of the inclusion of deeper convolutional layers. These residual blocks, along with enlarged output feature map, ensure the robust representation of the salient features for small sized objects. Furthermore, we proposed a recurrent-feature aware visualization (RFAV) technique to analyze the network behavior. We also created a new airborne image data set (ABD) by annotating 1396 new objects in 79 aerial images for our experiments. The effectiveness of AVDNet is validated on VEDAI, DLR- 3K, DOTA, and the combined (VEDAI, DLR-3K, DOTA, and ABD) data set. Experimental results demonstrate the significant performance improvement of the proposed method over state-of-the-art detection techniques in terms of mAP, computation, and space complexity.
Tasks
Published	2019-07-17
URL	https://arxiv.org/abs/1907.07477v1
PDF	https://arxiv.org/pdf/1907.07477v1.pdf
PWC	https://paperswithcode.com/paper/avdnet-a-small-sized-vehicle-detection
Repo
Framework

MRI Reconstruction Using Deep Bayesian Inference


Title	MRI Reconstruction Using Deep Bayesian Inference
Authors	GuanXiong Luo, Na Zhao, Wenhao Jiang, Peng Cao
Abstract	Purpose: To develop a deep learning-based Bayesian inference for MRI reconstruction. Methods: We modeled the MRI reconstruction problem with Bayes’s theorem, following the recently proposed PixelCNN++ method. The image reconstruction from incomplete k-space measurement was obtained by maximizing the posterior possibility. A generative network was utilized as the image prior, which was computationally tractable, and the k-space data fidelity was enforced by using an equality constraint. The stochastic backpropagation was utilized to calculate the descent gradient in the process of maximum a posterior, and a projected subgradient method was used to impose the equality constraint. In contrast to the other deep learning reconstruction methods, the proposed one used the likelihood of prior as the training loss and the objective function in reconstruction to improve the image quality. Results: The proposed method showed an improved performance in preserving image details and reducing aliasing artifacts, compared with GRAPPA, $\ell_1$-ESPRiT, and MODL, a state-of-the-art deep learning reconstruction method. The proposed method generally achieved more than 5 dB peak signal-to-noise ratio improvement for compressed sensing and parallel imaging reconstructions compared with the other methods. Conclusion: The Bayesian inference significantly improved the reconstruction performance, compared with the conventional $\ell_1$-sparsity prior in compressed sensing reconstruction tasks. More importantly, the proposed reconstruction framework can be generalized for most MRI reconstruction scenarios.
Tasks	Bayesian Inference, Image Reconstruction
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01127v1
PDF	https://arxiv.org/pdf/1909.01127v1.pdf
PWC	https://paperswithcode.com/paper/mri-reconstruction-using-deep-bayesian
Repo
Framework

Peanut Maturity Classification using Hyperspectral Imagery


Title	Peanut Maturity Classification using Hyperspectral Imagery
Authors	Sheng Zou, Yu-Chien Tseng, Alina Zare, Diane Rowland, Barry Tillman, Seung-Chul Yoon
Abstract	Seed maturity in peanut (Arachis hypogaea L.) determines economic return to a producer because of its impact on seed weight (yield), and critically influences seed vigor and other quality characteristics. During seed development, the inner mesocarp layer of the pericarp (hull) transitions in color from white to black as the seed matures. The maturity assessment process involves the removal of the exocarp of the hull and visually categorizing the mesocarp color into varying color classes from immature (white, yellow, orange) to mature (brown, and black). This visual color classification is time consuming because the exocarp must be manually removed. In addition, the visual classification process involves human assessment of colors, which leads to large variability of color classification from observer to observer. A more objective, digital imaging approach to peanut maturity is needed, optimally without the requirement of removal of the hull’s exocarp. This study examined the use of a hyperspectral imaging (HSI) process to determine pod maturity with intact pericarps. The HSI method leveraged spectral differences between mature and immature pods within a classification algorithm to identify the mature and immature pods. The results showed a high classification accuracy with consistency using samples from different years and cultivars. In addition, the proposed method was capable of estimating a continuous-valued, pixel-level maturity value for individual peanut pods, allowing for a valuable tool that can be utilized in seed quality research. This new method solves issues of labor intensity and subjective error that all current methods of peanut maturity determination have.
Tasks
Published	2019-10-20
URL	https://arxiv.org/abs/1910.11122v2
PDF	https://arxiv.org/pdf/1910.11122v2.pdf
PWC	https://paperswithcode.com/paper/peanut-maturity-classification-using
Repo
Framework

Multiple Graph Adversarial Learning


Title	Multiple Graph Adversarial Learning
Authors	Bo Jiang, Ziyan Zhang, Jin Tang, Bin Luo
Abstract	Recently, Graph Convolutional Networks (GCNs) have been widely studied for graph-structured data representation and learning. However, in many real applications, data are coming with multiple graphs, and it is non-trivial to adapt GCNs to deal with data representation with multiple graph structures. One main challenge for multi-graph representation is how to exploit both structure information of each individual graph and correlation information across multiple graphs simultaneously. In this paper, we propose a novel Multiple Graph Adversarial Learning (MGAL) framework for multi-graph representation and learning. MGAL aims to learn an optimal structure-invariant and consistent representation for multiple graphs in a common subspace via a novel adversarial learning framework, which thus incorporates both structure information of intra-graph and correlation information of inter-graphs simultaneously. Based on MGAL, we then provide a unified network for semi-supervised learning task. Promising experimental results demonstrate the effectiveness of MGAL model.
Tasks
Published	2019-01-22
URL	http://arxiv.org/abs/1901.07439v1
PDF	http://arxiv.org/pdf/1901.07439v1.pdf
PWC	https://paperswithcode.com/paper/multiple-graph-adversarial-learning
Repo
Framework

Augmented Hard Example Mining for Generalizable Person Re-Identification


Title	Augmented Hard Example Mining for Generalizable Person Re-Identification
Authors	Masato Tamura, Tomokazu Murakami
Abstract	Although the performance of person re-identification (Re-ID) has been much improved by using sophisticated training methods and large-scale labelled datasets, many existing methods make the impractical assumption that information of a target domain can be utilized during training. In practice, a Re-ID system often starts running as soon as it is deployed, hence training with data from a target domain is unrealistic. To make Re-ID systems more practical, methods have been proposed that achieve high performance without information of a target domain. However, they need cumbersome tuning for training and unusual operations for testing. In this paper, we propose augmented hard example mining, which can be easily integrated to a common Re-ID training process and can utilize sophisticated models without any network modification. The method discovers hard examples on the basis of classification probabilities, and to make the examples harder, various types of augmentation are applied to the examples. Among those examples, excessively augmented ones are eliminated by a classification based selection process. Extensive analysis shows that our method successfully selects effective examples and achieves state-of-the-art performance on publicly available benchmark datasets.
Tasks	Person Re-Identification
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05280v1
PDF	https://arxiv.org/pdf/1910.05280v1.pdf
PWC	https://paperswithcode.com/paper/augmented-hard-example-mining-for
Repo
Framework