January 28, 2020

2986 words 15 mins read

Paper Group ANR 939

A Strong and Robust Baseline for Text-Image Matching. Lattice Map Spiking Neural Networks (LM-SNNs) for Clustering and Classifying Image Data. A Rigorous Theory of Conditional Mean Embeddings. Metrics for Learning in Topological Persistence. Visual-Textual Association with Hardest and Semi-Hard Negative Pairs Mining for Person Search. Semi-supervis …

A Strong and Robust Baseline for Text-Image Matching


Title	A Strong and Robust Baseline for Text-Image Matching
Authors	Fangyu Liu, Rongtian Ye
Abstract	We review the current schemes of text-image matching models and propose improvements for both training and inference. First, we empirically show limitations of two popular loss (sum and max-margin loss) widely used in training text-image embeddings and propose a trade-off: a kNN-margin loss which 1) utilizes information from hard negatives and 2) is robust to noise as all $K$-most hardest samples are taken into account, tolerating \emph{pseudo} negatives and outliers. Second, we advocate the use of Inverted Softmax (\textsc{Is}) and Cross-modal Local Scaling (\textsc{Csls}) during inference to mitigate the so-called hubness problem in high-dimensional embedding space, enhancing scores of all metrics by a large margin.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01205v1
PDF	https://arxiv.org/pdf/1906.01205v1.pdf
PWC	https://paperswithcode.com/paper/a-strong-and-robust-baseline-for-text-image
Repo
Framework

Lattice Map Spiking Neural Networks (LM-SNNs) for Clustering and Classifying Image Data


Title	Lattice Map Spiking Neural Networks (LM-SNNs) for Clustering and Classifying Image Data
Authors	Hananel Hazan, Daniel J. Saunders, Darpan T. Sanghavi, Hava Siegelmann, Robert Kozma
Abstract	Spiking neural networks (SNNs) with a lattice architecture are introduced in this work, combining several desirable properties of SNNs and self-organized maps (SOMs). Networks are trained with biologically motivated, unsupervised learning rules to obtain a self-organized grid of filters via cooperative and competitive excitatory-inhibitory interactions. Several inhibition strategies are developed and tested, such as (i) incrementally increasing inhibition level over the course of network training, and (ii) switching the inhibition level from low to high (two-level) after an initial training segment. During the labeling phase, the spiking activity generated by data with known labels is used to assign neurons to categories of data, which are then used to evaluate the network’s classification ability on a held-out set of test data. Several biologically plausible evaluation rules are proposed and compared, including a population-level confidence rating, and an $n$-gram inspired method. The effectiveness of the proposed self-organized learning mechanism is tested using the MNIST benchmark dataset, as well as using images produced by playing the Atari Breakout game.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.11826v1
PDF	https://arxiv.org/pdf/1906.11826v1.pdf
PWC	https://paperswithcode.com/paper/lattice-map-spiking-neural-networks-lm-snns
Repo
Framework

A Rigorous Theory of Conditional Mean Embeddings


Title	A Rigorous Theory of Conditional Mean Embeddings
Authors	Ilja Klebanov, Ingmar Schuster, T. J. Sullivan
Abstract	Conditional mean embeddings (CME) have proven themselves to be a powerful tool in many machine learning applications. They allow the efficient conditioning of probability distributions within the corresponding reproducing kernel Hilbert spaces (RKHSs) by providing a linear-algebraic relation for the kernel mean embeddings of the respective joint and conditional probability distributions. Both centred and uncentred covariance operators have been used to define CMEs in the existing literature. In this paper, we develop a mathematically rigorous theory for both variants, discuss the merits and problems of each, and significantly weaken the conditions for applicability of CMEs. In the course of this, we demonstrate a beautiful connection to Gaussian conditioning in Hilbert spaces.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00671v3
PDF	https://arxiv.org/pdf/1912.00671v3.pdf
PWC	https://paperswithcode.com/paper/a-rigorous-theory-of-conditional-mean
Repo
Framework

Metrics for Learning in Topological Persistence


Title	Metrics for Learning in Topological Persistence
Authors	Henri Riihimäki, José Licón-Saláiz
Abstract	Persistent homology analysis provides means to capture the connectivity structure of data sets in various dimensions. On the mathematical level, by defining a metric between the objects that persistence attaches to data sets, we can stabilize invariants characterizing these objects. We outline how so called contour functions induce relevant metrics for stabilizing the rank invariant. On the practical level, the stable ranks are used as fingerprints for data. Different choices of contour lead to different stable ranks and the topological learning is then the question of finding the optimal contour. We outline our analysis pipeline and show how it can enhance classification of physical activities data. As our main application we study how stable ranks and contours provide robust descriptors of spatial patterns of atmospheric cloud fields.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04436v1
PDF	https://arxiv.org/pdf/1906.04436v1.pdf
PWC	https://paperswithcode.com/paper/metrics-for-learning-in-topological
Repo
Framework

Visual-Textual Association with Hardest and Semi-Hard Negative Pairs Mining for Person Search


Title	Visual-Textual Association with Hardest and Semi-Hard Negative Pairs Mining for Person Search
Authors	Jing Ge, Guangyu Gao, Zhen Liu
Abstract	Searching persons in large-scale image databases with the query of natural language description is a more practical important applications in video surveillance. Intuitively, for person search, the core issue should be visual-textual association, which is still an extremely challenging task, due to the contradiction between the high abstraction of textual description and the intuitive expression of visual images. However, for this task, while positive image-text pairs are always well provided, most existing methods doesn’t tackle this problem effectively by mining more reasonable negative pairs. In this paper, we proposed a novel visual-textual association approach with visual and textual attention, and cross-modality hardest and semi-hard negative pair mining. In order to evaluate the effectiveness and feasibility of the proposed approach, we conduct extensive experiments on typical person search datasdet: CUHK-PEDES, in which our approach achieves the top1 score of 55.32% as a new state-of-the-art. Besides, we also evaluate the semi-hard pair mining approach in COCO caption dataset, and validate the effectiveness and complementarity of the methods.
Tasks	Person Search
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03083v1
PDF	https://arxiv.org/pdf/1912.03083v1.pdf
PWC	https://paperswithcode.com/paper/visual-textual-association-with-hardest-and
Repo
Framework

Semi-supervised Approach to Soft Sensor Modeling for Fault Detection in Industrial Systems with Multiple Operation Modes


Title	Semi-supervised Approach to Soft Sensor Modeling for Fault Detection in Industrial Systems with Multiple Operation Modes
Authors	Shun Takeuchi, Takuya Nishino, Takahiro Saito, Isamu Watanabe
Abstract	In industrial systems, certain process variables that need to be monitored for detecting faults are often difficult or impossible to measure. Soft sensor techniques are widely used to estimate such difficult-to-measure process variables from easy-to-measure ones. Soft sensor modeling requires training datasets including the information of various states such as operation modes, but the fault dataset with the target variable is insufficient as the training dataset. This paper describes a semi-supervised approach to soft sensor modeling to incorporate an incomplete dataset without the target variable in the training dataset. To incorporate the incomplete dataset, we consider the properties of processes at transition points between operation modes in the system. The regression coefficients of the operation modes are estimated under constraint conditions obtained from the information on the mode transitions. In a case study, this constrained soft sensor modeling was used to predict refrigerant leaks in air-conditioning systems with heating and cooling operation modes. The results show that this modeling method is promising for soft sensors in a system with multiple operation modes.
Tasks	Fault Detection, Sensor Modeling
Published	2019-02-22
URL	http://arxiv.org/abs/1902.09426v1
PDF	http://arxiv.org/pdf/1902.09426v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-approach-to-soft-sensor
Repo
Framework

Learning Energy-Based Models in High-Dimensional Spaces with Multi-scale Denoising Score Matching


Title	Learning Energy-Based Models in High-Dimensional Spaces with Multi-scale Denoising Score Matching
Authors	Zengyi Li, Yubei Chen, Friedrich T. Sommer
Abstract	Energy-Based Models (EBMs) assign unnormalized log-probability to data samples. This functionality has a variety of applications, such as sample synthesis, data denoising, sample restoration, outlier detection, Bayesian reasoning, and many more. But training of EBMs using standard maximum likelihood is extremely slow because it requires sampling from the model distribution. Score matching potentially alleviates this problem. In particular, denoising score matching \citep{vincent2011connection} has been successfully used to train EBMs. Using noisy data samples with one fixed noise level, these models learn fast and yield good results in data denoising \citep{saremi2019neural}. However, demonstrations of such models in high quality sample synthesis of high dimensional data were lacking. Recently, \citet{song2019generative} have shown that a generative model trained by denoising score matching accomplishes excellent sample synthesis, when trained with data samples corrupted with multiple levels of noise. Here we provide analysis and empirical evidence showing that training with multiple noise levels is necessary when the data dimension is high. Leveraging this insight, we propose a novel EBM trained with multi-scale denoising score matching. Our model exhibits data generation performance comparable to state-of-the-art techniques such as GANs, and sets a new baseline for EBMs. The proposed model also provides density information and performs well in an image inpainting task.
Tasks	Denoising, Image Inpainting, Outlier Detection
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07762v2
PDF	https://arxiv.org/pdf/1910.07762v2.pdf
PWC	https://paperswithcode.com/paper/annealed-denoising-score-matching-learning-1
Repo
Framework

Neural Network for NILM Based on Operational State Change Classification


Title	Neural Network for NILM Based on Operational State Change Classification
Authors	Peng Xiao, Samuel Cheng
Abstract	Energy disaggregation in a non-intrusive way estimates appliance level electricity consumption from a single meter that measures the whole house electricity demand. Recently, with the ongoing increment of energy data, there are many data-driven deep learning architectures being applied to solve the non-intrusive energy disaggregation problem. However, most proposed methods try to estimate the on-off state or the power consumption of appliance, which need not only large amount of parameters, but also hyper-parameter optimization prior to training and even preprocessing of energy data for a specified appliance. In this paper, instead of estimating on-off state or power consumption, we adapt a neural network to estimate the operational state change of appliance. Our proposed solution is more feasible across various appliances and lower complexity comparing to previous methods. The simulated experiments in the low sample rate dataset REDD show the competitive performance of the designed method, with respect to other two benchmark methods, Hidden Markov Model-based and Graph Signal processing-based approaches.
Tasks
Published	2019-02-05
URL	http://arxiv.org/abs/1902.02675v2
PDF	http://arxiv.org/pdf/1902.02675v2.pdf
PWC	https://paperswithcode.com/paper/neural-network-for-nilm-based-on-operational
Repo
Framework

Estimating localized complexity of white-matter wiring with GANs


Title	Estimating localized complexity of white-matter wiring with GANs
Authors	Haraldur T. Hallgrimsson, Richika Sharan, Scott T. Grafton, Ambuj K. Singh
Abstract	In-vivo examination of the physical connectivity of axonal projections through the white matter of the human brain is made possible by diffusion weighted magnetic resonance imaging (dMRI) Analysis of dMRI commonly considers derived scalar metrics such as fractional anisotrophy as proxies for “white matter integrity,” and differences of such measures have been observed as significantly correlating with various neurological diagnosis and clinical measures such as executive function, presence of multiple sclerosis, and genetic similarity. The analysis of such voxel measures is confounded in areas of more complicated fiber wiring due to crossing, kissing, and dispersing fibers. Recently, Volz et al. introduced a simple probabilistic measure of the count of distinct fiber populations within a voxel, which was shown to reduce variance in group comparisons. We propose a complementary measure that considers the complexity of a voxel in context of its local region, with an aim to quantify the localized wiring complexity of every part of white matter. This allows, for example, identification of particularly ambiguous regions of the brain for tractographic approaches of modeling global wiring connectivity. Our method builds on recent advances in image inpainting, in which the task is to plausibly fill in a missing region of an image. Our proposed method builds on a Bayesian estimate of heteroscedastic aleatoric uncertainty of a region of white matter by inpainting it from its context. We define the localized wiring complexity of white matter as how accurately and confidently a well-trained model can predict the missing patch. In our results, we observe low aleatoric uncertainty along major neuronal pathways which increases at junctions and towards cortex boundaries. This directly quantifies the difficulty of lesion inpainting of dMRI images at all parts of white matter.
Tasks	Image Inpainting
Published	2019-10-02
URL	https://arxiv.org/abs/1910.04868v2
PDF	https://arxiv.org/pdf/1910.04868v2.pdf
PWC	https://paperswithcode.com/paper/estimating-localized-complexity-of-white
Repo
Framework

Spike-Train Level Backpropagation for Training Deep Recurrent Spiking Neural Networks


Title	Spike-Train Level Backpropagation for Training Deep Recurrent Spiking Neural Networks
Authors	Wenrui Zhang, Peng Li
Abstract	Spiking neural networks (SNNs) well support spatiotemporal learning and energy-efficient event-driven hardware neuromorphic processors. As an important class of SNNs, recurrent spiking neural networks (RSNNs) possess great computational power. However, the practical application of RSNNs is severely limited by challenges in training. Biologically-inspired unsupervised learning has limited capability in boosting the performance of RSNNs. On the other hand, existing backpropagation (BP) methods suffer from high complexity of unrolling in time, vanishing and exploding gradients, and approximate differentiation of discontinuous spiking activities when applied to RSNNs. To enable supervised training of RSNNs under a well-defined loss function, we present a novel Spike-Train level RSNNs Backpropagation (ST-RSBP) algorithm for training deep RSNNs. The proposed ST-RSBP directly computes the gradient of a rated-coded loss function defined at the output layer of the network w.r.t tunable parameters. The scalability of ST-RSBP is achieved by the proposed spike-train level computation during which temporal effects of the SNN is captured in both the forward and backward pass of BP. Our ST-RSBP algorithm can be broadly applied to RSNNs with a single recurrent layer or deep RSNNs with multiple feed-forward and recurrent layers. Based upon challenging speech and image datasets including TI46, N-TIDIGITS, Fashion-MNIST and MNIST, ST-RSBP is able to train RSNNs with an accuracy surpassing that of the current state-of-art SNN BP algorithms and conventional non-spiking deep learning models.
Tasks
Published	2019-08-18
URL	https://arxiv.org/abs/1908.06378v3
PDF	https://arxiv.org/pdf/1908.06378v3.pdf
PWC	https://paperswithcode.com/paper/spike-train-level-backpropagation-for
Repo
Framework

Relay: A High-Level Compiler for Deep Learning


Title	Relay: A High-Level Compiler for Deep Learning
Authors	Jared Roesch, Steven Lyubomirsky, Marisa Kirisame, Logan Weber, Josh Pollock, Luis Vega, Ziheng Jiang, Tianqi Chen, Thierry Moreau, Zachary Tatlock
Abstract	Frameworks for writing, compiling, and optimizing deep learning (DL) models have recently enabled progress in areas like computer vision and natural language processing. Extending these frameworks to accommodate the rapidly diversifying landscape of DL models and hardware platforms presents challenging tradeoffs between expressivity, composability, and portability. We present Relay, a new compiler framework for DL. Relay’s functional, statically typed intermediate representation (IR) unifies and generalizes existing DL IRs to express state-of-the-art models. The introduction of Relay’s expressive IR requires careful design of domain-specific optimizations, addressed via Relay’s extension mechanisms. Using these extension mechanisms, Relay supports a unified compiler that can target a variety of hardware platforms. Our evaluation demonstrates Relay’s competitive performance for a broad class of models and devices (CPUs, GPUs, and emerging accelerators). Relay’s design demonstrates how a unified IR can provide expressivity, composability, and portability without compromising performance.
Tasks
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08368v2
PDF	https://arxiv.org/pdf/1904.08368v2.pdf
PWC	https://paperswithcode.com/paper/relay-a-high-level-ir-for-deep-learning
Repo
Framework

Extracting Aspects Hierarchies using Rhetorical Structure Theory


Title	Extracting Aspects Hierarchies using Rhetorical Structure Theory
Authors	Łukasz Augustyniak, Tomasz Kajdanowicz, Przemysław Kazienko
Abstract	We propose a novel approach to generate aspect hierarchies that proved to be consistently correct compared with human-generated hierarchies. We present an unsupervised technique using Rhetorical Structure Theory and graph analysis. We evaluated our approach based on 100,000 reviews from Amazon and achieved an astonishing 80% coverage compared with human-generated hierarchies coded in ConceptNet. The method could be easily extended with a sentiment analysis model and used to describe sentiment on different levels of aspect granularity. Hence, besides the flat aspect structure, we can differentiate between aspects and describe if the charging aspect is related to battery or price.
Tasks	Sentiment Analysis
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01800v1
PDF	https://arxiv.org/pdf/1909.01800v1.pdf
PWC	https://paperswithcode.com/paper/extracting-aspects-hierarchies-using
Repo
Framework

A Single-shot Object Detector with Feature Aggragation and Enhancement


Title	A Single-shot Object Detector with Feature Aggragation and Enhancement
Authors	Weiqiang Li, Guizhong Liu
Abstract	For many real applications, it is equally important to detect objects accurately and quickly. In this paper, we propose an accurate and efficient single shot object detector with feature aggregation and enhancement (FAENet). Our motivation is to enhance and exploit the shallow and deep feature maps of the whole network simultaneously. To achieve it we introduce a pair of novel feature aggregation modules and two feature enhancement blocks, and integrate them into the original structure of SSD. Extensive experiments on both the PASCAL VOC and MS COCO datasets demonstrate that the proposed method achieves much higher accuracy than SSD. In addition, our method performs better than the state-of-the-art one-stage detector RefineDet on small objects and can run at a faster speed.
Tasks
Published	2019-02-08
URL	https://arxiv.org/abs/1902.02923v2
PDF	https://arxiv.org/pdf/1902.02923v2.pdf
PWC	https://paperswithcode.com/paper/a-single-shot-object-detector-with-feature
Repo
Framework

Extracting a Discriminative Structural Sub-Network for ASD Screening using the Evolutionary Algorithm


Title	Extracting a Discriminative Structural Sub-Network for ASD Screening using the Evolutionary Algorithm
Authors	M. Amin, F. Safaei, N. S. Ghaderian
Abstract	Autism spectrum disorder (ASD) is one of the most significant neurological disorders that disrupt a person’s social communication skills. The progression and development of neuroimaging technologies has made structural network construction of brain regions possible. In this paper, after finding the discriminative sub-network using the evolutionary algorithm, the simple features of the sub-network lead us to diagnose autism in various subjects with plausible accuracy (76% on average). This method yields substantially better results compared to previous researches. Thus, this method may be used as an accurate assistance in autism screening
Tasks
Published	2019-10-26
URL	https://arxiv.org/abs/1911.05484v1
PDF	https://arxiv.org/pdf/1911.05484v1.pdf
PWC	https://paperswithcode.com/paper/extracting-a-discriminative-structural-sub
Repo
Framework

Optimization Methods for Interpretable Differentiable Decision Trees in Reinforcement Learning


Title	Optimization Methods for Interpretable Differentiable Decision Trees in Reinforcement Learning
Authors	Andrew Silva, Taylor Killian, Ivan Dario Jimenez Rodriguez, Sung-Hyun Son, Matthew Gombolay
Abstract	Decision trees are ubiquitous in machine learning for their ease of use and interpretability. Yet, these models are not typically employed in reinforcement learning as they cannot be updated online via stochastic gradient descent. We overcome this limitation by allowing for a gradient update over the entire tree that improves sample complexity affords interpretable policy extraction. First, we include theoretical motivation on the need for policy-gradient learning by examining the properties of gradient descent over differentiable decision trees. Second, we demonstrate that our approach equals or outperforms a neural network on all domains and can learn discrete decision trees online with average rewards up to 7x higher than a batch-trained decision tree. Third, we conduct a user study to quantify the interpretability of a decision tree, rule list, and a neural network with statistically significant results ($p < 0.001$).
Tasks
Published	2019-03-22
URL	https://arxiv.org/abs/1903.09338v4
PDF	https://arxiv.org/pdf/1903.09338v4.pdf
PWC	https://paperswithcode.com/paper/interpretable-reinforcement-learning-via
Repo
Framework