January 30, 2020

3293 words 16 mins read

Paper Group ANR 223

Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited. Instance-Aware Representation Learning and Association for Online Multi-Person Tracking. An Investigation into Neural Net Optimization via Hessian Eigenvalue Density. A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes. FACLSTM: Co …

Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited


Title	Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited
Authors	S. E. Marzen, J. P. Crutchfield
Abstract	Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any finite-state automaton in theory, and some workers demonstrated that this can hold in practice. We test the capability of generalized linear models, RCs, and Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic processes generated by a large suite of probabilistic deterministic finite-state automata (PDFA). PDFAs provide an excellent performance benchmark in that they can be systematically enumerated, the randomness and correlation structure of their generated processes are exactly known, and their optimal memory-limited predictors are easily computed. Unsurprisingly, LSTMs outperform RCs, which outperform generalized linear models. Surprisingly, each of these methods can fall short of the maximal predictive accuracy by as much as 50% after training and, when optimized, tend to fall short of the maximal predictive accuracy by ~5%, even though previously available methods achieve maximal predictive accuracy with orders-of-magnitude less data. Thus, despite the representational universality of RCs and RNNs, using them can engender a surprising predictive gap for simple stimuli. One concludes that there is an important and underappreciated role for methods that infer “causal states” or “predictive state representations”.
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07663v1
PDF	https://arxiv.org/pdf/1910.07663v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-deterministic-finite-automata
Repo
Framework

Instance-Aware Representation Learning and Association for Online Multi-Person Tracking


Title	Instance-Aware Representation Learning and Association for Online Multi-Person Tracking
Authors	Hefeng Wu, Yafei Hu, Keze Wang, Hanhui Li, Lin Nie, Hui Cheng
Abstract	Multi-Person Tracking (MPT) is often addressed within the detection-to-association paradigm. In such approaches, human detections are first extracted in every frame and person trajectories are then recovered by a procedure of data association (usually offline). However, their performances usually degenerate in presence of detection errors, mutual interactions and occlusions. In this paper, we present a deep learning based MPT approach that learns instance-aware representations of tracked persons and robustly online infers states of the tracked persons. Specifically, we design a multi-branch neural network (MBN), which predicts the classification confidences and locations of all targets by taking a batch of candidate regions as input. In our MBN architecture, each branch (instance-subnet) corresponds to an individual to be tracked and new branches can be dynamically created for handling newly appearing persons. Then based on the output of MBN, we construct a joint association matrix that represents meaningful states of tracked persons (e.g., being tracked or disappearing from the scene) and solve it by using the efficient Hungarian algorithm. Moreover, we allow the instance-subnets to be updated during tracking by online mining hard examples, accounting to person appearance variations over time. We comprehensively evaluate our framework on a popular MPT benchmark, demonstrating its excellent performance in comparison with recent online MPT methods.
Tasks	Representation Learning
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12409v1
PDF	https://arxiv.org/pdf/1905.12409v1.pdf
PWC	https://paperswithcode.com/paper/instance-aware-representation-learning-and
Repo
Framework

An Investigation into Neural Net Optimization via Hessian Eigenvalue Density


Title	An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Authors	Behrooz Ghorbani, Shankar Krishnan, Ying Xiao
Abstract	To understand the dynamics of optimization in deep neural networks, we develop a tool to study the evolution of the entire Hessian spectrum throughout the optimization process. Using this, we study a number of hypotheses concerning smoothness, curvature, and sharpness in the deep learning literature. We then thoroughly analyze a crucial structural feature of the spectra: in non-batch normalized networks, we observe the rapid appearance of large isolated eigenvalues in the spectrum, along with a surprising concentration of the gradient in the corresponding eigenspaces. In batch normalized networks, these two effects are almost absent. We characterize these effects, and explain how they affect optimization speed through both theory and experiments. As part of this work, we adapt advanced tools from numerical linear algebra that allow scalable and accurate estimation of the entire Hessian spectrum of ImageNet-scale neural networks; this technique may be of independent interest in other applications.
Tasks
Published	2019-01-29
URL	http://arxiv.org/abs/1901.10159v1
PDF	http://arxiv.org/pdf/1901.10159v1.pdf
PWC	https://paperswithcode.com/paper/an-investigation-into-neural-net-optimization
Repo
Framework

A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes


Title	A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes
Authors	Lichao Mou, Yuansheng Hua, Xiao Xiang Zhu
Abstract	Most current semantic segmentation approaches fall back on deep convolutional neural networks (CNNs). However, their use of convolution operations with local receptive fields causes failures in modeling contextual spatial relations. Prior works have sought to address this issue by using graphical models or spatial propagation modules in networks. But such models often fail to capture long-range spatial relationships between entities, which leads to spatially fragmented predictions. Moreover, recent works have demonstrated that channel-wise information also acts a pivotal part in CNNs. In this work, we introduce two simple yet effective network units, the spatial relation module and the channel relation module, to learn and reason about global relationships between any two spatial positions or feature maps, and then produce relation-augmented feature representations. The spatial and channel relation modules are general and extensible, and can be used in a plug-and-play fashion with the existing fully convolutional network (FCN) framework. We evaluate relation module-equipped networks on semantic segmentation tasks using two aerial image datasets, which fundamentally depend on long-range spatial relational reasoning. The networks achieve very competitive results, bringing significant improvements over baselines.
Tasks	Relational Reasoning, Semantic Segmentation
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05730v2
PDF	http://arxiv.org/pdf/1904.05730v2.pdf
PWC	https://paperswithcode.com/paper/a-relation-augmented-fully-convolutional
Repo
Framework

FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition


Title	FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition
Authors	Qingqing Wang, Wenjing Jia, Xiangjian He, Yue Lu, Michael Blumenstein, Ye Huang
Abstract	Scene text recognition has recently been widely treated as a sequence-to-sequence prediction problem, where traditional fully-connected-LSTM (FC-LSTM) has played a critical role. Due to the limitation of FC-LSTM, existing methods have to convert 2-D feature maps into 1-D sequential feature vectors, resulting in severe damages of the valuable spatial and structural information of text images. In this paper, we argue that scene text recognition is essentially a spatiotemporal prediction problem for its 2-D image inputs, and propose a convolution LSTM (ConvLSTM)-based scene text recognizer, namely, FACLSTM, i.e., Focused Attention ConvLSTM, where the spatial correlation of pixels is fully leveraged when performing sequential prediction with LSTM. Particularly, the attention mechanism is properly incorporated into an efficient ConvLSTM structure via the convolutional operations and additional character center masks are generated to help focus attention on right feature areas. The experimental results on benchmark datasets IIIT5K, SVT and CUTE demonstrate that our proposed FACLSTM performs competitively on the regular, low-resolution and noisy text images, and outperforms the state-of-the-art approaches on the curved text with large margins.
Tasks	Scene Text Recognition
Published	2019-04-20
URL	https://arxiv.org/abs/1904.09405v2
PDF	https://arxiv.org/pdf/1904.09405v2.pdf
PWC	https://paperswithcode.com/paper/190409405
Repo
Framework

Graph based Dynamic Segmentation of Generic Objects in 3D


Title	Graph based Dynamic Segmentation of Generic Objects in 3D
Authors	Xiao Lin, Josep R. Casas, Montse Pardàs
Abstract	We propose a novel 3D segmentation method for RBGD stream data to deal with 3D object segmentation task in a generic scenario with frequent object interactions. It mainly contributes in two aspects, while being generic and not requiring initialization: firstly, a novel tree structure representation for the point cloud of the scene is proposed. Then, a dynamic manangement mechanism for connected component splits and merges exploits the tree structure representation.
Tasks	Semantic Segmentation
Published	2019-04-17
URL	http://arxiv.org/abs/1904.08518v1
PDF	http://arxiv.org/pdf/1904.08518v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-dynamic-segmentation-of-generic
Repo
Framework

Learning the helix topology of musical pitch


Title	Learning the helix topology of musical pitch
Authors	Vincent Lostanlen, Sripathi Sridhar, Brian McFee, Andrew Farnsworth, Juan Pablo Bello
Abstract	To explain the consonance of octaves, music psychologists represent pitch as a helix where azimuth and axial coordinate correspond to pitch class and pitch height respectively. This article addresses the problem of discovering this helical structure from unlabeled audio data. We measure Pearson correlations in the constant-Q transform (CQT) domain to build a K-nearest neighbor graph between frequency subbands. Then, we run the Isomap manifold learning algorithm to represent this graph in a three-dimensional space in which straight lines approximate graph geodesics. Experiments on isolated musical notes demonstrate that the resulting manifold resembles a helix which makes a full turn at every octave. A circular shape is also found in English speech, but not in urban noise. We discuss the impact of various design choices on the visualization: instrumentarium, loudness mapping function, and number of neighbors K.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.10246v2
PDF	https://arxiv.org/pdf/1910.10246v2.pdf
PWC	https://paperswithcode.com/paper/learning-the-helix-topology-of-musical-pitch
Repo
Framework

MQLV: Optimal Policy of Money Management in Retail Banking with Q-Learning


Title	MQLV: Optimal Policy of Money Management in Retail Banking with Q-Learning
Authors	Jeremy Charlier, Gaston Ormazabal, Radu State, Jean Hilger
Abstract	Reinforcement learning has become one of the best approach to train a computer game emulator capable of human level performance. In a reinforcement learning approach, an optimal value function is learned across a set of actions, or decisions, that leads to a set of states giving different rewards, with the objective to maximize the overall reward. A policy assigns to each state-action pairs an expected return. We call an optimal policy a policy for which the value function is optimal. QLBS, Q-Learner in the Black-Scholes(-Merton) Worlds, applies the reinforcement learning concepts, and noticeably, the popular Q-learning algorithm, to the financial stochastic model of Black, Scholes and Merton. It is, however, specifically optimized for the geometric Brownian motion and the vanilla options. Its range of application is, therefore, limited to vanilla option pricing within financial markets. We propose MQLV, Modified Q-Learner for the Vasicek model, a new reinforcement learning approach that determines the optimal policy of money management based on the aggregated financial transactions of the clients. It unlocks new frontiers to establish personalized credit card limits or to fulfill bank loan applications, targeting the retail banking industry. MQLV extends the simulation to mean reverting stochastic diffusion processes and it uses a digital function, a Heaviside step function expressed in its discrete form, to estimate the probability of a future event such as a payment default. In our experiments, we first show the similarities between a set of historical financial transactions and Vasicek generated transactions and, then, we underline the potential of MQLV on generated Monte Carlo simulations. Finally, MQLV is the first Q-learning Vasicek-based methodology addressing transparent decision making processes in retail banking.
Tasks	Decision Making, Q-Learning, Time Series
Published	2019-05-24
URL	https://arxiv.org/abs/1905.12567v3
PDF	https://arxiv.org/pdf/1905.12567v3.pdf
PWC	https://paperswithcode.com/paper/190512567
Repo
Framework

Nested Conformal Prediction and the Generalized Jackknife+


Title	Nested Conformal Prediction and the Generalized Jackknife+
Authors	Arun K. Kuchibhotla, Aaditya K. Ramdas
Abstract	We provide an alternate unified framework for conformal prediction, which is a framework to provide assumption-free prediction intervals. Instead of beginning by choosing a conformity score, our framework starts with a sequence of nested sets ${\mathcal{F}t(x)}{t\in\mathcal{T}}$ for some ordered set $\mathcal{T}$ that specifies all potential prediction sets. We show that most proposed conformity scores in the literature, including several based on quantiles, straightforwardly result in nested families. Then, we argue that what conformal prediction does is find a mapping $\alpha \mapsto t(\alpha)$, meaning that it calibrates or rescales $\mathcal{T}$ to $[0,1]$. Nestedness is a natural and intuitive requirement because the optimal prediction sets (eg: level sets of conditional densities) are also nested, but we also formally prove that nested sets are universal, meaning that any conformal prediction method can be represented in our framework. Finally, to demonstrate its utility, we show how to develop the full conformal, split conformal, cross-conformal and the recent jackknife+ methods within our nested framework, thus immediately generalizing the latter two classes of methods to new settings. Specifically, we prove the validity of the leave-one-out, $K$-fold, subsampling and bootstrap variants of the latter two methods for any nested family.
Tasks
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10562v1
PDF	https://arxiv.org/pdf/1910.10562v1.pdf
PWC	https://paperswithcode.com/paper/nested-conformal-prediction-and-the
Repo
Framework

Iterative subtraction method for Feature Ranking


Title	Iterative subtraction method for Feature Ranking
Authors	Paul Glaysher, Judith M. Katzy, Sitong An
Abstract	Training features used to analyse physical processes are often highly correlated and determining which ones are most important for the classification is a non-trivial tasks. For the use case of a search for a top-quark pair produced in association with a Higgs boson decaying to bottom-quarks at the LHC, we compare feature ranking methods for a classification BDT. Ranking methods, such as the BDT Selection Frequency commonly used in High Energy Physics and the Permutational Performance, are compared with the computationally expense Iterative Addition and Iterative Removal procedures, while the latter was found to be the most performant.
Tasks
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05718v1
PDF	https://arxiv.org/pdf/1906.05718v1.pdf
PWC	https://paperswithcode.com/paper/iterative-subtraction-method-for-feature
Repo
Framework

Anatomically Consistent Segmentation of Organs at Risk in MRI with Convolutional Neural Networks


Title	Anatomically Consistent Segmentation of Organs at Risk in MRI with Convolutional Neural Networks
Authors	Pawel Mlynarski, Hervé Delingette, Hamza Alghamdi, Pierre-Yves Bondiau, Nicholas Ayache
Abstract	Planning of radiotherapy involves accurate segmentation of a large number of organs at risk, i.e. organs for which irradiation doses should be minimized to avoid important side effects of the therapy. We propose a deep learning method for segmentation of organs at risk inside the brain region, from Magnetic Resonance (MR) images. Our system performs segmentation of eight structures: eye, lens, optic nerve, optic chiasm, pituitary gland, hippocampus, brainstem and brain. We propose an efficient algorithm to train neural networks for an end-to-end segmentation of multiple and non-exclusive classes, addressing problems related to computational costs and missing ground truth segmentations for a subset of classes. We enforce anatomical consistency of the result in a postprocessing step, in particular we introduce a graph-based algorithm for segmentation of the optic nerves, enforcing the connectivity between the eyes and the optic chiasm. We report cross-validated quantitative results on a database of 44 contrast-enhanced T1-weighted MRIs with provided segmentations of the considered organs at risk, which were originally used for radiotherapy planning. In addition, the segmentations produced by our model on an independent test set of 50 MRIs are evaluated by an experienced radiotherapist in order to qualitatively assess their accuracy. The mean distances between produced segmentations and the ground truth ranged from 0.1 mm to 0.7 mm across different organs. A vast majority (96 %) of the produced segmentations were found acceptable for radiotherapy planning.
Tasks
Published	2019-07-03
URL	https://arxiv.org/abs/1907.02003v1
PDF	https://arxiv.org/pdf/1907.02003v1.pdf
PWC	https://paperswithcode.com/paper/anatomically-consistent-segmentation-of
Repo
Framework

Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling


Title	Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling
Authors	Maxwell Crouse, Ibrahim Abdelaziz, Cristina Cornelio, Veronika Thost, Lingfei Wu, Kenneth Forbus, Achille Fokoue
Abstract	Recent advances in the integration of deep learning with automated theorem proving have centered around the representation of logical formulae as inputs to deep learning systems. In particular, there has been a growing interest in adapting structure-aware neural methods to work with the underlying graph representations of logical expressions. While more effective than character and token-level approaches, such methods have often made representational trade-offs that limited their ability to capture key structural properties of their inputs. In this work we propose a novel, LSTM-based approach for embedding logical formulae that is designed to overcome the representational limitations of prior approaches. Our proposed architecture works for logics of different expressivity; e.g., first-order and higher-order logic. We evaluate our approach on two standard datasets and show that the proposed architecture improves the performance of premise selection and proof step classification significantly compared to state-of-the-art.
Tasks	Automated Theorem Proving
Published	2019-11-15
URL	https://arxiv.org/abs/1911.06904v2
PDF	https://arxiv.org/pdf/1911.06904v2.pdf
PWC	https://paperswithcode.com/paper/improving-graph-neural-network
Repo
Framework

DV3+HED+: A DCNNs-based Framework to Monitor Temporary Works and ESAs in Railway Construction Project Using VHR Satellite Images


Title	DV3+HED+: A DCNNs-based Framework to Monitor Temporary Works and ESAs in Railway Construction Project Using VHR Satellite Images
Authors	Rui Guo, Ronghua Liu, Na Li, Wei Liu
Abstract	Current VHR(Very High Resolution) satellite images enable the detailed monitoring of the earth and can capture the ongoing works of railway construction. In this paper, we present an integrated framework applied to monitoring the railway construction in China, using QuickBird, GF-2 and Google Earth VHR satellite images. We also construct a novel DCNNs-based (Deep Convolutional Neural Networks) semantic segmentation network to label the temporary works such as borrow & spoil area, camp, beam yard and ESAs(Environmental Sensitive Areas) such as resident houses throughout the whole railway construction project using VHR satellite images. In addition, we employ HED edge detection sub-network to refine the boundary details and attention cross entropy loss function to fit the sample class disequilibrium problem. Our semantic segmentation network is trained on 572 VHR true color images, and tested on the 15 QuickBird true color images along Ruichang-Jiujiang railway during 2015-2017. The experiment results show that compared with the existing state-of-the-art approach, our approach has obvious improvements with an overall accuracy of more than 80%.
Tasks	Edge Detection, Semantic Segmentation
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11080v1
PDF	https://arxiv.org/pdf/1908.11080v1.pdf
PWC	https://paperswithcode.com/paper/dv3hed-a-dcnns-based-framework-to-monitor
Repo
Framework

Contour Detection in Cassini ISS images based on Hierarchical Extreme Learning Machine and Dense Conditional Random Field


Title	Contour Detection in Cassini ISS images based on Hierarchical Extreme Learning Machine and Dense Conditional Random Field
Authors	Xiqi Yang, Qingfeng Zhang, Zhan Li
Abstract	In Cassini ISS (Imaging Science Subsystem) images, contour detection is often performed on disk-resolved object to accurately locate their center. Thus, the contour detection is a key problem. Traditional edge detection methods, such as Canny and Roberts, often extract the contour with too much interior details and noise. Although the deep convolutional neural network has been applied successfully in many image tasks, such as classification and object detection, it needs more time and computer resources. In the paper, a contour detection algorithm based on H-ELM (Hierarchical Extreme Learning Machine) and DenseCRF (Dense Conditional Random Field) is proposed for Cassini ISS images. The experimental results show that this algorithm’s performance is better than both traditional machine learning methods such as SVM, ELM and even deep convolutional neural network. And the extracted contour is closer to the actual contour. Moreover, it can be trained and tested quickly on the general configuration of PC, so can be applied to contour detection for Cassini ISS images.
Tasks	Contour Detection, Edge Detection, Object Detection
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08279v1
PDF	https://arxiv.org/pdf/1908.08279v1.pdf
PWC	https://paperswithcode.com/paper/contour-detection-in-cassini-iss-images-based
Repo
Framework

Quantifying the effect of representations on task complexity


Title	Quantifying the effect of representations on task complexity
Authors	Julian Zilly, Lorenz Hetzel, Andrea Censi, Emilio Frazzoli
Abstract	We examine the influence of input data representations on learning complexity. For learning, we posit that each model implicitly uses a candidate model distribution for unexplained variations in the data, its noise model. If the model distribution is not well aligned to the true distribution, then even relevant variations will be treated as noise. Crucially however, the alignment of model and true distribution can be changed, albeit implicitly, by changing data representations. “Better” representations can better align the model to the true distribution, making it easier to approximate the input-output relationship in the data without discarding useful data variations. To quantify this alignment effect of data representations on the difficulty of a learning task, we make use of an existing task complexity score and show its connection to the representation-dependent information coding length of the input. Empirically we extract the necessary statistics from a linear regression approximation and show that these are sufficient to predict relative learning performance outcomes of different data representations and neural network types obtained when utilizing an extensive neural network architecture search. We conclude that to ensure better learning outcomes, representations may need to be tailored to both task and model to align with the implicit distribution of model and task.
Tasks
Published	2019-12-19
URL	https://arxiv.org/abs/1912.09399v1
PDF	https://arxiv.org/pdf/1912.09399v1.pdf
PWC	https://paperswithcode.com/paper/quantifying-the-effect-of-representations-on
Repo
Framework