October 21, 2019

3173 words 15 mins read

Paper Group AWR 157

Multi-View Stereo 3D Edge Reconstruction. An Optical Flow-Based Approach for Minimally-Divergent Velocimetry Data Interpolation. Computing Kantorovich-Wasserstein Distances on $d$-dimensional histograms using $(d+1)$-partite graphs. Infrared and Visible Image Fusion using a Deep Learning Framework. Remember and Forget for Experience Replay. A mathe …

Multi-View Stereo 3D Edge Reconstruction


Title	Multi-View Stereo 3D Edge Reconstruction
Authors	Andrea Bignoli, Andrea Romanoni, Matteo Matteucci
Abstract	This paper presents a novel method for the reconstruction of 3D edges in multi-view stereo scenarios. Previous research in the field typically relied on video sequences and limited the reconstruction process to either straight line-segments, or edge-points, i.e., 3D points that correspond to image edges. We instead propose a system, denoted as EdgeGraph3D, able to recover both straight and curved 3D edges from an unordered image sequence. A second contribution of this work is a graph-based representation for 2D edges that allows the identification of the most structurally significant edges detected in an image. We integrate EdgeGraph3D in a multi-view stereo reconstruction pipeline and analyze the benefits provided by 3D edges to the accuracy of the recovered surfaces. We evaluate the effectiveness of our approach on multiple datasets from two different collections in the multi-view stereo literature. Experimental results demonstrate the ability of EdgeGraph3D to work in presence of strong illumination changes and reflections, which are usually detrimental to the effectiveness of classical photometric reconstruction systems.
Tasks
Published	2018-01-17
URL	http://arxiv.org/abs/1801.05606v1
PDF	http://arxiv.org/pdf/1801.05606v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-stereo-3d-edge-reconstruction
Repo	https://github.com/abignoli/EdgeGraph3D
Framework	none

An Optical Flow-Based Approach for Minimally-Divergent Velocimetry Data Interpolation


Title	An Optical Flow-Based Approach for Minimally-Divergent Velocimetry Data Interpolation
Authors	Berkay Kanberoglu, Dhritiman Das, Priya Nair, Pavan Turaga, David Frakes
Abstract	Three-dimensional (3D) biomedical image sets are often acquired with in-plane pixel spacings that are far less than the out-of-plane spacings between images. The resultant anisotropy, which can be detrimental in many applications, can be decreased using image interpolation. Optical flow and/or other registration-based interpolators have proven useful in such interpolation roles in the past. When acquired images are comprised of signals that describe the flow velocity of fluids, additional information is available to guide the interpolation process. In this paper, we present an optical-flow based framework for image interpolation that also minimizes resultant divergence in the interpolated data.
Tasks	Optical Flow Estimation
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08882v1
PDF	http://arxiv.org/pdf/1812.08882v1.pdf
PWC	https://paperswithcode.com/paper/an-optical-flow-based-approach-for-minimally
Repo	https://github.com/berk-github/OF_interp
Framework	none

Computing Kantorovich-Wasserstein Distances on $d$-dimensional histograms using $(d+1)$-partite graphs


Title	Computing Kantorovich-Wasserstein Distances on $d$-dimensional histograms using $(d+1)$-partite graphs
Authors	Gennaro Auricchio, Federico Bassetti, Stefano Gualandi, Marco Veneroni
Abstract	This paper presents a novel method to compute the exact Kantorovich-Wasserstein distance between a pair of $d$-dimensional histograms having $n$ bins each. We prove that this problem is equivalent to an uncapacitated minimum cost flow problem on a $(d+1)$-partite graph with $(d+1)n$ nodes and $dn^{\frac{d+1}{d}}$ arcs, whenever the cost is separable along the principal $d$-dimensional directions. We show numerically the benefits of our approach by computing the Kantorovich-Wasserstein distance of order 2 among two sets of instances: gray scale images and $d$-dimensional biomedical histograms. On these types of instances, our approach is competitive with state-of-the-art optimal transport algorithms.
Tasks
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07416v2
PDF	http://arxiv.org/pdf/1805.07416v2.pdf
PWC	https://paperswithcode.com/paper/computing-kantorovich-wasserstein-distances-1
Repo	https://github.com/stegua/dpartion-nips2018
Framework	none

Infrared and Visible Image Fusion using a Deep Learning Framework


Title	Infrared and Visible Image Fusion using a Deep Learning Framework
Authors	Hui Li, Xiao-Jun Wu, Josef Kittler
Abstract	In recent years, deep learning has become a very active research tool which is used in many image processing fields. In this paper, we propose an effective image fusion method using a deep learning framework to generate a single image which contains all the features from infrared and visible images. First, the source images are decomposed into base parts and detail content. Then the base parts are fused by weighted-averaging. For the detail content, we use a deep learning network to extract multi-layer features. Using these features, we use l_1-norm and weighted-average strategy to generate several candidates of the fused detail content. Once we get these candidates, the max selection strategy is used to get final fused detail content. Finally, the fused image will be reconstructed by combining the fused base part and detail content. The experimental results demonstrate that our proposed method achieves state-of-the-art performance in both objective assessment and visual quality. The Code of our fusion method is available at https://github.com/hli1221/imagefusion_deeplearning
Tasks	Infrared And Visible Image Fusion
Published	2018-04-19
URL	http://arxiv.org/abs/1804.06992v4
PDF	http://arxiv.org/pdf/1804.06992v4.pdf
PWC	https://paperswithcode.com/paper/infrared-and-visible-image-fusion-using-a
Repo	https://github.com/hli1221/imagefusion_deeplearning
Framework	pytorch

Remember and Forget for Experience Replay


Title	Remember and Forget for Experience Replay
Authors	Guido Novati, Petros Koumoutsakos
Abstract	Experience replay (ER) is a fundamental component of off-policy deep reinforcement learning (RL). ER recalls experiences from past iterations to compute gradient estimates for the current policy, increasing data-efficiency. However, the accuracy of such updates may deteriorate when the policy diverges from past behaviors and can undermine the performance of ER. Many algorithms mitigate this issue by tuning hyper-parameters to slow down policy changes. An alternative is to actively enforce the similarity between policy and the experiences in the replay memory. We introduce Remember and Forget Experience Replay (ReF-ER), a novel method that can enhance RL algorithms with parameterized policies. ReF-ER (1) skips gradients computed from experiences that are too unlikely with the current policy and (2) regulates policy changes within a trust region of the replayed behaviors. We couple ReF-ER with Q-learning, deterministic policy gradient and off-policy gradient methods. We find that ReF-ER consistently improves the performance of continuous-action, off-policy RL on fully observable benchmarks and partially observable flow control problems.
Tasks	Policy Gradient Methods, Q-Learning
Published	2018-07-16
URL	https://arxiv.org/abs/1807.05827v4
PDF	https://arxiv.org/pdf/1807.05827v4.pdf
PWC	https://paperswithcode.com/paper/remember-and-forget-for-experience-replay
Repo	https://github.com/cselab/smarties
Framework	none

A mathematical theory of semantic development in deep neural networks


Title	A mathematical theory of semantic development in deep neural networks
Authors	Andrew M. Saxe, James L. McClelland, Surya Ganguli
Abstract	An extensive body of empirical research has revealed remarkable regularities in the acquisition, organization, deployment, and neural representation of human semantic knowledge, thereby raising a fundamental conceptual question: what are the theoretical principles governing the ability of neural networks to acquire, organize, and deploy abstract knowledge by integrating across many individual experiences? We address this question by mathematically analyzing the nonlinear dynamics of learning in deep linear networks. We find exact solutions to this learning dynamics that yield a conceptual explanation for the prevalence of many disparate phenomena in semantic cognition, including the hierarchical differentiation of concepts through rapid developmental transitions, the ubiquity of semantic illusions between such transitions, the emergence of item typicality and category coherence as factors controlling the speed of semantic processing, changing patterns of inductive projection over development, and the conservation of semantic similarity in neural representations across species. Thus, surprisingly, our simple neural model qualitatively recapitulates many diverse regularities underlying semantic development, while providing analytic insight into how the statistical structure of an environment can interact with nonlinear deep learning dynamics to give rise to these regularities.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2018-10-23
URL	http://arxiv.org/abs/1810.10531v1
PDF	http://arxiv.org/pdf/1810.10531v1.pdf
PWC	https://paperswithcode.com/paper/a-mathematical-theory-of-semantic-development
Repo	https://github.com/bhoov/ngd-vis
Framework	none

Bayesian Metabolic Flux Analysis reveals intracellular flux couplings


Title	Bayesian Metabolic Flux Analysis reveals intracellular flux couplings
Authors	Markus Heinonen, Maria Osmala, Henrik Mannerström, Janne Wallenius, Samuel Kaski, Juho Rousu, Harri Lähdesmäki
Abstract	Metabolic flux balance analyses are a standard tool in analysing metabolic reaction rates compatible with measurements, steady-state and the metabolic reaction network stoichiometry. Flux analysis methods commonly place unrealistic assumptions on fluxes due to the convenience of formulating the problem as a linear programming model, and most methods ignore the notable uncertainty in flux estimates. We introduce a novel paradigm of Bayesian metabolic flux analysis that models the reactions of the whole genome-scale cellular system in probabilistic terms, and can infer the full flux vector distribution of genome-scale metabolic systems based on exchange and intracellular (e.g. 13C) flux measurements, steady-state assumptions, and target function assumptions. The Bayesian model couples all fluxes jointly together in a simple truncated multivariate posterior distribution, which reveals informative flux couplings. Our model is a plug-in replacement to conventional metabolic balance methods, such as flux balance analysis (FBA). Our experiments indicate that we can characterise the genome-scale flux covariances, reveal flux couplings, and determine more intracellular unobserved fluxes in C. acetobutylicum from 13C data than flux variability analysis. The COBRA compatible software is available at github.com/markusheinonen/bamfa
Tasks
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06673v1
PDF	http://arxiv.org/pdf/1804.06673v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-metabolic-flux-analysis-reveals
Repo	https://github.com/markusheinonen/bamfa
Framework	none

Cellular automata as convolutional neural networks


Title	Cellular automata as convolutional neural networks
Authors	William Gilpin
Abstract	Deep learning techniques have recently demonstrated broad success in predicting complex dynamical systems ranging from turbulence to human speech, motivating broader questions about how neural networks encode and represent dynamical rules. We explore this problem in the context of cellular automata (CA), simple dynamical systems that are intrinsically discrete and thus difficult to analyze using standard tools from dynamical systems theory. We show that any CA may readily be represented using a convolutional neural network with a network-in-network architecture. This motivates our development of a general convolutional multilayer perceptron architecture, which we find can learn the dynamical rules for arbitrary CA when given videos of the CA as training data. In the limit of large network widths, we find that training dynamics are nearly identical across replicates, and that common patterns emerge in the structure of networks trained on different CA rulesets. We train ensembles of networks on randomly-sampled CA, and we probe how the trained networks internally represent the CA rules using an information-theoretic technique based on distributions of layer activation patterns. We find that CA with simpler rule tables produce trained networks with hierarchical structure and layer specialization, while more complex CA produce shallower representations—illustrating how the underlying complexity of the CA’s rules influences the specificity of these internal representations. Our results suggest how the entropy of a physical process can affect its representation when learned by neural networks.
Tasks
Published	2018-09-09
URL	https://arxiv.org/abs/1809.02942v2
PDF	https://arxiv.org/pdf/1809.02942v2.pdf
PWC	https://paperswithcode.com/paper/cellular-automata-as-convolutional-neural
Repo	https://github.com/Ahmore/isz_project
Framework	tf

NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention


Title	NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention
Authors	Christos Baziotis, Nikos Athanasiou, Georgios Paraskevopoulos, Nikolaos Ellinas, Athanasia Kolovou, Alexandros Potamianos
Abstract	In this paper we present a deep-learning model that competed at SemEval-2018 Task 2 “Multilingual Emoji Prediction”. We participated in subtask A, in which we are called to predict the most likely associated emoji in English tweets. The proposed architecture relies on a Long Short-Term Memory network, augmented with an attention mechanism, that conditions the weight of each word, on a “context vector” which is taken as the aggregation of a tweet’s meaning. Moreover, we initialize the embedding layer of our model, with word2vec word embeddings, pretrained on a dataset of 550 million English tweets. Finally, our model does not rely on hand-crafted features or lexicons and is trained end-to-end with back-propagation. We ranked 2nd out of 48 teams.
Tasks	Word Embeddings
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06657v1
PDF	http://arxiv.org/pdf/1804.06657v1.pdf
PWC	https://paperswithcode.com/paper/ntua-slp-at-semeval-2018-task-2-predicting
Repo	https://github.com/cbaziotis/ntua-slp-semeval2018
Framework	pytorch

Unsupervised Domain Adaptive Re-Identification: Theory and Practice


Title	Unsupervised Domain Adaptive Re-Identification: Theory and Practice
Authors	Liangchen Song, Cheng Wang, Lefei Zhang, Bo Du, Qian Zhang, Chang Huang, Xinggang Wang
Abstract	We study the problem of unsupervised domain adaptive re-identification (re-ID) which is an active topic in computer vision but lacks a theoretical foundation. We first extend existing unsupervised domain adaptive classification theories to re-ID tasks. Concretely, we introduce some assumptions on the extracted feature space and then derive several loss functions guided by these assumptions. To optimize them, a novel self-training scheme for unsupervised domain adaptive re-ID tasks is proposed. It iteratively makes guesses for unlabeled target data based on an encoder and trains the encoder based on the guessed labels. Extensive experiments on unsupervised domain adaptive person re-ID and vehicle re-ID tasks with comparisons to the state-of-the-arts confirm the effectiveness of the proposed theories and self-training framework. Our code is available at \url{https://github.com/LcDog/DomainAdaptiveReID}.
Tasks
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11334v1
PDF	http://arxiv.org/pdf/1807.11334v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-domain-adaptive-re
Repo	https://github.com/FlyingRoastDuck/ACT_AAAI20
Framework	pytorch

Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees


Title	Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
Authors	Yuping Luo, Huazhe Xu, Yuanzhi Li, Yuandong Tian, Trevor Darrell, Tengyu Ma
Abstract	Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL. However, the theoretical understanding of such methods has been rather limited. This paper introduces a novel algorithmic framework for designing and analyzing model-based RL algorithms with theoretical guarantees. We design a meta-algorithm with a theoretical guarantee of monotone improvement to a local maximum of the expected reward. The meta-algorithm iteratively builds a lower bound of the expected reward based on the estimated dynamical model and sample trajectories, and then maximizes the lower bound jointly over the policy and the model. The framework extends the optimism-in-face-of-uncertainty principle to non-linear dynamical models in a way that requires \textit{no explicit} uncertainty quantification. Instantiating our framework with simplification gives a variant of model-based RL algorithms Stochastic Lower Bounds Optimization (SLBO). Experiments demonstrate that SLBO achieves state-of-the-art performance when only one million or fewer samples are permitted on a range of continuous control benchmark tasks.
Tasks	Continuous Control
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03858v4
PDF	http://arxiv.org/pdf/1807.03858v4.pdf
PWC	https://paperswithcode.com/paper/algorithmic-framework-for-model-based-deep
Repo	https://github.com/roosephu/slbo
Framework	tf

CollaboNet: collaboration of deep neural networks for biomedical named entity recognition


Title	CollaboNet: collaboration of deep neural networks for biomedical named entity recognition
Authors	Wonjin Yoon, Chan Ho So, Jinhyuk Lee, Jaewoo Kang
Abstract	Background: Finding biomedical named entities is one of the most essential tasks in biomedical text mining. Recently, deep learning-based approaches have been applied to biomedical named entity recognition (BioNER) and showed promising results. However, as deep learning approaches need an abundant amount of training data, a lack of data can hinder performance. BioNER datasets are scarce resources and each dataset covers only a small subset of entity types. Furthermore, many bio entities are polysemous, which is one of the major obstacles in named entity recognition. Results: To address the lack of data and the entity type misclassification problem, we propose CollaboNet which utilizes a combination of multiple NER models. In CollaboNet, models trained on a different dataset are connected to each other so that a target model obtains information from other collaborator models to reduce false positives. Every model is an expert on their target entity type and takes turns serving as a target and a collaborator model during training time. The experimental results show that CollaboNet can be used to greatly reduce the number of false positives and misclassified entities including polysemous words. CollaboNet achieved state-of-the-art performance in terms of precision, recall and F1 score. Conclusions: We demonstrated the benefits of combining multiple models for BioNER. Our model has successfully reduced the number of misclassified entities and improved the performance by leveraging multiple datasets annotated for different entity types. Given the state-of-the-art performance of our model, we believe that CollaboNet can improve the accuracy of downstream biomedical text mining applications such as bio-entity relation extraction.
Tasks	Named Entity Recognition, Relation Extraction
Published	2018-09-21
URL	https://arxiv.org/abs/1809.07950v2
PDF	https://arxiv.org/pdf/1809.07950v2.pdf
PWC	https://paperswithcode.com/paper/collabonet-collaboration-of-deep-neural
Repo	https://github.com/ncbi-nlp/BLUE_Benchmark
Framework	none

$ρ$-hot Lexicon Embedding-based Two-level LSTM for Sentiment Analysis


Title	$ρ$-hot Lexicon Embedding-based Two-level LSTM for Sentiment Analysis
Authors	Ou Wu, Tao Yang, Mengyang Li, Ming Li
Abstract	Sentiment analysis is a key component in various text mining applications. Numerous sentiment classification techniques, including conventional and deep learning-based methods, have been proposed in the literature. In most existing methods, a high-quality training set is assumed to be given. Nevertheless, constructing a high-quality training set that consists of highly accurate labels is challenging in real applications. This difficulty stems from the fact that text samples usually contain complex sentiment representations, and their annotation is subjective. We address this challenge in this study by leveraging a new labeling strategy and utilizing a two-level long short-term memory network to construct a sentiment classifier. Lexical cues are useful for sentiment analysis, and they have been utilized in conventional studies. For example, polar and privative words play important roles in sentiment analysis. A new encoding strategy, that is, $\rho$-hot encoding, is proposed to alleviate the drawbacks of one-hot encoding and thus effectively incorporate useful lexical cues. We compile three Chinese data sets on the basis of our label strategy and proposed methodology. Experiments on the three data sets demonstrate that the proposed method outperforms state-of-the-art algorithms.
Tasks	Sentiment Analysis
Published	2018-03-21
URL	http://arxiv.org/abs/1803.07771v1
PDF	http://arxiv.org/pdf/1803.07771v1.pdf
PWC	https://paperswithcode.com/paper/-hot-lexicon-embedding-based-two-level-lstm
Repo	https://github.com/Tju-AI/two-stage-labeling-for-the-sentiment-orientations
Framework	none

AutoSense Model for Word Sense Induction


Title	AutoSense Model for Word Sense Induction
Authors	Reinald Kim Amplayo, Seung-won Hwang, Min Song
Abstract	Word sense induction (WSI), or the task of automatically discovering multiple senses or meanings of a word, has three main challenges: domain adaptability, novel sense detection, and sense granularity flexibility. While current latent variable models are known to solve the first two challenges, they are not flexible to different word sense granularities, which differ very much among words, from aardvark with one sense, to play with over 50 senses. Current models either require hyperparameter tuning or nonparametric induction of the number of senses, which we find both to be ineffective. Thus, we aim to eliminate these requirements and solve the sense granularity problem by proposing AutoSense, a latent variable model based on two observations: (1) senses are represented as a distribution over topics, and (2) senses generate pairings between the target word and its neighboring word. These observations alleviate the problem by (a) throwing garbage senses and (b) additionally inducing fine-grained word senses. Results show great improvements over the state-of-the-art models on popular WSI datasets. We also show that AutoSense is able to learn the appropriate sense granularity of a word. Finally, we apply AutoSense to the unsupervised author name disambiguation task where the sense granularity problem is more evident and show that AutoSense is evidently better than competing models. We share our data and code here: https://github.com/rktamplayo/AutoSense.
Tasks	Latent Variable Models, Word Sense Induction
Published	2018-11-22
URL	http://arxiv.org/abs/1811.09242v1
PDF	http://arxiv.org/pdf/1811.09242v1.pdf
PWC	https://paperswithcode.com/paper/autosense-model-for-word-sense-induction
Repo	https://github.com/rktamplayo/AutoSense
Framework	none

Unsupervised Training for 3D Morphable Model Regression


Title	Unsupervised Training for 3D Morphable Model Regression
Authors	Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, William T. Freeman
Abstract	We present a method for training a regression network from image pixels to 3D morphable model coordinates using only unlabeled photographs. The training loss is based on features from a facial recognition network, computed on-the-fly by rendering the predicted faces with a differentiable renderer. To make training from features feasible and avoid network fooling effects, we introduce three objectives: a batch distribution loss that encourages the output distribution to match the distribution of the morphable model, a loopback loss that ensures the network can correctly reinterpret its own output, and a multi-view identity loss that compares the features of the predicted 3D face and the input photograph from multiple viewing angles. We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results.
Tasks	3D Face Reconstruction
Published	2018-06-15
URL	http://arxiv.org/abs/1806.06098v1
PDF	http://arxiv.org/pdf/1806.06098v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-training-for-3d-morphable-model
Repo	https://github.com/google/tf_mesh_renderer
Framework	tf