July 27, 2019

3344 words 16 mins read

Paper Group ANR 666

Paper Group ANR 666

A dynamic connectome supports the emergence of stable computational function of neural circuits through reward-based learning. Generating Reflectance Curves from sRGB Triplets. Two-temperature logistic regression based on the Tsallis divergence. NSML: A Machine Learning Platform That Enables You to Focus on Your Models. Exchangeable choice function …

A dynamic connectome supports the emergence of stable computational function of neural circuits through reward-based learning

Title A dynamic connectome supports the emergence of stable computational function of neural circuits through reward-based learning
Authors David Kappel, Robert Legenstein, Stefan Habenschuss, Michael Hsieh, Wolfgang Maass
Abstract Synaptic connections between neurons in the brain are dynamic because of continuously ongoing spine dynamics, axonal sprouting, and other processes. In fact, it was recently shown that the spontaneous synapse-autonomous component of spine dynamics is at least as large as the component that depends on the history of pre- and postsynaptic neural activity. These data are inconsistent with common models for network plasticity, and raise the questions how neural circuits can maintain a stable computational function in spite of these continuously ongoing processes, and what functional uses these ongoing processes might have. Here, we present a rigorous theoretical framework for these seemingly stochastic spine dynamics and rewiring processes in the context of reward-based learning tasks. We show that spontaneous synapse-autonomous processes, in combination with reward signals such as dopamine, can explain the capability of networks of neurons in the brain to configure themselves for specific computational tasks, and to compensate automatically for later changes in the network or task. Furthermore we show theoretically and through computer simulations that stable computational performance is compatible with continuously ongoing synapse-autonomous changes. After reaching good computational performance it causes primarily a slow drift of network architecture and dynamics in task-irrelevant dimensions, as observed for neural activity in motor cortex and other areas. On the more abstract level of reinforcement learning the resulting model gives rise to an understanding of reward-driven network plasticity as continuous sampling of network configurations.
Tasks
Published 2017-04-13
URL http://arxiv.org/abs/1704.04238v4
PDF http://arxiv.org/pdf/1704.04238v4.pdf
PWC https://paperswithcode.com/paper/a-dynamic-connectome-supports-the-emergence
Repo
Framework

Generating Reflectance Curves from sRGB Triplets

Title Generating Reflectance Curves from sRGB Triplets
Authors Scott Allen Burns
Abstract The color sensation evoked by an object depends on both the spectral power distribution of the illumination and the reflectance properties of the object being illuminated. The color sensation can be characterized by three color-space values, such as XYZ, RGB, HSV, Lab*, etc. It is straightforward to compute the three values given the illuminant and reflectance curves. The converse process of computing a reflectance curve given the color-space values and the illuminant is complicated by the fact that an infinite number of different reflectance curves can give rise to a single set of color-space values (metamerism). This paper presents five algorithms for generating a reflectance curve from a specified sRGB triplet, written for a general audience. The algorithms are designed to generate reflectance curves that are similar to those found with naturally occurring colored objects. The computed reflectance curves are compared to a database of thousands of reflectance curves measured from paints and pigments available both commercially and in nature, and the similarity is quantified. One particularly useful application of these algorithms is in the field of computer graphics, where modeling color transformations sometimes requires wavelength-specific information, such as when modeling subtractive color mixture.
Tasks Metamerism
Published 2017-10-11
URL https://arxiv.org/abs/1710.05732v5
PDF https://arxiv.org/pdf/1710.05732v5.pdf
PWC https://paperswithcode.com/paper/generating-reflectance-curves-from-srgb
Repo
Framework

Two-temperature logistic regression based on the Tsallis divergence

Title Two-temperature logistic regression based on the Tsallis divergence
Authors Ehsan Amid, Manfred K. Warmuth, Sriram Srinivasan
Abstract We develop a variant of multiclass logistic regression that is significantly more robust to noise. The algorithm has one weight vector per class and the surrogate loss is a function of the linear activations (one per class). The surrogate loss of an example with linear activation vector $\mathbf{a}$ and class $c$ has the form $-\log_{t_1} \exp_{t_2} (a_c - G_{t_2}(\mathbf{a}))$ where the two temperatures $t_1$ and $t_2$ ‘‘temper’’ the $\log$ and $\exp$, respectively, and $G_{t_2}(\mathbf{a})$ is a scalar value that generalizes the log-partition function. We motivate this loss using the Tsallis divergence. Our method allows transitioning between non-convex and convex losses by the choice of the temperature parameters. As the temperature $t_1$ of the logarithm becomes smaller than the temperature $t_2$ of the exponential, the surrogate loss becomes ‘‘quasi convex’'. Various tunings of the temperatures recover previous methods and tuning the degree of non-convexity is crucial in the experiments. In particular, quasi-convexity and boundedness of the loss provide significant robustness to the outliers. We explain this by showing that $t_1 < 1$ caps the surrogate loss and $t_2 >1$ makes the predictive distribution have a heavy tail. We show that the surrogate loss is Bayes-consistent, even in the non-convex case. Additionally, we provide efficient iterative algorithms for calculating the log-partition value only in a few number of iterations. Our compelling experimental results on large real-world datasets show the advantage of using the two-temperature variant in the noisy as well as the noise free case.
Tasks
Published 2017-05-19
URL https://arxiv.org/abs/1705.07210v2
PDF https://arxiv.org/pdf/1705.07210v2.pdf
PWC https://paperswithcode.com/paper/two-temperature-logistic-regression-based-on
Repo
Framework

NSML: A Machine Learning Platform That Enables You to Focus on Your Models

Title NSML: A Machine Learning Platform That Enables You to Focus on Your Models
Authors Nako Sung, Minkyu Kim, Hyunwoo Jo, Youngil Yang, Jingwoong Kim, Leonard Lausen, Youngkwan Kim, Gayoung Lee, Donghyun Kwak, Jung-Woo Ha, Sunghun Kim
Abstract Machine learning libraries such as TensorFlow and PyTorch simplify model implementation. However, researchers are still required to perform a non-trivial amount of manual tasks such as GPU allocation, training status tracking, and comparison of models with different hyperparameter settings. We propose a system to handle these tasks and help researchers focus on models. We present the requirements of the system based on a collection of discussions from an online study group comprising 25k members. These include automatic GPU allocation, learning status visualization, handling model parameter snapshots as well as hyperparameter modification during learning, and comparison of performance metrics between models via a leaderboard. We describe the system architecture that fulfills these requirements and present a proof-of-concept implementation, NAVER Smart Machine Learning (NSML). We test the system and confirm substantial efficiency improvements for model development.
Tasks
Published 2017-12-16
URL http://arxiv.org/abs/1712.05902v1
PDF http://arxiv.org/pdf/1712.05902v1.pdf
PWC https://paperswithcode.com/paper/nsml-a-machine-learning-platform-that-enables
Repo
Framework

Exchangeable choice functions

Title Exchangeable choice functions
Authors Arthur Van Camp, Gert de Cooman
Abstract We investigate how to model exchangeability with choice functions. Exchangeability is a structural assessment on a sequence of uncertain variables. We show how such assessments are a special indifference assessment, and how that leads to a counterpart of de Finetti’s Representation Theorem, both in a finite and a countable context.
Tasks
Published 2017-03-06
URL http://arxiv.org/abs/1703.01924v1
PDF http://arxiv.org/pdf/1703.01924v1.pdf
PWC https://paperswithcode.com/paper/exchangeable-choice-functions
Repo
Framework

Entropic selection of concepts unveils hidden topics in documents corpora

Title Entropic selection of concepts unveils hidden topics in documents corpora
Authors Andrea Martini, Alessio Cardillo, Paolo De Los Rios
Abstract The organization and evolution of science has recently become itself an object of scientific quantitative investigation, thanks to the wealth of information that can be extracted from scientific documents, such as citations between papers and co-authorship between researchers. However, only few studies have focused on the concepts that characterize full documents and that can be extracted and analyzed, revealing the deeper organization of scientific knowledge. Unfortunately, several concepts can be so common across documents that they hinder the emergence of the underlying topical structure of the document corpus, because they give rise to a large amount of spurious and trivial relations among documents. To identify and remove common concepts, we introduce a method to gauge their relevance according to an objective information-theoretic measure related to the statistics of their occurrence across the document corpus. After progressively removing concepts that, according to this metric, can be considered as generic, we find that the topic organization displays a correspondingly more refined structure.
Tasks
Published 2017-05-18
URL http://arxiv.org/abs/1705.06510v2
PDF http://arxiv.org/pdf/1705.06510v2.pdf
PWC https://paperswithcode.com/paper/entropic-selection-of-concepts-unveils-hidden
Repo
Framework

Mesh-to-raster based non-rigid registration of multi-modal images

Title Mesh-to-raster based non-rigid registration of multi-modal images
Authors Rosalia Tatano, Benjamin Berkels, Thomas M. Deserno
Abstract Region of interest (ROI) alignment in medical images plays a crucial role in diagnostics, procedure planning, treatment, and follow-up. Frequently, a model is represented as triangulated mesh while the patient data is provided from CAT scanners as pixel or voxel data. Previously, we presented a 2D method for curve-to-pixel registration. This paper contributes (i) a general mesh-to-raster (M2R) framework to register ROIs in multi-modal images; (ii) a 3D surface-to-voxel application, and (iii) a comprehensive quantitative evaluation in 2D using ground truth provided by the simultaneous truth and performance level estimation (STAPLE) method. The registration is formulated as a minimization problem where the objective consists of a data term, which involves the signed distance function of the ROI from the reference image, and a higher order elastic regularizer for the deformation. The evaluation is based on quantitative light-induced fluoroscopy (QLF) and digital photography (DP) of decalcified teeth. STAPLE is computed on 150 image pairs from 32 subjects, each showing one corresponding tooth in both modalities. The ROI in each image is manually marked by three experts (900 curves in total). In the QLF-DP setting, our approach significantly outperforms the mutual information-based registration algorithm implemented with the Insight Segmentation and Registration Toolkit (ITK) and Elastix.
Tasks
Published 2017-03-06
URL http://arxiv.org/abs/1703.01972v2
PDF http://arxiv.org/pdf/1703.01972v2.pdf
PWC https://paperswithcode.com/paper/mesh-to-raster-based-non-rigid-registration
Repo
Framework

On the Hardness of Inventory Management with Censored Demand Data

Title On the Hardness of Inventory Management with Censored Demand Data
Authors Gábor Lugosi, Mihalis G. Markakis, Gergely Neu
Abstract We consider a repeated newsvendor problem where the inventory manager has no prior information about the demand, and can access only censored/sales data. In analogy to multi-armed bandit problems, the manager needs to simultaneously “explore” and “exploit” with her inventory decisions, in order to minimize the cumulative cost. We make no probabilistic assumptions—importantly, independence or time stationarity—regarding the mechanism that creates the demand sequence. Our goal is to shed light on the hardness of the problem, and to develop policies that perform well with respect to the regret criterion, that is, the difference between the cumulative cost of a policy and that of the best fixed action/static inventory decision in hindsight, uniformly over all feasible demand sequences. We show that a simple randomized policy, termed the Exponentially Weighted Forecaster, combined with a carefully designed cost estimator, achieves optimal scaling of the expected regret (up to logarithmic factors) with respect to all three key primitives: the number of time periods, the number of inventory decisions available, and the demand support. Through this result, we derive an important insight: the benefit from “information stalking” as well as the cost of censoring are both negligible in this dynamic learning problem, at least with respect to the regret criterion. Furthermore, we modify the proposed policy in order to perform well in terms of the tracking regret, that is, using as benchmark the best sequence of inventory decisions that switches a limited number of times. Numerical experiments suggest that the proposed approach outperforms existing ones (that are tailored to, or facilitated by, time stationarity) on nonstationary demand models. Finally, we extend the proposed approach and its analysis to a “combinatorial” version of the repeated newsvendor problem.
Tasks
Published 2017-10-16
URL http://arxiv.org/abs/1710.05739v1
PDF http://arxiv.org/pdf/1710.05739v1.pdf
PWC https://paperswithcode.com/paper/on-the-hardness-of-inventory-management-with
Repo
Framework

Semidefinite tests for latent causal structures

Title Semidefinite tests for latent causal structures
Authors Aditya Kela, Kai von Prillwitz, Johan Aberg, Rafael Chaves, David Gross
Abstract Testing whether a probability distribution is compatible with a given Bayesian network is a fundamental task in the field of causal inference, where Bayesian networks model causal relations. Here we consider the class of causal structures where all correlations between observed quantities are solely due to the influence from latent variables. We show that each model of this type imposes a certain signature on the observable covariance matrix in terms of a particular decomposition into positive semidefinite components. This signature, and thus the underlying hypothetical latent structure, can be tested in a computationally efficient manner via semidefinite programming. This stands in stark contrast with the algebraic geometric tools required if the full observable probability distribution is taken into account. The semidefinite test is compared with tests based on entropic inequalities.
Tasks Causal Inference
Published 2017-01-03
URL http://arxiv.org/abs/1701.00652v1
PDF http://arxiv.org/pdf/1701.00652v1.pdf
PWC https://paperswithcode.com/paper/semidefinite-tests-for-latent-causal
Repo
Framework

LesionSeg: Semantic segmentation of skin lesions using Deep Convolutional Neural Network

Title LesionSeg: Semantic segmentation of skin lesions using Deep Convolutional Neural Network
Authors Dhanesh Ramachandram, Terrance DeVries
Abstract We present a method for skin lesion segmentation for the ISIC 2017 Skin Lesion Segmentation Challenge. Our approach is based on a Fully Convolutional Network architecture which is trained end to end, from scratch, on a limited dataset. Our semantic segmentation architecture utilizes several recent innovations in particularly in the combined use of (i) use of atrous convolutions to increase the effective field of view of the network’s receptive field without increasing the number of parameters, (ii) the use of network-in-network $1\times1$ convolution layers to add capacity to the network and (iii) state-of-art super-resolution upsampling of predictions using subpixel CNN layers. We reported a mean IOU score of 0.642 on the validation set provided by the organisers.
Tasks Lesion Segmentation, Semantic Segmentation, Super-Resolution
Published 2017-03-09
URL http://arxiv.org/abs/1703.03372v3
PDF http://arxiv.org/pdf/1703.03372v3.pdf
PWC https://paperswithcode.com/paper/lesionseg-semantic-segmentation-of-skin
Repo
Framework

Constructing multi-modality and multi-classifier radiomics predictive models through reliable classifier fusion

Title Constructing multi-modality and multi-classifier radiomics predictive models through reliable classifier fusion
Authors Zhiguo Zhou, Zhi-Jie Zhou, Hongxia Hao, Shulong Li, Xi Chen, You Zhang, Michael Folkert, Jing Wang
Abstract Radiomics aims to extract and analyze large numbers of quantitative features from medical images and is highly promising in staging, diagnosing, and predicting outcomes of cancer treatments. Nevertheless, several challenges need to be addressed to construct an optimal radiomics predictive model. First, the predictive performance of the model may be reduced when features extracted from an individual imaging modality are blindly combined into a single predictive model. Second, because many different types of classifiers are available to construct a predictive model, selecting an optimal classifier for a particular application is still challenging. In this work, we developed multi-modality and multi-classifier radiomics predictive models that address the aforementioned issues in currently available models. Specifically, a new reliable classifier fusion strategy was proposed to optimally combine output from different modalities and classifiers. In this strategy, modality-specific classifiers were first trained, and an analytic evidential reasoning (ER) rule was developed to fuse the output score from each modality to construct an optimal predictive model. One public data set and two clinical case studies were performed to validate model performance. The experimental results indicated that the proposed ER rule based radiomics models outperformed the traditional models that rely on a single classifier or simply use combined features from different modalities.
Tasks
Published 2017-10-04
URL http://arxiv.org/abs/1710.01614v2
PDF http://arxiv.org/pdf/1710.01614v2.pdf
PWC https://paperswithcode.com/paper/constructing-multi-modality-and-multi
Repo
Framework

Optimal Algorithms for Distributed Optimization

Title Optimal Algorithms for Distributed Optimization
Authors César A. Uribe, Soomin Lee, Alexander Gasnikov, Angelia Nedić
Abstract In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov’s accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.
Tasks Distributed Optimization
Published 2017-12-01
URL http://arxiv.org/abs/1712.00232v3
PDF http://arxiv.org/pdf/1712.00232v3.pdf
PWC https://paperswithcode.com/paper/optimal-algorithms-for-distributed
Repo
Framework

Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models

Title Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models
Authors Pierre Lison, Serge Bibauw
Abstract Neural conversational models require substantial amounts of dialogue data for their parameter estimation and are therefore usually learned on large corpora such as chat forums or movie subtitles. These corpora are, however, often challenging to work with, notably due to their frequent lack of turn segmentation and the presence of multiple references external to the dialogue itself. This paper shows that these challenges can be mitigated by adding a weighting model into the architecture. The weighting model, which is itself estimated from dialogue data, associates each training example to a numerical weight that reflects its intrinsic quality for dialogue modelling. At training time, these sample weights are included into the empirical loss to be minimised. Evaluation results on retrieval-based models trained on movie and TV subtitles demonstrate that the inclusion of such a weighting model improves the model performance on unsupervised metrics.
Tasks
Published 2017-04-28
URL http://arxiv.org/abs/1704.08966v2
PDF http://arxiv.org/pdf/1704.08966v2.pdf
PWC https://paperswithcode.com/paper/not-all-dialogues-are-created-equal-instance
Repo
Framework

SPP-Net: Deep Absolute Pose Regression with Synthetic Views

Title SPP-Net: Deep Absolute Pose Regression with Synthetic Views
Authors Pulak Purkait, Cheng Zhao, Christopher Zach
Abstract Image based localization is one of the important problems in computer vision due to its wide applicability in robotics, augmented reality, and autonomous systems. There is a rich set of methods described in the literature how to geometrically register a 2D image w.r.t.\ a 3D model. Recently, methods based on deep (and convolutional) feedforward networks (CNNs) became popular for pose regression. However, these CNN-based methods are still less accurate than geometry based methods despite being fast and memory efficient. In this work we design a deep neural network architecture based on sparse feature descriptors to estimate the absolute pose of an image. Our choice of using sparse feature descriptors has two major advantages: first, our network is significantly smaller than the CNNs proposed in the literature for this task—thereby making our approach more efficient and scalable. Second—and more importantly—, usage of sparse features allows to augment the training data with synthetic viewpoints, which leads to substantial improvements in the generalization performance to unseen poses. Thus, our proposed method aims to combine the best of the two worlds—feature-based localization and CNN-based pose regression–to achieve state-of-the-art performance in the absolute pose estimation. A detailed analysis of the proposed architecture and a rigorous evaluation on the existing datasets are provided to support our method.
Tasks Image-Based Localization, Pose Estimation
Published 2017-12-09
URL http://arxiv.org/abs/1712.03452v1
PDF http://arxiv.org/pdf/1712.03452v1.pdf
PWC https://paperswithcode.com/paper/spp-net-deep-absolute-pose-regression-with
Repo
Framework

Interpretable Structure-Evolving LSTM

Title Interpretable Structure-Evolving LSTM
Authors Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, Eric P. Xing
Abstract This paper develops a general framework for learning interpretable data representation via Long Short-Term Memory (LSTM) recurrent neural networks over hierarchal graph structures. Instead of learning LSTM models over the pre-fixed structures, we propose to further learn the intermediate interpretable multi-level graph structures in a progressive and stochastic way from data during the LSTM network optimization. We thus call this model the structure-evolving LSTM. In particular, starting with an initial element-level graph representation where each node is a small data element, the structure-evolving LSTM gradually evolves the multi-level graph representations by stochastically merging the graph nodes with high compatibilities along the stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two connected nodes from their corresponding LSTM gate outputs, which is used to generate a merging probability. The candidate graph structures are accordingly generated where the nodes are grouped into cliques with their merging probabilities. We then produce the new graph structure with a Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in local optimums by stochastic sampling with an acceptance probability. Once a graph structure is accepted, a higher-level graph is then constructed by taking the partitioned cliques as its nodes. During the evolving process, representation becomes more abstracted in higher-levels where redundant information is filtered out, allowing more efficient propagation of long-range data dependencies. We evaluate the effectiveness of structure-evolving LSTM in the application of semantic object parsing and demonstrate its advantage over state-of-the-art LSTM models on standard benchmarks.
Tasks
Published 2017-03-08
URL http://arxiv.org/abs/1703.03055v1
PDF http://arxiv.org/pdf/1703.03055v1.pdf
PWC https://paperswithcode.com/paper/interpretable-structure-evolving-lstm
Repo
Framework
comments powered by Disqus