Paper Group ANR 590
![Paper Group ANR 590](/2017/images/pwc/paper-arxiv_hu144ec288a26b3e360d673e256787de3e_28623_900x500_fit_q75_box.jpg)
The Causal Frame Problem: An Algorithmic Perspective. Overlapping Cover Local Regression Machines. What Would a Graph Look Like in This Layout? A Machine Learning Approach to Large Graph Visualization. Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics. Fully Distributed and Asynchronized Stochastic Gra …
The Causal Frame Problem: An Algorithmic Perspective
Title | The Causal Frame Problem: An Algorithmic Perspective |
Authors | Ardavan Salehi Nobandegani, Ioannis N. Psaromiligkos |
Abstract | The Frame Problem (FP) is a puzzle in philosophy of mind and epistemology, articulated by the Stanford Encyclopedia of Philosophy as follows: “How do we account for our apparent ability to make decisions on the basis only of what is relevant to an ongoing situation without having explicitly to consider all that is not relevant?” In this work, we focus on the causal variant of the FP, the Causal Frame Problem (CFP). Assuming that a reasoner’s mental causal model can be (implicitly) represented by a causal Bayes net, we first introduce a notion called Potential Level (PL). PL, in essence, encodes the relative position of a node with respect to its neighbors in a causal Bayes net. Drawing on the psychological literature on causal judgment, we substantiate the claim that PL may bear on how time is encoded in the mind. Using PL, we propose an inference framework, called the PL-based Inference Framework (PLIF), which permits a boundedly-rational approach to the CFP to be formally articulated at Marr’s algorithmic level of analysis. We show that our proposed framework, PLIF, is consistent with a wide range of findings in causal judgment literature, and that PL and PLIF make a number of predictions, some of which are already supported by existing findings. |
Tasks | |
Published | 2017-01-26 |
URL | http://arxiv.org/abs/1701.08100v1 |
http://arxiv.org/pdf/1701.08100v1.pdf | |
PWC | https://paperswithcode.com/paper/the-causal-frame-problem-an-algorithmic |
Repo | |
Framework | |
Overlapping Cover Local Regression Machines
Title | Overlapping Cover Local Regression Machines |
Authors | Mohamed Elhoseiny, Ahmed Elgammal |
Abstract | We present the Overlapping Domain Cover (ODC) notion for kernel machines, as a set of overlapping subsets of the data that covers the entire training set and optimized to be spatially cohesive as possible. We show how this notion benefit the speed of local kernel machines for regression in terms of both speed while achieving while minimizing the prediction error. We propose an efficient ODC framework, which is applicable to various regression models and in particular reduces the complexity of Twin Gaussian Processes (TGP) regression from cubic to quadratic. Our notion is also applicable to several kernel methods (e.g., Gaussian Process Regression(GPR) and IWTGP regression, as shown in our experiments). We also theoretically justified the idea behind our method to improve local prediction by the overlapping cover. We validated and analyzed our method on three benchmark human pose estimation datasets and interesting findings are discussed. |
Tasks | Gaussian Processes, Pose Estimation |
Published | 2017-01-05 |
URL | http://arxiv.org/abs/1701.01218v1 |
http://arxiv.org/pdf/1701.01218v1.pdf | |
PWC | https://paperswithcode.com/paper/overlapping-cover-local-regression-machines |
Repo | |
Framework | |
What Would a Graph Look Like in This Layout? A Machine Learning Approach to Large Graph Visualization
Title | What Would a Graph Look Like in This Layout? A Machine Learning Approach to Large Graph Visualization |
Authors | Oh-Hyun Kwon, Tarik Crnovrsanin, Kwan-Liu Ma |
Abstract | Using different methods for laying out a graph can lead to very different visual appearances, with which the viewer perceives different information. Selecting a “good” layout method is thus important for visualizing a graph. The selection can be highly subjective and dependent on the given task. A common approach to selecting a good layout is to use aesthetic criteria and visual inspection. However, fully calculating various layouts and their associated aesthetic metrics is computationally expensive. In this paper, we present a machine learning approach to large graph visualization based on computing the topological similarity of graphs using graph kernels. For a given graph, our approach can show what the graph would look like in different layouts and estimate their corresponding aesthetic metrics. An important contribution of our work is the development of a new framework to design graph kernels. Our experimental study shows that our estimation calculation is considerably faster than computing the actual layouts and their aesthetic metrics. Also, our graph kernels outperform the state-of-the-art ones in both time and accuracy. In addition, we conducted a user study to demonstrate that the topological similarity computed with our graph kernel matches perceptual similarity assessed by human users. |
Tasks | |
Published | 2017-10-11 |
URL | http://arxiv.org/abs/1710.04328v1 |
http://arxiv.org/pdf/1710.04328v1.pdf | |
PWC | https://paperswithcode.com/paper/what-would-a-graph-look-like-in-this-layout-a |
Repo | |
Framework | |
Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics
Title | Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics |
Authors | Hyungjun Kim, Taesu Kim, Jinseok Kim, Jae-Joon Kim |
Abstract | Artificial Neural Network computation relies on intensive vector-matrix multiplications. Recently, the emerging nonvolatile memory (NVM) crossbar array showed a feasibility of implementing such operations with high energy efficiency, thus there are many works on efficiently utilizing emerging NVM crossbar array as analog vector-matrix multiplier. However, its nonlinear I-V characteristics restrain critical design parameters, such as the read voltage and weight range, resulting in substantial accuracy loss. In this paper, instead of optimizing hardware parameters to a given neural network, we propose a methodology of reconstructing a neural network itself optimized to resistive memory crossbar arrays. To verify the validity of the proposed method, we simulated various neural network with MNIST and CIFAR-10 dataset using two different specific Resistive Random Access Memory (RRAM) model. Simulation results show that our proposed neural network produces significantly higher inference accuracies than conventional neural network when the synapse devices have nonlinear I-V characteristics. |
Tasks | |
Published | 2017-03-30 |
URL | http://arxiv.org/abs/1703.10642v1 |
http://arxiv.org/pdf/1703.10642v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-network-optimized-to-resistive |
Repo | |
Framework | |
Fully Distributed and Asynchronized Stochastic Gradient Descent for Networked Systems
Title | Fully Distributed and Asynchronized Stochastic Gradient Descent for Networked Systems |
Authors | Ying Zhang |
Abstract | This paper considers a general data-fitting problem over a networked system, in which many computing nodes are connected by an undirected graph. This kind of problem can find many real-world applications and has been studied extensively in the literature. However, existing solutions either need a central controller for information sharing or requires slot synchronization among different nodes, which increases the difficulty of practical implementations, especially for a very large and heterogeneous system. As a contrast, in this paper, we treat the data-fitting problem over the network as a stochastic programming problem with many constraints. By adapting the results in a recent paper, we design a fully distributed and asynchronized stochastic gradient descent (SGD) algorithm. We show that our algorithm can achieve global optimality and consensus asymptotically by only local computations and communications. Additionally, we provide a sharp lower bound for the convergence speed in the regular graph case. This result fits the intuition and provides guidance to design a `good’ network topology to speed up the convergence. Also, the merit of our design is validated by experiments on both synthetic and real-world datasets. | |
Tasks | |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.03992v1 |
http://arxiv.org/pdf/1704.03992v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-distributed-and-asynchronized |
Repo | |
Framework | |
Residual Connections Encourage Iterative Inference
Title | Residual Connections Encourage Iterative Inference |
Authors | Stanisław Jastrzębski, Devansh Arpit, Nicolas Ballas, Vikas Verma, Tong Che, Yoshua Bengio |
Abstract | Residual networks (Resnets) have become a prominent architecture in deep learning. However, a comprehensive understanding of Resnets is still a topic of ongoing research. A recent view argues that Resnets perform iterative refinement of features. We attempt to further expose properties of this aspect. To this end, we study Resnets both analytically and empirically. We formalize the notion of iterative refinement in Resnets by showing that residual connections naturally encourage features of residual blocks to move along the negative gradient of loss as we go from one block to the next. In addition, our empirical analysis suggests that Resnets are able to perform both representation learning and iterative refinement. In general, a Resnet block tends to concentrate representation learning behavior in the first few layers while higher layers perform iterative refinement of features. Finally we observe that sharing residual layers naively leads to representation explosion and counterintuitively, overfitting, and we show that simple existing strategies can help alleviating this problem. |
Tasks | Representation Learning |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.04773v2 |
http://arxiv.org/pdf/1710.04773v2.pdf | |
PWC | https://paperswithcode.com/paper/residual-connections-encourage-iterative |
Repo | |
Framework | |
Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima
Title | Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima |
Authors | Yaodong Yu, Pan Xu, Quanquan Gu |
Abstract | We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently. More specifically, the proposed algorithm only needs $\tilde{O}(\epsilon^{-10/3})$ stochastic gradient evaluations to converge to an approximate local minimum $\mathbf{x}$, which satisfies $\nabla f(\mathbf{x})_2\leq\epsilon$ and $\lambda_{\min}(\nabla^2 f(\mathbf{x}))\geq -\sqrt{\epsilon}$ in the general stochastic optimization setting, where $\tilde{O}(\cdot)$ hides logarithm polynomial terms and constants. This improves upon the $\tilde{O}(\epsilon^{-7/2})$ gradient complexity achieved by the state-of-the-art stochastic local minima finding algorithms by a factor of $\tilde{O}(\epsilon^{-1/6})$. For nonconvex finite-sum optimization, our algorithm also outperforms the best known algorithms in a certain regime. |
Tasks | Stochastic Optimization |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06585v1 |
http://arxiv.org/pdf/1712.06585v1.pdf | |
PWC | https://paperswithcode.com/paper/third-order-smoothness-helps-even-faster |
Repo | |
Framework | |
An Efficient Algorithm for Bayesian Nearest Neighbours
Title | An Efficient Algorithm for Bayesian Nearest Neighbours |
Authors | Giuseppe Nuti |
Abstract | K-Nearest Neighbours (k-NN) is a popular classification and regression algorithm, yet one of its main limitations is the difficulty in choosing the number of neighbours. We present a Bayesian algorithm to compute the posterior probability distribution for k given a target point within a data-set, efficiently and without the use of Markov Chain Monte Carlo (MCMC) methods or simulation - alongside an exact solution for distributions within the exponential family. The central idea is that data points around our target are generated by the same probability distribution, extending outwards over the appropriate, though unknown, number of neighbours. Once the data is projected onto a distance metric of choice, we can transform the choice of k into a change-point detection problem, for which there is an efficient solution: we recursively compute the probability of the last change-point as we move towards our target, and thus de facto compute the posterior probability distribution over k. Applying this approach to both a classification and a regression UCI data-sets, we compare favourably and, most importantly, by removing the need for simulation, we are able to compute the posterior probability of k exactly and rapidly. As an example, the computational time for the Ripley data-set is a few milliseconds compared to a few hours when using a MCMC approach. |
Tasks | Change Point Detection |
Published | 2017-05-26 |
URL | http://arxiv.org/abs/1705.09407v2 |
http://arxiv.org/pdf/1705.09407v2.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-algorithm-for-bayesian-nearest |
Repo | |
Framework | |
Nearly second-order asymptotic optimality of sequential change-point detection with one-sample updates
Title | Nearly second-order asymptotic optimality of sequential change-point detection with one-sample updates |
Authors | Yang Cao, Liyan Xie, Yao Xie, Huan Xu |
Abstract | Sequential change-point detection when the distribution parameters are unknown is a fundamental problem in statistics and machine learning. When the post-change parameters are unknown, we consider a set of detection procedures based on sequential likelihood ratios with non-anticipating estimators constructed using online convex optimization algorithms such as online mirror descent, which provides a more versatile approach to tackle complex situations where recursive maximum likelihood estimators cannot be found. When the underlying distributions belong to a exponential family and the estimators satisfy the logarithm regret property, we show that this approach is nearly second-order asymptotically optimal. This means that the upper bound for the false alarm rate of the algorithm (measured by the average-run-length) meets the lower bound asymptotically up to a log-log factor when the threshold tends to infinity. Our proof is achieved by making a connection between sequential change-point and online convex optimization and leveraging the logarithmic regret bound property of online mirror descent algorithm. Numerical and real data examples validate our theory. |
Tasks | Change Point Detection |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.06995v4 |
http://arxiv.org/pdf/1705.06995v4.pdf | |
PWC | https://paperswithcode.com/paper/nearly-second-order-asymptotic-optimality-of |
Repo | |
Framework | |
Widening siamese architectures for stereo matching
Title | Widening siamese architectures for stereo matching |
Authors | Patrick Brandao, Evangelos Mazomenos, Danail Stoyanov |
Abstract | Computational stereo is one of the classical problems in computer vision. Numerous algorithms and solutions have been reported in recent years focusing on developing methods for computing similarity, aggregating it to obtain spatial support and finally optimizing an energy function to find the final disparity. In this paper, we focus on the feature extraction component of stereo matching architecture and we show standard CNNs operation can be used to improve the quality of the features used to find point correspondences. Furthermore, we propose a simple space aggregation that hugely simplifies the correlation learning problem. Our results on benchmark data are compelling and show promising potential even without refining the solution. |
Tasks | Stereo Matching, Stereo Matching Hand |
Published | 2017-11-01 |
URL | http://arxiv.org/abs/1711.00499v1 |
http://arxiv.org/pdf/1711.00499v1.pdf | |
PWC | https://paperswithcode.com/paper/widening-siamese-architectures-for-stereo |
Repo | |
Framework | |
Virtual Blood Vessels in Complex Background using Stereo X-ray Images
Title | Virtual Blood Vessels in Complex Background using Stereo X-ray Images |
Authors | Qiuyu Chen, Ryoma Bise, Lin Gu, Yinqiang Zheng, Imari Sato, Jenq-Neng Hwang, Nobuaki Imanishi, Sadakazu Aiso |
Abstract | We propose a fully automatic system to reconstruct and visualize 3D blood vessels in Augmented Reality (AR) system from stereo X-ray images with bones and body fat. Currently, typical 3D imaging technologies are expensive and carrying the risk of irradiation exposure. To reduce the potential harm, we only need to take two X-ray images before visualizing the vessels. Our system can effectively reconstruct and visualize vessels in following steps. We first conduct initial segmentation using Markov Random Field and then refine segmentation in an entropy based post-process. We parse the segmented vessels by extracting their centerlines and generating trees. We propose a coarse-to-fine scheme for stereo matching, including initial matching using affine transform and dense matching using Hungarian algorithm guided by Gaussian regression. Finally, we render and visualize the reconstructed model in a HoloLens based AR system, which can essentially change the way of visualizing medical data. We have evaluated its performance by using synthetic and real stereo X-ray images, and achieved satisfactory quantitative and qualitative results. |
Tasks | Stereo Matching, Stereo Matching Hand |
Published | 2017-09-22 |
URL | http://arxiv.org/abs/1709.07551v1 |
http://arxiv.org/pdf/1709.07551v1.pdf | |
PWC | https://paperswithcode.com/paper/virtual-blood-vessels-in-complex-background |
Repo | |
Framework | |
Probabilistic Reduced-Order Modeling for Stochastic Partial Differential Equations
Title | Probabilistic Reduced-Order Modeling for Stochastic Partial Differential Equations |
Authors | Constantin Grigo, Phaedon-Stelios Koutsourelakis |
Abstract | We discuss a Bayesian formulation to coarse-graining (CG) of PDEs where the coefficients (e.g. material parameters) exhibit random, fine scale variability. The direct solution to such problems requires grids that are small enough to resolve this fine scale variability which unavoidably requires the repeated solution of very large systems of algebraic equations. We establish a physically inspired, data-driven coarse-grained model which learns a low- dimensional set of microstructural features that are predictive of the fine-grained model (FG) response. Once learned, those features provide a sharp distribution over the coarse scale effec- tive coefficients of the PDE that are most suitable for prediction of the fine scale model output. This ultimately allows to replace the computationally expensive FG by a generative proba- bilistic model based on evaluating the much cheaper CG several times. Sparsity enforcing pri- ors further increase predictive efficiency and reveal microstructural features that are important in predicting the FG response. Moreover, the model yields probabilistic rather than single-point predictions, which enables the quantification of the unavoidable epistemic uncertainty that is present due to the information loss that occurs during the coarse-graining process. |
Tasks | |
Published | 2017-03-06 |
URL | http://arxiv.org/abs/1703.01962v1 |
http://arxiv.org/pdf/1703.01962v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-reduced-order-modeling-for |
Repo | |
Framework | |
Neural and Statistical Methods for Leveraging Meta-information in Machine Translation
Title | Neural and Statistical Methods for Leveraging Meta-information in Machine Translation |
Authors | Shahram Khadivi, Patrick Wilken, Leonard Dahlmann, Evgeny Matusov |
Abstract | In this paper, we discuss different methods which use meta information and richer context that may accompany source language input to improve machine translation quality. We focus on category information of input text as meta information, but the proposed methods can be extended to all textual and non-textual meta information that might be available for the input text or automatically predicted using the text content. The main novelty of this work is to use state-of-the-art neural network methods to tackle this problem within a statistical machine translation (SMT) framework. We observe translation quality improvements up to 3% in terms of BLEU score in some text categories. |
Tasks | Machine Translation |
Published | 2017-08-10 |
URL | http://arxiv.org/abs/1708.03186v1 |
http://arxiv.org/pdf/1708.03186v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-and-statistical-methods-for-leveraging |
Repo | |
Framework | |
HyperPower: Power- and Memory-Constrained Hyper-Parameter Optimization for Neural Networks
Title | HyperPower: Power- and Memory-Constrained Hyper-Parameter Optimization for Neural Networks |
Authors | Dimitrios Stamoulis, Ermao Cai, Da-Cheng Juan, Diana Marculescu |
Abstract | While selecting the hyper-parameters of Neural Networks (NNs) has been so far treated as an art, the emergence of more complex, deeper architectures poses increasingly more challenges to designers and Machine Learning (ML) practitioners, especially when power and memory constraints need to be considered. In this work, we propose HyperPower, a framework that enables efficient Bayesian optimization and random search in the context of power- and memory-constrained hyper-parameter optimization for NNs running on a given hardware platform. HyperPower is the first work (i) to show that power consumption can be used as a low-cost, a priori known constraint, and (ii) to propose predictive models for the power and memory of NNs executing on GPUs. Thanks to HyperPower, the number of function evaluations and the best test error achieved by a constraint-unaware method are reached up to 112.99x and 30.12x faster, respectively, while never considering invalid configurations. HyperPower significantly speeds up the hyper-parameter optimization, achieving up to 57.20x more function evaluations compared to constraint-unaware methods for a given time interval, effectively yielding significant accuracy improvements by up to 67.6%. |
Tasks | |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02446v1 |
http://arxiv.org/pdf/1712.02446v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperpower-power-and-memory-constrained-hyper |
Repo | |
Framework | |
Transitioning between Convolutional and Fully Connected Layers in Neural Networks
Title | Transitioning between Convolutional and Fully Connected Layers in Neural Networks |
Authors | Shazia Akbar, Mohammad Peikari, Sherine Salama, Sharon Nofech-Mozes, Anne Martel |
Abstract | Digital pathology has advanced substantially over the last decade however tumor localization continues to be a challenging problem due to highly complex patterns and textures in the underlying tissue bed. The use of convolutional neural networks (CNNs) to analyze such complex images has been well adopted in digital pathology. However in recent years, the architecture of CNNs have altered with the introduction of inception modules which have shown great promise for classification tasks. In this paper, we propose a modified “transition” module which learns global average pooling layers from filters of varying sizes to encourage class-specific filters at multiple spatial resolutions. We demonstrate the performance of the transition module in AlexNet and ZFNet, for classifying breast tumors in two independent datasets of scanned histology sections, of which the transition module was superior. |
Tasks | |
Published | 2017-07-18 |
URL | http://arxiv.org/abs/1707.05743v1 |
http://arxiv.org/pdf/1707.05743v1.pdf | |
PWC | https://paperswithcode.com/paper/transitioning-between-convolutional-and-fully |
Repo | |
Framework | |