Paper Group ANR 128
Capacity limitations of visual search in deep convolutional neural network. Exploiting Convolutional Representations for Multiscale Human Settlement Detection. Convolutional Neural Networks Via Node-Varying Graph Filters. Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary. Learning Rank Reduced Int …
Capacity limitations of visual search in deep convolutional neural network
Title | Capacity limitations of visual search in deep convolutional neural network |
Authors | Endel Poder |
Abstract | Deep convolutional neural networks follow roughly the architecture of biological visual systems, and have shown a performance comparable to human observers in object recognition tasks. In this study, I test a pre-trained deep neural network in some classic visual search tasks. The results reveal a qualitative difference from human performance. It appears that there is no difference between searches for simple features that pop out in experiments with humans, and for feature configurations that exhibit strict capacity limitations in human vision. Both types of stimuli reveal moderate capacity limitations in the neural network tested here. |
Tasks | Object Recognition |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1707.09775v1 |
http://arxiv.org/pdf/1707.09775v1.pdf | |
PWC | https://paperswithcode.com/paper/capacity-limitations-of-visual-search-in-deep |
Repo | |
Framework | |
Exploiting Convolutional Representations for Multiscale Human Settlement Detection
Title | Exploiting Convolutional Representations for Multiscale Human Settlement Detection |
Authors | Dalton Lunga, Dilip Patlolla, Lexie Yang, Jeanette Weaver, Budhendra Bhadhuri |
Abstract | We test this premise and explore representation spaces from a single deep convolutional network and their visualization to argue for a novel unified feature extraction framework. The objective is to utilize and re-purpose trained feature extractors without the need for network retraining on three remote sensing tasks i.e. superpixel mapping, pixel-level segmentation and semantic based image visualization. By leveraging the same convolutional feature extractors and viewing them as visual information extractors that encode different image representation spaces, we demonstrate a preliminary inductive transfer learning potential on multiscale experiments that incorporate edge-level details up to semantic-level information. |
Tasks | Transfer Learning |
Published | 2017-07-18 |
URL | http://arxiv.org/abs/1707.05683v1 |
http://arxiv.org/pdf/1707.05683v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-convolutional-representations-for |
Repo | |
Framework | |
Convolutional Neural Networks Via Node-Varying Graph Filters
Title | Convolutional Neural Networks Via Node-Varying Graph Filters |
Authors | Fernando Gama, Geert Leus, Antonio G. Marques, Alejandro Ribeiro |
Abstract | Convolutional neural networks (CNNs) are being applied to an increasing number of problems and fields due to their superior performance in classification and regression tasks. Since two of the key operations that CNNs implement are convolution and pooling, this type of networks is implicitly designed to act on data described by regular structures such as images. Motivated by the recent interest in processing signals defined in irregular domains, we advocate a CNN architecture that operates on signals supported on graphs. The proposed design replaces the classical convolution not with a node-invariant graph filter (GF), which is the natural generalization of convolution to graph domains, but with a node-varying GF. This filter extracts different local features without increasing the output dimension of each layer and, as a result, bypasses the need for a pooling stage while involving only local operations. A second contribution is to replace the node-varying GF with a hybrid node-varying GF, which is a new type of GF introduced in this paper. While the alternative architecture can still be run locally without requiring a pooling stage, the number of trainable parameters is smaller and can be rendered independent of the data dimension. Tests are run on a synthetic source localization problem and on the 20NEWS dataset. |
Tasks | |
Published | 2017-10-27 |
URL | http://arxiv.org/abs/1710.10355v2 |
http://arxiv.org/pdf/1710.10355v2.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-via-node |
Repo | |
Framework | |
Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary
Title | Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary |
Authors | Krishnaram Kenthapadi, Stuart Ambler, Liang Zhang, Deepak Agarwal |
Abstract | The recently launched LinkedIn Salary product has been designed with the goal of providing compensation insights to the world’s professionals and thereby helping them optimize their earning potential. We describe the overall design and architecture of the statistical modeling system underlying this product. We focus on the unique data mining challenges while designing and implementing the system, and describe the modeling components such as Bayesian hierarchical smoothing that help to compute and present robust compensation insights to users. We report on extensive evaluation with nearly one year of de-identified compensation data collected from over one million LinkedIn users, thereby demonstrating the efficacy of the statistical models. We also highlight the lessons learned through the deployment of our system at LinkedIn. |
Tasks | |
Published | 2017-03-29 |
URL | http://arxiv.org/abs/1703.09845v3 |
http://arxiv.org/pdf/1703.09845v3.pdf | |
PWC | https://paperswithcode.com/paper/bringing-salary-transparency-to-the-world |
Repo | |
Framework | |
Learning Rank Reduced Interpolation with Principal Component Analysis
Title | Learning Rank Reduced Interpolation with Principal Component Analysis |
Authors | Matthias Ochs, Henry Bradler, Rudolf Mester |
Abstract | In computer vision most iterative optimization algorithms, both sparse and dense, rely on a coarse and reliable dense initialization to bootstrap their optimization procedure. For example, dense optical flow algorithms profit massively in speed and robustness if they are initialized well in the basin of convergence of the used loss function. The same holds true for methods as sparse feature tracking when initial flow or depth information for new features at arbitrary positions is needed. This makes it extremely important to have techniques at hand that allow to obtain from only very few available measurements a dense but still approximative sketch of a desired 2D structure (e.g. depth maps, optical flow, disparity maps, etc.). The 2D map is regarded as sample from a 2D random process. The method presented here exploits the complete information given by the principal component analysis (PCA) of that process, the principal basis and its prior distribution. The method is able to determine a dense reconstruction from sparse measurement. When facing situations with only very sparse measurements, typically the number of principal components is further reduced which results in a loss of expressiveness of the basis. We overcome this problem and inject prior knowledge in a maximum a posterior (MAP) approach. We test our approach on the KITTI and the virtual KITTI datasets and focus on the interpolation of depth maps for driving scenes. The evaluation of the results show good agreement to the ground truth and are clearly better than results of interpolation by the nearest neighbor method which disregards statistical information. |
Tasks | Optical Flow Estimation |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.05061v1 |
http://arxiv.org/pdf/1703.05061v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-rank-reduced-interpolation-with |
Repo | |
Framework | |
Augmented Ensemble MCMC sampling in Factorial Hidden Markov Models
Title | Augmented Ensemble MCMC sampling in Factorial Hidden Markov Models |
Authors | Kaspar Märtens, Michalis K Titsias, Christopher Yau |
Abstract | Bayesian inference for factorial hidden Markov models is challenging due to the exponentially sized latent variable space. Standard Monte Carlo samplers can have difficulties effectively exploring the posterior landscape and are often restricted to exploration around localised regions that depend on initialisation. We introduce a general purpose ensemble Markov Chain Monte Carlo (MCMC) technique to improve on existing poorly mixing samplers. This is achieved by combining parallel tempering and an auxiliary variable scheme to exchange information between the chains in an efficient way. The latter exploits a genetic algorithm within an augmented Gibbs sampler. We compare our technique with various existing samplers in a simulation study as well as in a cancer genomics application, demonstrating the improvements obtained by our augmented ensemble approach. |
Tasks | Bayesian Inference |
Published | 2017-03-24 |
URL | http://arxiv.org/abs/1703.08520v2 |
http://arxiv.org/pdf/1703.08520v2.pdf | |
PWC | https://paperswithcode.com/paper/augmented-ensemble-mcmc-sampling-in-factorial |
Repo | |
Framework | |
Stock Trading Using PE ratio: A Dynamic Bayesian Network Modeling on Behavioral Finance and Fundamental Investment
Title | Stock Trading Using PE ratio: A Dynamic Bayesian Network Modeling on Behavioral Finance and Fundamental Investment |
Authors | Haizhen Wang, Ratthachat Chatpatanasiri, Pairote Sattayatham |
Abstract | On a daily investment decision in a security market, the price earnings (PE) ratio is one of the most widely applied methods being used as a firm valuation tool by investment experts. Unfortunately, recent academic developments in financial econometrics and machine learning rarely look at this tool. In practice, fundamental PE ratios are often estimated only by subjective expert opinions. The purpose of this research is to formalize a process of fundamental PE estimation by employing advanced dynamic Bayesian network (DBN) methodology. The estimated PE ratio from our model can be used either as a information support for an expert to make investment decisions, or as an automatic trading system illustrated in experiments. Forward-backward inference and EM parameter estimation algorithms are derived with respect to the proposed DBN structure. Unlike existing works in literatures, the economic interpretation of our DBN model is well-justified by behavioral finance evidences of volatility. A simple but practical trading strategy is invented based on the result of Bayesian inference. Extensive experiments show that our trading strategy equipped with the inferenced PE ratios consistently outperforms standard investment benchmarks. |
Tasks | Bayesian Inference |
Published | 2017-05-25 |
URL | http://arxiv.org/abs/1706.02985v1 |
http://arxiv.org/pdf/1706.02985v1.pdf | |
PWC | https://paperswithcode.com/paper/stock-trading-using-pe-ratio-a-dynamic |
Repo | |
Framework | |
Anisotropic EM Segmentation by 3D Affinity Learning and Agglomeration
Title | Anisotropic EM Segmentation by 3D Affinity Learning and Agglomeration |
Authors | Toufiq Parag, Fabian Tschopp, William Grisaitis, Srinivas C Turaga, Xuewen Zhang, Brian Matejek, Lee Kamentsky, Jeff W. Lichtman, Hanspeter Pfister |
Abstract | The field of connectomics has recently produced neuron wiring diagrams from relatively large brain regions from multiple animals. Most of these neural reconstructions were computed from isotropic (e.g., FIBSEM) or near isotropic (e.g., SBEM) data. In spite of the remarkable progress on algorithms in recent years, automatic dense reconstruction from anisotropic data remains a challenge for the connectomics community. One significant hurdle in the segmentation of anisotropic data is the difficulty in generating a suitable initial over-segmentation. In this study, we present a segmentation method for anisotropic EM data that agglomerates a 3D over-segmentation computed from the 3D affinity prediction. A 3D U-net is trained to predict 3D affinities by the MALIS approach. Experiments on multiple datasets demonstrates the strength and robustness of the proposed method for anisotropic EM segmentation. |
Tasks | |
Published | 2017-07-27 |
URL | http://arxiv.org/abs/1707.08935v2 |
http://arxiv.org/pdf/1707.08935v2.pdf | |
PWC | https://paperswithcode.com/paper/anisotropic-em-segmentation-by-3d-affinity |
Repo | |
Framework | |
Noisy Tensor Completion for Tensors with a Sparse Canonical Polyadic Factor
Title | Noisy Tensor Completion for Tensors with a Sparse Canonical Polyadic Factor |
Authors | Swayambhoo Jain, Alexander Gutierrez, Jarvis Haupt |
Abstract | In this paper we study the problem of noisy tensor completion for tensors that admit a canonical polyadic or CANDECOMP/PARAFAC (CP) decomposition with one of the factors being sparse. We present general theoretical error bounds for an estimate obtained by using a complexity-regularized maximum likelihood principle and then instantiate these bounds for the case of additive white Gaussian noise. We also provide an ADMM-type algorithm for solving the complexity-regularized maximum likelihood problem and validate the theoretical finding via experiments on synthetic data set. |
Tasks | |
Published | 2017-04-08 |
URL | http://arxiv.org/abs/1704.02534v1 |
http://arxiv.org/pdf/1704.02534v1.pdf | |
PWC | https://paperswithcode.com/paper/noisy-tensor-completion-for-tensors-with-a |
Repo | |
Framework | |
Block-Diagonal and LT Codes for Distributed Computing With Straggling Servers
Title | Block-Diagonal and LT Codes for Distributed Computing With Straggling Servers |
Authors | Albin Severinson, Alexandre Graell i Amat, Eirik Rosnes |
Abstract | We propose two coded schemes for the distributed computing problem of multiplying a matrix by a set of vectors. The first scheme is based on partitioning the matrix into submatrices and applying maximum distance separable (MDS) codes to each submatrix. For this scheme, we prove that up to a given number of partitions the communication load and the computational delay (not including the encoding and decoding delay) are identical to those of the scheme recently proposed by Li et al., based on a single, long MDS code. However, due to the use of shorter MDS codes, our scheme yields a significantly lower overall computational delay when the delay incurred by encoding and decoding is also considered. We further propose a second coded scheme based on Luby Transform (LT) codes under inactivation decoding. Interestingly, LT codes may reduce the delay over the partitioned scheme at the expense of an increased communication load. We also consider distributed computing under a deadline and show numerically that the proposed schemes outperform other schemes in the literature, with the LT code-based scheme yielding the best performance for the scenarios considered. |
Tasks | |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.08230v3 |
http://arxiv.org/pdf/1712.08230v3.pdf | |
PWC | https://paperswithcode.com/paper/block-diagonal-and-lt-codes-for-distributed |
Repo | |
Framework | |
Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization
Title | Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization |
Authors | Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara |
Abstract | This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech. A standard approach to speech enhancement is to train a deep neural network (DNN) to take noisy speech as input and output clean speech. Although this supervised approach requires a very large amount of pair data for training, it is not robust against unknown environments. Another approach is to use non-negative matrix factorization (NMF) based on basis spectra trained on clean speech in advance and those adapted to noise on the fly. This semi-supervised approach, however, causes considerable signal distortion in enhanced speech due to the unrealistic assumption that speech spectrograms are linear combinations of the basis spectra. Replacing the poor linear generative model of clean speech in NMF with a VAE—a powerful nonlinear deep generative model—trained on clean speech, we formulate a unified probabilistic generative model of noisy speech. Given noisy speech as observed data, we can sample clean speech from its posterior distribution. The proposed method outperformed the conventional DNN-based method in unseen noisy environments. |
Tasks | Speech Enhancement |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1710.11439v4 |
http://arxiv.org/pdf/1710.11439v4.pdf | |
PWC | https://paperswithcode.com/paper/statistical-speech-enhancement-based-on |
Repo | |
Framework | |
A Reverse Hex Solver
Title | A Reverse Hex Solver |
Authors | Kenny Young, Ryan B. Hayward |
Abstract | We present Solrex,an automated solver for the game of Reverse Hex.Reverse Hex, also known as Rex, or Misere Hex, is the variant of the game of Hex in which the player who joins her two sides loses the game. Solrex performs a mini-max search of the state space using Scalable Parallel Depth First Proof Number Search, enhanced by the pruning of inferior moves and the early detection of certain winning strategies. Solrex is implemented on the same code base as the Hex program Solver, and can solve arbitrary positions on board sizes up to 6x6, with the hardest position taking less than four hours on four threads. |
Tasks | |
Published | 2017-04-26 |
URL | http://arxiv.org/abs/1707.00627v1 |
http://arxiv.org/pdf/1707.00627v1.pdf | |
PWC | https://paperswithcode.com/paper/a-reverse-hex-solver |
Repo | |
Framework | |
Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement
Title | Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement |
Authors | Nasser Mohammadiha, Arne Leijon |
Abstract | Deriving a good model for multitalker babble noise can facilitate different speech processing algorithms, e.g. noise reduction, to reduce the so-called cocktail party difficulty. In the available systems, the fact that the babble waveform is generated as a sum of N different speech waveforms is not exploited explicitly. In this paper, first we develop a gamma hidden Markov model for power spectra of the speech signal, and then formulate it as a sparse nonnegative matrix factorization (NMF). Second, the sparse NMF is extended by relaxing the sparsity constraint, and a novel model for babble noise (gamma nonnegative HMM) is proposed in which the babble basis matrix is the same as the speech basis matrix, and only the activation factors (weights) of the basis vectors are different for the two signals over time. Finally, a noise reduction algorithm is proposed using the derived speech and babble models. All of the stationary model parameters are estimated using the expectation-maximization (EM) algorithm, whereas the time-varying parameters, i.e. the gain parameters of speech and babble signals, are estimated using a recursive EM algorithm. The objective and subjective listening evaluations show that the proposed babble model and the final noise reduction algorithm significantly outperform the conventional methods. |
Tasks | Speech Enhancement |
Published | 2017-09-16 |
URL | http://arxiv.org/abs/1709.05559v1 |
http://arxiv.org/pdf/1709.05559v1.pdf | |
PWC | https://paperswithcode.com/paper/nonnegative-hmm-for-babble-noise-derived-from |
Repo | |
Framework | |
Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification
Title | Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification |
Authors | Daniel Michelsanti, Zheng-Hua Tan |
Abstract | Improving speech system performance in noisy environments remains a challenging task, and speech enhancement (SE) is one of the effective techniques to solve the problem. Motivated by the promising results of generative adversarial networks (GANs) in a variety of image processing tasks, we explore the potential of conditional GANs (cGANs) for SE, and in particular, we make use of the image processing framework proposed by Isola et al. [1] to learn a mapping from the spectrogram of noisy speech to an enhanced counterpart. The SE cGAN consists of two networks, trained in an adversarial manner: a generator that tries to enhance the input noisy spectrogram, and a discriminator that tries to distinguish between enhanced spectrograms provided by the generator and clean ones from the database using the noisy spectrogram as a condition. We evaluate the performance of the cGAN method in terms of perceptual evaluation of speech quality (PESQ), short-time objective intelligibility (STOI), and equal error rate (EER) of speaker verification (an example application). Experimental results show that the cGAN method overall outperforms the classical short-time spectral amplitude minimum mean square error (STSA-MMSE) SE algorithm, and is comparable to a deep neural network-based SE approach (DNN-SE). |
Tasks | Speaker Verification, Speech Enhancement |
Published | 2017-09-06 |
URL | http://arxiv.org/abs/1709.01703v2 |
http://arxiv.org/pdf/1709.01703v2.pdf | |
PWC | https://paperswithcode.com/paper/conditional-generative-adversarial-networks-1 |
Repo | |
Framework | |
Weakly Supervised Image Annotation and Segmentation with Objects and Attributes
Title | Weakly Supervised Image Annotation and Segmentation with Objects and Attributes |
Authors | Zhiyuan Shi, Yongxin Yang, Timothy M. Hospedales, Tao Xiang |
Abstract | We propose to model complex visual scenes using a non-parametric Bayesian model learned from weakly labelled images abundant on media sharing sites such as Flickr. Given weak image-level annotations of objects and attributes without locations or associations between them, our model aims to learn the appearance of object and attribute classes as well as their association on each object instance. Once learned, given an image, our model can be deployed to tackle a number of vision problems in a joint and coherent manner, including recognising objects in the scene (automatic object annotation), describing objects using their attributes (attribute prediction and association), and localising and delineating the objects (object detection and semantic segmentation). This is achieved by developing a novel Weakly Supervised Markov Random Field Stacked Indian Buffet Process (WS-MRF-SIBP) that models objects and attributes as latent factors and explicitly captures their correlations within and across superpixels. Extensive experiments on benchmark datasets demonstrate that our weakly supervised model significantly outperforms weakly supervised alternatives and is often comparable with existing strongly supervised models on a variety of tasks including semantic segmentation, automatic image annotation and retrieval based on object-attribute associations. |
Tasks | Object Detection, Semantic Segmentation |
Published | 2017-08-08 |
URL | http://arxiv.org/abs/1708.02459v1 |
http://arxiv.org/pdf/1708.02459v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-image-annotation-and |
Repo | |
Framework | |