Paper Group ANR 680
Investigation of Using VAE for i-Vector Speaker Verification. On the Local Structure of Stable Clustering Instances. A Spectral Approach for the Design of Experiments: Design, Analysis and Algorithms. Diversified Texture Synthesis with Feed-forward Networks. Nonnegative Restricted Boltzmann Machines for Parts-based Representations Discovery and Pre …
Investigation of Using VAE for i-Vector Speaker Verification
Title | Investigation of Using VAE for i-Vector Speaker Verification |
Authors | Timur Pekhovsky, Maxim Korenevsky |
Abstract | New system for i-vector speaker recognition based on variational autoencoder (VAE) is investigated. VAE is a promising approach for developing accurate deep nonlinear generative models of complex data. Experiments show that VAE provides speaker embedding and can be effectively trained in an unsupervised manner. LLR estimate for VAE is developed. Experiments on NIST SRE 2010 data demonstrate its correctness. Additionally, we show that the performance of VAE-based system in the i-vectors space is close to that of the diagonal PLDA. Several interesting results are also observed in the experiments with $\beta$-VAE. In particular, we found that for $\beta\ll 1$, VAE can be trained to capture the features of complex input data distributions in an effective way, which is hard to obtain in the standard VAE ($\beta=1$). |
Tasks | Speaker Recognition, Speaker Verification |
Published | 2017-05-25 |
URL | http://arxiv.org/abs/1705.09185v1 |
http://arxiv.org/pdf/1705.09185v1.pdf | |
PWC | https://paperswithcode.com/paper/investigation-of-using-vae-for-i-vector |
Repo | |
Framework | |
On the Local Structure of Stable Clustering Instances
Title | On the Local Structure of Stable Clustering Instances |
Authors | Vincent Cohen-Addad, Chris Schwiegelshohn |
Abstract | We study the classic $k$-median and $k$-means clustering objectives in the beyond-worst-case scenario. We consider three well-studied notions of structured data that aim at characterizing real-world inputs: Distribution Stability (introduced by Awasthi, Blum, and Sheffet, FOCS 2010), Spectral Separability (introduced by Kumar and Kannan, FOCS 2010), Perturbation Resilience (introduced by Bilu and Linial, ICS 2010). We prove structural results showing that inputs satisfying at least one of the conditions are inherently “local”. Namely, for any such input, any local optimum is close both in term of structure and in term of objective value to the global optima. As a corollary we obtain that the widely-used Local Search algorithm has strong performance guarantees for both the tasks of recovering the underlying optimal clustering and obtaining a clustering of small cost. This is a significant step toward understanding the success of local search heuristics in clustering applications. |
Tasks | |
Published | 2017-01-29 |
URL | http://arxiv.org/abs/1701.08423v3 |
http://arxiv.org/pdf/1701.08423v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-local-structure-of-stable-clustering |
Repo | |
Framework | |
A Spectral Approach for the Design of Experiments: Design, Analysis and Algorithms
Title | A Spectral Approach for the Design of Experiments: Design, Analysis and Algorithms |
Authors | Bhavya Kailkhura, Jayaraman J. Thiagarajan, Charvi Rastogi, Pramod K. Varshney, Peer-Timo Bremer |
Abstract | This paper proposes a new approach to construct high quality space-filling sample designs. First, we propose a novel technique to quantify the space-filling property and optimally trade-off uniformity and randomness in sample designs in arbitrary dimensions. Second, we connect the proposed metric (defined in the spatial domain) to the objective measure of the design performance (defined in the spectral domain). This connection serves as an analytic framework for evaluating the qualitative properties of space-filling designs in general. Using the theoretical insights provided by this spatial-spectral analysis, we derive the notion of optimal space-filling designs, which we refer to as space-filling spectral designs. Third, we propose an efficient estimator to evaluate the space-filling properties of sample designs in arbitrary dimensions and use it to develop an optimization framework to generate high quality space-filling designs. Finally, we carry out a detailed performance comparison on two different applications in 2 to 6 dimensions: a) image reconstruction and b) surrogate modeling on several benchmark optimization functions and an inertial confinement fusion (ICF) simulation code. We demonstrate that the propose spectral designs significantly outperform existing approaches especially in high dimensions. |
Tasks | Image Reconstruction |
Published | 2017-12-16 |
URL | http://arxiv.org/abs/1712.06028v1 |
http://arxiv.org/pdf/1712.06028v1.pdf | |
PWC | https://paperswithcode.com/paper/a-spectral-approach-for-the-design-of |
Repo | |
Framework | |
Diversified Texture Synthesis with Feed-forward Networks
Title | Diversified Texture Synthesis with Feed-forward Networks |
Authors | Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, Ming-Hsuan Yang |
Abstract | Recent progresses on deep discriminative and generative modeling have shown promising results on texture synthesis. However, existing feed-forward based methods trade off generality for efficiency, which suffer from many issues, such as shortage of generality (i.e., build one network per texture), lack of diversity (i.e., always produce visually identical output) and suboptimality (i.e., generate less satisfying visual effects). In this work, we focus on solving these issues for improved texture synthesis. We propose a deep generative feed-forward network which enables efficient synthesis of multiple textures within one single network and meaningful interpolation between them. Meanwhile, a suite of important techniques are introduced to achieve better convergence and diversity. With extensive experiments, we demonstrate the effectiveness of the proposed model and techniques for synthesizing a large number of textures and show its applications with the stylization. |
Tasks | Texture Synthesis |
Published | 2017-03-05 |
URL | http://arxiv.org/abs/1703.01664v1 |
http://arxiv.org/pdf/1703.01664v1.pdf | |
PWC | https://paperswithcode.com/paper/diversified-texture-synthesis-with-feed |
Repo | |
Framework | |
Nonnegative Restricted Boltzmann Machines for Parts-based Representations Discovery and Predictive Model Stabilization
Title | Nonnegative Restricted Boltzmann Machines for Parts-based Representations Discovery and Predictive Model Stabilization |
Authors | Tu Dinh Nguyen, Truyen Tran, Dinh Phung, Svetha Venkatesh |
Abstract | The success of any machine learning system depends critically on effective representations of data. In many cases, it is desirable that a representation scheme uncovers the parts-based, additive nature of the data. Of current representation learning schemes, restricted Boltzmann machines (RBMs) have proved to be highly effective in unsupervised settings. However, when it comes to parts-based discovery, RBMs do not usually produce satisfactory results. We enhance such capacity of RBMs by introducing nonnegativity into the model weights, resulting in a variant called nonnegative restricted Boltzmann machine (NRBM). The NRBM produces not only controllable decomposition of data into interpretable parts but also offers a way to estimate the intrinsic nonlinear dimensionality of data, and helps to stabilize linear predictive models. We demonstrate the capacity of our model on applications such as handwritten digit recognition, face recognition, document classification and patient readmission prognosis. The decomposition quality on images is comparable with or better than what produced by the nonnegative matrix factorization (NMF), and the thematic features uncovered from text are qualitatively interpretable in a similar manner to that of the latent Dirichlet allocation (LDA). The stability performance of feature selection on medical data is better than RBM and competitive with NMF. The learned features, when used for classification, are more discriminative than those discovered by both NMF and LDA and comparable with those by RBM. |
Tasks | Document Classification, Face Recognition, Feature Selection, Handwritten Digit Recognition, Representation Learning |
Published | 2017-08-18 |
URL | http://arxiv.org/abs/1708.05603v1 |
http://arxiv.org/pdf/1708.05603v1.pdf | |
PWC | https://paperswithcode.com/paper/nonnegative-restricted-boltzmann-machines-for |
Repo | |
Framework | |
Analysis of the Effect of Dependency Information on Predicate-Argument Structure Analysis and Zero Anaphora Resolution
Title | Analysis of the Effect of Dependency Information on Predicate-Argument Structure Analysis and Zero Anaphora Resolution |
Authors | Koichiro Yoshino, Shinsuke Mori, Satoshi Nakamura |
Abstract | This paper investigates and analyzes the effect of dependency information on predicate-argument structure analysis (PASA) and zero anaphora resolution (ZAR) for Japanese, and shows that a straightforward approach of PASA and ZAR works effectively even if dependency information was not available. We constructed an analyzer that directly predicts relationships of predicates and arguments with their semantic roles from a POS-tagged corpus. The features of the system are designed to compensate for the absence of syntactic information by using features used in dependency parsing as a reference. We also constructed analyzers that use the oracle dependency and the real dependency parsing results, and compared with the system that does not use any syntactic information to verify that the improvement provided by dependencies is not crucial. |
Tasks | Dependency Parsing |
Published | 2017-05-31 |
URL | http://arxiv.org/abs/1705.10962v1 |
http://arxiv.org/pdf/1705.10962v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-the-effect-of-dependency |
Repo | |
Framework | |
Neuron Pruning for Compressing Deep Networks using Maxout Architectures
Title | Neuron Pruning for Compressing Deep Networks using Maxout Architectures |
Authors | Fernando Moya Rueda, Rene Grzeszick, Gernot A. Fink |
Abstract | This paper presents an efficient and robust approach for reducing the size of deep neural networks by pruning entire neurons. It exploits maxout units for combining neurons into more complex convex functions and it makes use of a local relevance measurement that ranks neurons according to their activation on the training set for pruning them. Additionally, a parameter reduction comparison between neuron and weight pruning is shown. It will be empirically shown that the proposed neuron pruning reduces the number of parameters dramatically. The evaluation is performed on two tasks, the MNIST handwritten digit recognition and the LFW face verification, using a LeNet-5 and a VGG16 network architecture. The network size is reduced by up to $74%$ and $61%$, respectively, without affecting the network’s performance. The main advantage of neuron pruning is its direct influence on the size of the network architecture. Furthermore, it will be shown that neuron pruning can be combined with subsequent weight pruning, reducing the size of the LeNet-5 and VGG16 up to $92%$ and $80%$ respectively. |
Tasks | Face Verification, Handwritten Digit Recognition |
Published | 2017-07-21 |
URL | http://arxiv.org/abs/1707.06838v1 |
http://arxiv.org/pdf/1707.06838v1.pdf | |
PWC | https://paperswithcode.com/paper/neuron-pruning-for-compressing-deep-networks |
Repo | |
Framework | |
Developing All-Skyrmion Spiking Neural Network
Title | Developing All-Skyrmion Spiking Neural Network |
Authors | Zhezhi He, Deliang Fan |
Abstract | In this work, we have proposed a revolutionary neuromorphic computing methodology to implement All-Skyrmion Spiking Neural Network (AS-SNN). Such proposed methodology is based on our finding that skyrmion is a topological stable spin texture and its spatiotemporal motion along the magnetic nano-track intuitively interprets the pulse signal transmission between two interconnected neurons. In such design, spike train in SNN could be encoded as particle-like skyrmion train and further processed by the proposed skyrmion-synapse and skyrmion-neuron within the same magnetic nano-track to generate output skyrmion as post-spike. Then, both pre-neuron spikes and post-neuron spikes are encoded as particle-like skyrmions without conversion between charge and spin signals, which fundamentally differentiates our proposed design from other hybrid Spin-CMOS designs. The system level simulation shows 87.1% inference accuracy for handwritten digit recognition task, while the energy dissipation is ~1 fJ/per spike which is 3 orders smaller in comparison with CMOS based IBM TrueNorth system. |
Tasks | Handwritten Digit Recognition |
Published | 2017-05-08 |
URL | http://arxiv.org/abs/1705.02995v1 |
http://arxiv.org/pdf/1705.02995v1.pdf | |
PWC | https://paperswithcode.com/paper/developing-all-skyrmion-spiking-neural |
Repo | |
Framework | |
Handwritten Arabic Numeral Recognition using Deep Learning Neural Networks
Title | Handwritten Arabic Numeral Recognition using Deep Learning Neural Networks |
Authors | Akm Ashiquzzaman, Abdul Kawsar Tushar |
Abstract | Handwritten character recognition is an active area of research with applications in numerous fields. Past and recent works in this field have concentrated on various languages. Arabic is one language where the scope of research is still widespread, with it being one of the most popular languages in the world and being syntactically different from other major languages. Das et al. \cite{DBLP:journals/corr/abs-1003-1891} has pioneered the research for handwritten digit recognition in Arabic. In this paper, we propose a novel algorithm based on deep learning neural networks using appropriate activation function and regularization layer, which shows significantly improved accuracy compared to the existing Arabic numeral recognition methods. The proposed model gives 97.4 percent accuracy, which is the recorded highest accuracy of the dataset used in the experiment. We also propose a modification of the method described in \cite{DBLP:journals/corr/abs-1003-1891}, where our method scores identical accuracy as that of \cite{DBLP:journals/corr/abs-1003-1891}, with the value of 93.8 percent. |
Tasks | Handwritten Digit Recognition |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04663v1 |
http://arxiv.org/pdf/1702.04663v1.pdf | |
PWC | https://paperswithcode.com/paper/handwritten-arabic-numeral-recognition-using |
Repo | |
Framework | |
On Self-Adaptive Mutation Restarts for Evolutionary Robotics with Real Rotorcraft
Title | On Self-Adaptive Mutation Restarts for Evolutionary Robotics with Real Rotorcraft |
Authors | Gerard David Howard |
Abstract | Self-adaptive parameters are increasingly used in the field of Evolutionary Robotics, as they allow key evolutionary rates to vary autonomously in a context-sensitive manner throughout the optimisation process. A significant limitation to self-adaptive mutation is that rates can be set unfavourably, which hinders convergence. Rate restarts are typically employed to remedy this, but thus far have only been applied in Evolutionary Robotics for mutation-only algorithms. This paper focuses on the level at which evolutionary rate restarts are applied in population-based algorithms with more than 1 evolutionary operator. After testing on a real hexacopter hovering task, we conclude that individual-level restarting results in higher fitness solutions without fitness stagnation, and population restarts provide a more stable rate evolution. Without restarts, experiments can become stuck in suboptimal controller/rate combinations which can be difficult to escape from. |
Tasks | |
Published | 2017-03-31 |
URL | http://arxiv.org/abs/1703.10754v2 |
http://arxiv.org/pdf/1703.10754v2.pdf | |
PWC | https://paperswithcode.com/paper/on-self-adaptive-mutation-restarts-for |
Repo | |
Framework | |
Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle
Title | Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle |
Authors | Jacob Whitehill |
Abstract | In the context of data-mining competitions (e.g., Kaggle, KDDCup, ILSVRC Challenge), we show how access to an oracle that reports a contestant’s log-loss score on the test set can be exploited to deduce the ground-truth of some of the test examples. By applying this technique iteratively to batches of $m$ examples (for small $m$), all of the test labels can eventually be inferred. In this paper, (1) We demonstrate this attack on the first stage of a recent Kaggle competition (Intel & MobileODT Cancer Screening) and use it to achieve a log-loss of $0.00000$ (and thus attain a rank of #4 out of 848 contestants), without ever training a classifier to solve the actual task. (2) We prove an upper bound on the batch size $m$ as a function of the floating-point resolution of the probability estimates that the contestant submits for the labels. (3) We derive, and demonstrate in simulation, a more flexible attack that can be used even when the oracle reports the accuracy on an unknown (but fixed) subset of the test set’s labels. These results underline the importance of evaluating contestants based only on test data that the oracle does not examine. |
Tasks | |
Published | 2017-07-06 |
URL | http://arxiv.org/abs/1707.01825v1 |
http://arxiv.org/pdf/1707.01825v1.pdf | |
PWC | https://paperswithcode.com/paper/climbing-the-kaggle-leaderboard-by-exploiting |
Repo | |
Framework | |
Hyperspectral Light Field Stereo Matching
Title | Hyperspectral Light Field Stereo Matching |
Authors | Kang Zhu, Yujia Xue, Qiang Fu, Sing Bing Kang, Xilin Chen, Jingyi Yu |
Abstract | In this paper, we describe how scene depth can be extracted using a hyperspectral light field capture (H-LF) system. Our H-LF system consists of a 5 x 6 array of cameras, with each camera sampling a different narrow band in the visible spectrum. There are two parts to extracting scene depth. The first part is our novel cross-spectral pairwise matching technique, which involves a new spectral-invariant feature descriptor and its companion matching metric we call bidirectional weighted normalized cross correlation (BWNCC). The second part, namely, H-LF stereo matching, uses a combination of spectral-dependent correspondence and defocus cues that rely on BWNCC. These two new cost terms are integrated into a Markov Random Field (MRF) for disparity estimation. Experiments on synthetic and real H-LF data show that our approach can produce high-quality disparity maps. We also show that these results can be used to produce the complete plenoptic cube in addition to synthesizing all-focus and defocused color images under different sensor spectral responses. |
Tasks | Disparity Estimation, Stereo Matching, Stereo Matching Hand |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.00835v1 |
http://arxiv.org/pdf/1709.00835v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperspectral-light-field-stereo-matching |
Repo | |
Framework | |
Semi-supervised Learning with Deep Generative Models for Asset Failure Prediction
Title | Semi-supervised Learning with Deep Generative Models for Asset Failure Prediction |
Authors | Andre S. Yoon, Taehoon Lee, Yongsub Lim, Deokwoo Jung, Philgyun Kang, Dongwon Kim, Keuntae Park, Yongjin Choi |
Abstract | This work presents a novel semi-supervised learning approach for data-driven modeling of asset failures when health status is only partially known in historical data. We combine a generative model parameterized by deep neural networks with non-linear embedding technique. It allows us to build prognostic models with the limited amount of health status information for the precise prediction of future asset reliability. The proposed method is evaluated on a publicly available dataset for remaining useful life (RUL) estimation, which shows significant improvement even when a fraction of the data with known health status is as sparse as 1% of the total. Our study suggests that the non-linear embedding based on a deep generative model can efficiently regularize a complex model with deep architectures while achieving high prediction accuracy that is far less sensitive to the availability of health status information. |
Tasks | |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.00845v1 |
http://arxiv.org/pdf/1709.00845v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-with-deep-generative |
Repo | |
Framework | |
PuRe: Robust pupil detection for real-time pervasive eye tracking
Title | PuRe: Robust pupil detection for real-time pervasive eye tracking |
Authors | Thiago Santini, Wolfgang Fuhl, Enkelejda Kasneci |
Abstract | Real-time, accurate, and robust pupil detection is an essential prerequisite to enable pervasive eye-tracking and its applications – e.g., gaze-based human computer interaction, health monitoring, foveated rendering, and advanced driver assistance. However, automated pupil detection has proved to be an intricate task in real-world scenarios due to a large mixture of challenges such as quickly changing illumination and occlusions. In this paper, we introduce the Pupil Reconstructor PuRe, a method for pupil detection in pervasive scenarios based on a novel edge segment selection and conditional segment combination schemes; the method also includes a confidence measure for the detected pupil. The proposed method was evaluated on over 316,000 images acquired with four distinct head-mounted eye tracking devices. Results show a pupil detection rate improvement of over 10 percentage points w.r.t. state-of-the-art algorithms in the two most challenging data sets (6.46 for all data sets), further pushing the envelope for pupil detection. Moreover, we advance the evaluation protocol of pupil detection algorithms by also considering eye images in which pupils are not present. In this aspect, PuRe improved precision and specificity w.r.t. state-of-the-art algorithms by 25.05 and 10.94 percentage points, respectively, demonstrating the meaningfulness of PuRe’s confidence measure. PuRe operates in real-time for modern eye trackers (at 120 fps). |
Tasks | Eye Tracking |
Published | 2017-12-24 |
URL | http://arxiv.org/abs/1712.08900v1 |
http://arxiv.org/pdf/1712.08900v1.pdf | |
PWC | https://paperswithcode.com/paper/pure-robust-pupil-detection-for-real-time |
Repo | |
Framework | |
Multi-fidelity Bayesian Optimisation with Continuous Approximations
Title | Multi-fidelity Bayesian Optimisation with Continuous Approximations |
Authors | Kirthevasan Kandasamy, Gautam Dasarathy, Jeff Schneider, Barnabas Poczos |
Abstract | Bandit methods for black-box optimisation, such as Bayesian optimisation, are used in a variety of applications including hyper-parameter tuning and experiment design. Recently, \emph{multi-fidelity} methods have garnered considerable attention since function evaluations have become increasingly expensive in such applications. Multi-fidelity methods use cheap approximations to the function of interest to speed up the overall optimisation process. However, most multi-fidelity methods assume only a finite number of approximations. In many practical applications however, a continuous spectrum of approximations might be available. For instance, when tuning an expensive neural network, one might choose to approximate the cross validation performance using less data $N$ and/or few training iterations $T$. Here, the approximations are best viewed as arising out of a continuous two dimensional space $(N,T)$. In this work, we develop a Bayesian optimisation method, BOCA, for this setting. We characterise its theoretical properties and show that it achieves better regret than than strategies which ignore the approximations. BOCA outperforms several other baselines in synthetic and real experiments. |
Tasks | Bayesian Optimisation |
Published | 2017-03-18 |
URL | http://arxiv.org/abs/1703.06240v1 |
http://arxiv.org/pdf/1703.06240v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-fidelity-bayesian-optimisation-with |
Repo | |
Framework | |