Paper Group AWR 248
Multiobjective Optimization Training of PLDA for Speaker Verification. Deep Compressive Autoencoder for Action Potential Compression in Large-Scale Neural Recording. PlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach. Compact Factorization of Matrices Using Generalized Round-Rank. ECG Segmenta …
Multiobjective Optimization Training of PLDA for Speaker Verification
Title | Multiobjective Optimization Training of PLDA for Speaker Verification |
Authors | Liang He, Xianhong Chen, Can Xu, Jia Liu |
Abstract | Most current state-of-the-art text-independent speaker verification systems take probabilistic linear discriminant analysis (PLDA) as their backend classifiers. The parameters of PLDA are often estimated by maximizing the objective function, which focuses on increasing the value of log-likelihood function, but ignoring the distinction between speakers. In order to better distinguish speakers, we propose a multi-objective optimization training for PLDA. Experiment results show that the proposed method has more than 10% relative performance improvement in both EER and MinDCF on the NIST SRE14 i-vector challenge dataset, and about 20% relative performance improvement in EER on the MCE18 dataset. |
Tasks | Multiobjective Optimization, Speaker Verification, Text-Independent Speaker Verification |
Published | 2018-08-25 |
URL | http://arxiv.org/abs/1808.08344v2 |
http://arxiv.org/pdf/1808.08344v2.pdf | |
PWC | https://paperswithcode.com/paper/multiobjective-optimization-training-of-plda |
Repo | https://github.com/sanphiee/MOT-sGPLDA-MCE18 |
Framework | none |
Deep Compressive Autoencoder for Action Potential Compression in Large-Scale Neural Recording
Title | Deep Compressive Autoencoder for Action Potential Compression in Large-Scale Neural Recording |
Authors | Tong Wu, Wenfeng Zhao, Edward Keefer, Zhi Yang |
Abstract | Understanding the coordinated activity underlying brain computations requires large-scale, simultaneous recordings from distributed neuronal structures at a cellular-level resolution. One major hurdle to design high-bandwidth, high-precision, large-scale neural interfaces lies in the formidable data streams that are generated by the recorder chip and need to be online transferred to a remote computer. The data rates can require hundreds to thousands of I/O pads on the recorder chip and power consumption on the order of Watts for data streaming alone. We developed a deep learning-based compression model to reduce the data rate of multichannel action potentials. The proposed model is built upon a deep compressive autoencoder (CAE) with discrete latent embeddings. The encoder is equipped with residual transformations to extract representative features from spikes, which are mapped into the latent embedding space and updated via vector quantization (VQ). The decoder network reconstructs spike waveforms from the quantized latent embeddings. Experimental results show that the proposed model consistently outperforms conventional methods by achieving much higher compression ratios (20-500x) and better or comparable reconstruction accuracies. Testing results also indicate that CAE is robust against a diverse range of imperfections, such as waveform variation and spike misalignment, and has minor influence on spike sorting accuracy. Furthermore, we have estimated the hardware cost and real-time performance of CAE and shown that it could support thousands of recording channels simultaneously without excessive power/heat dissipation. The proposed model can reduce the required data transmission bandwidth in large-scale recording experiments and maintain good signal qualities. The code of this work has been made available at https://github.com/tong-wu-umn/spike-compression-autoencoder |
Tasks | Quantization |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05522v2 |
http://arxiv.org/pdf/1809.05522v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-compressive-autoencoder-for-action |
Repo | https://github.com/tong-wu-umn/spike-compression-autoencoder |
Framework | pytorch |
PlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach
Title | PlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach |
Authors | Luca Pappalardo, Paolo Cintia, Paolo Ferragina, Emanuele Massucco, Dino Pedreschi, Fosca Giannotti |
Abstract | The problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this paper, we design and implement PlayeRank, a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. We build our framework by deploying a massive dataset of soccer-logs and consisting of millions of match events pertaining to four seasons of 18 prominent soccer competitions. By comparing PlayeRank to known algorithms for performance evaluation in soccer, and by exploiting a dataset of players’ evaluations made by professional soccer scouts, we show that PlayeRank significantly outperforms the competitors. We also explore the ratings produced by {\sf PlayeRank} and discover interesting patterns about the nature of excellent performances and what distinguishes the top players from the others. At the end, we explore some applications of PlayeRank – i.e. searching players and player versatility — showing its flexibility and efficiency, which makes it worth to be used in the design of a scalable platform for soccer analytics. |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.04987v3 |
http://arxiv.org/pdf/1802.04987v3.pdf | |
PWC | https://paperswithcode.com/paper/playerank-data-driven-performance-evaluation |
Repo | https://github.com/madsemildahlgaard/football-project |
Framework | none |
Compact Factorization of Matrices Using Generalized Round-Rank
Title | Compact Factorization of Matrices Using Generalized Round-Rank |
Authors | Pouya Pezeshkpour, Carlos Guestrin, Sameer Singh |
Abstract | Matrix factorization is a well-studied task in machine learning for compactly representing large, noisy data. In our approach, instead of using the traditional concept of matrix rank, we define a new notion of link-rank based on a non-linear link function used within factorization. In particular, by applying the round function on a factorization to obtain ordinal-valued matrices, we introduce generalized round-rank (GRR). We show that not only are there many full-rank matrices that are low GRR, but further, that these matrices cannot be approximated well by low-rank linear factorization. We provide uniqueness conditions of this formulation and provide gradient descent-based algorithms. Finally, we present experiments on real-world datasets to demonstrate that the GRR-based factorization is significantly more accurate than linear factorization, while converging faster and using lower rank representations. |
Tasks | |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00184v1 |
http://arxiv.org/pdf/1805.00184v1.pdf | |
PWC | https://paperswithcode.com/paper/compact-factorization-of-matrices-using |
Repo | https://github.com/pouyapez/GRR-Matrix-Factorization |
Framework | none |
ECG Segmentation by Neural Networks: Errors and Correction
Title | ECG Segmentation by Neural Networks: Errors and Correction |
Authors | Iana Sereda, Sergey Alekseev, Aleksandra Koneva, Roman Kataev, Grigory Osipov |
Abstract | In this study we examined the question of how error correction occurs in an ensemble of deep convolutional networks, trained for an important applied problem: segmentation of Electrocardiograms(ECG). We also explore the possibility of using the information about ensemble errors to evaluate a quality of data representation, built by the network. This possibility arises from the effect of distillation of outliers, which was demonstarted for the ensemble, described in this paper. |
Tasks | Electrocardiography (ECG) |
Published | 2018-12-26 |
URL | http://arxiv.org/abs/1812.10386v1 |
http://arxiv.org/pdf/1812.10386v1.pdf | |
PWC | https://paperswithcode.com/paper/ecg-segmentation-by-neural-networks-errors |
Repo | https://github.com/Namenaro/ecg_segmentation |
Framework | none |
Context-adaptive neural network based prediction for image compression
Title | Context-adaptive neural network based prediction for image compression |
Authors | Thierry Dumas, Aline Roumy, Christine Guillemot |
Abstract | This paper describes a set of neural network architectures, called Prediction Neural Networks Set (PNNS), based on both fully-connected and convolutional neural networks, for intra image prediction. The choice of neural network for predicting a given image block depends on the block size, hence does not need to be signalled to the decoder. It is shown that, while fully-connected neural networks give good performance for small block sizes, convolutional neural networks provide better predictions in large blocks with complex textures. Thanks to the use of masks of random sizes during training, the neural networks of PNNS well adapt to the available context that may vary, depending on the position of the image block to be predicted. When integrating PNNS into a H.265 codec, PSNR-rate performance gains going from 1.46% to 5.20% are obtained. These gains are on average 0.99% larger than those of prior neural network based methods. Unlike the H.265 intra prediction modes, which are each specialized in predicting a specific texture, the proposed PNNS can model a large set of complex textures. |
Tasks | Image Compression |
Published | 2018-07-17 |
URL | https://arxiv.org/abs/1807.06244v2 |
https://arxiv.org/pdf/1807.06244v2.pdf | |
PWC | https://paperswithcode.com/paper/context-adaptive-neural-network-based |
Repo | https://github.com/thierrydumas/context_adaptive_neural_network_based_prediction |
Framework | tf |
Tune: A Research Platform for Distributed Model Selection and Training
Title | Tune: A Research Platform for Distributed Model Selection and Training |
Authors | Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, Ion Stoica |
Abstract | Modern machine learning algorithms are increasingly computationally demanding, requiring specialized hardware and distributed computation to achieve high performance in a reasonable time frame. Many hyperparameter search algorithms have been proposed for improving the efficiency of model selection, however their adaptation to the distributed compute environment is often ad-hoc. We propose Tune, a unified framework for model selection and training that provides a narrow-waist interface between training scripts and search algorithms. We show that this interface meets the requirements for a broad range of hyperparameter search algorithms, allows straightforward scaling of search to large clusters, and simplifies algorithm implementation. We demonstrate the implementation of several state-of-the-art hyperparameter search algorithms in Tune. Tune is available at http://ray.readthedocs.io/en/latest/tune.html. |
Tasks | Hyperparameter Optimization, Model Selection |
Published | 2018-07-13 |
URL | http://arxiv.org/abs/1807.05118v1 |
http://arxiv.org/pdf/1807.05118v1.pdf | |
PWC | https://paperswithcode.com/paper/tune-a-research-platform-for-distributed |
Repo | https://github.com/ray-project/ray |
Framework | tf |
Defense-VAE: A Fast and Accurate Defense against Adversarial Attacks
Title | Defense-VAE: A Fast and Accurate Defense against Adversarial Attacks |
Authors | Xiang Li, Shihao Ji |
Abstract | Deep neural networks (DNNs) have been enormously successful across a variety of prediction tasks. However, recent research shows that DNNs are particularly vulnerable to adversarial attacks, which poses a serious threat to their applications in security-sensitive systems. In this paper, we propose a simple yet effective defense algorithm Defense-VAE that uses variational autoencoder (VAE) to purge adversarial perturbations from contaminated images. The proposed method is generic and can defend white-box and black-box attacks without the need of retraining the original CNN classifiers, and can further strengthen the defense by retraining CNN or end-to-end finetuning the whole pipeline. In addition, the proposed method is very efficient compared to the optimization-based alternatives, such as Defense-GAN, since no iterative optimization is needed for online prediction. Extensive experiments on MNIST, Fashion-MNIST, CelebA and CIFAR-10 demonstrate the superior defense accuracy of Defense-VAE compared to Defense-GAN, while being 50x faster than the latter. This makes Defense-VAE widely deployable in real-time security-sensitive systems. Our source code can be found at https://github.com/lxuniverse/defense-vae. |
Tasks | |
Published | 2018-12-17 |
URL | https://arxiv.org/abs/1812.06570v3 |
https://arxiv.org/pdf/1812.06570v3.pdf | |
PWC | https://paperswithcode.com/paper/defense-vae-a-fast-and-accurate-defense |
Repo | https://github.com/lxuniverse/defense-vae |
Framework | tf |
Pooling Pyramid Network for Object Detection
Title | Pooling Pyramid Network for Object Detection |
Authors | Pengchong Jin, Vivek Rathod, Xiangxin Zhu |
Abstract | We’d like to share a simple tweak of Single Shot Multibox Detector (SSD) family of detectors, which is effective in reducing model size while maintaining the same quality. We share box predictors across all scales, and replace convolution between scales with max pooling. This has two advantages over vanilla SSD: (1) it avoids score miscalibration across scales; (2) the shared predictor sees the training data over all scales. Since we reduce the number of predictors to one, and trim all convolutions between them, model size is significantly smaller. We empirically show that these changes do not hurt model quality compared to vanilla SSD. |
Tasks | Object Detection |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03284v1 |
http://arxiv.org/pdf/1807.03284v1.pdf | |
PWC | https://paperswithcode.com/paper/pooling-pyramid-network-for-object-detection |
Repo | https://github.com/tensorflow/models/tree/master/research/object_detection |
Framework | tf |
Temporally Identity-Aware SSD with Attentional LSTM
Title | Temporally Identity-Aware SSD with Attentional LSTM |
Authors | Xingyu Chen, Junzhi Yu, Zhengxing Wu |
Abstract | Temporal object detection has attracted significant attention, but most popular detection methods cannot leverage rich temporal information in videos. Very recently, many algorithms have been developed for video detection task, yet very few approaches can achieve \emph{real-time online} object detection in videos. In this paper, based on attention mechanism and convolutional long short-term memory (ConvLSTM), we propose a temporal single-shot detector (TSSD) for real-world detection. Distinct from previous methods, we take aim at temporally integrating pyramidal feature hierarchy using ConvLSTM, and design a novel structure including a low-level temporal unit as well as a high-level one (LH-TU) for multi-scale feature maps. Moreover, we develop a creative temporal analysis unit, namely, attentional ConvLSTM (AC-LSTM), in which a temporal attention mechanism is specially tailored for background suppression and scale suppression while a ConvLSTM integrates attention-aware features across time. An association loss and a multi-step training are designed for temporal coherence. Besides, an online tubelet analysis (OTA) is exploited for identification. Our framework is evaluated on ImageNet VID dataset and 2DMOT15 dataset. Extensive comparisons on the detection and tracking capability validate the superiority of the proposed approach. Consequently, the developed TSSD-OTA achieves a fast speed and an overall competitive performance in terms of detection and tracking. Finally, a real-world maneuver is conducted for underwater object grasping. The source code is publicly available at https://github.com/SeanChenxy/TSSD-OTA. |
Tasks | Object Detection |
Published | 2018-03-01 |
URL | https://arxiv.org/abs/1803.00197v4 |
https://arxiv.org/pdf/1803.00197v4.pdf | |
PWC | https://paperswithcode.com/paper/temporally-identity-aware-ssd-with |
Repo | https://github.com/SeanChenxy/TSSD-OTA |
Framework | pytorch |
Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect Information
Title | Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect Information |
Authors | Henry Charlesworth |
Abstract | We introduce a new virtual environment for simulating a card game known as “Big 2”. This is a four-player game of imperfect information with a relatively complicated action space (being allowed to play 1,2,3,4 or 5 card combinations from an initial starting hand of 13 cards). As such it poses a challenge for many current reinforcement learning methods. We then use the recently proposed “Proximal Policy Optimization” algorithm to train a deep neural network to play the game, purely learning via self-play, and find that it is able to reach a level which outperforms amateur human players after only a relatively short amount of training time and without needing to search a tree of future game states. |
Tasks | Card Games |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10442v1 |
http://arxiv.org/pdf/1808.10442v1.pdf | |
PWC | https://paperswithcode.com/paper/application-of-self-play-reinforcement |
Repo | https://github.com/henrycharlesworth/big2_PPOalgorithm |
Framework | tf |
Exploring Weight Symmetry in Deep Neural Networks
Title | Exploring Weight Symmetry in Deep Neural Networks |
Authors | Xu Shell Hu, Sergey Zagoruyko, Nikos Komodakis |
Abstract | We propose to impose symmetry in neural network parameters to improve parameter usage and make use of dedicated convolution and matrix multiplication routines. Due to significant reduction in the number of parameters as a result of the symmetry constraints, one would expect a dramatic drop in accuracy. Surprisingly, we show that this is not the case, and, depending on network size, symmetry can have little or no negative effect on network accuracy, especially in deep overparameterized networks. We propose several ways to impose local symmetry in recurrent and convolutional neural networks, and show that our symmetry parameterizations satisfy universal approximation property for single hidden layer networks. We extensively evaluate these parameterizations on CIFAR, ImageNet and language modeling datasets, showing significant benefits from the use of symmetry. For instance, our ResNet-101 with channel-wise symmetry has almost 25% less parameters and only 0.2% accuracy loss on ImageNet. Code for our experiments is available at https://github.com/hushell/deep-symmetry |
Tasks | Language Modelling |
Published | 2018-12-28 |
URL | http://arxiv.org/abs/1812.11027v2 |
http://arxiv.org/pdf/1812.11027v2.pdf | |
PWC | https://paperswithcode.com/paper/exploring-weight-symmetry-in-deep-neural |
Repo | https://github.com/hushell/deep-symmetry |
Framework | pytorch |
Deep Multiple Description Coding by Learning Scalar Quantization
Title | Deep Multiple Description Coding by Learning Scalar Quantization |
Authors | Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao |
Abstract | In this paper, we propose a deep multiple description coding framework, whose quantizers are adaptively learned via the minimization of multiple description compressive loss. Firstly, our framework is built upon auto-encoder networks, which have multiple description multi-scale dilated encoder network and multiple description decoder networks. Secondly, two entropy estimation networks are learned to estimate the informative amounts of the quantized tensors, which can further supervise the learning of multiple description encoder network to represent the input image delicately. Thirdly, a pair of scalar quantizers accompanied by two importance-indicator maps is automatically learned in an end-to-end self-supervised way. Finally, multiple description structural dissimilarity distance loss is imposed on multiple description decoded images in pixel domain for diversified multiple description generations rather than on feature tensors in feature domain, in addition to multiple description reconstruction loss. Through testing on two commonly used datasets, it is verified that our method is beyond several state-of-the-art multiple description coding approaches in terms of coding efficiency. |
Tasks | Quantization |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01504v3 |
http://arxiv.org/pdf/1811.01504v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-multiple-description-coding-by-learning |
Repo | https://github.com/mdcnn/mdcnn.github.io |
Framework | none |
Game of Sketches: Deep Recurrent Models of Pictionary-style Word Guessing
Title | Game of Sketches: Deep Recurrent Models of Pictionary-style Word Guessing |
Authors | Ravi Kiran Sarvadevabhatla, Shiv Surya, Trisha Mittal, Venkatesh Babu Radhakrishnan |
Abstract | The ability of intelligent agents to play games in human-like fashion is popularly considered a benchmark of progress in Artificial Intelligence. Similarly, performance on multi-disciplinary tasks such as Visual Question Answering (VQA) is considered a marker for gauging progress in Computer Vision. In our work, we bring games and VQA together. Specifically, we introduce the first computational model aimed at Pictionary, the popular word-guessing social game. We first introduce Sketch-QA, an elementary version of Visual Question Answering task. Styled after Pictionary, Sketch-QA uses incrementally accumulated sketch stroke sequences as visual data. Notably, Sketch-QA involves asking a fixed question (“What object is being drawn?") and gathering open-ended guess-words from human guessers. We analyze the resulting dataset and present many interesting findings therein. To mimic Pictionary-style guessing, we subsequently propose a deep neural model which generates guess-words in response to temporally evolving human-drawn sketches. Our model even makes human-like mistakes while guessing, thus amplifying the human mimicry factor. We evaluate our model on the large-scale guess-word dataset generated via Sketch-QA task and compare with various baselines. We also conduct a Visual Turing Test to obtain human impressions of the guess-words generated by humans and our model. Experimental results demonstrate the promise of our approach for Pictionary and similarly themed games. |
Tasks | Question Answering, Visual Question Answering |
Published | 2018-01-29 |
URL | http://arxiv.org/abs/1801.09356v1 |
http://arxiv.org/pdf/1801.09356v1.pdf | |
PWC | https://paperswithcode.com/paper/game-of-sketches-deep-recurrent-models-of |
Repo | https://github.com/val-iisc/sketchguess |
Framework | none |
Automatic, fast and robust characterization of noise distributions for diffusion MRI
Title | Automatic, fast and robust characterization of noise distributions for diffusion MRI |
Authors | Samuel St-Jean, Alberto De Luca, Max A. Viergever, Alexander Leemans |
Abstract | Knowledge of the noise distribution in magnitude diffusion MRI images is the centerpiece to quantify uncertainties arising from the acquisition process. The use of parallel imaging methods, the number of receiver coils and imaging filters applied by the scanner, amongst other factors, dictate the resulting signal distribution. Accurate estimation beyond textbook Rician or noncentral chi distributions often requires information about the acquisition process (e.g. coils sensitivity maps or reconstruction coefficients), which is not usually available. We introduce a new method where a change of variable naturally gives rise to a particular form of the gamma distribution for background signals. The first moments and maximum likelihood estimators of this gamma distribution explicitly depend on the number of coils, making it possible to estimate all unknown parameters using only the magnitude data. A rejection step is used to make the method automatic and robust to artifacts. Experiments on synthetic datasets show that the proposed method can reliably estimate both the degrees of freedom and the standard deviation. The worst case errors range from below 2% (spatially uniform noise) to approximately 10% (spatially variable noise). Repeated acquisitions of in vivo datasets show that the estimated parameters are stable and have lower variances than compared methods. |
Tasks | |
Published | 2018-05-30 |
URL | http://arxiv.org/abs/1805.12071v2 |
http://arxiv.org/pdf/1805.12071v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-fast-and-robust-characterization-of |
Repo | https://github.com/samuelstjean/nlsam |
Framework | none |