October 18, 2019

3386 words 16 mins read

Paper Group ANR 433

Prediction of ESG Compliance using a Heterogeneous Information Network. Importance mixing: Improving sample reuse in evolutionary policy search methods. Neyman-Pearson classification: parametrics and sample size requirement. ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems. Multi-Layer Competitive-Cooperative Fram …

Prediction of ESG Compliance using a Heterogeneous Information Network


Title	Prediction of ESG Compliance using a Heterogeneous Information Network
Authors	Ryohei Hisano, Didier Sornette, Takayuki Mizuno
Abstract	Negative screening is one method to avoid interactions with inappropriate entities. For example, financial institutions keep investment exclusion lists of inappropriate firms that have environmental, social, and government (ESG) problems. They create their investment exclusion lists by gathering information from various news sources to keep their portfolios profitable as well as green. International organizations also maintain smart sanctions lists that are used to prohibit trade with entities that are involved in illegal activities. In the present paper, we focus on the prediction of investment exclusion lists in the finance domain. We construct a vast heterogeneous information network that covers the necessary information surrounding each firm, which is assembled using seven professionally curated datasets and two open datasets, which results in approximately 50 million nodes and 400 million edges in total. Exploiting these vast datasets and motivated by how professional investigators and journalists undertake their daily investigations, we propose a model that can learn to predict firms that are more likely to be added to an investment exclusion list in the near future. Our approach is tested using the negative news investment exclusion list data of more than 35,000 firms worldwide from January 2012 to May 2018. Comparing with the state-of-the-art methods with and without using the network, we show that the predictive accuracy is substantially improved when using the vast information stored in the heterogeneous information network. This work suggests new ways to consolidate the diffuse information contained in big data to monitor dominant firms on a global scale for better risk management and more socially responsible investment.
Tasks
Published	2018-11-09
URL	https://arxiv.org/abs/1811.12166v3
PDF	https://arxiv.org/pdf/1811.12166v3.pdf
PWC	https://paperswithcode.com/paper/social-blacklist-prediction-using-a
Repo
Framework

Importance mixing: Improving sample reuse in evolutionary policy search methods


Title	Importance mixing: Improving sample reuse in evolutionary policy search methods
Authors	Aloïs Pourchot, Nicolas Perrin, Olivier Sigaud
Abstract	Deep neuroevolution, that is evolutionary policy search methods based on deep neural networks, have recently emerged as a competitor to deep reinforcement learning algorithms due to their better parallelization capabilities. However, these methods still suffer from a far worse sample efficiency. In this paper we investigate whether a mechanism known as “importance mixing” can significantly improve their sample efficiency. We provide a didactic presentation of importance mixing and we explain how it can be extended to reuse more samples. Then, from an empirical comparison based on a simple benchmark, we show that, though it actually provides better sample efficiency, it is still far from the sample efficiency of deep reinforcement learning, though it is more stable.
Tasks
Published	2018-08-17
URL	http://arxiv.org/abs/1808.05832v1
PDF	http://arxiv.org/pdf/1808.05832v1.pdf
PWC	https://paperswithcode.com/paper/importance-mixing-improving-sample-reuse-in
Repo
Framework

Neyman-Pearson classification: parametrics and sample size requirement


Title	Neyman-Pearson classification: parametrics and sample size requirement
Authors	Xin Tong, Lucy Xia, Jiacheng Wang, Yang Feng
Abstract	The Neyman-Pearson (NP) paradigm in binary classification seeks classifiers that achieve a minimal type II error while enforcing the prioritized type I error controlled under some user-specified level $\alpha$. This paradigm serves naturally in applications such as severe disease diagnosis and spam detection, where people have clear priorities among the two error types. Recently, Tong, Feng and Li (2018) proposed a nonparametric umbrella algorithm that adapts all scoring-type classification methods (e.g., logistic regression, support vector machines, random forest) to respect the given type I error upper bound $\alpha$ with high probability, without specific distributional assumptions on the features and the responses. Universal the umbrella algorithm is, it demands an explicit minimum sample size requirement on class $0$, which is often the more scarce class, such as in rare disease diagnosis applications. In this work, we employ the parametric linear discriminant analysis (LDA) model and propose a new parametric thresholding algorithm, which does not need the minimum sample size requirements on class $0$ observations and thus is suitable for small sample applications such as rare disease diagnosis. Leveraging both the existing nonparametric and the newly proposed parametric thresholding rules, we propose four LDA-based NP classifiers, for both low- and high-dimensional settings. On the theoretical front, we prove NP oracle inequalities for one proposed classifier, where the rate for excess type II error benefits from the explicit parametric model assumption. Furthermore, as NP classifiers involve a sample splitting step of class $0$ observations, we construct a new adaptive sample splitting scheme that can be applied universally to NP classifiers, and this adaptive strategy reduces the type II error of these classifiers.
Tasks
Published	2018-02-07
URL	https://arxiv.org/abs/1802.02557v5
PDF	https://arxiv.org/pdf/1802.02557v5.pdf
PWC	https://paperswithcode.com/paper/neyman-pearson-classification-parametrics-and
Repo
Framework

ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems


Title	ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems
Authors	Eunwoo Song, Kyungguen Byun, Hong-Goo Kang
Abstract	This paper proposes a WaveNet-based neural excitation model (ExcitNet) for statistical parametric speech synthesis systems. Conventional WaveNet-based neural vocoding systems significantly improve the perceptual quality of synthesized speech by statistically generating a time sequence of speech waveforms through an auto-regressive framework. However, they often suffer from noisy outputs because of the difficulties in capturing the complicated time-varying nature of speech signals. To improve modeling efficiency, the proposed ExcitNet vocoder employs an adaptive inverse filter to decouple spectral components from the speech signal. The residual component, i.e. excitation signal, is then trained and generated within the WaveNet framework. In this way, the quality of the synthesized speech signal can be further improved since the spectral component is well represented by a deep learning framework and, moreover, the residual component is efficiently generated by the WaveNet framework. Experimental results show that the proposed ExcitNet vocoder, trained both speaker-dependently and speaker-independently, outperforms traditional linear prediction vocoders and similarly configured conventional WaveNet vocoders.
Tasks	Speech Synthesis
Published	2018-11-09
URL	https://arxiv.org/abs/1811.04769v3
PDF	https://arxiv.org/pdf/1811.04769v3.pdf
PWC	https://paperswithcode.com/paper/excitnet-vocoder-a-neural-excitation-model
Repo
Framework

Multi-Layer Competitive-Cooperative Framework for Performance Enhancement of Differential Evolution


Title	Multi-Layer Competitive-Cooperative Framework for Performance Enhancement of Differential Evolution
Authors	Sheng Xin Zhang, Li Ming Zheng, Kit Sang Tang, Shao Yong Zheng, Wing Shing Chan
Abstract	Differential Evolution (DE) is recognized as one of the most powerful optimizers in the evolutionary algorithm (EA) family. Many DE variants were proposed in recent years, but significant differences in performances between them are hardly observed. Therefore, this paper suggests a multi-layer competitive-cooperative (MLCC) framework to facilitate the competition and cooperation of multiple DEs, which in turns, achieve a significant performance improvement. Unlike other multi-method strategies which adopt a multi-population based structure, with individuals only evolving in their corresponding subpopulations, MLCC implements a parallel structure with the entire population simultaneously monitored by multiple DEs assigned to their corresponding layers. An individual can store, utilize and update its evolution information in different layers based on an individual preference based layer selecting (IPLS) mechanism and a computational resource allocation bias (RAB) mechanism. In IPLS, individuals connect to only one favorite layer. While in RAB, high-quality solutions are evolved by considering all the layers. Thus DEs associated in the layers work in a competitive and cooperative manner. The proposed MLCC framework has been implemented on several highly competitive DEs. Experimental studies show that the MLCC variants significantly outperform the baseline DEs as well as several state-of-the-art and up-to-date DEs on CEC benchmark functions.
Tasks
Published	2018-01-31
URL	http://arxiv.org/abs/1801.10546v3
PDF	http://arxiv.org/pdf/1801.10546v3.pdf
PWC	https://paperswithcode.com/paper/multi-layer-competitive-cooperative-framework
Repo
Framework

Highly Accelerated Multishot EPI through Synergistic Machine Learning and Joint Reconstruction


Title	Highly Accelerated Multishot EPI through Synergistic Machine Learning and Joint Reconstruction
Authors	Berkin Bilgic, Itthi Chatnuntawech, Mary Kate Manhard, Qiyuan Tian, Congyu Liao, Stephen F. Cauley, Susie Y. Huang, Jonathan R. Polimeni, Lawrence L. Wald, Kawin Setsompop
Abstract	Purpose: To introduce a combined machine learning (ML) and physics-based image reconstruction framework that enables navigator-free, highly accelerated multishot echo planar imaging (msEPI), and demonstrate its application in high-resolution structural and diffusion imaging. Methods: Singleshot EPI is an efficient encoding technique, but does not lend itself well to high-resolution imaging due to severe distortion artifacts and blurring. While msEPI can mitigate these artifacts, high-quality msEPI has been elusive because of phase mismatch arising from shot-to-shot variations which preclude the combination of the multiple-shot data into a single image. We employ deep learning to obtain an interim image with minimal artifacts, which permits estimation of image phase variations due to shot-to-shot changes. These variations are then included in a Joint Virtual Coil Sensitivity Encoding (JVC-SENSE) reconstruction to utilize data from all shots and improve upon the ML solution. Results: Our combined ML + physics approach enabled Rinplane x MultiBand (MB) = 8x2-fold acceleration using 2 EPI-shots for multi-echo imaging, so that whole-brain T2 and T2* parameter maps could be derived from an 8.3 sec acquisition at 1x1x3mm3 resolution. This has also allowed high-resolution diffusion imaging with high geometric fidelity using 5-shots at Rinplane x MB = 9x2-fold acceleration. To make these possible, we extended the state-of-the-art MUSSELS reconstruction technique to Simultaneous MultiSlice (SMS) encoding and used it as an input to our ML network. Conclusion: Combination of ML and JVC-SENSE enabled navigator-free msEPI at higher accelerations than previously possible while using fewer shots, with reduced vulnerability to poor generalizability and poor acceptance of end-to-end ML approaches.
Tasks	Image Reconstruction
Published	2018-08-08
URL	http://arxiv.org/abs/1808.02814v3
PDF	http://arxiv.org/pdf/1808.02814v3.pdf
PWC	https://paperswithcode.com/paper/highly-accelerated-multishot-epi-through
Repo
Framework

A Joint Model of Conversational Discourse and Latent Topics on Microblogs


Title	A Joint Model of Conversational Discourse and Latent Topics on Microblogs
Authors	Jing Li, Yan Song, Zhongyu Wei, Kam-Fai Wong
Abstract	Conventional topic models are ineffective for topic extraction from microblog messages, because the data sparseness exhibited in short messages lacking structure and contexts results in poor message-level word co-occurrence patterns. To address this issue, we organize microblog messages as conversation trees based on their reposting and replying relations, and propose an unsupervised model that jointly learns word distributions to represent: 1) different roles of conversational discourse, 2) various latent topics in reflecting content information. By explicitly distinguishing the probabilities of messages with varying discourse roles in containing topical words, our model is able to discover clusters of discourse words that are indicative of topical content. In an automatic evaluation on large-scale microblog corpora, our joint model yields topics with better coherence scores than competitive topic models from previous studies. Qualitative analysis on model outputs indicates that our model induces meaningful representations for both discourse and topics. We further present an empirical study on microblog summarization based on the outputs of our joint model. The results show that the jointly modeled discourse and topic representations can effectively indicate summary-worthy content in microblog conversations.
Tasks	Topic Models
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03690v1
PDF	http://arxiv.org/pdf/1809.03690v1.pdf
PWC	https://paperswithcode.com/paper/a-joint-model-of-conversational-discourse-and
Repo
Framework

Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks


Title	Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks
Authors	Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku
Abstract	The state-of-the-art in text-to-speech synthesis has recently improved considerably due to novel neural waveform generation methods, such as WaveNet. However, these methods suffer from their slow sequential inference process, while their parallel versions are difficult to train and even more expensive computationally. Meanwhile, generative adversarial networks (GANs) have achieved impressive results in image generation and are making their way into audio applications; parallel inference is among their lucrative properties. By adopting recent advances in GAN training techniques, this investigation studies waveform generation for TTS in two domains (speech signal and glottal excitation). Listening test results show that while direct waveform generation with GAN is still far behind WaveNet, a GAN-based glottal excitation model can achieve quality and voice similarity on par with a WaveNet vocoder.
Tasks	Image Generation, Speech Synthesis, Text-To-Speech Synthesis
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12598v1
PDF	http://arxiv.org/pdf/1810.12598v1.pdf
PWC	https://paperswithcode.com/paper/waveform-generation-for-text-to-speech
Repo
Framework

Universal features of price formation in financial markets: perspectives from Deep Learning


Title	Universal features of price formation in financial markets: perspectives from Deep Learning
Authors	Justin Sirignano, Rama Cont
Abstract	Using a large-scale Deep Learning approach applied to a high-frequency database containing billions of electronic market quotes and transactions for US equities, we uncover nonparametric evidence for the existence of a universal and stationary price formation mechanism relating the dynamics of supply and demand for a stock, as revealed through the order book, to subsequent variations in its market price. We assess the model by testing its out-of-sample predictions for the direction of price moves given the history of price and order flow, across a wide range of stocks and time periods. The universal price formation model is shown to exhibit a remarkably stable out-of-sample prediction accuracy across time, for a wide range of stocks from different sectors. Interestingly, these results also hold for stocks which are not part of the training sample, showing that the relations captured by the model are universal and not asset-specific. The universal model — trained on data from all stocks — outperforms, in terms of out-of-sample prediction accuracy, asset-specific linear and nonlinear models trained on time series of any given stock, showing that the universal nature of price formation weighs in favour of pooling together financial data from various stocks, rather than designing asset- or sector-specific models as commonly done. Standard data normalizations based on volatility, price level or average spread, or partitioning the training data into sectors or categories such as large/small tick stocks, do not improve training results. On the other hand, inclusion of price and order flow history over many past observations is shown to improve forecasting performance, showing evidence of path-dependence in price dynamics.
Tasks	Time Series
Published	2018-03-19
URL	http://arxiv.org/abs/1803.06917v1
PDF	http://arxiv.org/pdf/1803.06917v1.pdf
PWC	https://paperswithcode.com/paper/universal-features-of-price-formation-in
Repo
Framework

Spinal Cord Gray Matter-White Matter Segmentation on Magnetic Resonance AMIRA Images with MD-GRU


Title	Spinal Cord Gray Matter-White Matter Segmentation on Magnetic Resonance AMIRA Images with MD-GRU
Authors	Antal Horvath, Charidimos Tsagkas, Simon Andermatt, Simon Pezold, Katrin Parmar, Philippe Cattin
Abstract	The small butterfly shaped structure of spinal cord (SC) gray matter (GM) is challenging to image and to delinate from its surrounding white matter (WM). Segmenting GM is up to a point a trade-off between accuracy and precision. We propose a new pipeline for GM-WM magnetic resonance (MR) image acquisition and segmentation. We report superior results as compared to the ones recently reported in the SC GM segmentation challenge and show even better results using the averaged magnetization inversion recovery acquisitions (AMIRA) sequence. Scan-rescan experiments with the AMIRA sequence show high reproducibility in terms of Dice coefficient, Hausdorff distance and relative standard deviation. We use a recurrent neural network (RNN) with multi-dimensional gated recurrent units (MD-GRU) to train segmentation models on the AMIRA dataset of 855 slices. We added a generalized dice loss to the cross entropy loss that MD-GRU uses and were able to improve the results.
Tasks
Published	2018-08-07
URL	http://arxiv.org/abs/1808.02408v1
PDF	http://arxiv.org/pdf/1808.02408v1.pdf
PWC	https://paperswithcode.com/paper/spinal-cord-gray-matter-white-matter
Repo
Framework

Nonlinear Distributional Gradient Temporal-Difference Learning


Title	Nonlinear Distributional Gradient Temporal-Difference Learning
Authors	Chao Qu, Shie Mannor, Huan Xu
Abstract	We devise a distributional variant of gradient temporal-difference (TD) learning. Distributional reinforcement learning has been demonstrated to outperform the regular one in the recent study \citep{bellemare2017distributional}. In the policy evaluation setting, we design two new algorithms called distributional GTD2 and distributional TDC using the Cram{'e}r distance on the distributional version of the Bellman error objective function, which inherits advantages of both the nonlinear gradient TD algorithms and the distributional RL approach. In the control setting, we propose the distributional Greedy-GQ using the similar derivation. We prove the asymptotic almost-sure convergence of distributional GTD2 and TDC to a local optimal solution for general smooth function approximators, which includes neural networks that have been widely used in recent study to solve the real-life RL problems. In each step, the computational complexities of above three algorithms are linear w.r.t.\ the number of the parameters of the function approximator, thus can be implemented efficiently for neural networks.
Tasks	Distributional Reinforcement Learning
Published	2018-05-20
URL	http://arxiv.org/abs/1805.07732v3
PDF	http://arxiv.org/pdf/1805.07732v3.pdf
PWC	https://paperswithcode.com/paper/nonlinear-distributional-gradient-temporal
Repo
Framework

A Knowledge Graph Based Solution for Entity Discovery and Linking in Open-Domain Questions


Title	A Knowledge Graph Based Solution for Entity Discovery and Linking in Open-Domain Questions
Authors	Kai Lei, Bing Zhang, Yong Liu, Yang Deng, Dongyu Zhang, Ying Shen
Abstract	Named entity discovery and linking is the fundamental and core component of question answering. In Question Entity Discovery and Linking (QEDL) problem, traditional methods are challenged because multiple entities in one short question are difficult to be discovered entirely and the incomplete information in short text makes entity linking hard to implement. To overcome these difficulties, we proposed a knowledge graph based solution for QEDL and developed a system consists of Question Entity Discovery (QED) module and Entity Linking (EL) module. The method of QED module is a tradeoff and ensemble of two methods. One is the method based on knowledge graph retrieval, which could extract more entities in questions and guarantee the recall rate, the other is the method based on Conditional Random Field (CRF), which improves the precision rate. The EL module is treated as a ranking problem and Learning to Rank (LTR) method with features such as semantic similarity, text similarity and entity popularity is utilized to extract and make full use of the information in short texts. On the official dataset of a shared QEDL evaluation task, our approach could obtain 64.44% F1 score of QED and 64.86% accuracy of EL, which ranks the 2nd place and indicates its practical use for QEDL problem.
Tasks	Entity Linking, Learning-To-Rank, Question Answering, Semantic Similarity, Semantic Textual Similarity
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01889v1
PDF	http://arxiv.org/pdf/1812.01889v1.pdf
PWC	https://paperswithcode.com/paper/a-knowledge-graph-based-solution-for-entity
Repo
Framework

Crystal Loss and Quality Pooling for Unconstrained Face Verification and Recognition


Title	Crystal Loss and Quality Pooling for Unconstrained Face Verification and Recognition
Authors	Rajeev Ranjan, Ankan Bansal, Hongyu Xu, Swami Sankaranarayanan, Jun-Cheng Chen, Carlos D. Castillo, Rama Chellappa
Abstract	In recent years, the performance of face verification and recognition systems based on deep convolutional neural networks (DCNNs) has significantly improved. A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of face images or videos. The softmax loss function does not optimize the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap. In this paper, we propose a new loss function, called Crystal Loss, that restricts the features to lie on a hypersphere of a fixed radius. The loss can be easily implemented using existing deep learning frameworks. We show that integrating this simple step in the training pipeline significantly improves the performance of face verification and recognition systems. We achieve state-of-the-art performance for face verification and recognition on challenging LFW, IJB-A, IJB-B and IJB-C datasets over a large range of false alarm rates (10-1 to 10-7).
Tasks	Face Verification
Published	2018-04-03
URL	http://arxiv.org/abs/1804.01159v2
PDF	http://arxiv.org/pdf/1804.01159v2.pdf
PWC	https://paperswithcode.com/paper/crystal-loss-and-quality-pooling-for
Repo
Framework

Exploration by Distributional Reinforcement Learning


Title	Exploration by Distributional Reinforcement Learning
Authors	Yunhao Tang, Shipra Agrawal
Abstract	We propose a framework based on distributional reinforcement learning and recent attempts to combine Bayesian parameter updates with deep reinforcement learning. We show that our proposed framework conceptually unifies multiple previous methods in exploration. We also derive a practical algorithm that achieves efficient exploration on challenging control tasks.
Tasks	Distributional Reinforcement Learning, Efficient Exploration
Published	2018-05-04
URL	http://arxiv.org/abs/1805.01907v2
PDF	http://arxiv.org/pdf/1805.01907v2.pdf
PWC	https://paperswithcode.com/paper/exploration-by-distributional-reinforcement
Repo
Framework

SHADE: Information Based Regularization for Deep Learning


Title	SHADE: Information Based Regularization for Deep Learning
Authors	Michael Blot, Thomas Robert, Nicolas Thome, Matthieu Cord
Abstract	Regularization is a big issue for training deep neural networks. In this paper, we propose a new information-theory-based regularization scheme named SHADE for SHAnnon DEcay. The originality of the approach is to define a prior based on conditional entropy, which explicitly decouples the learning of invariant representations in the regularizer and the learning of correlations between inputs and labels in the data fitting term. Our second contribution is to derive a stochastic version of the regularizer compatible with deep learning, resulting in a tractable training scheme. We empirically validate the efficiency of our approach to improve classification performances compared to common regularization schemes on several standard architectures.
Tasks
Published	2018-04-29
URL	http://arxiv.org/abs/1804.10988v4
PDF	http://arxiv.org/pdf/1804.10988v4.pdf
PWC	https://paperswithcode.com/paper/shade-information-based-regularization-for-1
Repo
Framework