Paper Group ANR 507
Differential Evolution for Quantum Robust Control: Algorithm, Applications and Experiments. Variational Inference for Logical Inference. MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU. Differentially Private Bayesian Learning on Distributed Data. Lower Bounds on the Bayes Risk of the Bayesian BTL Model with Applications to Comp …
Differential Evolution for Quantum Robust Control: Algorithm, Applications and Experiments
Title | Differential Evolution for Quantum Robust Control: Algorithm, Applications and Experiments |
Authors | Daoyi Dong, Xi Xing, Hailan Ma, Chunlin Chen, Zhixin Liu, Herschel Rabitz |
Abstract | Robust control design for quantum systems has been recognized as a key task in quantum information technology, molecular chemistry and atomic physics. In this paper, an improved differential evolution algorithm of msMS_DE is proposed to search robust fields for various quantum control problems. In msMS_DE, multiple samples are used for fitness evaluation and a mixed strategy is employed for mutation operation. In particular, the msMS_DE algorithm is applied to the control problem of open inhomogeneous quantum ensembles and the consensus problem of a quantum network with uncertainties. Numerical results are presented to demonstrate the excellent performance of the improved DE algorithm for these two classes of quantum robust control problems. Furthermore, msMS_DE is experimentally implemented on femtosecond laser control systems to generate good signals of two photon absorption and control fragmentation of halomethane molecules CH2BrI. Experimental results demonstrate excellent performance of msMS_DE in searching effective femtosecond laser pulses for various tasks. |
Tasks | |
Published | 2017-02-13 |
URL | http://arxiv.org/abs/1702.03946v1 |
http://arxiv.org/pdf/1702.03946v1.pdf | |
PWC | https://paperswithcode.com/paper/differential-evolution-for-quantum-robust |
Repo | |
Framework | |
Variational Inference for Logical Inference
Title | Variational Inference for Logical Inference |
Authors | Guy Emerson, Ann Copestake |
Abstract | Functional Distributional Semantics is a framework that aims to learn, from text, semantic representations which can be interpreted in terms of truth. Here we make two contributions to this framework. The first is to show how a type of logical inference can be performed by evaluating conditional probabilities. The second is to make these calculations tractable by means of a variational approximation. This approximation also enables faster convergence during training, allowing us to close the gap with state-of-the-art vector space models when evaluating on semantic similarity. We demonstrate promising performance on two tasks. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2017-09-01 |
URL | http://arxiv.org/abs/1709.00224v1 |
http://arxiv.org/pdf/1709.00224v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-inference-for-logical-inference |
Repo | |
Framework | |
MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU
Title | MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU |
Authors | Qingqing Cao, Niranjan Balasubramanian, Aruna Balasubramanian |
Abstract | In this paper, we explore optimizations to run Recurrent Neural Network (RNN) models locally on mobile devices. RNN models are widely used for Natural Language Processing, Machine Translation, and other tasks. However, existing mobile applications that use RNN models do so on the cloud. To address privacy and efficiency concerns, we show how RNN models can be run locally on mobile devices. Existing work on porting deep learning models to mobile devices focus on Convolution Neural Networks (CNNs) and cannot be applied directly to RNN models. In response, we present MobiRNN, a mobile-specific optimization framework that implements GPU offloading specifically for mobile GPUs. Evaluations using an RNN model for activity recognition shows that MobiRNN does significantly decrease the latency of running RNN models on phones. |
Tasks | Activity Recognition, Machine Translation |
Published | 2017-06-03 |
URL | http://arxiv.org/abs/1706.00878v1 |
http://arxiv.org/pdf/1706.00878v1.pdf | |
PWC | https://paperswithcode.com/paper/mobirnn-efficient-recurrent-neural-network |
Repo | |
Framework | |
Differentially Private Bayesian Learning on Distributed Data
Title | Differentially Private Bayesian Learning on Distributed Data |
Authors | Mikko Heikkilä, Eemil Lagerspetz, Samuel Kaski, Kana Shimizu, Sasu Tarkoma, Antti Honkela |
Abstract | Many applications of machine learning, for example in health care, would benefit from methods that can guarantee privacy of data subjects. Differential privacy (DP) has become established as a standard for protecting learning results. The standard DP algorithms require a single trusted party to have access to the entire data, which is a clear weakness. We consider DP Bayesian learning in a distributed setting, where each party only holds a single sample or a few samples of the data. We propose a learning strategy based on a secure multi-party sum function for aggregating summaries from data holders and the Gaussian mechanism for DP. Our method builds on an asymptotically optimal and practically efficient DP Bayesian inference with rapidly diminishing extra cost. |
Tasks | Bayesian Inference |
Published | 2017-03-03 |
URL | http://arxiv.org/abs/1703.01106v2 |
http://arxiv.org/pdf/1703.01106v2.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-bayesian-learning-on |
Repo | |
Framework | |
Lower Bounds on the Bayes Risk of the Bayesian BTL Model with Applications to Comparison Graphs
Title | Lower Bounds on the Bayes Risk of the Bayesian BTL Model with Applications to Comparison Graphs |
Authors | Mine Alsan, Ranjitha Prasad, Vincent Y. F. Tan |
Abstract | We consider the problem of aggregating pairwise comparisons to obtain a consensus ranking order over a collection of objects. We use the popular Bradley-Terry-Luce (BTL) model which allows us to probabilistically describe pairwise comparisons between objects. In particular, we employ the Bayesian BTL model which allows for meaningful prior assumptions and to cope with situations where the number of objects is large and the number of comparisons between some objects is small or even zero. For the conventional Bayesian BTL model, we derive information-theoretic lower bounds on the Bayes risk of estimators for norm-based distortion functions. We compare the information-theoretic lower bound with the Bayesian Cram'{e}r-Rao lower bound we derive for the case when the Bayes risk is the mean squared error. We illustrate the utility of the bounds through simulations by comparing them with the error performance of an expectation-maximization based inference algorithm proposed for the Bayesian BTL model. We draw parallels between pairwise comparisons in the BTL model and inter-player games represented as edges in a comparison graph and analyze the effect of various graph structures on the lower bounds. We also extend the information-theoretic and Bayesian Cram'{e}r-Rao lower bounds to the more general Bayesian BTL model which takes into account home-field advantage. |
Tasks | |
Published | 2017-09-27 |
URL | http://arxiv.org/abs/1709.09676v4 |
http://arxiv.org/pdf/1709.09676v4.pdf | |
PWC | https://paperswithcode.com/paper/lower-bounds-on-the-bayes-risk-of-the |
Repo | |
Framework | |
Convolution Aware Initialization
Title | Convolution Aware Initialization |
Authors | Armen Aghajanyan |
Abstract | Initialization of parameters in deep neural networks has been shown to have a big impact on the performance of the networks (Mishkin & Matas, 2015). The initialization scheme devised by He et al, allowed convolution activations to carry a constrained mean which allowed deep networks to be trained effectively (He et al., 2015a). Orthogonal initializations and more generally orthogonal matrices in standard recurrent networks have been proved to eradicate the vanishing and exploding gradient problem (Pascanu et al., 2012). Majority of current initialization schemes do not take fully into account the intrinsic structure of the convolution operator. Using the duality of the Fourier transform and the convolution operator, Convolution Aware Initialization builds orthogonal filters in the Fourier space, and using the inverse Fourier transform represents them in the standard space. With Convolution Aware Initialization we noticed not only higher accuracy and lower loss, but faster convergence. We achieve new state of the art on the CIFAR10 dataset, and achieve close to state of the art on various other tasks. |
Tasks | |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06295v3 |
http://arxiv.org/pdf/1702.06295v3.pdf | |
PWC | https://paperswithcode.com/paper/convolution-aware-initialization |
Repo | |
Framework | |
The MATLAB Toolbox SciXMiner: User’s Manual and Programmer’s Guide
Title | The MATLAB Toolbox SciXMiner: User’s Manual and Programmer’s Guide |
Authors | Ralf Mikut, Andreas Bartschat, Wolfgang Doneit, Jorge Ángel González Ordiano, Benjamin Schott, Johannes Stegmaier, Simon Waczowicz, Markus Reischl |
Abstract | The Matlab toolbox SciXMiner is designed for the visualization and analysis of time series and features with a special focus to classification problems. It was developed at the Institute of Applied Computer Science of the Karlsruhe Institute of Technology (KIT), a member of the Helmholtz Association of German Research Centres in Germany. The aim was to provide an open platform for the development and improvement of data mining methods and its applications to various medical and technical problems. SciXMiner bases on Matlab (tested for the version 2017a). Many functions do not require additional standard toolboxes but some parts of Signal, Statistics and Wavelet toolboxes are used for special cases. The decision to a Matlab-based solution was made to use the wide mathematical functionality of this package provided by The Mathworks Inc. SciXMiner is controlled by a graphical user interface (GUI) with menu items and control elements like popup lists, checkboxes and edit elements. This makes it easier to work with SciXMiner for inexperienced users. Furthermore, an automatization and batch standardization of analyzes is possible using macros. The standard Matlab style using the command line is also available. SciXMiner is an open source software. The download page is http://sourceforge.net/projects/SciXMiner. It is licensed under the conditions of the GNU General Public License (GNU-GPL) of The Free Software Foundation. |
Tasks | Time Series |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03298v1 |
http://arxiv.org/pdf/1704.03298v1.pdf | |
PWC | https://paperswithcode.com/paper/the-matlab-toolbox-scixminer-users-manual-and |
Repo | |
Framework | |
CDS Rate Construction Methods by Machine Learning Techniques
Title | CDS Rate Construction Methods by Machine Learning Techniques |
Authors | Raymond Brummelhuis, Zhongmin Luo |
Abstract | Regulators require financial institutions to estimate counterparty default risks from liquid CDS quotes for the valuation and risk management of OTC derivatives. However, the vast majority of counterparties do not have liquid CDS quotes and need proxy CDS rates. Existing methods cannot account for counterparty-specific default risks; we propose to construct proxy CDS rates by associating to illiquid counterparty liquid CDS Proxy based on Machine Learning Techniques. After testing 156 classifiers from 8 most popular classifier families, we found that some classifiers achieve highly satisfactory accuracy rates. Furthermore, we have rank-ordered the performances and investigated performance variations amongst and within the 8 classifier families. This paper is, to the best of our knowledge, the first systematic study of CDS Proxy construction by Machine Learning techniques, and the first systematic classifier comparison study based entirely on financial market data. Its findings both confirm and contrast existing classifier performance literature. Given the typically highly correlated nature of financial data, we investigated the impact of correlation on classifier performance. The techniques used in this paper should be of interest for financial institutions seeking a CDS Proxy method, and can serve for proxy construction for other financial variables. Some directions for future research are indicated. |
Tasks | |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.06899v1 |
http://arxiv.org/pdf/1705.06899v1.pdf | |
PWC | https://paperswithcode.com/paper/cds-rate-construction-methods-by-machine |
Repo | |
Framework | |
Tensor-Based Classifiers for Hyperspectral Data Analysis
Title | Tensor-Based Classifiers for Hyperspectral Data Analysis |
Authors | Konstantinos Makantasis, Anastasios Doulamis, Nikolaos Doulamis, Antonis Nikitakis |
Abstract | In this work, we present tensor-based linear and nonlinear models for hyperspectral data classification and analysis. By exploiting principles of tensor algebra, we introduce new classification architectures, the weight parameters of which satisfies the {\it rank}-1 canonical decomposition property. Then, we introduce learning algorithms to train both the linear and the non-linear classifier in a way to i) to minimize the error over the training samples and ii) the weight coefficients satisfies the {\it rank}-1 canonical decomposition property. The advantages of the proposed classification model is that i) it reduces the number of parameters required and thus reduces the respective number of training samples required to properly train the model, ii) it provides a physical interpretation regarding the model coefficients on the classification output and iii) it retains the spatial and spectral coherency of the input samples. To address issues related with linear classification, characterizing by low capacity, since it can produce rules that are linear in the input space, we introduce non-linear classification models based on a modification of a feedforward neural network. We call the proposed architecture {\it rank}-1 Feedfoward Neural Network (FNN), since their weights satisfy the {\it rank}-1 caconical decomposition property. Appropriate learning algorithms are also proposed to train the network. Experimental results and comparisons with state of the art classification methods, either linear (e.g., SVM) and non-linear (e.g., deep learning) indicates the outperformance of the proposed scheme, especially in cases where a small number of training samples are available. Furthermore, the proposed tensor-based classfiers are evaluated against their capabilities in dimensionality reduction. |
Tasks | Dimensionality Reduction |
Published | 2017-09-24 |
URL | http://arxiv.org/abs/1709.08164v2 |
http://arxiv.org/pdf/1709.08164v2.pdf | |
PWC | https://paperswithcode.com/paper/tensor-based-classifiers-for-hyperspectral |
Repo | |
Framework | |
Unsupervised Ensemble Ranking of Terms in Electronic Health Record Notes Based on Their Importance to Patients
Title | Unsupervised Ensemble Ranking of Terms in Electronic Health Record Notes Based on Their Importance to Patients |
Authors | Jinying Chen, Hong Yu |
Abstract | Background: Electronic health record (EHR) notes contain abundant medical jargon that can be difficult for patients to comprehend. One way to help patients is to reduce information overload and help them focus on medical terms that matter most to them. Objective: The aim of this work was to develop FIT (Finding Important Terms for patients), an unsupervised natural language processing (NLP) system that ranks medical terms in EHR notes based on their importance to patients. Methods: We built FIT on a new unsupervised ensemble ranking model derived from the biased random walk algorithm to combine heterogeneous information resources for ranking candidate terms from each EHR note. Specifically, FIT integrates four single views for term importance: patient use of medical concepts, document-level term salience, word-occurrence based term relatedness, and topic coherence. It also incorporates partial information of term importance as conveyed by terms’ unfamiliarity levels and semantic types. We evaluated FIT on 90 expert-annotated EHR notes and compared it with three benchmark unsupervised ensemble ranking methods. Results: FIT achieved 0.885 AUC-ROC for ranking candidate terms from EHR notes to identify important terms. When including term identification, the performance of FIT for identifying important terms from EHR notes was 0.813 AUC-ROC. It outperformed the three ensemble rankers for most metrics. Its performance is relatively insensitive to its parameter. Conclusions: FIT can automatically identify EHR terms important to patients and may help develop personalized interventions to improve quality of care. By using unsupervised learning as well as a robust and flexible framework for information fusion, FIT can be readily applied to other domains and applications. |
Tasks | |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00538v2 |
http://arxiv.org/pdf/1703.00538v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-ensemble-ranking-of-terms-in |
Repo | |
Framework | |
Improved Workflow for Unsupervised Multiphase Image Segmentation
Title | Improved Workflow for Unsupervised Multiphase Image Segmentation |
Authors | Brendan A. West, Taylor S. Hodgdon, Matthew D. Parno, Arnold J. Song |
Abstract | Quantitative image analysis often depends on accurate classification of pixels through a segmentation process. However, imaging artifacts such as the partial volume effect and sensor noise complicate the classification process. These effects increase the pixel intensity variance of each constituent class, causing intensities from one class to overlap with another. This increased variance makes threshold based segmentation methods insufficient due to ambiguous overlap regions in the pixel intensity distributions. The class ambiguity becomes even more complex for systems with more than two constituents, such as unsaturated moist granular media. In this paper, we propose an image processing workflow that improves segmentation accuracy for multiphase systems. First, the ambiguous transition regions between classes are identified and removed, which allows for global thresholding of single-class regions. Then the transition regions are classified using a distance function, and finally both segmentations are combined into one classified image. This workflow includes three methodologies for identifying transition pixels and we demonstrate on a variety of synthetic images that these approaches are able to accurately separate the ambiguous transition pixels from the single-class regions. For situations with typical amounts of image noise, misclassification errors and area differences calculated between each class of the synthetic images and the resultant segmented images range from 0.69-1.48% and 0.01-0.74%, respectively, showing the segmentation accuracy of this approach. We demonstrate that we are able to accurately segment x-ray microtomography images of moist granular media using these computationally efficient methodologies. |
Tasks | Semantic Segmentation |
Published | 2017-10-26 |
URL | http://arxiv.org/abs/1710.09671v1 |
http://arxiv.org/pdf/1710.09671v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-workflow-for-unsupervised-multiphase |
Repo | |
Framework | |
Attribute Recognition by Joint Recurrent Learning of Context and Correlation
Title | Attribute Recognition by Joint Recurrent Learning of Context and Correlation |
Authors | Jingya Wang, Xiatian Zhu, Shaogang Gong, Wei Li |
Abstract | Recognising semantic pedestrian attributes in surveillance images is a challenging task for computer vision, particularly when the imaging quality is poor with complex background clutter and uncontrolled viewing conditions, and the number of labelled training data is small. In this work, we formulate a Joint Recurrent Learning (JRL) model for exploring attribute context and correlation in order to improve attribute recognition given small sized training data with poor quality images. The JRL model learns jointly pedestrian attribute correlations in a pedestrian image and in particular their sequential ordering dependencies (latent high-order correlation) in an end-to-end encoder/decoder recurrent network. We demonstrate the performance advantage and robustness of the JRL model over a wide range of state-of-the-art deep models for pedestrian attribute recognition, multi-label image classification, and multi-person image annotation on two largest pedestrian attribute benchmarks PETA and RAP. |
Tasks | Image Classification, Pedestrian Attribute Recognition |
Published | 2017-09-25 |
URL | http://arxiv.org/abs/1709.08553v1 |
http://arxiv.org/pdf/1709.08553v1.pdf | |
PWC | https://paperswithcode.com/paper/attribute-recognition-by-joint-recurrent |
Repo | |
Framework | |
Dual Attention Network for Product Compatibility and Function Satisfiability Analysis
Title | Dual Attention Network for Product Compatibility and Function Satisfiability Analysis |
Authors | Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu |
Abstract | Product compatibility and their functionality are of utmost importance to customers when they purchase products, and to sellers and manufacturers when they sell products. Due to the huge number of products available online, it is infeasible to enumerate and test the compatibility and functionality of every product. In this paper, we address two closely related problems: product compatibility analysis and function satisfiability analysis, where the second problem is a generalization of the first problem (e.g., whether a product works with another product can be considered as a special function). We first identify a novel question and answering corpus that is up-to-date regarding product compatibility and functionality information. To allow automatic discovery product compatibility and functionality, we then propose a deep learning model called Dual Attention Network (DAN). Given a QA pair for a to-be-purchased product, DAN learns to 1) discover complementary products (or functions), and 2) accurately predict the actual compatibility (or satisfiability) of the discovered products (or functions). The challenges addressed by the model include the briefness of QAs, linguistic patterns indicating compatibility, and the appropriate fusion of questions and answers. We conduct experiments to quantitatively and qualitatively show that the identified products and functions have both high coverage and accuracy, compared with a wide spectrum of baselines. |
Tasks | |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02016v1 |
http://arxiv.org/pdf/1712.02016v1.pdf | |
PWC | https://paperswithcode.com/paper/dual-attention-network-for-product |
Repo | |
Framework | |
Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation
Title | Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation |
Authors | Svetlana Kiritchenko, Saif M. Mohammad |
Abstract | Rating scales are a widely used method for data annotation; however, they present several challenges, such as difficulty in maintaining inter- and intra-annotator consistency. Best-worst scaling (BWS) is an alternative method of annotation that is claimed to produce high-quality annotations while keeping the required number of annotations similar to that of rating scales. However, the veracity of this claim has never been systematically established. Here for the first time, we set up an experiment that directly compares the rating scale method with BWS. We show that with the same total number of annotations, BWS produces significantly more reliable results than the rating scale. |
Tasks | |
Published | 2017-12-05 |
URL | http://arxiv.org/abs/1712.01765v1 |
http://arxiv.org/pdf/1712.01765v1.pdf | |
PWC | https://paperswithcode.com/paper/best-worst-scaling-more-reliable-than-rating |
Repo | |
Framework | |
Reinforced stochastic gradient descent for deep neural network learning
Title | Reinforced stochastic gradient descent for deep neural network learning |
Authors | Haiping Huang, Taro Toyoizumi |
Abstract | Stochastic gradient descent (SGD) is a standard optimization method to minimize a training error with respect to network parameters in modern neural network learning. However, it typically suffers from proliferation of saddle points in the high-dimensional parameter space. Therefore, it is highly desirable to design an efficient algorithm to escape from these saddle points and reach a parameter region of better generalization capabilities. Here, we propose a simple extension of SGD, namely reinforced SGD, which simply adds previous first-order gradients in a stochastic manner with a probability that increases with learning time. As verified in a simple synthetic dataset, this method significantly accelerates learning compared with the original SGD. Surprisingly, it dramatically reduces over-fitting effects, even compared with state-of-the-art adaptive learning algorithm—Adam. For a benchmark handwritten digits dataset, the learning performance is comparable to Adam, yet with an extra advantage of requiring one-fold less computer memory. The reinforced SGD is also compared with SGD with fixed or adaptive momentum parameter and Nesterov’s momentum, which shows that the proposed framework is able to reach a similar generalization accuracy with less computational costs. Overall, our method introduces stochastic memory into gradients, which plays an important role in understanding how gradient-based training algorithms can work and its relationship with generalization abilities of deep networks. |
Tasks | |
Published | 2017-01-27 |
URL | http://arxiv.org/abs/1701.07974v5 |
http://arxiv.org/pdf/1701.07974v5.pdf | |
PWC | https://paperswithcode.com/paper/reinforced-stochastic-gradient-descent-for |
Repo | |
Framework | |