January 27, 2020

3457 words 17 mins read

Paper Group ANR 1309

Paper Group ANR 1309

MAP Clustering under the Gaussian Mixture Model via Mixed Integer Nonlinear Optimization. Exploring the fitness landscape of a realistic turbofan rotor blade optimization. MxML: Mixture of Meta-Learners for Few-Shot Classification. Markov Decision Process for Video Generation. Dual Adaptive Pyramid Network for Cross-Stain Histopathology Image Segme …

MAP Clustering under the Gaussian Mixture Model via Mixed Integer Nonlinear Optimization

Title MAP Clustering under the Gaussian Mixture Model via Mixed Integer Nonlinear Optimization
Authors Patrick Flaherty, Pitchaya Wiratchotisatian, Ji Ah Lee, Zhou Tang, Andrew C. Trapp
Abstract We present a global optimization approach for solving the maximum a-posteriori (MAP) clustering problem under the Gaussian mixture model.Our approach can accommodate side constraints and it preserves the combinatorial structure of the MAP clustering problem by formulating it asa mixed-integer nonlinear optimization problem (MINLP). We approximate the MINLP through a mixed-integer quadratic program (MIQP) transformation that improves computational aspects while guaranteeing $\epsilon$-global optimality. An important benefit of our approach is the explicit quantification of the degree of suboptimality, via the optimality gap, en route to finding the globally optimal MAP clustering. Numerical experiments comparing our method to other approaches show that our method finds a better solution than standard clustering methods. Finally, we cluster a real breast cancer gene expression data set incorporating intrinsic subtype information; the induced constraints substantially improve the computational performance and produce more coherent and bio-logically meaningful clusters.
Tasks
Published 2019-11-08
URL https://arxiv.org/abs/1911.04285v2
PDF https://arxiv.org/pdf/1911.04285v2.pdf
PWC https://paperswithcode.com/paper/maximum-a-posteriori-estimation-for-the
Repo
Framework

Exploring the fitness landscape of a realistic turbofan rotor blade optimization

Title Exploring the fitness landscape of a realistic turbofan rotor blade optimization
Authors Jakub Kmec, Sebastian Schmitt
Abstract Aerodynamic shape optimization has established itself as a valuable tool in the engineering design process to achieve highly efficient results. A central aspect for such approaches is the mapping from the design parameters which encode the geometry of the shape to be improved to the quality criteria which describe its performance. The choices to be made in the setup of the optimization process strongly influence this mapping and thus are expected to have a profound influence on the achievable result. In this work we explore the influence of such choices on the effects on the shape optimization of a turbofan rotor blade as it can be realized within an aircraft engine design process. The blade quality is assessed by realistic three dimensional computational fluid dynamics (CFD) simulations. We investigate the outcomes of several optimization runs which differ in various configuration options, such as optimization algorithm, initialization, number of degrees of freedom for the parametrization. For all such variations, we generally find that the achievable improvement of the blade quality is comparable for most settings and thus rather insensitive to the details of the setup. On the other hand, even supposedly minor changes in the settings, such as using a different random seed for the initialization of the optimizer algorithm, lead to very different shapes. Optimized shapes which show comparable performance usually differ quite strongly in their geometries over the complete blade. Our analyses indicate that the fitness landscape for such a realistic turbofan rotor blade optimization is highly multi-modal with many local optima, where very different shapes show similar performance.
Tasks
Published 2019-10-16
URL https://arxiv.org/abs/1910.07268v1
PDF https://arxiv.org/pdf/1910.07268v1.pdf
PWC https://paperswithcode.com/paper/exploring-the-fitness-landscape-of-a
Repo
Framework

MxML: Mixture of Meta-Learners for Few-Shot Classification

Title MxML: Mixture of Meta-Learners for Few-Shot Classification
Authors Minseop Park, Jungtaek Kim, Saehoon Kim, Yanbin Liu, Seungjin Choi
Abstract A meta-model is trained on a distribution of similar tasks such that it learns an algorithm that can quickly adapt to a novel task with only a handful of labeled examples. Most of current meta-learning methods assume that the meta-training set consists of relevant tasks sampled from a single distribution. In practice, however, a new task is often out of the task distribution, yielding a performance degradation. One way to tackle this problem is to construct an ensemble of meta-learners such that each meta-learner is trained on different task distribution. In this paper we present a method for constructing a mixture of meta-learners (MxML), where mixing parameters are determined by the weight prediction network (WPN) optimized to improve the few-shot classification performance. Experiments on various datasets demonstrate that MxML significantly outperforms state-of-the-art meta-learners, or their naive ensemble in the case of out-of-distribution as well as in-distribution tasks.
Tasks Meta-Learning
Published 2019-04-11
URL http://arxiv.org/abs/1904.05658v1
PDF http://arxiv.org/pdf/1904.05658v1.pdf
PWC https://paperswithcode.com/paper/mxml-mixture-of-meta-learners-for-few-shot
Repo
Framework

Markov Decision Process for Video Generation

Title Markov Decision Process for Video Generation
Authors Vladyslav Yushchenko, Nikita Araslanov, Stefan Roth
Abstract We identify two pathological cases of temporal inconsistencies in video generation: video freezing and video looping. To better quantify the temporal diversity, we propose a class of complementary metrics that are effective, easy to implement, data agnostic, and interpretable. Further, we observe that current state-of-the-art models are trained on video samples of fixed length thereby inhibiting long-term modeling. To address this, we reformulate the problem of video generation as a Markov Decision Process (MDP). The underlying idea is to represent motion as a stochastic process with an infinite forecast horizon to overcome the fixed length limitation and to mitigate the presence of temporal artifacts. We show that our formulation is easy to integrate into the state-of-the-art MoCoGAN framework. Our experiments on the Human Actions and UCF-101 datasets demonstrate that our MDP-based model is more memory efficient and improves the video quality both in terms of the new and established metrics.
Tasks Video Generation
Published 2019-09-26
URL https://arxiv.org/abs/1909.12400v1
PDF https://arxiv.org/pdf/1909.12400v1.pdf
PWC https://paperswithcode.com/paper/markov-decision-process-for-video-generation
Repo
Framework

Dual Adaptive Pyramid Network for Cross-Stain Histopathology Image Segmentation

Title Dual Adaptive Pyramid Network for Cross-Stain Histopathology Image Segmentation
Authors Xianxu Hou, Jingxin Liu, Bolei Xu, Bozhi Liu, Xin Chen, Mohammad Ilyas, Ian Ellis, Jon Garibaldi, Guoping Qiu
Abstract Supervised semantic segmentation normally assumes the test data being in a similar data domain as the training data. However, in practice, the domain mismatch between the training and unseen data could lead to a significant performance drop. Obtaining accurate pixel-wise label for images in different domains is tedious and labor intensive, especially for histopathology images. In this paper, we propose a dual adaptive pyramid network (DAPNet) for histopathological gland segmentation adapting from one stain domain to another. We tackle the domain adaptation problem on two levels: 1) the image-level considers the differences of image color and style; 2) the feature-level addresses the spatial inconsistency between two domains. The two components are implemented as domain classifiers with adversarial training. We evaluate our new approach using two gland segmentation datasets with H&E and DAB-H stains respectively. The extensive experiments and ablation study demonstrate the effectiveness of our approach on the domain adaptive segmentation task. We show that the proposed approach performs favorably against other state-of-the-art methods.
Tasks Domain Adaptation, Semantic Segmentation
Published 2019-09-25
URL https://arxiv.org/abs/1909.11524v1
PDF https://arxiv.org/pdf/1909.11524v1.pdf
PWC https://paperswithcode.com/paper/dual-adaptive-pyramid-network-for-cross-stain
Repo
Framework

Comparison of Diverse Decoding Methods from Conditional Language Models

Title Comparison of Diverse Decoding Methods from Conditional Language Models
Authors Daphne Ippolito, Reno Kriz, Maria Kustikova, João Sedoc, Chris Callison-Burch
Abstract While conditional language models have greatly improved in their ability to output high-quality natural language, many NLP applications benefit from being able to generate a diverse set of candidate sequences. Diverse decoding strategies aim to, within a given-sized candidate list, cover as much of the space of high-quality outputs as possible, leading to improvements for tasks that re-rank and combine candidate outputs. Standard decoding methods, such as beam search, optimize for generating high likelihood sequences rather than diverse ones, though recent work has focused on increasing diversity in these methods. In this work, we perform an extensive survey of decoding-time strategies for generating diverse outputs from conditional language models. We also show how diversity can be improved without sacrificing quality by over-sampling additional candidates, then filtering to the desired number.
Tasks
Published 2019-06-14
URL https://arxiv.org/abs/1906.06362v1
PDF https://arxiv.org/pdf/1906.06362v1.pdf
PWC https://paperswithcode.com/paper/comparison-of-diverse-decoding-methods-from
Repo
Framework

Quantum Optical Experiments Modeled by Long Short-Term Memory

Title Quantum Optical Experiments Modeled by Long Short-Term Memory
Authors Thomas Adler, Manuel Erhard, Mario Krenn, Johannes Brandstetter, Johannes Kofler, Sepp Hochreiter
Abstract We demonstrate how machine learning is able to model experiments in quantum physics. Quantum entanglement is a cornerstone for upcoming quantum technologies such as quantum computation and quantum cryptography. Of particular interest are complex quantum states with more than two particles and a large number of entangled quantum levels. Given such a multiparticle high-dimensional quantum state, it is usually impossible to reconstruct an experimental setup that produces it. To search for interesting experiments, one thus has to randomly create millions of setups on a computer and calculate the respective output states. In this work, we show that machine learning models can provide significant improvement over random search. We demonstrate that a long short-term memory (LSTM) neural network can successfully learn to model quantum experiments by correctly predicting output state characteristics for given setups without the necessity of computing the states themselves. This approach not only allows for faster search but is also an essential step towards automated design of multiparticle high-dimensional quantum experiments using generative machine learning models.
Tasks
Published 2019-10-30
URL https://arxiv.org/abs/1910.13804v1
PDF https://arxiv.org/pdf/1910.13804v1.pdf
PWC https://paperswithcode.com/paper/quantum-optical-experiments-modeled-by-long-1
Repo
Framework

On Analog Gradient Descent Learning over Multiple Access Fading Channels

Title On Analog Gradient Descent Learning over Multiple Access Fading Channels
Authors Tomer Sery, Kobi Cohen
Abstract We consider a distributed learning problem over multiple access channel (MAC) using a large wireless network. The computation is made by the network edge and is based on received data from a large number of distributed nodes which transmit over a noisy fading MAC. The objective function is a sum of the nodes’ local loss functions. This problem has attracted a growing interest in distributed sensing systems, and more recently in federated learning. We develop a novel Gradient-Based Multiple Access (GBMA) algorithm to solve the distributed learning problem over MAC. Specifically, the nodes transmit an analog function of the local gradient using common shaping waveforms and the network edge receives a superposition of the analog transmitted signals used for updating the estimate. GBMA does not require power control or beamforming to cancel the fading effect as in other algorithms, and operates directly with noisy distorted gradients. We analyze the performance of GBMA theoretically, and prove that it can approach the convergence rate of the centralized gradient descent (GD) algorithm in large networks. Specifically, we establish a finite-sample bound of the error for both convex and strongly convex loss functions with Lipschitz gradient. Furthermore, we provide energy scaling laws for approaching the centralized convergence rate as the number of nodes increases. Finally, experimental results support the theoretical findings, and demonstrate strong performance of GBMA using synthetic and real data.
Tasks
Published 2019-08-20
URL https://arxiv.org/abs/1908.07463v1
PDF https://arxiv.org/pdf/1908.07463v1.pdf
PWC https://paperswithcode.com/paper/on-analog-gradient-descent-learning-over
Repo
Framework

Seq-SetNet: Exploring Sequence Sets for Inferring Structures

Title Seq-SetNet: Exploring Sequence Sets for Inferring Structures
Authors Fusong Ju, Jianwei Zhu, Guozheng Wei, Qi Zhang, Shiwei Sun, Dongbo Bu
Abstract Sequence set is a widely-used type of data source in a large variety of fields. A typical example is protein structure prediction, which takes an multiple sequence alignment (MSA) as input and aims to infer structural information from it. Almost all of the existing approaches exploit MSAs in an indirect fashion, i.e., they transform MSAs into position-specific scoring matrices (PSSM) that represent the distribution of amino acid types at each column. PSSM could capture column-wise characteristics of MSA, however, the column-wise characteristics embedded in each individual component sequence were nearly totally neglected. The drawback of PSSM is rooted in the fact that an MSA is essentially an unordered sequence set rather than a matrix. Specifically, the interchange of any two sequences will not affect the whole MSA. In contrast, the pixels in an image essentially form a matrix since any two rows of pixels cannot be interchanged. Therefore, the traditional deep neural networks designed for image processing cannot be directly applied on sequence sets. Here, we proposed a novel deep neural network framework (called Seq-SetNet) for sequence set processing. By employing a {\it symmetric function} module to integrate features calculated from preceding layers, Seq-SetNet are immune to the order of sequences in the input MSA. This advantage enables us to directly and fully exploit MSAs by considering each component protein individually. We evaluated Seq-SetNet by using it to extract structural information from MSA for protein secondary structure prediction. Experimental results on popular benchmark sets suggests that Seq-SetNet outperforms the state-of-the-art approaches by 3.6% in precision. These results clearly suggest the advantages of Seq-SetNet in sequence set processing and it can be readily used in a wide range of fields, say natural language processing.
Tasks Protein Secondary Structure Prediction
Published 2019-06-06
URL https://arxiv.org/abs/1906.11196v1
PDF https://arxiv.org/pdf/1906.11196v1.pdf
PWC https://paperswithcode.com/paper/seq-setnet-exploring-sequence-sets-for
Repo
Framework

DomainGAN: Generating Adversarial Examples to Attack Domain Generation Algorithm Classifiers

Title DomainGAN: Generating Adversarial Examples to Attack Domain Generation Algorithm Classifiers
Authors Isaac Corley, Jonathan Lwowski, Justin Hoffman
Abstract Domain Generation Algorithms (DGAs) are frequently used to generate numerous domains for use by botnets. These domains are often utilized as rendezvous points for servers that malware has command and control over. There are many algorithms that are used to generate domains, however many of these algorithms are simplistic and easily detected by traditional machine learning techniques. In this paper, three variants of Generative Adversarial Networks (GANs) are optimized to generate domains which have similar characteristics of benign domains, resulting in domains which greatly evade several state-of-the-art deep learning based DGA classifiers. We additionally provide a detailed analysis into offensive usability for each variant with respect to repeated and existing domain collisions. Finally, we fine-tune the state-of-the-art DGA classifiers by adding GAN generated samples to their original training datasets and analyze the changes in performance. Our results conclude that GAN based DGAs are superior in evading DGA classifiers in comparison to traditional DGAs, and of the variants, the Wasserstein GAN with Gradient Penalty (WGANGP) is the highest performing DGA for uses both offensively and defensively.
Tasks
Published 2019-11-14
URL https://arxiv.org/abs/1911.06285v3
PDF https://arxiv.org/pdf/1911.06285v3.pdf
PWC https://paperswithcode.com/paper/domaingan-generating-adversarial-examples-to
Repo
Framework

Multi-Modal Recognition of Worker Activity for Human-Centered Intelligent Manufacturing

Title Multi-Modal Recognition of Worker Activity for Human-Centered Intelligent Manufacturing
Authors Wenjin Tao, Ming C. Leu, Zhaozheng Yin
Abstract In a human-centered intelligent manufacturing system, sensing and understanding of the worker’s activity are the primary tasks. In this paper, we propose a novel multi-modal approach for worker activity recognition by leveraging information from different sensors and in different modalities. Specifically, a smart armband and a visual camera are applied to capture Inertial Measurement Unit (IMU) signals and videos, respectively. For the IMU signals, we design two novel feature transform mechanisms, in both frequency and spatial domains, to assemble the captured IMU signals as images, which allow using convolutional neural networks to learn the most discriminative features. Along with the above two modalities, we propose two other modalities for the video data, at the video frame and video clip levels, respectively. Each of the four modalities returns a probability distribution on activity prediction. Then, these probability distributions are fused to output the worker activity classification result. A worker activity dataset of 6 activities is established, which at present contains 6 common activities in assembly tasks, i.e., grab a tool/part, hammer a nail, use a power-screwdriver, rest arms, turn a screwdriver, and use a wrench. The developed multi-modal approach is evaluated on this dataset and achieves recognition accuracies as high as 97% and 100% in the leave-one-out and half-half experiments, respectively.
Tasks Activity Prediction, Activity Recognition
Published 2019-08-20
URL https://arxiv.org/abs/1908.07519v1
PDF https://arxiv.org/pdf/1908.07519v1.pdf
PWC https://paperswithcode.com/paper/190807519
Repo
Framework

Direct and indirect reinforcement learning

Title Direct and indirect reinforcement learning
Authors Yang Guan, Shengbo Eben Li, Jingliang Duan, Jie Li, Yangang Ren, Bo Cheng
Abstract Reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. In this paper, we classify RL into direct and indirect methods according to how they seek optimal policy of the Markov Decision Process (MDP) problem. The former solves optimal policy by directly maximizing an objective function using gradient descent method, in which the objective function is usually the expectation of accumulative future rewards. The latter indirectly finds the optimal policy by solving the Bellman equation, which is the sufficient and necessary condition from Bellman’s principle of optimality. We take vanilla policy gradient and approximate policy iteration to study their internal relationship, and reveal that both direct and indirect methods can be unified in actor-critic architecture and are equivalent if we always choose stationary state distribution of current policy as initial state distribution of MDP. Finally, we classify the current mainstream RL algorithms and compare the differences between other criteria including value-based and policy-based, model-based and model-free.
Tasks Decision Making
Published 2019-12-23
URL https://arxiv.org/abs/1912.10600v1
PDF https://arxiv.org/pdf/1912.10600v1.pdf
PWC https://paperswithcode.com/paper/direct-and-indirect-reinforcement-learning
Repo
Framework

Constructing Clustering Transformations

Title Constructing Clustering Transformations
Authors Steffen Borgwardt, Charles Viss
Abstract Clustering is one of the fundamental tasks in data analytics and machine learning. In many situations, different clusterings of the same data set become relevant. For example, different algorithms for the same clustering task may return dramatically different solutions. We are interested in applications in which one clustering has to be transformed into another; e.g., when a gradual transition from an old solution to a new one is required. In this paper, we devise methods for constructing such a transition based on linear programming and network theory. We use a so-called clustering-difference graph to model the desired transformation and provide methods for decomposing the graph into a sequence of elementary moves that accomplishes the transformation. These moves are equivalent to the edge directions, or circuits, of the underlying partition polytopes. Therefore, in addition to a conceptually new metric for measuring the distance between clusterings, we provide new bounds on the circuit diameter of these partition polytopes.
Tasks
Published 2019-04-10
URL http://arxiv.org/abs/1904.05406v1
PDF http://arxiv.org/pdf/1904.05406v1.pdf
PWC https://paperswithcode.com/paper/constructing-clustering-transformations
Repo
Framework

Multi-grained Attention Networks for Single Image Super-Resolution

Title Multi-grained Attention Networks for Single Image Super-Resolution
Authors Huapeng Wu, Zhengxia Zou, Jie Gui, Wen-Jun Zeng, Jieping Ye, Jun Zhang, Hongyi Liu, Zhihui Wei
Abstract Deep Convolutional Neural Networks (CNN) have drawn great attention in image super-resolution (SR). Recently, visual attention mechanism, which exploits both of the feature importance and contextual cues, has been introduced to image SR and proves to be effective to improve CNN-based SR performance. In this paper, we make a thorough investigation on the attention mechanisms in a SR model and shed light on how simple and effective improvements on these ideas improve the state-of-the-arts. We further propose a unified approach called “multi-grained attention networks (MGAN)” which fully exploits the advantages of multi-scale and attention mechanisms in SR tasks. In our method, the importance of each neuron is computed according to its surrounding regions in a multi-grained fashion and then is used to adaptively re-scale the feature responses. More importantly, the “channel attention” and “spatial attention” strategies in previous methods can be essentially considered as two special cases of our method. We also introduce multi-scale dense connections to extract the image features at multiple scales and capture the features of different layers through dense skip connections. Ablation studies on benchmark datasets demonstrate the effectiveness of our method. In comparison with other state-of-the-art SR methods, our method shows the superiority in terms of both accuracy and model size.
Tasks Feature Importance, Image Super-Resolution, Super-Resolution
Published 2019-09-26
URL https://arxiv.org/abs/1909.11937v2
PDF https://arxiv.org/pdf/1909.11937v2.pdf
PWC https://paperswithcode.com/paper/multi-grained-attention-networks-for-single
Repo
Framework

retina-VAE: Variationally Decoding the Spectrum of Macular Disease

Title retina-VAE: Variationally Decoding the Spectrum of Macular Disease
Authors Stephen G. Odaibo
Abstract In this paper, we seek a clinically-relevant latent code for representing the spectrum of macular disease. Towards this end, we construct retina-VAE, a variational autoencoder-based model that accepts a patient profile vector (pVec) as input. The pVec components include clinical exam findings and demographic information. We evaluate the model on a subspectrum of the retinal maculopathies, in particular, exudative age-related macular degeneration, central serous chorioretinopathy, and polypoidal choroidal vasculopathy. For these three maculopathies, a database of 3000 6-dimensional pVecs (1000 each) was synthetically generated based on known disease statistics in the literature. The database was then used to train the VAE and generate latent vector representations. We found training performance to be best for a 3-dimensional latent vector architecture compared to 2 or 4 dimensional latents. Additionally, for the 3D latent architecture, we discovered that the resulting latent vectors were strongly clustered spontaneously into one of 14 clusters. Kmeans was then used only to identify members of each cluster and to inspect cluster properties. These clusters suggest underlying disease subtypes which may potentially respond better or worse to particular pharmaceutical treatments such as anti-vascular endothelial growth factor variants. The retina-VAE framework will potentially yield new fundamental insights into the mechanisms and manifestations of disease. And will potentially facilitate the development of personalized pharmaceuticals and gene therapies.
Tasks
Published 2019-07-11
URL https://arxiv.org/abs/1907.05195v1
PDF https://arxiv.org/pdf/1907.05195v1.pdf
PWC https://paperswithcode.com/paper/retina-vae-variationally-decoding-the
Repo
Framework
comments powered by Disqus