January 29, 2020

2750 words 13 mins read

Paper Group ANR 516

Exploring the 3D architectures of deep material network in data-driven multiscale mechanics. Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces. Classification-driven Single Image Dehazing. MC-ISTA-Net: Adaptive Measurement and Initialization and Channel Attention Optimization inspired Neural Network for Compressive Sensing. Ensem …

Exploring the 3D architectures of deep material network in data-driven multiscale mechanics


Title	Exploring the 3D architectures of deep material network in data-driven multiscale mechanics
Authors	Zeliang Liu, C. T. Wu
Abstract	This paper extends the deep material network (DMN) proposed by Liu et al. (2019) to tackle general 3-dimensional (3D) problems with arbitrary material and geometric nonlinearities. It discovers a new way of describing multiscale heterogeneous materials by a multi-layer network structure and mechanistic building blocks. The data-driven framework of DMN is discussed in detail about the offline training and online extrapolation stages. Analytical solutions of the 3D building block with a two-layer structure in both small- and finite-strain formulations are derived based on interfacial equilibrium conditions and kinematic constraints. With linear elastic data generated by direct numerical simulations on a representative volume element (RVE), the network can be effectively trained in the offline stage using stochastic gradient descent and advanced model compression algorithms. Efficiency and accuracy of DMN on addressing the long-standing 3D RVE challenges with complex morphologies and material laws are validated through numerical experiments, including 1) hyperelastic particle-reinforced rubber composite with Mullins effect; 2) polycrystalline materials with rate-dependent crystal plasticity; 3) carbon fiber reinforced polymer (CFRP) composites with fiber anisotropic elasticity and matrix plasticity. In particular, we demonstrate a three-scale homogenization procedure of CFRP system by concatenating the microscale and mesoscale material networks. The complete learning and extrapolation procedures of DMN establish a reliable data-driven framework for multiscale material modeling and design.
Tasks	Model Compression
Published	2019-01-02
URL	http://arxiv.org/abs/1901.04832v2
PDF	http://arxiv.org/pdf/1901.04832v2.pdf
PWC	https://paperswithcode.com/paper/exploring-the-3d-architectures-of-deep
Repo
Framework

Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces


Title	Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces
Authors	Bryan Seybold, Emily Fertig, Alex Alemi, Ian Fischer
Abstract	Variational autoencoders learn unsupervised data representations, but these models frequently converge to minima that fail to preserve meaningful semantic information. For example, variational autoencoders with autoregressive decoders often collapse into autodecoders, where they learn to ignore the encoder input. In this work, we demonstrate that adding an auxiliary decoder to regularize the latent space can prevent this collapse, but successful auxiliary decoding tasks are domain dependent. Auxiliary decoders can increase the amount of semantic information encoded in the latent space and visible in the reconstructions. The semantic information in the variational autoencoder’s representation is only weakly correlated with its rate, distortion, or evidence lower bound. Compared to other popular strategies that modify the training objective, our regularization of the latent space generally increased the semantic information content.
Tasks
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07478v1
PDF	https://arxiv.org/pdf/1905.07478v1.pdf
PWC	https://paperswithcode.com/paper/dueling-decoders-regularizing-variational
Repo
Framework

Classification-driven Single Image Dehazing


Title	Classification-driven Single Image Dehazing
Authors	Yanting Pei, Yaping Huang, Xingyuan Zhang
Abstract	Most existing dehazing algorithms often use hand-crafted features or Convolutional Neural Networks (CNN)-based methods to generate clear images using pixel-level Mean Square Error (MSE) loss. The generated images generally have better visual appeal, but not always have better performance for high-level vision tasks, e.g. image classification. In this paper, we investigate a new point of view in addressing this problem. Instead of focusing only on achieving good quantitative performance on pixel-based metrics such as Peak Signal to Noise Ratio (PSNR), we also ensure that the dehazed image itself does not degrade the performance of the high-level vision tasks such as image classification. To this end, we present an unified CNN architecture that includes three parts: a dehazing sub-network (DNet), a classification-driven Conditional Generative Adversarial Networks sub-network (CCGAN) and a classification sub-network (CNet) related to image classification, which has better performance both on visual appeal and image classification. We conduct comprehensive experiments on two challenging benchmark datasets for fine-grained and object classification: CUB-200-2011 and Caltech-256. Experimental results demonstrate that the proposed method outperforms many recent state-of-the-art single image dehazing methods in terms of image dehazing metrics and classification accuracy.
Tasks	Image Classification, Image Dehazing, Object Classification, Single Image Dehazing
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09389v1
PDF	https://arxiv.org/pdf/1911.09389v1.pdf
PWC	https://paperswithcode.com/paper/classification-driven-single-image-dehazing
Repo
Framework

MC-ISTA-Net: Adaptive Measurement and Initialization and Channel Attention Optimization inspired Neural Network for Compressive Sensing


Title	MC-ISTA-Net: Adaptive Measurement and Initialization and Channel Attention Optimization inspired Neural Network for Compressive Sensing
Authors	Nanyu Li, Cuiyin Liu
Abstract	The optimization inspired network can bridge convex optimization and neural networks in Compressive Sensing (CS) reconstruction of natural image, like ISTA-Net+, which mapping optimization algorithm: iterative shrinkage-thresholding algorithm (ISTA) into network. However, measurement matrix and input initialization are still hand-crafted, and multi-channel feature map contain information at different frequencies, which is treated equally across channels, hindering the ability of CS reconstruction in optimization-inspired networks. In order to solve the above problems, we proposed MC-ISTA-Net
Tasks	Compressive Sensing
Published	2019-02-26
URL	https://arxiv.org/abs/1902.09878v3
PDF	https://arxiv.org/pdf/1902.09878v3.pdf
PWC	https://paperswithcode.com/paper/mc-ista-net-adaptive-measurement-and
Repo
Framework

Ensembles of feedforward-designed convolutional neural networks


Title	Ensembles of feedforward-designed convolutional neural networks
Authors	Yueru Chen, Yijing Yang, Wei Wang, C. -C. Jay Kuo
Abstract	An ensemble method that fuses the output decision vectors of multiple feedforward-designed convolutional neural networks (FF-CNNs) to solve the image classification problem is proposed in this work. To enhance the performance of the ensemble system, it is critical to increasing the diversity of FF-CNN models. To achieve this objective, we introduce diversities by adopting three strategies: 1) different parameter settings in convolutional layers, 2) flexible feature subsets fed into the Fully-connected (FC) layers, and 3) multiple image embeddings of the same input source. Furthermore, we partition input samples into easy and hard ones based on their decision confidence scores. As a result, we can develop a new ensemble system tailored to hard samples to further boost classification accuracy. Experiments are conducted on the MNIST and CIFAR-10 datasets to demonstrate the effectiveness of the ensemble method.
Tasks	Image Classification
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02154v1
PDF	http://arxiv.org/pdf/1901.02154v1.pdf
PWC	https://paperswithcode.com/paper/ensembles-of-feedforward-designed
Repo
Framework

An Algorithm for Multi-Attribute Diverse Matching


Title	An Algorithm for Multi-Attribute Diverse Matching
Authors	Saba Ahmadi, Faez Ahmed, John P. Dickerson, Mark Fuge, Samir Khuller
Abstract	Bipartite b-matching, where agents on one side of a market are matched to one or more agents or items on the other, is a classical model that is used in myriad application areas such as healthcare, advertising, education, and general resource allocation. Traditionally, the primary goal of such models is to maximize a linear function of the constituent matches (e.g., linear social welfare maximization) subject to some constraints. Recent work has studied a new goal of balancing whole-match diversity and economic efficiency, where the objective is instead a monotone submodular function over the matching. Basic versions of this problem are solvable in polynomial time. In this work, we prove that the problem of simultaneously maximizing diversity along several features (e.g., country of citizenship, gender, skills) is NP-hard. To address this problem, we develop the first combinatorial algorithm that constructs provably-optimal diverse b-matchings in pseudo-polynomial time. We also provide a Mixed-Integer Quadratic formulation for the same problem and show that our method guarantees optimal solutions and takes less computation time for a reviewer assignment application.
Tasks
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03350v3
PDF	https://arxiv.org/pdf/1909.03350v3.pdf
PWC	https://paperswithcode.com/paper/algorithms-for-optimal-diverse-matching
Repo
Framework

Neural Markov Logic Networks


Title	Neural Markov Logic Networks
Authors	Giuseppe Marra, Ondřej Kuželka
Abstract	We introduce Neural Markov Logic Networks (NMLNs), a statistical relational learning system that borrows ideas from Markov logic. Like Markov Logic Networks (MLNs), NMLNs are an exponential-family model for modelling distributions over possible worlds, but unlike MLNs, they do not rely on explicitly specified first-order logic rules. Instead, NMLNs learn an implicit representation of such rules as a neural network that acts as a potential function on fragments of the relational structure. Interestingly, any MLN can be represented as an NMLN. Similarly to recently proposed Neural theorem provers (NTPs) [Rockt"aschel and Riedel, 2017], NMLNs can exploit embeddings of constants but, unlike NTPs, NMLNs work well also in their absence. This is extremely important for predicting in settings other than the transductive one. We showcase the potential of NMLNs on knowledge-base completion tasks and on generation of molecular (graph) data.
Tasks	Knowledge Base Completion, Relational Reasoning
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13462v2
PDF	https://arxiv.org/pdf/1905.13462v2.pdf
PWC	https://paperswithcode.com/paper/neural-markov-logic-networks
Repo
Framework

Grids versus Graphs: Partitioning Space for Improved Taxi Demand-Supply Forecasts


Title	Grids versus Graphs: Partitioning Space for Improved Taxi Demand-Supply Forecasts
Authors	Neema Davis, Gaurav Raina, Krishna Jagannathan
Abstract	Accurate taxi demand-supply forecasting is a challenging application of ITS (Intelligent Transportation Systems), due to the complex spatial and temporal patterns. We investigate the impact of different spatial partitioning techniques on the prediction performance of an LSTM (Long Short-Term Memory) network, in the context of taxi demand-supply forecasting. We consider two tessellation schemes: (i) the variable-sized Voronoi tessellation, and (ii) the fixed-sized Geohash tessellation. While the widely employed ConvLSTM (Convolutional LSTM) can model fixed-sized Geohash partitions, the standard convolutional filters cannot be applied on the variable-sized Voronoi partitions. To explore the Voronoi tessellation scheme, we propose the use of GraphLSTM (Graph-based LSTM), by representing the Voronoi spatial partitions as nodes on an arbitrarily structured graph. The GraphLSTM offers competitive performance against ConvLSTM, at lower computational complexity, across three real-world large-scale taxi demand-supply data sets, with different performance metrics. To ensure superior performance across diverse settings, a HEDGE based ensemble learning algorithm is applied over the ConvLSTM and the GraphLSTM networks.
Tasks
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06515v1
PDF	http://arxiv.org/pdf/1902.06515v1.pdf
PWC	https://paperswithcode.com/paper/grids-versus-graphs-partitioning-space-for
Repo
Framework

Learning to Tune XGBoost with XGBoost


Title	Learning to Tune XGBoost with XGBoost
Authors	Johanna Sommer, Dimitrios Sarigiannis, Thomas Parnell
Abstract	In this short paper we investigate whether meta-learning techniques can be used to more effectively tune the hyperparameters of machine learning models using successive halving (SH). We propose a novel variant of the SH algorithm (MeSH), that uses meta-regressors to determine which candidate configurations should be eliminated at each round. We apply MeSH to the problem of tuning the hyperparameters of a gradient-boosted decision tree model. By training and tuning our meta-regressors using existing tuning jobs from 95 datasets, we demonstrate that MeSH can often find a superior solution to both SH and random search.
Tasks	Meta-Learning
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07218v4
PDF	https://arxiv.org/pdf/1909.07218v4.pdf
PWC	https://paperswithcode.com/paper/learning-to-tune-xgboost-with-xgboost
Repo
Framework

A witness function based construction of discriminative models using Hermite polynomials


Title	A witness function based construction of discriminative models using Hermite polynomials
Authors	H. N. Mhaskar, A. Cloninger, X. Cheng
Abstract	In machine learning, we are given a dataset of the form ${(\mathbf{x}j,y_j)}{j=1}^M$, drawn as i.i.d. samples from an unknown probability distribution $\mu$; the marginal distribution for the $\mathbf{x}_j$'s being $\mu^*$. We propose that rather than using a positive kernel such as the Gaussian for estimation of these measures, using a non-positive kernel that preserves a large number of moments of these measures yields an optimal approximation. We use multi-variate Hermite polynomials for this purpose, and prove optimal and local approximation results in a supremum norm in a probabilistic sense. Together with a permutation test developed with the same kernel, we prove that the kernel estimator serves as a `witness function’ in classification problems. Thus, if the value of this estimator at a point $\mathbf{x}$ exceeds a certain threshold, then the point is reliably in a certain class. This approach can be used to modify pretrained algorithms, such as neural networks or nonlinear dimension reduction techniques, to identify in-class vs out-of-class regions for the purposes of generative models, classification uncertainty, or finding robust centroids. This fact is demonstrated in a number of real world data sets including MNIST, CIFAR10, Science News documents, and LaLonde data sets. \|
Tasks	Dimensionality Reduction
Published	2019-01-10
URL	http://arxiv.org/abs/1901.02975v1
PDF	http://arxiv.org/pdf/1901.02975v1.pdf
PWC	https://paperswithcode.com/paper/a-witness-function-based-construction-of
Repo
Framework

Revisiting clustering as matrix factorisation on the Stiefel manifold


Title	Revisiting clustering as matrix factorisation on the Stiefel manifold
Authors	Stéphane Chrétien, Benjamin Guedj
Abstract	This paper studies clustering for possibly high dimensional data (\emph{e.g.} images, time series, gene expression data, and many other settings), and rephrase it as low rank matrix estimation in the PAC-Bayesian framework. Our approach leverages the well known Burer-Monteiro factorisation strategy from large scale optimisation, in the context of low rank estimation. Moreover, our Burer-Monteiro factors are shown to lie on a Stiefel manifold. We propose a new generalized Bayesian estimator for this problem and prove novel prediction bounds for clustering. We also devise a componentwise Langevin sampler on the Stiefel manifold to compute this estimator.
Tasks	Time Series
Published	2019-03-11
URL	http://arxiv.org/abs/1903.04479v1
PDF	http://arxiv.org/pdf/1903.04479v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-clustering-as-matrix-factorisation
Repo
Framework

Supervised Hierarchical Clustering with Exponential Linkage


Title	Supervised Hierarchical Clustering with Exponential Linkage
Authors	Nishant Yadav, Ari Kobren, Nicholas Monath, Andrew McCallum
Abstract	In supervised clustering, standard techniques for learning a pairwise dissimilarity function often suffer from a discrepancy between the training and clustering objectives, leading to poor cluster quality. Rectifying this discrepancy necessitates matching the procedure for training the dissimilarity function to the clustering algorithm. In this paper, we introduce a method for training the dissimilarity function in a way that is tightly coupled with hierarchical clustering, in particular single linkage. However, the appropriate clustering algorithm for a given dataset is often unknown. Thus we introduce an approach to supervised hierarchical clustering that smoothly interpolates between single, average, and complete linkage, and we give a training procedure that simultaneously learns a linkage function and a dissimilarity function. We accomplish this with a novel Exponential Linkage function that has a learnable parameter that controls the interpolation. In experiments on four datasets, our joint training procedure consistently matches or outperforms the next best training procedure/linkage function pair and gives up to 8 points improvement in dendrogram purity over discrepant pairs.
Tasks
Published	2019-06-19
URL	https://arxiv.org/abs/1906.07859v1
PDF	https://arxiv.org/pdf/1906.07859v1.pdf
PWC	https://paperswithcode.com/paper/supervised-hierarchical-clustering-with
Repo
Framework

Orthogonal Estimation of Wasserstein Distances


Title	Orthogonal Estimation of Wasserstein Distances
Authors	Mark Rowland, Jiri Hron, Yunhao Tang, Krzysztof Choromanski, Tamas Sarlos, Adrian Weller
Abstract	Wasserstein distances are increasingly used in a wide variety of applications in machine learning. Sliced Wasserstein distances form an important subclass which may be estimated efficiently through one-dimensional sorting operations. In this paper, we propose a new variant of sliced Wasserstein distance, study the use of orthogonal coupling in Monte Carlo estimation of Wasserstein distances and draw connections with stratified sampling, and evaluate our approaches experimentally in a range of large-scale experiments in generative modelling and reinforcement learning.
Tasks
Published	2019-03-09
URL	http://arxiv.org/abs/1903.03784v2
PDF	http://arxiv.org/pdf/1903.03784v2.pdf
PWC	https://paperswithcode.com/paper/orthogonal-estimation-of-wasserstein
Repo
Framework

2nd Place Solution to the GQA Challenge 2019


Title	2nd Place Solution to the GQA Challenge 2019
Authors	Shijie Geng, Ji Zhang, Hang Zhang, Ahmed Elgammal, Dimitris N. Metaxas
Abstract	We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering. Our solution collects statistical features from high-frequency words of all the questions asked about an image and use them as accurate knowledge for answering further questions of the same image. We are fully aware that this setting is not ubiquitously applicable, and in a more common setting one should assume the questions are asked separately and they cannot be gathered to obtain a knowledge base. Nonetheless, we use this method as an evidence to demonstrate our observation that the bottleneck effect is more severe on the feature extraction part than it is on the knowledge reasoning part. We show significant gaps when using the same reasoning model with 1) ground-truth features; 2) statistical features; 3) detected features from completely learned detectors, and analyze what these gaps mean to researches on visual reasoning topics. Our model with the statistical features achieves the 2nd place in the GQA Challenge 2019.
Tasks	Question Answering, Visual Question Answering, Visual Reasoning
Published	2019-07-16
URL	https://arxiv.org/abs/1907.06794v2
PDF	https://arxiv.org/pdf/1907.06794v2.pdf
PWC	https://paperswithcode.com/paper/2nd-place-solution-to-the-gqa-challenge-2019
Repo
Framework

A New Malware Detection System Using a High Performance-ELM method


Title	A New Malware Detection System Using a High Performance-ELM method
Authors	Shahab Shamshirband, Anthony T. Chronopoulos
Abstract	A vital element of a cyberspace infrastructure is cybersecurity. Many protocols proposed for security issues, which leads to anomalies that affect the related infrastructure of cyberspace. Machine learning (ML) methods used to mitigate anomalies behavior in mobile devices. This paper aims to apply a High Performance Extreme Learning Machine (HP-ELM) to detect possible anomalies in two malware datasets. Two widely used datasets (the CTU-13 and Malware) are used to test the effectiveness of HP-ELM. Extensive comparisons are carried out in order to validate the effectiveness of the HP-ELM learning method. The experiment results demonstrate that the HP-ELM was the highest accuracy of performance of 0.9592 for the top 3 features with one activation function.
Tasks	Malware Detection
Published	2019-06-27
URL	https://arxiv.org/abs/1906.12198v1
PDF	https://arxiv.org/pdf/1906.12198v1.pdf
PWC	https://paperswithcode.com/paper/a-new-malware-detection-system-using-a-high
Repo
Framework