Paper Group ANR 516
Exploring the 3D architectures of deep material network in data-driven multiscale mechanics. Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces. Classification-driven Single Image Dehazing. MC-ISTA-Net: Adaptive Measurement and Initialization and Channel Attention Optimization inspired Neural Network for Compressive Sensing. Ensem …
Exploring the 3D architectures of deep material network in data-driven multiscale mechanics
Title | Exploring the 3D architectures of deep material network in data-driven multiscale mechanics |
Authors | Zeliang Liu, C. T. Wu |
Abstract | This paper extends the deep material network (DMN) proposed by Liu et al. (2019) to tackle general 3-dimensional (3D) problems with arbitrary material and geometric nonlinearities. It discovers a new way of describing multiscale heterogeneous materials by a multi-layer network structure and mechanistic building blocks. The data-driven framework of DMN is discussed in detail about the offline training and online extrapolation stages. Analytical solutions of the 3D building block with a two-layer structure in both small- and finite-strain formulations are derived based on interfacial equilibrium conditions and kinematic constraints. With linear elastic data generated by direct numerical simulations on a representative volume element (RVE), the network can be effectively trained in the offline stage using stochastic gradient descent and advanced model compression algorithms. Efficiency and accuracy of DMN on addressing the long-standing 3D RVE challenges with complex morphologies and material laws are validated through numerical experiments, including 1) hyperelastic particle-reinforced rubber composite with Mullins effect; 2) polycrystalline materials with rate-dependent crystal plasticity; 3) carbon fiber reinforced polymer (CFRP) composites with fiber anisotropic elasticity and matrix plasticity. In particular, we demonstrate a three-scale homogenization procedure of CFRP system by concatenating the microscale and mesoscale material networks. The complete learning and extrapolation procedures of DMN establish a reliable data-driven framework for multiscale material modeling and design. |
Tasks | Model Compression |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.04832v2 |
http://arxiv.org/pdf/1901.04832v2.pdf | |
PWC | https://paperswithcode.com/paper/exploring-the-3d-architectures-of-deep |
Repo | |
Framework | |
Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces
Title | Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces |
Authors | Bryan Seybold, Emily Fertig, Alex Alemi, Ian Fischer |
Abstract | Variational autoencoders learn unsupervised data representations, but these models frequently converge to minima that fail to preserve meaningful semantic information. For example, variational autoencoders with autoregressive decoders often collapse into autodecoders, where they learn to ignore the encoder input. In this work, we demonstrate that adding an auxiliary decoder to regularize the latent space can prevent this collapse, but successful auxiliary decoding tasks are domain dependent. Auxiliary decoders can increase the amount of semantic information encoded in the latent space and visible in the reconstructions. The semantic information in the variational autoencoder’s representation is only weakly correlated with its rate, distortion, or evidence lower bound. Compared to other popular strategies that modify the training objective, our regularization of the latent space generally increased the semantic information content. |
Tasks | |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07478v1 |
https://arxiv.org/pdf/1905.07478v1.pdf | |
PWC | https://paperswithcode.com/paper/dueling-decoders-regularizing-variational |
Repo | |
Framework | |
Classification-driven Single Image Dehazing
Title | Classification-driven Single Image Dehazing |
Authors | Yanting Pei, Yaping Huang, Xingyuan Zhang |
Abstract | Most existing dehazing algorithms often use hand-crafted features or Convolutional Neural Networks (CNN)-based methods to generate clear images using pixel-level Mean Square Error (MSE) loss. The generated images generally have better visual appeal, but not always have better performance for high-level vision tasks, e.g. image classification. In this paper, we investigate a new point of view in addressing this problem. Instead of focusing only on achieving good quantitative performance on pixel-based metrics such as Peak Signal to Noise Ratio (PSNR), we also ensure that the dehazed image itself does not degrade the performance of the high-level vision tasks such as image classification. To this end, we present an unified CNN architecture that includes three parts: a dehazing sub-network (DNet), a classification-driven Conditional Generative Adversarial Networks sub-network (CCGAN) and a classification sub-network (CNet) related to image classification, which has better performance both on visual appeal and image classification. We conduct comprehensive experiments on two challenging benchmark datasets for fine-grained and object classification: CUB-200-2011 and Caltech-256. Experimental results demonstrate that the proposed method outperforms many recent state-of-the-art single image dehazing methods in terms of image dehazing metrics and classification accuracy. |
Tasks | Image Classification, Image Dehazing, Object Classification, Single Image Dehazing |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09389v1 |
https://arxiv.org/pdf/1911.09389v1.pdf | |
PWC | https://paperswithcode.com/paper/classification-driven-single-image-dehazing |
Repo | |
Framework | |
MC-ISTA-Net: Adaptive Measurement and Initialization and Channel Attention Optimization inspired Neural Network for Compressive Sensing
Title | MC-ISTA-Net: Adaptive Measurement and Initialization and Channel Attention Optimization inspired Neural Network for Compressive Sensing |
Authors | Nanyu Li, Cuiyin Liu |
Abstract | The optimization inspired network can bridge convex optimization and neural networks in Compressive Sensing (CS) reconstruction of natural image, like ISTA-Net+, which mapping optimization algorithm: iterative shrinkage-thresholding algorithm (ISTA) into network. However, measurement matrix and input initialization are still hand-crafted, and multi-channel feature map contain information at different frequencies, which is treated equally across channels, hindering the ability of CS reconstruction in optimization-inspired networks. In order to solve the above problems, we proposed MC-ISTA-Net |
Tasks | Compressive Sensing |
Published | 2019-02-26 |
URL | https://arxiv.org/abs/1902.09878v3 |
https://arxiv.org/pdf/1902.09878v3.pdf | |
PWC | https://paperswithcode.com/paper/mc-ista-net-adaptive-measurement-and |
Repo | |
Framework | |
Ensembles of feedforward-designed convolutional neural networks
Title | Ensembles of feedforward-designed convolutional neural networks |
Authors | Yueru Chen, Yijing Yang, Wei Wang, C. -C. Jay Kuo |
Abstract | An ensemble method that fuses the output decision vectors of multiple feedforward-designed convolutional neural networks (FF-CNNs) to solve the image classification problem is proposed in this work. To enhance the performance of the ensemble system, it is critical to increasing the diversity of FF-CNN models. To achieve this objective, we introduce diversities by adopting three strategies: 1) different parameter settings in convolutional layers, 2) flexible feature subsets fed into the Fully-connected (FC) layers, and 3) multiple image embeddings of the same input source. Furthermore, we partition input samples into easy and hard ones based on their decision confidence scores. As a result, we can develop a new ensemble system tailored to hard samples to further boost classification accuracy. Experiments are conducted on the MNIST and CIFAR-10 datasets to demonstrate the effectiveness of the ensemble method. |
Tasks | Image Classification |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02154v1 |
http://arxiv.org/pdf/1901.02154v1.pdf | |
PWC | https://paperswithcode.com/paper/ensembles-of-feedforward-designed |
Repo | |
Framework | |
An Algorithm for Multi-Attribute Diverse Matching
Title | An Algorithm for Multi-Attribute Diverse Matching |
Authors | Saba Ahmadi, Faez Ahmed, John P. Dickerson, Mark Fuge, Samir Khuller |
Abstract | Bipartite b-matching, where agents on one side of a market are matched to one or more agents or items on the other, is a classical model that is used in myriad application areas such as healthcare, advertising, education, and general resource allocation. Traditionally, the primary goal of such models is to maximize a linear function of the constituent matches (e.g., linear social welfare maximization) subject to some constraints. Recent work has studied a new goal of balancing whole-match diversity and economic efficiency, where the objective is instead a monotone submodular function over the matching. Basic versions of this problem are solvable in polynomial time. In this work, we prove that the problem of simultaneously maximizing diversity along several features (e.g., country of citizenship, gender, skills) is NP-hard. To address this problem, we develop the first combinatorial algorithm that constructs provably-optimal diverse b-matchings in pseudo-polynomial time. We also provide a Mixed-Integer Quadratic formulation for the same problem and show that our method guarantees optimal solutions and takes less computation time for a reviewer assignment application. |
Tasks | |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.03350v3 |
https://arxiv.org/pdf/1909.03350v3.pdf | |
PWC | https://paperswithcode.com/paper/algorithms-for-optimal-diverse-matching |
Repo | |
Framework | |
Neural Markov Logic Networks
Title | Neural Markov Logic Networks |
Authors | Giuseppe Marra, Ondřej Kuželka |
Abstract | We introduce Neural Markov Logic Networks (NMLNs), a statistical relational learning system that borrows ideas from Markov logic. Like Markov Logic Networks (MLNs), NMLNs are an exponential-family model for modelling distributions over possible worlds, but unlike MLNs, they do not rely on explicitly specified first-order logic rules. Instead, NMLNs learn an implicit representation of such rules as a neural network that acts as a potential function on fragments of the relational structure. Interestingly, any MLN can be represented as an NMLN. Similarly to recently proposed Neural theorem provers (NTPs) [Rockt"aschel and Riedel, 2017], NMLNs can exploit embeddings of constants but, unlike NTPs, NMLNs work well also in their absence. This is extremely important for predicting in settings other than the transductive one. We showcase the potential of NMLNs on knowledge-base completion tasks and on generation of molecular (graph) data. |
Tasks | Knowledge Base Completion, Relational Reasoning |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13462v2 |
https://arxiv.org/pdf/1905.13462v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-markov-logic-networks |
Repo | |
Framework | |
Grids versus Graphs: Partitioning Space for Improved Taxi Demand-Supply Forecasts
Title | Grids versus Graphs: Partitioning Space for Improved Taxi Demand-Supply Forecasts |
Authors | Neema Davis, Gaurav Raina, Krishna Jagannathan |
Abstract | Accurate taxi demand-supply forecasting is a challenging application of ITS (Intelligent Transportation Systems), due to the complex spatial and temporal patterns. We investigate the impact of different spatial partitioning techniques on the prediction performance of an LSTM (Long Short-Term Memory) network, in the context of taxi demand-supply forecasting. We consider two tessellation schemes: (i) the variable-sized Voronoi tessellation, and (ii) the fixed-sized Geohash tessellation. While the widely employed ConvLSTM (Convolutional LSTM) can model fixed-sized Geohash partitions, the standard convolutional filters cannot be applied on the variable-sized Voronoi partitions. To explore the Voronoi tessellation scheme, we propose the use of GraphLSTM (Graph-based LSTM), by representing the Voronoi spatial partitions as nodes on an arbitrarily structured graph. The GraphLSTM offers competitive performance against ConvLSTM, at lower computational complexity, across three real-world large-scale taxi demand-supply data sets, with different performance metrics. To ensure superior performance across diverse settings, a HEDGE based ensemble learning algorithm is applied over the ConvLSTM and the GraphLSTM networks. |
Tasks | |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06515v1 |
http://arxiv.org/pdf/1902.06515v1.pdf | |
PWC | https://paperswithcode.com/paper/grids-versus-graphs-partitioning-space-for |
Repo | |
Framework | |
Learning to Tune XGBoost with XGBoost
Title | Learning to Tune XGBoost with XGBoost |
Authors | Johanna Sommer, Dimitrios Sarigiannis, Thomas Parnell |
Abstract | In this short paper we investigate whether meta-learning techniques can be used to more effectively tune the hyperparameters of machine learning models using successive halving (SH). We propose a novel variant of the SH algorithm (MeSH), that uses meta-regressors to determine which candidate configurations should be eliminated at each round. We apply MeSH to the problem of tuning the hyperparameters of a gradient-boosted decision tree model. By training and tuning our meta-regressors using existing tuning jobs from 95 datasets, we demonstrate that MeSH can often find a superior solution to both SH and random search. |
Tasks | Meta-Learning |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07218v4 |
https://arxiv.org/pdf/1909.07218v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-tune-xgboost-with-xgboost |
Repo | |
Framework | |
A witness function based construction of discriminative models using Hermite polynomials
Title | A witness function based construction of discriminative models using Hermite polynomials |
Authors | H. N. Mhaskar, A. Cloninger, X. Cheng |
Abstract | In machine learning, we are given a dataset of the form ${(\mathbf{x}j,y_j)}{j=1}^M$, drawn as i.i.d. samples from an unknown probability distribution $\mu$; the marginal distribution for the $\mathbf{x}_j$'s being $\mu^*$. We propose that rather than using a positive kernel such as the Gaussian for estimation of these measures, using a non-positive kernel that preserves a large number of moments of these measures yields an optimal approximation. We use multi-variate Hermite polynomials for this purpose, and prove optimal and local approximation results in a supremum norm in a probabilistic sense. Together with a permutation test developed with the same kernel, we prove that the kernel estimator serves as a `witness function’ in classification problems. Thus, if the value of this estimator at a point $\mathbf{x}$ exceeds a certain threshold, then the point is reliably in a certain class. This approach can be used to modify pretrained algorithms, such as neural networks or nonlinear dimension reduction techniques, to identify in-class vs out-of-class regions for the purposes of generative models, classification uncertainty, or finding robust centroids. This fact is demonstrated in a number of real world data sets including MNIST, CIFAR10, Science News documents, and LaLonde data sets. | |
Tasks | Dimensionality Reduction |
Published | 2019-01-10 |
URL | http://arxiv.org/abs/1901.02975v1 |
http://arxiv.org/pdf/1901.02975v1.pdf | |
PWC | https://paperswithcode.com/paper/a-witness-function-based-construction-of |
Repo | |
Framework | |
Revisiting clustering as matrix factorisation on the Stiefel manifold
Title | Revisiting clustering as matrix factorisation on the Stiefel manifold |
Authors | Stéphane Chrétien, Benjamin Guedj |
Abstract | This paper studies clustering for possibly high dimensional data (\emph{e.g.} images, time series, gene expression data, and many other settings), and rephrase it as low rank matrix estimation in the PAC-Bayesian framework. Our approach leverages the well known Burer-Monteiro factorisation strategy from large scale optimisation, in the context of low rank estimation. Moreover, our Burer-Monteiro factors are shown to lie on a Stiefel manifold. We propose a new generalized Bayesian estimator for this problem and prove novel prediction bounds for clustering. We also devise a componentwise Langevin sampler on the Stiefel manifold to compute this estimator. |
Tasks | Time Series |
Published | 2019-03-11 |
URL | http://arxiv.org/abs/1903.04479v1 |
http://arxiv.org/pdf/1903.04479v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-clustering-as-matrix-factorisation |
Repo | |
Framework | |
Supervised Hierarchical Clustering with Exponential Linkage
Title | Supervised Hierarchical Clustering with Exponential Linkage |
Authors | Nishant Yadav, Ari Kobren, Nicholas Monath, Andrew McCallum |
Abstract | In supervised clustering, standard techniques for learning a pairwise dissimilarity function often suffer from a discrepancy between the training and clustering objectives, leading to poor cluster quality. Rectifying this discrepancy necessitates matching the procedure for training the dissimilarity function to the clustering algorithm. In this paper, we introduce a method for training the dissimilarity function in a way that is tightly coupled with hierarchical clustering, in particular single linkage. However, the appropriate clustering algorithm for a given dataset is often unknown. Thus we introduce an approach to supervised hierarchical clustering that smoothly interpolates between single, average, and complete linkage, and we give a training procedure that simultaneously learns a linkage function and a dissimilarity function. We accomplish this with a novel Exponential Linkage function that has a learnable parameter that controls the interpolation. In experiments on four datasets, our joint training procedure consistently matches or outperforms the next best training procedure/linkage function pair and gives up to 8 points improvement in dendrogram purity over discrepant pairs. |
Tasks | |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.07859v1 |
https://arxiv.org/pdf/1906.07859v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-hierarchical-clustering-with |
Repo | |
Framework | |
Orthogonal Estimation of Wasserstein Distances
Title | Orthogonal Estimation of Wasserstein Distances |
Authors | Mark Rowland, Jiri Hron, Yunhao Tang, Krzysztof Choromanski, Tamas Sarlos, Adrian Weller |
Abstract | Wasserstein distances are increasingly used in a wide variety of applications in machine learning. Sliced Wasserstein distances form an important subclass which may be estimated efficiently through one-dimensional sorting operations. In this paper, we propose a new variant of sliced Wasserstein distance, study the use of orthogonal coupling in Monte Carlo estimation of Wasserstein distances and draw connections with stratified sampling, and evaluate our approaches experimentally in a range of large-scale experiments in generative modelling and reinforcement learning. |
Tasks | |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.03784v2 |
http://arxiv.org/pdf/1903.03784v2.pdf | |
PWC | https://paperswithcode.com/paper/orthogonal-estimation-of-wasserstein |
Repo | |
Framework | |
2nd Place Solution to the GQA Challenge 2019
Title | 2nd Place Solution to the GQA Challenge 2019 |
Authors | Shijie Geng, Ji Zhang, Hang Zhang, Ahmed Elgammal, Dimitris N. Metaxas |
Abstract | We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering. Our solution collects statistical features from high-frequency words of all the questions asked about an image and use them as accurate knowledge for answering further questions of the same image. We are fully aware that this setting is not ubiquitously applicable, and in a more common setting one should assume the questions are asked separately and they cannot be gathered to obtain a knowledge base. Nonetheless, we use this method as an evidence to demonstrate our observation that the bottleneck effect is more severe on the feature extraction part than it is on the knowledge reasoning part. We show significant gaps when using the same reasoning model with 1) ground-truth features; 2) statistical features; 3) detected features from completely learned detectors, and analyze what these gaps mean to researches on visual reasoning topics. Our model with the statistical features achieves the 2nd place in the GQA Challenge 2019. |
Tasks | Question Answering, Visual Question Answering, Visual Reasoning |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.06794v2 |
https://arxiv.org/pdf/1907.06794v2.pdf | |
PWC | https://paperswithcode.com/paper/2nd-place-solution-to-the-gqa-challenge-2019 |
Repo | |
Framework | |
A New Malware Detection System Using a High Performance-ELM method
Title | A New Malware Detection System Using a High Performance-ELM method |
Authors | Shahab Shamshirband, Anthony T. Chronopoulos |
Abstract | A vital element of a cyberspace infrastructure is cybersecurity. Many protocols proposed for security issues, which leads to anomalies that affect the related infrastructure of cyberspace. Machine learning (ML) methods used to mitigate anomalies behavior in mobile devices. This paper aims to apply a High Performance Extreme Learning Machine (HP-ELM) to detect possible anomalies in two malware datasets. Two widely used datasets (the CTU-13 and Malware) are used to test the effectiveness of HP-ELM. Extensive comparisons are carried out in order to validate the effectiveness of the HP-ELM learning method. The experiment results demonstrate that the HP-ELM was the highest accuracy of performance of 0.9592 for the top 3 features with one activation function. |
Tasks | Malware Detection |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.12198v1 |
https://arxiv.org/pdf/1906.12198v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-malware-detection-system-using-a-high |
Repo | |
Framework | |