Paper Group ANR 102
Understanding Car-Speak: Replacing Humans in Dealerships. A Scalable Evolution Strategy with Directional Gaussian Smoothing for Blackbox Optimization. Multi-Class classification of vulnerabilities in Smart Contracts using AWD-LSTM, with pre-trained encoder inspired from natural language processing. Curriculum By Texture. On the use of recurrent neu …
Understanding Car-Speak: Replacing Humans in Dealerships
Title | Understanding Car-Speak: Replacing Humans in Dealerships |
Authors | Habeeb Hooshmand, James Caverlee |
Abstract | A large portion of the car-buying experience in the United States involves interactions at a car dealership. At the dealership, the car-buyer relays their needs to a sales representative. However, most car-buyers are only have an abstract description of the vehicle they need. Therefore, they are only able to describe their ideal car in “car-speak”. Car-speak is abstract language that pertains to a car’s physical attributes. In this paper, we define car-speak. We also aim to curate a reasonable data set of car-speak language. Finally, we train several classifiers in order to classify car-speak. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02070v1 |
https://arxiv.org/pdf/2002.02070v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-car-speak-replacing-humans-in |
Repo | |
Framework | |
A Scalable Evolution Strategy with Directional Gaussian Smoothing for Blackbox Optimization
Title | A Scalable Evolution Strategy with Directional Gaussian Smoothing for Blackbox Optimization |
Authors | Jiaxin Zhang, Hoang Tran, Dan Lu, Guannan Zhang |
Abstract | We developed a new scalable evolution strategy with directional Gaussian smoothing (DGS-ES) for high-dimensional blackbox optimization. Standard ES methods have been proved to suffer from the curse of dimensionality, due to the random directional search and low accuracy of Monte Carlo estimation. The key idea of this work is to develop Gaussian smoothing approach which only averages the original objective function along $d$ orthogonal directions. In this way, the partial derivatives of the smoothed function along those directions can be represented by one-dimensional integrals, instead of $d$-dimensional integrals in the standard ES methods. As such, the averaged partial derivatives can be approximated using the Gauss-Hermite quadrature rule, as opposed to MC, which significantly improves the accuracy of the averaged gradients. Moreover, the smoothing technique reduces the barrier of local minima, such that global minima become easier to achieve. We provide three sets of examples to demonstrate the performance of our method, including benchmark functions for global optimization, and a rocket shell design problem. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.03001v1 |
https://arxiv.org/pdf/2002.03001v1.pdf | |
PWC | https://paperswithcode.com/paper/a-scalable-evolution-strategy-with |
Repo | |
Framework | |
Multi-Class classification of vulnerabilities in Smart Contracts using AWD-LSTM, with pre-trained encoder inspired from natural language processing
Title | Multi-Class classification of vulnerabilities in Smart Contracts using AWD-LSTM, with pre-trained encoder inspired from natural language processing |
Authors | Ajay K. Gogineni, S. Swayamjyoti, Devadatta Sahoo, Kisor K. Sahu, Raj kishore |
Abstract | Vulnerability detection and safety of smart contracts are of paramount importance because of their immutable nature. Symbolic tools like OYENTE and MAIAN are typically used for vulnerability prediction in smart contracts. As these tools are computationally expensive, they are typically used to detect vulnerabilities until some predefined invocation depth. These tools require more search time as the invocation depth increases. Since the number of smart contracts is increasing exponentially, it is difficult to analyze the contracts using these traditional tools. Recently a machine learning technique called Long Short Term Memory (LSTM) has been used for binary classification, i.e., to predict whether a smart contract is vulnerable or not. This technique requires nearly constant search time as the invocation depth increases. In the present article, we have shown a multi-class classification, where we classify a smart contract in Suicidal, Prodigal, Greedy, or Normal categories. We used Average Stochastic Gradient Descent Weight-Dropped LSTM (AWD-LSTM), which is a variant of LSTM, to perform classification. We reduced the class imbalance (a large number of normal contracts as compared to other categories) by considering only the distinct opcode combination for normal contracts. We have achieved a weighted average Fbeta score of 90.0%. Hence, such techniques can be used to analyze a large number of smart contracts and help to improve the security of these contracts. |
Tasks | Vulnerability Detection |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2004.00362v1 |
https://arxiv.org/pdf/2004.00362v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-class-classification-of-vulnerabilities |
Repo | |
Framework | |
Curriculum By Texture
Title | Curriculum By Texture |
Authors | Samarth Sinha, Animesh Garg, Hugo Larochelle |
Abstract | Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification and segmentation. One factor for the success of CNNs is that they have an inductive bias that assumes a certain type of spatial structure is present in the data. Recent work by Geirhos et al. (2018) shows how learning in CNNs causes the learned CNN models to be biased towards high-frequency textural information, compared to low-frequency shape information in images. Many tasks generally requires both shape and textural information. Hence, we propose a simple curriculum based scheme which improves the ability of CNNs to be less biased towards textural information, and at the same time, being able to represent both the shape and textural information. We propose to augment the training of CNNs by controlling the amount of textural information that is available to the CNNs during the training process, by convolving the output of a CNN layer with a low-pass filter, or simply a Gaussian kernel. By reducing the standard deviation of the Gaussian kernel, we are able to gradually increase the amount of textural information available as training progresses, and hence reduce the texture bias. Such an augmented training scheme significantly improves the performance of CNNs on various image classification tasks, while adding no additional trainable parameters or auxiliary regularization objectives. We also observe significant improvements when using the trained CNNs to perform transfer learning on a different dataset, and transferring to a different task which shows how the learned CNNs using the proposed method act as better feature extractors. |
Tasks | Image Classification, Transfer Learning |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01367v1 |
https://arxiv.org/pdf/2003.01367v1.pdf | |
PWC | https://paperswithcode.com/paper/curriculum-by-texture |
Repo | |
Framework | |
On the use of recurrent neural networks for predictions of turbulent flows
Title | On the use of recurrent neural networks for predictions of turbulent flows |
Authors | Luca Guastoni, Prem A. Srinivasan, Hossein Azizpour, Philipp Schlatter, Ricardo Vinuesa |
Abstract | In this paper, the prediction capabilities of recurrent neural networks are assessed in the low-order model of near-wall turbulence by Moehlis {\it et al.} (New J. Phys. {\bf 6}, 56, 2004). Our results show that it is possible to obtain excellent predictions of the turbulence statistics and the dynamic behavior of the flow with properly trained long short-term memory (LSTM) networks, leading to relative errors in the mean and the fluctuations below $1%$. We also observe that using a loss function based only on the instantaneous predictions of the flow may not lead to the best predictions in terms of turbulence statistics, and it is necessary to define a stopping criterion based on the computed statistics. Furthermore, more sophisticated loss functions, including not only the instantaneous predictions but also the averaged behavior of the flow, may lead to much faster neural network training. |
Tasks | |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01222v1 |
https://arxiv.org/pdf/2002.01222v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-use-of-recurrent-neural-networks-for |
Repo | |
Framework | |
SGD with Hardness Weighted Sampling for Distributionally Robust Deep Learning
Title | SGD with Hardness Weighted Sampling for Distributionally Robust Deep Learning |
Authors | Lucas Fidon, Sebastien Ourselin, Tom Vercauteren |
Abstract | Distributionally Robust Optimization (DRO) has been proposed as an alternative to Empirical Risk Minimization (ERM) in order to account for potential biases in the training data distribution. However, its use in deep learning has been severely restricted due to the relative inefficiency of the optimizers available for DRO in comparison to the wide-spread Stochastic Gradient Descent (SGD) based optimizers for deep learning with ERM. We propose SGD with Hardness weighted sampling, an efficient optimization method for machine learning with DRO with a focus on deep learning. In this work, we propose SGD with hardness weighted sampling, a principled and efficient optimization method for DRO in machine learning that is particularly suited in the context of deep learning. We show that our optimization method can be interpreted as a principled Hard Example Mining strategy. Similar to an online hard example mining strategy in essence and in practice, the proposed algorithm is straightforward to implement and computationally as efficient as SGD-based optimizers used for deep learning. It only requires adding a softmax layer and maintaining a history of the loss values for each training example to compute adaptive sampling probabilities. In contrast to typical ad hoc hard mining approaches, and exploiting recent theoretical results in deep learning optimization, we We also prove the convergence of our DRO algorithm for over-parameterized deep learning networks with ReLU activation and finite number of layers and parameters. Preliminary results demonstrate the feasibility and usefulness of our approach. |
Tasks | |
Published | 2020-01-08 |
URL | https://arxiv.org/abs/2001.02658v1 |
https://arxiv.org/pdf/2001.02658v1.pdf | |
PWC | https://paperswithcode.com/paper/sgd-with-hardness-weighted-sampling-for-1 |
Repo | |
Framework | |
Markovian Score Climbing: Variational Inference with KL(p||q)
Title | Markovian Score Climbing: Variational Inference with KL(p |
Authors | Christian A. Naesseth, Fredrik Lindsten, David Blei |
Abstract | Modern variational inference (VI) uses stochastic gradients to avoid intractable expectations, enabling large-scale probabilistic inference in complex models. VI posits a family of approximating distributions $q$ and then finds the member of that family that is closest to the exact posterior $p$. Traditionally, VI algorithms minimize the “exclusive KL” KL$(q\p)$, often for computational convenience. Recent research, however, has also focused on the “inclusive KL” KL$(p\q)$, which has good statistical properties that makes it more appropriate for certain inference problems. This paper develops a simple algorithm for reliably minimizing the inclusive KL. Consider a valid MCMC method, a Markov chain whose stationary distribution is $p$. The algorithm we develop iteratively samples the chain $z[k]$, and then uses those samples to follow the score function of the variational approximation, $\nabla \log q(z[k])$ with a Robbins-Monro step-size schedule. This method, which we call Markovian score climbing (MSC), converges to a local optimum of the inclusive KL. It does not suffer from the systematic errors inherent in existing methods, such as Reweighted Wake-Sleep and Neural Adaptive Sequential Monte Carlo, which lead to bias in their final estimates. In a variant that ties the variational approximation directly to the Markov chain, MSC further provides a new algorithm that melds VI and MCMC. We illustrate convergence on a toy model and demonstrate the utility of MSC on Bayesian probit regression for classification as well as a stochastic volatility model for financial data. |
Tasks | |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10374v1 |
https://arxiv.org/pdf/2003.10374v1.pdf | |
PWC | https://paperswithcode.com/paper/markovian-score-climbing-variational |
Repo | |
Framework | |
Distributed and Democratized Learning: Philosophy and Research Challenges
Title | Distributed and Democratized Learning: Philosophy and Research Challenges |
Authors | Minh N. H. Nguyen, Shashi Raj Pandey, Kyi Thar, Nguyen H. Tran, Mingzhe Chen, Walid Saad, Choong Seon Hong |
Abstract | Due to the availability of huge amounts of data and processing abilities, current artificial intelligence (AI) systems are effective at solving complex tasks. However, despite the success of AI in different areas, the problem of designing AI systems that can truly mimic human cognitive capabilities such as artificial general intelligence, remains largely open. Consequently, many emerging cross-device AI applications will require a transition from traditional centralized learning systems towards large-scale distributed AI systems that can collaboratively perform multiple complex learning tasks. In this paper, we propose a novel design philosophy called democratized learning (Dem-AI) whose goal is to build large-scale distributed learning systems that rely on the self-organization of distributed learning agents that are well-connected, but limited in learning capabilities. Correspondingly, inspired from the societal groups of humans, the specialized groups of learning agents in the proposed Dem-AI system are selforganized in a hierarchical structure to collectively perform learning tasks more efficiently. As such, the Dem-AI learning system can evolve and regulate itself based on the underlying duality of two processes that we call specialized and generalized processes. In this regard, we present a reference design as a guideline to realize future Dem-AI systems, inspired by various interdisciplinary fields. Accordingly, we introduce four underlying mechanisms in the design such as plasticity-stability transition mechanism, self-organizing hierarchical structuring, specialized learning, and generalization. Finally, we establish possible extensions and new challenges for the existing learning approaches to provide better scalable, flexible, and more powerful learning systems with the new setting of Dem-AI. |
Tasks | |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.09301v1 |
https://arxiv.org/pdf/2003.09301v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-and-democratized-learning |
Repo | |
Framework | |
Comparison of Syntactic and Semantic Representations of Programs in Neural Embeddings
Title | Comparison of Syntactic and Semantic Representations of Programs in Neural Embeddings |
Authors | Austin P. Wright, Herbert Wiklicky |
Abstract | Neural approaches to program synthesis and understanding have proliferated widely in the last few years; at the same time graph based neural networks have become a promising new tool. This work aims to be the first empirical study comparing the effectiveness of natural language models and static analysis graph based models in representing programs in deep learning systems. It compares graph convolutional networks using different graph representations in the task of program embedding. It shows that the sparsity of control flow graphs and the implicit aggregation of graph convolutional networks cause these models to perform worse than naive models. Therefore it concludes that simply augmenting purely linguistic or statistical models with formal information does not perform well due to the nuanced nature of formal properties introducing more noise than structure for graph convolutional networks. |
Tasks | Program Synthesis |
Published | 2020-01-24 |
URL | https://arxiv.org/abs/2001.09201v1 |
https://arxiv.org/pdf/2001.09201v1.pdf | |
PWC | https://paperswithcode.com/paper/comparison-of-syntactic-and-semantic |
Repo | |
Framework | |
Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation
Title | Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation |
Authors | Yingda Xia, Yi Zhang, Fengze Liu, Wei Shen, Alan Yuille |
Abstract | The ability to detect failures and anomalies are fundamental requirements for building reliable systems for computer vision applications, especially safety-critical applications of semantic segmentation, such as autonomous driving and medical image analysis. In this paper, we systematically study failure and anomaly detection for semantic segmentation and propose a unified framework, consisting of two modules, to address these two related problems. The first module is an image synthesis module, which generates a synthesized image from a segmentation layout map, and the second is a comparison module, which computes the difference between the synthesized image and the input image. We validate our framework on three challenging datasets and improve the state-of-the-arts by large margins, i.e., 6% AUPR-Error on Cityscapes, 10% DSC correlation on pancreatic tumor segmentation in MSD and 20% AUPR on StreetHazards anomaly segmentation. |
Tasks | Anomaly Detection, Autonomous Driving, Image Generation, Semantic Segmentation |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08440v1 |
https://arxiv.org/pdf/2003.08440v1.pdf | |
PWC | https://paperswithcode.com/paper/synthesize-then-compare-detecting-failures |
Repo | |
Framework | |
DP-CGAN: Differentially Private Synthetic Data and Label Generation
Title | DP-CGAN: Differentially Private Synthetic Data and Label Generation |
Authors | Reihaneh Torkzadehmahani, Peter Kairouz, Benedict Paten |
Abstract | Generative Adversarial Networks (GANs) are one of the well-known models to generate synthetic data including images, especially for research communities that cannot use original sensitive datasets because they are not publicly accessible. One of the main challenges in this area is to preserve the privacy of individuals who participate in the training of the GAN models. To address this challenge, we introduce a Differentially Private Conditional GAN (DP-CGAN) training framework based on a new clipping and perturbation strategy, which improves the performance of the model while preserving privacy of the training dataset. DP-CGAN generates both synthetic data and corresponding labels and leverages the recently introduced Renyi differential privacy accountant to track the spent privacy budget. The experimental results show that DP-CGAN can generate visually and empirically promising results on the MNIST dataset with a single-digit epsilon parameter in differential privacy. |
Tasks | |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09700v1 |
https://arxiv.org/pdf/2001.09700v1.pdf | |
PWC | https://paperswithcode.com/paper/dp-cgan-differentially-private-synthetic-data |
Repo | |
Framework | |
Deep learning for prediction of population health costs
Title | Deep learning for prediction of population health costs |
Authors | Philipp Drewe-Boss, Dirk Enders, Jochen Walker, Uwe Ohler |
Abstract | Accurate prediction of healthcare costs is important for optimally managing health costs. However, methods leveraging the medical richness from data such as health insurance claims or electronic health records are missing. Here, we developed a deep neural network to predict future cost from health insurance claims records. We applied the deep network and a ridge regression model to a sample of 1.4 million German insurants to predict total one-year health care costs. Both methods were compared to Morbi-RSA models with various performance measures and were also used to predict patients with a change in costs and to identify relevant codes for this prediction. We showed that the neural network outperformed the ridge regression as well as all Morbi-RSA models for cost prediction. Further, the neural network was superior to ridge regression in predicting patients with cost change and identified more specific codes. In summary, we showed that our deep neural network can leverage the full complexity of the patient records and outperforms standard approaches. We suggest that the better performance is due to the ability to incorporate complex interactions in the model and that the model might also be used for predicting other health phenotypes. |
Tasks | |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03466v1 |
https://arxiv.org/pdf/2003.03466v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-prediction-of-population |
Repo | |
Framework | |
Assigning credit to scientific datasets using article citation networks
Title | Assigning credit to scientific datasets using article citation networks |
Authors | Tong Zeng, Longfeng Wu, Sarah Bratt, Daniel E. Acuna |
Abstract | A citation is a well-established mechanism for connecting scientific artifacts. Citation networks are used by citation analysis for a variety of reasons, prominently to give credit to scientists’ work. However, because of current citation practices, scientists tend to cite only publications, leaving out other types of artifacts such as datasets. Datasets then do not get appropriate credit even though they are increasingly reused and experimented with. We develop a network flow measure, called DataRank, aimed at solving this gap. DataRank assigns a relative value to each node in the network based on how citations flow through the graph, differentiating publication and dataset flow rates. We evaluate the quality of DataRank by estimating its accuracy at predicting the usage of real datasets: web visits to GenBank and downloads of Figshare datasets. We show that DataRank is better at predicting this usage compared to alternatives while offering additional interpretable outcomes. We discuss improvements to citation behavior and algorithms to properly track and assign credit to datasets. |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.05917v1 |
https://arxiv.org/pdf/2001.05917v1.pdf | |
PWC | https://paperswithcode.com/paper/assigning-credit-to-scientific-datasets-using |
Repo | |
Framework | |
Explicitly Trained Spiking Sparsity in Spiking Neural Networks with Backpropagation
Title | Explicitly Trained Spiking Sparsity in Spiking Neural Networks with Backpropagation |
Authors | Jason M. Allred, Steven J. Spencer, Gopalakrishnan Srinivasan, Kaushik Roy |
Abstract | Spiking Neural Networks (SNNs) are being explored for their potential energy efficiency resulting from sparse, event-driven computations. Many recent works have demonstrated effective backpropagation for deep Spiking Neural Networks (SNNs) by approximating gradients over discontinuous neuron spikes or firing events. A beneficial side-effect of these surrogate gradient spiking backpropagation algorithms is that the spikes, which trigger additional computations, may now themselves be directly considered in the gradient calculations. We propose an explicit inclusion of spike counts in the loss function, along with a traditional error loss, causing the backpropagation learning algorithms to optimize weight parameters for both accuracy and spiking sparsity. As supported by existing theory of over-parameterized neural networks, there are many solution states with effectively equivalent accuracy. As such, appropriate weighting of the two loss goals during training in this multi-objective optimization process can yield an improvement in spiking sparsity without a significant loss of accuracy. We additionally explore a simulated annealing-inspired loss weighting technique to increase the weighting for sparsity as training time increases. Our preliminary results on the Cifar-10 dataset show up to 70.1% reduction in spiking activity with iso-accuracy compared to an equivalent SNN trained only for accuracy and up to 73.3% reduction in spiking activity if allowed a trade-off of 1% reduction in classification accuracy. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.01250v1 |
https://arxiv.org/pdf/2003.01250v1.pdf | |
PWC | https://paperswithcode.com/paper/explicitly-trained-spiking-sparsity-in |
Repo | |
Framework | |
Using Fractal Neural Networks to Play SimCity 1 and Conway’s Game of Life at Variable Scales
Title | Using Fractal Neural Networks to Play SimCity 1 and Conway’s Game of Life at Variable Scales |
Authors | Sam Earle |
Abstract | We introduce gym-city, a Reinforcement Learning environment that uses SimCity 1’s game engine to simulate an urban environment, wherein agents might seek to optimize one or a combination of any number of city-wide metrics, on gameboards of various sizes. We focus on population, and analyze our agents’ ability to generalize to larger map-sizes than those seen during training. The environment is interactive, allowing a human player to build alongside agents during training and inference, potentially influencing the course of their learning, or manually probing and evaluating their performance. To test our agents’ ability to capture distance-agnostic relationships between elements of the gameboard, we design a minigame within the environment which is, by design, unsolvable at large enough scales given strictly local strategies. Given the game engine’s extensive use of Cellular Automata, we also train our agents to “play” Conway’s Game of Life – again optimizing for population – and examine their behaviour at multiple scales. To make our models compatible with variable-scale gameplay, we use Neural Networks with recursive weights and structure – fractals to be truncated at different depths, dependent upon the size of the gameboard. |
Tasks | |
Published | 2020-01-29 |
URL | https://arxiv.org/abs/2002.03896v1 |
https://arxiv.org/pdf/2002.03896v1.pdf | |
PWC | https://paperswithcode.com/paper/using-fractal-neural-networks-to-play-simcity |
Repo | |
Framework | |