April 1, 2020

2772 words 14 mins read

Paper Group ANR 483

Geometry and Topology of Deep Neural Networks’ Decision Boundaries. Performance Analysis of Combine Harvester using Hybrid Model of Artificial Neural Networks Particle Swarm Optimization. Minimax Confidence Interval for Off-Policy Evaluation and Policy Optimization. Generalized sampling with functional principal components for high-resolution rando …

Geometry and Topology of Deep Neural Networks’ Decision Boundaries


Title	Geometry and Topology of Deep Neural Networks’ Decision Boundaries
Authors	Bo Liu
Abstract	Geometry and topology of decision regions are closely related with classification performance and robustness against adversarial attacks. In this paper, we use differential geometry and topology to explore theoretically the geometrical and topological properties of decision regions produced by deep neural networks (DNNs). The goals are to obtain some geometrical and topological properties of decision regions for given DNN models, and provide some principled guidances to designing and regularizing DNNs. At first, we give the curvatures of decision boundaries in terms of network weights. Based on the rotation index theorem and Gauss-Bonnet-Chern theorem, we then propose methods to identify the closeness and connectivity of given decision boundaries, and obtain the Euler characteristics of closed ones, all without the need to solve decision boundaries explicitly. Finally, we give necessary conditions on network architectures in order to produce closed decision boundaries, and sufficient conditions on network weights for producing zero curvature (flat or developable) decision boundaries.
Tasks
Published	2020-03-07
URL	https://arxiv.org/abs/2003.03687v1
PDF	https://arxiv.org/pdf/2003.03687v1.pdf
PWC	https://paperswithcode.com/paper/geometry-and-topology-of-deep-neural-networks
Repo
Framework

Performance Analysis of Combine Harvester using Hybrid Model of Artificial Neural Networks Particle Swarm Optimization


Title	Performance Analysis of Combine Harvester using Hybrid Model of Artificial Neural Networks Particle Swarm Optimization
Authors	Laszlo Nadai, Felde Imre, Sina Ardabili, Tarahom Mesri Gundoshmian, Pinter Gergo, Amir Mosavi
Abstract	Novel applications of artificial intelligence for tuning the parameters of industrial machines for optimal performance are emerging at a fast pace. Tuning the combine harvesters and improving the machine performance can dramatically minimize the wastes during harvesting, and it is also beneficial to machine maintenance. Literature includes several soft computing, machine learning and optimization methods that had been used to model the function of harvesters of various crops. Due to the complexity of the problem, machine learning methods had been recently proposed to predict the optimal performance with promising results. In this paper, through proposing a novel hybrid machine learning model based on artificial neural networks integrated with particle swarm optimization (ANN-PSO), the performance analysis of a common combine harvester is presented. The hybridization of machine learning methods with soft computing techniques has recently shown promising results to improve the performance of the combine harvesters. This research aims at improving the results further by providing more stable models with higher accuracy.
Tasks
Published	2020-02-22
URL	https://arxiv.org/abs/2002.11041v1
PDF	https://arxiv.org/pdf/2002.11041v1.pdf
PWC	https://paperswithcode.com/paper/performance-analysis-of-combine-harvester
Repo
Framework

Minimax Confidence Interval for Off-Policy Evaluation and Policy Optimization


Title	Minimax Confidence Interval for Off-Policy Evaluation and Policy Optimization
Authors	Nan Jiang, Jiawei Huang
Abstract	We study minimax methods for off-policy evaluation (OPE) using value-functions and marginalized importance weights. Despite that they hold promises of overcoming the exponential variance in traditional importance sampling, several key problems remain: (1) They require function approximation and are generally biased. For the sake of trustworthy OPE, is there anyway to quantify the biases? (2) They are split into two styles (“weight-learning” vs “value-learning”). Can we unify them? In this paper we answer both questions positively. By slightly altering the derivation of previous methods (one from each style; Uehara et al., 2019), we unify them into a single confidence interval (CI) that automatically comes with a special type of double robustness: when either the value-function or importance weight class is well-specified, the CI is valid and its length quantifies the misspecification of the other class. We can also tell which class is misspecified, which provides useful diagnostic information for the design of function approximation. Our CI also provides a unified view of and new insights to some recent methods: for example, one side of the CI recovers a version of AlgaeDICE (Nachum et al., 2019b), and we show that the two sides need to be used together and either alone may incur doubled approximation error as a point estimate. We further examine the potential of applying these bounds to two long-standing problems: off-policy policy optimization with poor data coverage (i.e., exploitation), and systematic exploration. With a well-specified value-function class, we show that optimizing the lower and the upper bounds lead to effective exploitation and exploration, respectively. Our results also suggests an interesting assymetry between exploration and exploitation, that the former might require substantially weaker realizability assumptions than the latter.
Tasks	Efficient Exploration
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02081v2
PDF	https://arxiv.org/pdf/2002.02081v2.pdf
PWC	https://paperswithcode.com/paper/minimax-confidence-interval-for-off-policy
Repo
Framework

Generalized sampling with functional principal components for high-resolution random field estimation


Title	Generalized sampling with functional principal components for high-resolution random field estimation
Authors	Milana Gataric
Abstract	In this paper, we take a statistical approach to the problem of recovering a function from low-resolution measurements taken with respect to an arbitrary basis, by regarding the function of interest as a realization of a random field. We introduce an infinite-dimensional framework for high-resolution estimation of a random field from its low-resolution indirect measurements as well as the high-resolution measurements of training observations by merging the existing frameworks of generalized sampling and functional principal component analysis. We study the statistical performance of the resulting estimation procedure and show that high-resolution recovery is indeed possible provided appropriate low-rank and angle conditions hold and provided the training set is sufficiently large relative to the desired resolution. We also consider sparse representations of the principle components, which can reduce the required size of the training set. Furthermore, the effectiveness of the proposed procedure is investigated in various numerical examples.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08724v1
PDF	https://arxiv.org/pdf/2002.08724v1.pdf
PWC	https://paperswithcode.com/paper/generalized-sampling-with-functional
Repo
Framework

Uncertainty-Aware Multi-Shot Knowledge Distillation for Image-Based Object Re-Identification


Title	Uncertainty-Aware Multi-Shot Knowledge Distillation for Image-Based Object Re-Identification
Authors	Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen
Abstract	Object re-identification (re-id) aims to identify a specific object across times or camera views, with the person re-id and vehicle re-id as the most widely studied applications. Re-id is challenging because of the variations in viewpoints, (human) poses, and occlusions. Multi-shots of the same object can cover diverse viewpoints/poses and thus provide more comprehensive information. In this paper, we propose exploiting the multi-shots of the same identity to guide the feature learning of each individual image. Specifically, we design an Uncertainty-aware Multi-shot Teacher-Student (UMTS) Network. It consists of a teacher network (T-net) that learns the comprehensive features from multiple images of the same object, and a student network (S-net) that takes a single image as input. In particular, we take into account the data dependent heteroscedastic uncertainty for effectively transferring the knowledge from the T-net to S-net. To the best of our knowledge, we are the first to make use of multi-shots of an object in a teacher-student learning manner for effectively boosting the single image based re-id. We validate the effectiveness of our approach on the popular vehicle re-id and person re-id datasets. In inference, the S-net alone significantly outperforms the baselines and achieves the state-of-the-art performance.
Tasks
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05197v2
PDF	https://arxiv.org/pdf/2001.05197v2.pdf
PWC	https://paperswithcode.com/paper/uncertainty-aware-multi-shot-knowledge
Repo
Framework

Boosting Ridge Regression for High Dimensional Data Classification


Title	Boosting Ridge Regression for High Dimensional Data Classification
Authors	Jakramate Bootkrajang
Abstract	Ridge regression is a well established regression estimator which can conveniently be adapted for classification problems. One compelling reason is probably the fact that ridge regression emits a closed-form solution thereby facilitating the training phase. However in the case of high-dimensional problems, the closed-form solution which involves inverting the regularised covariance matrix is rather expensive to compute. The high computational demand of such operation also renders difficulty in constructing ensemble of ridge regressions. In this paper, we consider learning an ensemble of ridge regressors where each regressor is trained in its own randomly projected subspace. Subspace regressors are later combined via adaptive boosting methodology. Experiments based on five high-dimensional classification problems demonstrated the effectiveness of the proposed method in terms of learning time and in some cases improved predictive performance can be observed.
Tasks
Published	2020-03-25
URL	https://arxiv.org/abs/2003.11283v1
PDF	https://arxiv.org/pdf/2003.11283v1.pdf
PWC	https://paperswithcode.com/paper/boosting-ridge-regression-for-high
Repo
Framework

Optimal Disturbance Attenuation Approach with Measurement Feedback to Missile Guidance


Title	Optimal Disturbance Attenuation Approach with Measurement Feedback to Missile Guidance
Authors	Barak Or, Joseph Z. Ben-Asher, Isaac Yaesh
Abstract	Pursuit-evasion differential games using the Disturbance Attenuation approach are revisited. Under this approach, the pursuer actions are considered to be control actions, whereas all external actions, such as target maneuvers, measurement errors and initial position uncertainties, are considered to be disturbances. Two open issues have been addressed, namely the effect of noise on the control gains, and the effect of trajectory shaping on the solution. These issues are closely related to the question of the best choice for the disturbance attenuation ratio. Detailed analyses are performed for two simple pursuit-evasion cases: a Simple Boat Guidance Problem and Missile Guidance Engagement.
Tasks
Published	2020-01-10
URL	https://arxiv.org/abs/2001.04308v1
PDF	https://arxiv.org/pdf/2001.04308v1.pdf
PWC	https://paperswithcode.com/paper/optimal-disturbance-attenuation-approach-with
Repo
Framework

Deep Learning for Learning Graph Representations


Title	Deep Learning for Learning Graph Representations
Authors	Wenwu Zhu, Xin Wang, Peng Cui
Abstract	Mining graph data has become a popular research topic in computer science and has been widely studied in both academia and industry given the increasing amount of network data in the recent years. However, the huge amount of network data has posed great challenges for efficient analysis. This motivates the advent of graph representation which maps the graph into a low-dimension vector space, keeping original graph structure and supporting graph inference. The investigation on efficient representation of a graph has profound theoretical significance and important realistic meaning, we therefore introduce some basic ideas in graph representation/network embedding as well as some representative models in this chapter.
Tasks	Network Embedding
Published	2020-01-02
URL	https://arxiv.org/abs/2001.00293v1
PDF	https://arxiv.org/pdf/2001.00293v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-learning-graph
Repo
Framework

TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation


Title	TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation
Authors	Wen Wang, Xiaojiang Peng, Yanzhou Su, Yu Qiao, Jian Cheng
Abstract	Video action anticipation aims to predict future action categories from observed frames. Current state-of-the-art approaches mainly resort to recurrent neural networks to encode history information into hidden states, and predict future actions from the hidden representations. It is well known that the recurrent pipeline is inefficient in capturing long-term information which may limit its performance in predication task. To address this problem, this paper proposes a simple yet efficient Temporal Transformer with Progressive Prediction (TTPP) framework, which repurposes a Transformer-style architecture to aggregate observed features, and then leverages a light-weight network to progressively predict future features and actions. Specifically, predicted features along with predicted probabilities are accumulated into the inputs of subsequent prediction. We evaluate our approach on three action datasets, namely TVSeries, THUMOS-14, and TV-Human-Interaction. Additionally we also conduct a comprehensive study for several popular aggregation and prediction strategies. Extensive results show that TTPP not only outperforms the state-of-the-art methods but also more efficient.
Tasks
Published	2020-03-07
URL	https://arxiv.org/abs/2003.03530v1
PDF	https://arxiv.org/pdf/2003.03530v1.pdf
PWC	https://paperswithcode.com/paper/ttpp-temporal-transformer-with-progressive
Repo
Framework

Problems with Shapley-value-based explanations as feature importance measures


Title	Problems with Shapley-value-based explanations as feature importance measures
Authors	I. Elizabeth Kumar, Suresh Venkatasubramanian, Carlos Scheidegger, Sorelle Friedler
Abstract	Game-theoretic formulations of feature importance have become popular as a way to “explain” machine learning models. These methods define a cooperative game between the features of a model and distribute influence among these input elements using some form of the game’s unique Shapley values. Justification for these methods rests on two pillars: their desirable mathematical properties, and their applicability to specific motivations for explanations. We show that mathematical problems arise when Shapley values are used for feature importance and that the solutions to mitigate these necessarily induce further complexity, such as the need for causal reasoning. We also draw on additional literature to argue that Shapley values do not provide explanations which suit human-centric goals of explainability.
Tasks	Feature Importance
Published	2020-02-25
URL	https://arxiv.org/abs/2002.11097v1
PDF	https://arxiv.org/pdf/2002.11097v1.pdf
PWC	https://paperswithcode.com/paper/problems-with-shapley-value-based
Repo
Framework

Probabilistic K-means Clustering via Nonlinear Programming


Title	Probabilistic K-means Clustering via Nonlinear Programming
Authors	Yujian Li, Bowen Liu, Zhaoying Liu, Ting Zhang
Abstract	K-means is a classical clustering algorithm with wide applications. However, soft K-means, or fuzzy c-means at m=1, remains unsolved since 1981. To address this challenging open problem, we propose a novel clustering model, i.e. Probabilistic K-Means (PKM), which is also a nonlinear programming model constrained on linear equalities and linear inequalities. In theory, we can solve the model by active gradient projection, while inefficiently. Thus, we further propose maximum-step active gradient projection and fast maximum-step active gradient projection to solve it more efficiently. By experiments, we evaluate the performance of PKM and how well the proposed methods solve it in five aspects: initialization robustness, clustering performance, descending stability, iteration number, and convergence speed.
Tasks
Published	2020-01-10
URL	https://arxiv.org/abs/2001.03286v1
PDF	https://arxiv.org/pdf/2001.03286v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-k-means-clustering-via
Repo
Framework

Across-scale Process Similarity based Interpolation for Image Super-Resolution


Title	Across-scale Process Similarity based Interpolation for Image Super-Resolution
Authors	Sobhan Kanti Dhara, Debashis Sen
Abstract	A pivotal step in image super-resolution techniques is interpolation, which aims at generating high resolution images without introducing artifacts such as blurring and ringing. In this paper, we propose a technique that performs interpolation through an infusion of high frequency signal components computed by exploiting `process similarity'. By` process similarity’, we refer to the resemblance between a decomposition of the image at a resolution to the decomposition of the image at another resolution. In our approach, the decompositions generating image details and approximations are obtained through the discrete wavelet (DWT) and stationary wavelet (SWT) transforms. The complementary nature of DWT and SWT is leveraged to get the structural relation between the input image and its low resolution approximation. The structural relation is represented by optimal model parameters obtained through particle swarm optimization (PSO). Owing to process similarity, these parameters are used to generate the high resolution output image from the input image. The proposed approach is compared with six existing techniques qualitatively and in terms of PSNR, SSIM, and FSIM measures, along with computation time (CPU time). It is found that our approach is the fastest in terms of CPU time and produces comparable results.
Tasks	Image Super-Resolution, Super-Resolution
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09182v1
PDF	https://arxiv.org/pdf/2003.09182v1.pdf
PWC	https://paperswithcode.com/paper/across-scale-process-similarity-based
Repo
Framework

Knowledge Graph Alignment using String Edit Distance


Title	Knowledge Graph Alignment using String Edit Distance
Authors	Navdeep Kaur, Gautam Kunapuli, Sriraam Natarajan
Abstract	In this work, we propose a novel knowledge graph alignment technique based upon string edit distance that exploits the type information between entities and can find similarity between relations of any arity
Tasks
Published	2020-03-13
URL	https://arxiv.org/abs/2003.12145v2
PDF	https://arxiv.org/pdf/2003.12145v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-graph-alignment-using-string-edit
Repo
Framework

MIM-Based Generative Adversarial Networks and Its Application on Anomaly Detection


Title	MIM-Based Generative Adversarial Networks and Its Application on Anomaly Detection
Authors	Rui She, Pingyi Fan
Abstract	In terms of Generative Adversarial Networks (GANs), the information metric to discriminate the generative data and the real data, lies in the key point of generation efficiency, which plays an important role in GAN-based applications, especially in anomaly detection. As for the original GAN, the information metric based on Kullback-Leibler (KL) divergence has limitations on rare events generation and training performance for adversarial networks. Therefore, it is significant to investigate the metrics used in GANs to improve the generation ability as well as bring gains in the training process. In this paper, we adopt the exponential form, referred from the Message Importance Measure (MIM), to replace the logarithm form of the original GAN. This approach named MIM-based GAN, has dominant performance on training process and rare events generation. Specifically, we first discuss the characteristics of training process in this approach. Moreover, we also analyze its advantages on generating rare events in theory. In addition, we do simulations on the datasets of MNIST and ODDS to see that the MIM-based GAN achieves state-of-the-art performance on anomaly detection compared with some classical GANs.
Tasks	Anomaly Detection
Published	2020-03-25
URL	https://arxiv.org/abs/2003.11285v1
PDF	https://arxiv.org/pdf/2003.11285v1.pdf
PWC	https://paperswithcode.com/paper/mim-based-generative-adversarial-networks-and
Repo
Framework

Modeling Musical Onset Probabilities via Neural Distribution Learning


Title	Modeling Musical Onset Probabilities via Neural Distribution Learning
Authors	Jaesung Huh, Egil Martinsson, Adrian Kim, Jung-Woo Ha
Abstract	Musical onset detection can be formulated as a time-to-event (TTE) or time-since-event (TSE) prediction task by defining music as a sequence of onset events. Here we propose a novel method to model the probability of onsets by introducing a sequential density prediction model. The proposed model estimates TTE & TSE distributions from mel-spectrograms using convolutional neural networks (CNNs) as a density predictor. We evaluate our model on the Bock dataset show-ing comparable results to previous deep-learning models.
Tasks
Published	2020-02-10
URL	https://arxiv.org/abs/2002.03559v1
PDF	https://arxiv.org/pdf/2002.03559v1.pdf
PWC	https://paperswithcode.com/paper/modeling-musical-onset-probabilities-via
Repo
Framework