April 3, 2020

2730 words 13 mins read

Paper Group ANR 4

Paper Group ANR 4

Action Quality Assessment using Siamese Network-Based Deep Metric Learning. Machines Learn Appearance Bias in Face Recognition. Kinetic Theory for Residual Neural Networks. Understanding the Power and Limitations of Teaching with Imperfect Knowledge. Local intrinsic dimensionality estimators based on concentration of measure. On Coresets for Suppor …

Action Quality Assessment using Siamese Network-Based Deep Metric Learning

Title Action Quality Assessment using Siamese Network-Based Deep Metric Learning
Authors Hiteshi Jain, Gaurav Harit, Avinash Sharma
Abstract Automated vision-based score estimation models can be used as an alternate opinion to avoid judgment bias. In the past works the score estimation models were learned by regressing the video representations to the ground truth score provided by the judges. However such regression-based solutions lack interpretability in terms of giving reasons for the awarded score. One solution to make the scores more explicable is to compare the given action video with a reference video. This would capture the temporal variations w.r.t. the reference video and map those variations to the final score. In this work, we propose a new action scoring system as a two-phase system: (1) A Deep Metric Learning Module that learns similarity between any two action videos based on their ground truth scores given by the judges; (2) A Score Estimation Module that uses the first module to find the resemblance of a video to a reference video in order to give the assessment score. The proposed scoring model has been tested for Olympics Diving and Gymnastic vaults and the model outperforms the existing state-of-the-art scoring models.
Tasks Metric Learning
Published 2020-02-27
URL https://arxiv.org/abs/2002.12096v1
PDF https://arxiv.org/pdf/2002.12096v1.pdf
PWC https://paperswithcode.com/paper/action-quality-assessment-using-siamese

Machines Learn Appearance Bias in Face Recognition

Title Machines Learn Appearance Bias in Face Recognition
Authors Ryan Steed, Aylin Caliskan
Abstract We seek to determine whether state-of-the-art, black box face recognition techniques can learn first-impression appearance bias from human annotations. With FaceNet, a popular face recognition architecture, we train a transfer learning model on human subjects’ first impressions of personality traits in other faces. We measure the extent to which this appearance bias is embedded and benchmark learning performance for six different perceived traits. In particular, we find that our model is better at judging a person’s dominance based on their face than other traits like trustworthiness or likeability, even for emotionally neutral faces. We also find that our model tends to predict emotions for deliberately manipulated faces with higher accuracy than for randomly generated faces, just like a human subject. Our results lend insight into the manner in which appearance biases may be propagated by standard face recognition models.
Tasks Face Recognition, Transfer Learning
Published 2020-02-13
URL https://arxiv.org/abs/2002.05636v1
PDF https://arxiv.org/pdf/2002.05636v1.pdf
PWC https://paperswithcode.com/paper/machines-learn-appearance-bias-in-face

Kinetic Theory for Residual Neural Networks

Title Kinetic Theory for Residual Neural Networks
Authors M. Herty, T. Trimborn, G. Visconti
Abstract Deep residual neural networks (ResNet) are performing very well for many data science applications. We use kinetic theory to improve understanding and existing methods. A microscopic simplified residual neural network (SimResNet) model is studied as the limit of infinitely many inputs. This leads to kinetic formulations of the SimResNet and we analyze those with respect to sensitivities and steady states. Aggregation phenomena in the case of a linear activation function are also studied. In addition the analysis is validated by numerics. In particular, results on a clustering and regression problem are presented.
Published 2020-01-07
URL https://arxiv.org/abs/2001.04294v1
PDF https://arxiv.org/pdf/2001.04294v1.pdf
PWC https://paperswithcode.com/paper/kinetic-theory-for-residual-neural-networks

Understanding the Power and Limitations of Teaching with Imperfect Knowledge

Title Understanding the Power and Limitations of Teaching with Imperfect Knowledge
Authors Rati Devidze, Farnam Mansouri, Luis Haug, Yuxin Chen, Adish Singla
Abstract Machine teaching studies the interaction between a teacher and a student/learner where the teacher selects training examples for the learner to learn a specific task. The typical assumption is that the teacher has perfect knowledge of the task—this knowledge comprises knowing the desired learning target, having the exact task representation used by the learner, and knowing the parameters capturing the learning dynamics of the learner. Inspired by real-world applications of machine teaching in education, we consider the setting where teacher’s knowledge is limited and noisy, and the key research question we study is the following: When does a teacher succeed or fail in effectively teaching a learner using its imperfect knowledge? We answer this question by showing connections to how imperfect knowledge affects the teacher’s solution of the corresponding machine teaching problem when constructing optimal teaching sets. Our results have important implications for designing robust teaching algorithms for real-world applications.
Published 2020-03-21
URL https://arxiv.org/abs/2003.09712v1
PDF https://arxiv.org/pdf/2003.09712v1.pdf
PWC https://paperswithcode.com/paper/understanding-the-power-and-limitations-of

Local intrinsic dimensionality estimators based on concentration of measure

Title Local intrinsic dimensionality estimators based on concentration of measure
Authors Jonathan Bac, Andrei Zinovyev
Abstract Intrinsic dimensionality (ID) is one of the most fundamental characteristics of multi-dimensional data point clouds. Knowing ID is crucial to choose the appropriate machine learning approach as well as to understand its behavior and validate it. ID can be computed globally for the whole data distribution, or computed locally in different regions of the dataset. In this paper, we introduce new local estimators of ID based on linear separability of multi-dimensional data point clouds, which is one of the manifestations of concentration of measure. We empirically study the properties of these estimators and compare them with other recently introduced ID estimators exploiting various effects of measure concentration. Observed differences between estimators can be used to anticipate their behaviour in practical applications.
Published 2020-01-31
URL https://arxiv.org/abs/2001.11739v2
PDF https://arxiv.org/pdf/2001.11739v2.pdf
PWC https://paperswithcode.com/paper/local-intrinsic-dimensionality-estimators

On Coresets for Support Vector Machines

Title On Coresets for Support Vector Machines
Authors Murad Tukan, Cenk Baykal, Dan Feldman, Daniela Rus
Abstract We present an efficient coreset construction algorithm for large-scale Support Vector Machine (SVM) training in Big Data and streaming applications. A coreset is a small, representative subset of the original data points such that a models trained on the coreset are provably competitive with those trained on the original data set. Since the size of the coreset is generally much smaller than the original set, our preprocess-then-train scheme has potential to lead to significant speedups when training SVM models. We prove lower and upper bounds on the size of the coreset required to obtain small data summaries for the SVM problem. As a corollary, we show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings. We evaluate the performance of our algorithm on real-world and synthetic data sets. Our experimental results reaffirm the favorable theoretical properties of our algorithm and demonstrate its practical effectiveness in accelerating SVM training.
Published 2020-02-15
URL https://arxiv.org/abs/2002.06469v1
PDF https://arxiv.org/pdf/2002.06469v1.pdf
PWC https://paperswithcode.com/paper/on-coresets-for-support-vector-machines

Molecule Property Prediction and Classification with Graph Hypernetworks

Title Molecule Property Prediction and Classification with Graph Hypernetworks
Authors Eliya Nachmani, Lior Wolf
Abstract Graph neural networks are currently leading the performance charts in learning-based molecule property prediction and classification. Computational chemistry has, therefore, become the a prominent testbed for generic graph neural networks, as well as for specialized message passing methods. In this work, we demonstrate that the replacement of the underlying networks with hypernetworks leads to a boost in performance, obtaining state of the art results in various benchmarks. A major difficulty in the application of hypernetworks is their lack of stability. We tackle this by combining the current message and the first message. A recent work has tackled the training instability of hypernetworks in the context of error correcting codes, by replacing the activation function of the message passing network with a low-order Taylor approximation of it. We demonstrate that our generic solution can replace this domain-specific solution.
Published 2020-02-01
URL https://arxiv.org/abs/2002.00240v1
PDF https://arxiv.org/pdf/2002.00240v1.pdf
PWC https://paperswithcode.com/paper/molecule-property-prediction-and-1

Weight mechanism adding a constant in concatenation of series connect

Title Weight mechanism adding a constant in concatenation of series connect
Authors Xiaojie Qi, Yindi Zhao
Abstract It is a consensus that feature maps in the shallow layer are more related to image attributes such as texture and shape, whereas abstract semantic representation exists in the deep layer. Meanwhile, some image information will be lost in the process of the convolution operation. Naturally, the direct method is combining them together to gain lost detailed information through concatenation or adding. In fact, the image representation flowed in feature fusion can not match with the semantic representation completely, and the semantic deviation in different layers also destroy the information purification, that leads to useless information being mixed into the fusion layers. Therefore, it is crucial to narrow the gap among the fused layers and reduce the impact of noises during fusion. In this paper, we propose a method named weight mechanism to reduce the gap between feature maps in concatenation of series connection, and we get a better result of 0.80% mIoU improvement on Massachusetts building dataset by changing the weight of the concatenation of series connection in residual U-Net. Specifically, we design a new architecture named fused U-Net to test weight mechanism, and it also gains 0.12% mIoU improvement.
Published 2020-03-07
URL https://arxiv.org/abs/2003.03500v1
PDF https://arxiv.org/pdf/2003.03500v1.pdf
PWC https://paperswithcode.com/paper/weight-mechanism-adding-a-constant-in

A new hybrid approach for crude oil price forecasting: Evidence from multi-scale data

Title A new hybrid approach for crude oil price forecasting: Evidence from multi-scale data
Authors Yang Yifan, Guo Ju’e, Sun Shaolong, Li Yixin
Abstract Faced with the growing research towards crude oil price fluctuations influential factors following the accelerated development of Internet technology, accessible data such as Google search volume index are increasingly quantified and incorporated into forecasting approaches. In this paper, we apply multi-scale data that including both GSVI data and traditional economic data related to crude oil price as independent variables and propose a new hybrid approach for monthly crude oil price forecasting. This hybrid approach, based on divide and conquer strategy, consists of K-means method, kernel principal component analysis and kernel extreme learning machine , where K-means method is adopted to divide input data into certain clusters, KPCA is applied to reduce dimension, and KELM is employed for final crude oil price forecasting. The empirical result can be analyzed from data and method levels. At the data level, GSVI data perform better than economic data in level forecasting accuracy but with opposite performance in directional forecasting accuracy because of Herd Behavior, while hybrid data combined their advantages and obtain best forecasting performance in both level and directional accuracy. At the method level, the approaches with K-means perform better than those without K-means, which demonstrates that divide and conquer strategy can effectively improve the forecasting performance.
Published 2020-02-22
URL https://arxiv.org/abs/2002.09656v1
PDF https://arxiv.org/pdf/2002.09656v1.pdf
PWC https://paperswithcode.com/paper/a-new-hybrid-approach-for-crude-oil-price

Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture

Title Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Authors Haoran Miao, Gaofeng Cheng, Changfeng Gao, Pengyuan Zhang, Yonghong Yan
Abstract Recently, Transformer has gained success in automatic speech recognition (ASR) field. However, it is challenging to deploy a Transformer-based end-to-end (E2E) model for online speech recognition. In this paper, we propose the Transformer-based online CTC/attention E2E ASR architecture, which contains the chunk self-attention encoder (chunk-SAE) and the monotonic truncated attention (MTA) based self-attention decoder (SAD). Firstly, the chunk-SAE splits the speech into isolated chunks. To reduce the computational cost and improve the performance, we propose the state reuse chunk-SAE. Sencondly, the MTA based SAD truncates the speech features monotonically and performs attention on the truncated features. To support the online recognition, we integrate the state reuse chunk-SAE and the MTA based SAD into online CTC/attention architecture. We evaluate the proposed online models on the HKUST Mandarin ASR benchmark and achieve a 23.66% character error rate (CER) with a 320 ms latency. Our online model yields as little as 0.19% absolute CER degradation compared with the offline baseline, and achieves significant improvement over our prior work on Long Short-Term Memory (LSTM) based online E2E models.
Tasks End-To-End Speech Recognition, Speech Recognition
Published 2020-01-15
URL https://arxiv.org/abs/2001.08290v2
PDF https://arxiv.org/pdf/2001.08290v2.pdf
PWC https://paperswithcode.com/paper/transformer-based-online-ctcattention-end-to

Second-order Conditional Gradients

Title Second-order Conditional Gradients
Authors Alejandro Carderera, Sebastian Pokutta
Abstract Constrained second-order convex optimization algorithms are the method of choice when a high accuracy solution to a problem is needed, due to the quadratic convergence rates these methods enjoy when close to the optimum. These algorithms require the solution of a constrained quadratic subproblem at every iteration. In the case where the feasible region can only be accessed efficiently through a linear optimization oracle, and computing first-order information about the function, although possible, is costly, the coupling of constrained second-order and conditional gradient algorithms leads to competitive algorithms with solid theoretical guarantees and good numerical performance.
Published 2020-02-20
URL https://arxiv.org/abs/2002.08907v1
PDF https://arxiv.org/pdf/2002.08907v1.pdf
PWC https://paperswithcode.com/paper/second-order-conditional-gradients

Experimental Comparison of Semi-parametric, Parametric, and Machine Learning Models for Time-to-Event Analysis Through the Concordance Index

Title Experimental Comparison of Semi-parametric, Parametric, and Machine Learning Models for Time-to-Event Analysis Through the Concordance Index
Authors Camila Fernandez, Chung Shue Chen, Pierre Gaillard, Alonso Silva
Abstract In this paper, we make an experimental comparison of semi-parametric (Cox proportional hazards model, Aalen’s additive regression model), parametric (Weibull AFT model), and machine learning models (Random Survival Forest, Gradient Boosting with Cox Proportional Hazards Loss, DeepSurv) through the concordance index on two different datasets (PBC and GBCSG2). We present two comparisons: one with the default hyper-parameters of these models and one with the best hyper-parameters found by randomized search.
Published 2020-03-13
URL https://arxiv.org/abs/2003.08820v1
PDF https://arxiv.org/pdf/2003.08820v1.pdf
PWC https://paperswithcode.com/paper/experimental-comparison-of-semi-parametric

Evolutionary Image Transition and Painting Using Random Walks

Title Evolutionary Image Transition and Painting Using Random Walks
Authors Aneta Neumann, Bradley Alexander, Frank Neumann
Abstract We present a study demonstrating how random walk algorithms can be used for evolutionary image transition. We design different mutation operators based on uniform and biased random walks and study how their combination with a baseline mutation operator can lead to interesting image transition processes in terms of visual effects and artistic features. Using feature-based analysis we investigate the evolutionary image transition behaviour with respect to different features and evaluate the images constructed during the image transition process. Afterwards, we investigate how modifications of our biased random walk approaches can be used for evolutionary image painting. We introduce an evolutionary image painting approach whose underlying biased random walk can be controlled by a parameter influencing the bias of the random walk and thereby creating different artistic painting effects.
Published 2020-03-02
URL https://arxiv.org/abs/2003.01517v1
PDF https://arxiv.org/pdf/2003.01517v1.pdf
PWC https://paperswithcode.com/paper/evolutionary-image-transition-and-painting

Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities

Title Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities
Authors Jelena Diakonikolas
Abstract We leverage the connections between nonexpansive maps, monotone Lipschitz operators, and proximal mappings to obtain near-optimal (i.e., optimal up to poly-log factors in terms of iteration complexity) and parameter-free methods for solving monotone inclusion problems. These results immediately translate into near-optimal guarantees for approximating strong solutions to variational inequality problems, approximating convex-concave min-max optimization problems, and minimizing the norm of the gradient in min-max optimization problems. Our analysis is based on a novel and simple potential-based proof of convergence of Halpern iteration, a classical iteration for finding fixed points of nonexpansive maps. Additionally, we provide a series of algorithmic reductions that highlight connections between different problem classes and lead to lower bounds that certify near-optimality of the studied methods.
Published 2020-02-20
URL https://arxiv.org/abs/2002.08872v2
PDF https://arxiv.org/pdf/2002.08872v2.pdf
PWC https://paperswithcode.com/paper/halpern-iteration-for-near-optimal-and

Prediction of Bayesian Intervals for Tropical Storms

Title Prediction of Bayesian Intervals for Tropical Storms
Authors Max Chiswick, Sam Ganzfried
Abstract Building on recent research for prediction of hurricane trajectories using recurrent neural networks (RNNs), we have developed improved methods and generalized the approach to predict Bayesian intervals in addition to simple point estimates. Tropical storms are capable of causing severe damage, so accurately predicting their trajectories can bring significant benefits to cities and lives, especially as they grow more intense due to climate change effects. By implementing the Bayesian interval using dropout in an RNN, we improve the actionability of the predictions, for example by estimating the areas to evacuate in the landfall region. We used an RNN to predict the trajectory of the storms at 6-hour intervals. We used latitude, longitude, windspeed, and pressure features from a Statistical Hurricane Intensity Prediction Scheme (SHIPS) dataset of about 500 tropical storms in the Atlantic Ocean. Our results show how neural network dropout values affect predictions and intervals.
Published 2020-03-10
URL https://arxiv.org/abs/2003.05024v1
PDF https://arxiv.org/pdf/2003.05024v1.pdf
PWC https://paperswithcode.com/paper/prediction-of-bayesian-intervals-for-tropical
comments powered by Disqus