January 30, 2020

2733 words 13 mins read

Paper Group ANR 377

Paper Group ANR 377

Influences in Forecast Errors for Wind and Photovoltaic Power: A Study on Machine Learning Models. Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection. Diversity in Fashion Recommendation using Semantic Parsing. Anti-Confusing: Region-Aware Network for Human Pose Estimation. Minimax Weight and Q-Function Learning for Off-Po …

Influences in Forecast Errors for Wind and Photovoltaic Power: A Study on Machine Learning Models

Title Influences in Forecast Errors for Wind and Photovoltaic Power: A Study on Machine Learning Models
Authors Jens Schreiber, Artjom Buschin, Bernhard Sick
Abstract Despite the increasing importance of forecasts of renewable energy, current planning studies only address a general estimate of the forecast quality to be expected and selected forecast horizons. However, these estimates allow only a limited and highly uncertain use in the planning of electric power distribution. More reliable planning processes require considerably more information about future forecast quality. In this article, we present an in-depth analysis and comparison of influencing factors regarding uncertainty in wind and photovoltaic power forecasts, based on four different machine learning (ML) models. In our analysis, we found substantial differences in uncertainty depending on ML models, data coverage, and seasonal patterns that have to be considered in future planning studies.
Tasks
Published 2019-05-31
URL https://arxiv.org/abs/1905.13668v1
PDF https://arxiv.org/pdf/1905.13668v1.pdf
PWC https://paperswithcode.com/paper/influences-in-forecast-errors-for-wind-and
Repo
Framework

Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection

Title Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection
Authors Hongkai Zhang, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen
Abstract Recent researches attempt to improve the detection performance by adopting the idea of cascade for single-stage detectors. In this paper, we analyze and discover that inconsistency is the major factor limiting the performance. The refined anchors are associated with the feature extracted from the previous location and the classifier is confused by misaligned classification and localization. Further, we point out two main designing rules for the cascade manner: improving consistency between classification confidence and localization performance, and maintaining feature consistency between different stages. A multistage object detector named Cas-RetinaNet, is then proposed for reducing the misalignments. It consists of sequential stages trained with increasing IoU thresholds for improving the correlation, and a novel Feature Consistency Module for mitigating the feature inconsistency. Experiments show that our proposed Cas-RetinaNet achieves stable performance gains across different models and input scales. Specifically, our method improves RetinaNet from 39.1 AP to 41.1 AP on the challenging MS COCO dataset without any bells or whistles.
Tasks Object Detection
Published 2019-07-16
URL https://arxiv.org/abs/1907.06881v1
PDF https://arxiv.org/pdf/1907.06881v1.pdf
PWC https://paperswithcode.com/paper/cascade-retinanet-maintaining-consistency-for
Repo
Framework

Diversity in Fashion Recommendation using Semantic Parsing

Title Diversity in Fashion Recommendation using Semantic Parsing
Authors Sagar Verma, Sukhad Anand, Chetan Arora, Atul Rai
Abstract Developing recommendation system for fashion images is challenging due to the inherent ambiguity associated with what criterion a user is looking at. Suggesting multiple images where each output image is similar to the query image on the basis of a different feature or part is one way to mitigate the problem. Existing works for fashion recommendation have used Siamese or Triplet network to learn features between a similar pair and a similar-dissimilar triplet respectively. However, these methods do not provide basic information such as, how two clothing images are similar, or which parts present in the two images make them similar. In this paper, we propose to recommend images by explicitly learning and exploiting part based similarity. We propose a novel approach of learning discriminative features from weakly-supervised data by using visual attention over the parts and a texture encoding network. We show that the learned features surpass the state-of-the-art in retrieval task on DeepFashion dataset. We then use the proposed model to recommend fashion images having an explicit variation with respect to similarity of any of the parts.
Tasks Semantic Parsing
Published 2019-10-18
URL https://arxiv.org/abs/1910.08292v1
PDF https://arxiv.org/pdf/1910.08292v1.pdf
PWC https://paperswithcode.com/paper/diversity-in-fashion-recommendation-using
Repo
Framework

Anti-Confusing: Region-Aware Network for Human Pose Estimation

Title Anti-Confusing: Region-Aware Network for Human Pose Estimation
Authors Xuan Cao, Yanhao Ge, Ying Tai, Wei Zhang, Jian Li, Chengjie Wang, Jilin Li, Feiyue Huang
Abstract In this work, we propose a novel framework named Region-Aware Network (RANet), which learns the ability of anti-confusing in case of heavy occlusion, nearby person and symmetric appearance, for human pose estimation. Specifically, the proposed method addresses three key aspects, i.e., data augmentation, feature learning and prediction fusion, respectively. First, we propose Parsing-based Data Augmentation (PDA) to generate abundant data that synthesizes confusing textures. Second, we not only propose a Feature Pyramid Stem (FPS) to learn stronger low-level features in lower stage; but also incorporate an Effective Region Extraction (ERE) module to excavate better target-specific features. Third, we introduce Cascade Voting Fusion (CVF) to explicitly exclude the inferior predictions and fuse the rest effective predictions for the final pose estimation. Extensive experimental results on two popular benchmarks, i.e. MPII and LSP, demonstrate the effectiveness of our method against the state-of-the-art competitors. Especially on easily-confusable joints, our method makes significant improvement.
Tasks Data Augmentation, Pose Estimation
Published 2019-05-03
URL https://arxiv.org/abs/1905.00996v2
PDF https://arxiv.org/pdf/1905.00996v2.pdf
PWC https://paperswithcode.com/paper/anti-confusing-region-aware-network-for-human
Repo
Framework

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

Title Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Authors Masatoshi Uehara, Jiawei Huang, Nan Jiang
Abstract We provide theoretical investigations into off-policy evaluation in reinforcement learning using function approximators for (marginalized) importance weights and value functions. Our contributions include: (1) A new estimator, MWL, that directly estimates importance ratios over the state-action distributions, removing the reliance on knowledge of the behavior policy as in prior work (Liu et al., 2018). (2) Another new estimator, MQL, obtained by swapping the roles of importance weights and value-functions in MWL. MQL has an intuitive interpretation of minimizing average Bellman errors and can be combined with MWL in a doubly robust manner. (3) Several additional results that offer further insights into these methods, including the sample complexity analyses of MWL and MQL, their asymptotic optimality in the tabular setting, how the learned importance weights depend the choice of the discriminator class, and how our methods provide a unified view of some old and new algorithms in RL.
Tasks
Published 2019-10-28
URL https://arxiv.org/abs/1910.12809v3
PDF https://arxiv.org/pdf/1910.12809v3.pdf
PWC https://paperswithcode.com/paper/minimax-weight-and-q-function-learning-for
Repo
Framework

Improving Transformer Models by Reordering their Sublayers

Title Improving Transformer Models by Reordering their Sublayers
Authors Ofir Press, Noah A. Smith, Omer Levy
Abstract Multilayer transformer networks consist of interleaved self-attention and feedforward sublayers. Could ordering the sublayers in a different pattern achieve better performance? We generate randomly ordered transformers and train them with the language modeling objective. We observe that some of these models are able to achieve better performance than the interleaved baseline, and that those successful variants tend to have more self-attention at the bottom and more feedforward sublayers at the top. We propose a new transformer design pattern that adheres to this property, the sandwich transformer, and show that it improves perplexity on the WikiText-103 language modeling benchmark, at no cost in parameters, memory, or training time.
Tasks Language Modelling
Published 2019-11-10
URL https://arxiv.org/abs/1911.03864v1
PDF https://arxiv.org/pdf/1911.03864v1.pdf
PWC https://paperswithcode.com/paper/improving-transformer-models-by-reordering
Repo
Framework

Controllable Sentence Simplification: Employing Syntactic and Lexical Constraints

Title Controllable Sentence Simplification: Employing Syntactic and Lexical Constraints
Authors Jonathan Mallinson, Mirella Lapata
Abstract Sentence simplification aims to make sentences easier to read and understand. Recent approaches have shown promising results with sequence-to-sequence models which have been developed assuming homogeneous target audiences. In this paper we argue that different users have different simplification needs (e.g. dyslexics vs. non-native speakers), and propose CROSS, ContROllable Sentence Simplification model, which allows to control both the level of simplicity and the type of the simplification. We achieve this by enriching a Transformer-based architecture with syntactic and lexical constraints (which can be set or learned from data). Empirical results on two benchmark datasets show that constraints are key to successful simplification, offering flexible generation output.
Tasks
Published 2019-10-10
URL https://arxiv.org/abs/1910.04387v1
PDF https://arxiv.org/pdf/1910.04387v1.pdf
PWC https://paperswithcode.com/paper/controllable-sentence-simplification-1
Repo
Framework

Bayesian Strategies for Likelihood Ratio Computation in Forensic Voice Comparison with Automatic Systems

Title Bayesian Strategies for Likelihood Ratio Computation in Forensic Voice Comparison with Automatic Systems
Authors Daniel Ramos, Juan Maroñas, Alicia Lozano-Diez
Abstract This paper explores several strategies for Forensic Voice Comparison (FVC), aimed at improving the performance of the LRs when using generative Gaussian score-to-LR models. First, different anchoring strategies are proposed, with the objective of adapting the LR computation process to the case at hand, always respecting the propositions defined for the particular case. Second, a fully-Bayesian Gaussian model is used to tackle the sparsity in the training scores that is often present when the proposed anchoring strategies are used. Experiments are performed using the 2014 i-Vector challenge set-up, which presents high variability in a telephone speech context. The results show that the proposed fully-Bayesian model clearly outperforms a more common Maximum-Likelihood approach, leading to high robustness when the scores to train the model become sparse.
Tasks
Published 2019-09-18
URL https://arxiv.org/abs/1909.08315v1
PDF https://arxiv.org/pdf/1909.08315v1.pdf
PWC https://paperswithcode.com/paper/bayesian-strategies-for-likelihood-ratio
Repo
Framework

Mexican Hat Wavelet Kernel ELM for Multiclass Classification

Title Mexican Hat Wavelet Kernel ELM for Multiclass Classification
Authors Jie Wang, Yi-Fan Song, Tian-Lei Ma
Abstract Kernel extreme learning machine (KELM) is a novel feedforward neural network, which is widely used in classification problems. To some extent, it solves the existing problems of the invalid nodes and the large computational complexity in ELM. However, the traditional KELM classifier usually has a low test accuracy when it faces multiclass classification problems. In order to solve the above problem, a new classifier, Mexican Hat wavelet KELM classifier, is proposed in this paper. The proposed classifier successfully improves the training accuracy and reduces the training time in the multiclass classification problems. Moreover, the validity of the Mexican Hat wavelet as a kernel function of ELM is rigorously proved. Experimental results on different data sets show that the performance of the proposed classifier is significantly superior to the compared classifiers.
Tasks
Published 2019-02-20
URL http://arxiv.org/abs/1902.07422v1
PDF http://arxiv.org/pdf/1902.07422v1.pdf
PWC https://paperswithcode.com/paper/mexican-hat-wavelet-kernel-elm-for-multiclass
Repo
Framework

Fast geodesic shooting for landmark matching using CUDA

Title Fast geodesic shooting for landmark matching using CUDA
Authors Jiancong Wang
Abstract Landmark matching via geodesic shooting is a prerequisite task for numerous registration based applications in biomedicine. Geodesic shooting has been developed as one solution approach and formulates the diffeomorphic registration as an optimal control problem under the Hamiltonian framework. In this framework, with landmark positions q0 fixed, the problem solely depends on the initial momentum p0 and evolves through time steps according to a set of constraint equations. Given an initial p0, the algorithm flows q and p forward through time steps, calculates a loss based on point-set mismatch and kinetic energy, back-propagate through time to calculate gradient on p0 and update it. In the forward and backward pass, a pair-wise kernel on landmark points K and additional intermediate terms have to be calculated and marginalized, leading to O(N2) computational complexity, N being the number of points to be registered. For medical image applications, N maybe in the range of thousands, rendering this operation computationally expensive. In this work we ropose a CUDA implementation based on shared memory reduction. Our implementation achieves nearly 2 orders magnitude speed up compared to a naive CPU-based implementation, in addition to improved numerical accuracy as well as better registration results.
Tasks
Published 2019-07-10
URL https://arxiv.org/abs/1907.04839v1
PDF https://arxiv.org/pdf/1907.04839v1.pdf
PWC https://paperswithcode.com/paper/fast-geodesic-shooting-for-landmark-matching
Repo
Framework

Conditional WGANs with Adaptive Gradient Balancing for Sparse MRI Reconstruction

Title Conditional WGANs with Adaptive Gradient Balancing for Sparse MRI Reconstruction
Authors Itzik Malkiel, Sangtae Ahn, Valentina Taviani, Anne Menini, Lior Wolf, Christopher J. Hardy
Abstract Recent sparse MRI reconstruction models have used Deep Neural Networks (DNNs) to reconstruct relatively high-quality images from highly undersampled k-space data, enabling much faster MRI scanning. However, these techniques sometimes struggle to reconstruct sharp images that preserve fine detail while maintaining a natural appearance. In this work, we enhance the image quality by using a Conditional Wasserstein Generative Adversarial Network combined with a novel Adaptive Gradient Balancing technique that stabilizes the training and minimizes the degree of artifacts, while maintaining a high-quality reconstruction that produces sharper images than other techniques.
Tasks
Published 2019-05-02
URL https://arxiv.org/abs/1905.00985v1
PDF https://arxiv.org/pdf/1905.00985v1.pdf
PWC https://paperswithcode.com/paper/conditional-wgans-with-adaptive-gradient
Repo
Framework

Effectiveness of Adversarial Examples and Defenses for Malware Classification

Title Effectiveness of Adversarial Examples and Defenses for Malware Classification
Authors Robert Podschwadt, Hassan Takabi
Abstract Artificial neural networks have been successfully used for many different classification tasks including malware detection and distinguishing between malicious and non-malicious programs. Although artificial neural networks perform very well on these tasks, they are also vulnerable to adversarial examples. An adversarial example is a sample that has minor modifications made to it so that the neural network misclassifies it. Many techniques have been proposed, both for crafting adversarial examples and for hardening neural networks against them. Most previous work has been done in the image domain. Some of the attacks have been adopted to work in the malware domain which typically deals with binary feature vectors. In order to better understand the space of adversarial examples in malware classification, we study different approaches of crafting adversarial examples and defense techniques in the malware domain and compare their effectiveness on multiple datasets.
Tasks Malware Classification, Malware Detection
Published 2019-09-10
URL https://arxiv.org/abs/1909.04778v1
PDF https://arxiv.org/pdf/1909.04778v1.pdf
PWC https://paperswithcode.com/paper/effectiveness-of-adversarial-examples-and
Repo
Framework

RNNbow: Visualizing Learning via Backpropagation Gradients in Recurrent Neural Networks

Title RNNbow: Visualizing Learning via Backpropagation Gradients in Recurrent Neural Networks
Authors Dylan Cashman, Genevieve Patterson, Abigail Mosca, Nathan Watts, Shannon Robinson, Remco Chang
Abstract We present RNNbow, an interactive tool for visualizing the gradient flow during backpropagation training in recurrent neural networks. RNNbow is a web application that displays the relative gradient contributions from Recurrent Neural Network (RNN) cells in a neighborhood of an element of a sequence. We describe the calculation of backpropagation through time (BPTT) that keeps track of itemized gradients, or gradient contributions from one element of a sequence to previous elements of a sequence. By visualizing the gradient, as opposed to activations, RNNbow offers insight into how the network is learning. We use it to explore the learning of an RNN that is trained to generate code in the C programming language. We show how it uncovers insights into the vanishing gradient as well as the evolution of training as the RNN works its way through a corpus.
Tasks
Published 2019-07-29
URL https://arxiv.org/abs/1907.12545v1
PDF https://arxiv.org/pdf/1907.12545v1.pdf
PWC https://paperswithcode.com/paper/rnnbow-visualizing-learning-via
Repo
Framework

Mixture separability loss in a deep convolutional network for image classification

Title Mixture separability loss in a deep convolutional network for image classification
Authors Trung Dung Do, Cheng-Bin Jin, Hakil Kim, Van Huan Nguyen
Abstract In machine learning, the cost function is crucial because it measures how good or bad a system is. In image classification, well-known networks only consider modifying the network structures and applying cross-entropy loss at the end of the network. However, using only cross-entropy loss causes a network to stop updating weights when all training images are correctly classified. This is the problem of the early saturation. This paper proposes a novel cost function, called mixture separability loss (MSL), which updates the weights of the network even when most of the training images are accurately predicted. MSL consists of between-class and within-class loss. Between-class loss maximizes the differences between inter-class images, whereas within-class loss minimizes the similarities between intra-class images. We designed the proposed loss function to attach to different convolutional layers in the network in order to utilize intermediate feature maps. Experiments show that a network with MSL deepens the learning process and obtains promising results with some public datasets, such as Street View House Number (SVHN), Canadian Institute for Advanced Research (CIFAR), and our self-collected Inha Computer Vision Lab (ICVL) gender dataset.
Tasks Image Classification
Published 2019-06-16
URL https://arxiv.org/abs/1906.06633v1
PDF https://arxiv.org/pdf/1906.06633v1.pdf
PWC https://paperswithcode.com/paper/mixture-separability-loss-in-a-deep
Repo
Framework

Multi-domain CT metal artifacts reduction using partial convolution based inpainting

Title Multi-domain CT metal artifacts reduction using partial convolution based inpainting
Authors Artem Pimkin, Alexander Samoylenko, Natalia Antipina, Anna Ovechkina, Andrey Golanov, Alexandra Dalechina, Mikhail Belyaev
Abstract Recent CT Metal Artifacts Reduction (MAR) methods are often based on image-to-image convolutional neural networks for adjustment of corrupted sinograms or images themselves. In this paper, we are exploring the capabilities of a multi-domain method which consists of both sinogram correction (projection domain step) and restored image correction (image-domain step). Moreover, we propose a formulation of the first step problem as sinogram inpainting which allows us to use methods of this specific field such as partial convolutions. The proposed method allows to achieve state-of-the-art (-75% MSE) improvement in comparison with a classic benchmark - Li-MAR.
Tasks
Published 2019-11-13
URL https://arxiv.org/abs/1911.05530v1
PDF https://arxiv.org/pdf/1911.05530v1.pdf
PWC https://paperswithcode.com/paper/multi-domain-ct-metal-artifacts-reduction
Repo
Framework
comments powered by Disqus