January 31, 2020

3627 words 18 mins read

Paper Group ANR 13

Paper Group ANR 13

Translationese in Machine Translation Evaluation. A Bayesian/Information Theoretic Model of Bias Learning. Straight to Shapes++: Real-time Instance Segmentation Made More Accurate. Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects. SNAP: Finding Approximate Second-Order Stationary Solutions Efficien …

Translationese in Machine Translation Evaluation

Title Translationese in Machine Translation Evaluation
Authors Yvette Graham, Barry Haddow, Philipp Koehn
Abstract The term translationese has been used to describe the presence of unusual features of translated text. In this paper, we provide a detailed analysis of the adverse effects of translationese on machine translation evaluation results. Our analysis shows evidence to support differences in text originally written in a given language relative to translated text and this can potentially negatively impact the accuracy of machine translation evaluations. For this reason we recommend that reverse-created test data be omitted from future machine translation test sets. In addition, we provide a re-evaluation of a past high-profile machine translation evaluation claiming human-parity of MT, as well as analysis of the since re-evaluations of it. We find potential ways of improving the reliability of all three past evaluations. One important issue not previously considered is the statistical power of significance tests applied in past evaluations that aim to investigate human-parity of MT. Since the very aim of such evaluations is to reveal legitimate ties between human and MT systems, power analysis is of particular importance, where low power could result in claims of human parity that in fact simply correspond to Type II error. We therefore provide a detailed power analysis of tests used in such evaluations to provide an indication of a suitable minimum sample size of translations for such studies. Subsequently, since no past evaluation that aimed to investigate claims of human parity ticks all boxes in terms of accuracy and reliability, we rerun the evaluation of the systems claiming human parity. Finally, we provide a comprehensive check-list for future machine translation evaluation.
Tasks Machine Translation
Published 2019-06-24
URL https://arxiv.org/abs/1906.09833v1
PDF https://arxiv.org/pdf/1906.09833v1.pdf
PWC https://paperswithcode.com/paper/translationese-in-machine-translation
Repo
Framework

A Bayesian/Information Theoretic Model of Bias Learning

Title A Bayesian/Information Theoretic Model of Bias Learning
Authors Jonathan Baxter
Abstract In this paper the problem of learning appropriate bias for an environment of related tasks is examined from a Bayesian perspective. The environment of related tasks is shown to be naturally modelled by the concept of an {\em objective} prior distribution. Sampling from the objective prior corresponds to sampling different learning tasks from the environment. It is argued that for many common machine learning problems, although we don’t know the true (objective) prior for the problem, we do have some idea of a set of possible priors to which the true prior belongs. It is shown that under these circumstances a learner can use Bayesian inference to learn the true prior by sampling from the objective prior. Bounds are given on the amount of information required to learn a task when it is simultaneously learnt with several other tasks. The bounds show that if the learner has little knowledge of the true prior, and the dimensionality of the true prior is small, then sampling multiple tasks is highly advantageous.
Tasks Bayesian Inference
Published 2019-11-14
URL https://arxiv.org/abs/1911.06129v1
PDF https://arxiv.org/pdf/1911.06129v1.pdf
PWC https://paperswithcode.com/paper/a-bayesianinformation-theoretic-model-of-bias
Repo
Framework

Straight to Shapes++: Real-time Instance Segmentation Made More Accurate

Title Straight to Shapes++: Real-time Instance Segmentation Made More Accurate
Authors Laurynas Miksys, Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr
Abstract Instance segmentation is an important problem in computer vision, with applications in autonomous driving, drone navigation and robotic manipulation. However, most existing methods are not real-time, complicating their deployment in time-sensitive contexts. In this work, we extend an existing approach to real-time instance segmentation, called Straight to Shapes' (STS), which makes use of low-dimensional shape embedding spaces to directly regress to object shape masks. The STS model can run at 35 FPS on a high-end desktop, but its accuracy is significantly worse than that of offline state-of-the-art methods. We leverage recent advances in the design and training of deep instance segmentation models to improve the performance accuracy of the STS model whilst keeping its real-time capabilities intact. In particular, we find that parameter sharing, more aggressive data augmentation and the use of structured loss for shape mask prediction all provide a useful boost to the network performance. Our proposed approach, Straight to Shapes++’, achieves a remarkable 19.7 point improvement in mAP (at IOU of 0.5) over the original method as evaluated on the PASCAL VOC dataset, thus redefining the accuracy frontier at real-time speeds. Since the accuracy of instance segmentation is closely tied to that of object bounding box prediction, we also study the error profile of the latter and examine the failure modes of our method for future improvements.
Tasks Autonomous Driving, Data Augmentation, Drone navigation, Instance Segmentation, Real-time Instance Segmentation, Semantic Segmentation
Published 2019-05-27
URL https://arxiv.org/abs/1905.11358v2
PDF https://arxiv.org/pdf/1905.11358v2.pdf
PWC https://paperswithcode.com/paper/straight-to-shapes-real-time-instance
Repo
Framework

Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects

Title Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects
Authors Gabriel Grand, Yonatan Belinkov
Abstract Visual question answering (VQA) models have been shown to over-rely on linguistic biases in VQA datasets, answering questions “blindly” without considering visual context. Adversarial regularization (AdvReg) aims to address this issue via an adversary sub-network that encourages the main model to learn a bias-free representation of the question. In this work, we investigate the strengths and shortcomings of AdvReg with the goal of better understanding how it affects inference in VQA models. Despite achieving a new state-of-the-art on VQA-CP, we find that AdvReg yields several undesirable side-effects, including unstable gradients and sharply reduced performance on in-domain examples. We demonstrate that gradual introduction of regularization during training helps to alleviate, but not completely solve, these issues. Through error analyses, we observe that AdvReg improves generalization to binary questions, but impairs performance on questions with heterogeneous answer distributions. Qualitatively, we also find that regularized models tend to over-rely on visual features, while ignoring important linguistic cues in the question. Our results suggest that AdvReg requires further refinement before it can be considered a viable bias mitigation technique for VQA.
Tasks Question Answering, Visual Question Answering
Published 2019-06-20
URL https://arxiv.org/abs/1906.08430v1
PDF https://arxiv.org/pdf/1906.08430v1.pdf
PWC https://paperswithcode.com/paper/adversarial-regularization-for-visual
Repo
Framework

SNAP: Finding Approximate Second-Order Stationary Solutions Efficiently for Non-convex Linearly Constrained Problems

Title SNAP: Finding Approximate Second-Order Stationary Solutions Efficiently for Non-convex Linearly Constrained Problems
Authors Songtao Lu, Meisam Razaviyayn, Bo Yang, Kejun Huang, Mingyi Hong
Abstract This paper proposes low-complexity algorithms for finding approximate second-order stationary points (SOSPs) of problems with smooth non-convex objective and linear constraints. While finding (approximate) SOSPs is computationally intractable, we first show that generic instances of the problem can be solved efficiently. More specifically, for a generic problem instance, certain strict complementarity (SC) condition holds for all Karush-Kuhn-Tucker (KKT) solutions (with probability one). The SC condition is then used to establish an equivalence relationship between two different notions of SOSPs, one of which is computationally easy to verify. Based on this particular notion of SOSP, we design an algorithm named the Successive Negative-curvature grAdient Projection (SNAP), which successively performs either conventional gradient projection or some negative curvature based projection steps to find SOSPs. SNAP and its first-order extension SNAP$^+$, require $\mathcal{O}(1/\epsilon^{2.5})$ iterations to compute an $(\epsilon, \sqrt{\epsilon})$-SOSP, and their per-iteration computational complexities are polynomial in the number of constraints and problem dimension. To our knowledge, this is the first time that first-order algorithms with polynomial per-iteration complexity and global sublinear rate have been designed to find SOSPs of the important class of non-convex problems with linear constraints.
Tasks
Published 2019-07-09
URL https://arxiv.org/abs/1907.04450v1
PDF https://arxiv.org/pdf/1907.04450v1.pdf
PWC https://paperswithcode.com/paper/snap-finding-approximate-second-order
Repo
Framework

Comparison of Artificial Intelligence Techniques for Project Conceptual Cost Prediction

Title Comparison of Artificial Intelligence Techniques for Project Conceptual Cost Prediction
Authors Haytham H. Elmousalami
Abstract Developing a reliable parametric cost model at the conceptual stage of the project is crucial for projects managers and decision-makers. Existing methods, such as probabilistic and statistical algorithms have been developed for project cost prediction. However, these methods are unable to produce accurate results for conceptual cost prediction due to small and unstable data samples. Artificial intelligence (AI) and machine learning (ML) algorithms include numerous models and algorithms for supervised regression applications. Therefore, a comparison analysis for AI models is required to guide practitioners to the appropriate model. The study focuses on investigating twenty artificial intelligence (AI) techniques which are conducted for cost modeling such as fuzzy logic (FL) model, artificial neural networks (ANNs), multiple regression analysis (MRA), case-based reasoning (CBR), hybrid models, and ensemble methods such as scalable boosting trees (XGBoost). Field canals improvement projects (FCIPs) are used as an actual case study to analyze the performance of the applied ML models. Out of 20 AI techniques, the results showed that the most accurate and suitable method is XGBoost with 9.091% and 0.929 based on Mean Absolute Percentage Error (MAPE) and adjusted R2. Nonlinear adaptability, handling missing values and outliers, model interpretation and uncertainty have been discussed for the twenty developed AI models. Keywords: Artificial intelligence, Machine learning, ensemble methods, XGBoost, evolutionary fuzzy rules generation, Conceptual cost, and parametric cost model.
Tasks
Published 2019-08-08
URL https://arxiv.org/abs/1909.11637v1
PDF https://arxiv.org/pdf/1909.11637v1.pdf
PWC https://paperswithcode.com/paper/comparison-of-artificial-intelligence
Repo
Framework

A Quantum Field Theory of Representation Learning

Title A Quantum Field Theory of Representation Learning
Authors Robert Bamler, Stephan Mandt
Abstract Continuous symmetries and their breaking play a prominent role in contemporary physics. Effective low-energy field theories around symmetry breaking states explain diverse phenomena such as superconductivity, magnetism, and the mass of nucleons. We show that such field theories can also be a useful tool in machine learning, in particular for loss functions with continuous symmetries that are spontaneously broken by random initializations. In this paper, we illuminate our earlier published work (Bamler & Mandt, 2018) on this topic more from the perspective of theoretical physics. We show that the analogies between superconductivity and symmetry breaking in temporal representation learning are rather deep, allowing us to formulate a gauge theory of `charged’ embedding vectors in time series models. We show that making the loss function gauge invariant speeds up convergence in such models. |
Tasks Representation Learning, Time Series
Published 2019-07-04
URL https://arxiv.org/abs/1907.02163v1
PDF https://arxiv.org/pdf/1907.02163v1.pdf
PWC https://paperswithcode.com/paper/a-quantum-field-theory-of-representation
Repo
Framework

Uniform error estimates for artificial neural network approximations for heat equations

Title Uniform error estimates for artificial neural network approximations for heat equations
Authors Lukas Gonon, Philipp Grohs, Arnulf Jentzen, David Kofler, David Šiška
Abstract Recently, artificial neural networks (ANNs) in conjunction with stochastic gradient descent optimization methods have been employed to approximately compute solutions of possibly rather high-dimensional partial differential equations (PDEs). Very recently, there have also been a number of rigorous mathematical results in the scientific literature which examine the approximation capabilities of such deep learning based approximation algorithms for PDEs. These mathematical results from the scientific literature prove in part that algorithms based on ANNs are capable of overcoming the curse of dimensionality in the numerical approximation of high-dimensional PDEs. In these mathematical results from the scientific literature usually the error between the solution of the PDE and the approximating ANN is measured in the $L^p$-sense with respect to some $p \in [1,\infty)$ and some probability measure. In many applications it is, however, also important to control the error in a uniform $L^\infty$-sense. The key contribution of the main result of this article is to develop the techniques to obtain error estimates between solutions of PDEs and approximating ANNs in the uniform $L^\infty$-sense. In particular, we prove that the number of parameters of an ANN to uniformly approximate the classical solution of the heat equation in a region $ [a,b]^d $ for a fixed time point $ T \in (0,\infty) $ grows at most polynomially in the dimension $ d \in \mathbb{N} $ and the reciprocal of the approximation precision $ \varepsilon > 0 $. This shows that ANNs can overcome the curse of dimensionality in the numerical approximation of the heat equation when the error is measured in the uniform $L^\infty$-norm.
Tasks
Published 2019-11-20
URL https://arxiv.org/abs/1911.09647v2
PDF https://arxiv.org/pdf/1911.09647v2.pdf
PWC https://paperswithcode.com/paper/uniform-error-estimates-for-artificial-neural
Repo
Framework

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Title Improving Visual Question Answering by Referring to Generated Paragraph Captions
Authors Hyounghun Kim, Mohit Bansal
Abstract Paragraph-style image captions describe diverse aspects of an image as opposed to the more common single-sentence captions that only provide an abstract description of the image. These paragraph captions can hence contain substantial information of the image for tasks such as visual question answering. Moreover, this textual information is complementary with visual information present in the image because it can discuss both more abstract concepts and more explicit, intermediate symbolic information about objects, events, and scenes that can directly be matched with the textual question and copied into the textual answer (i.e., via easier modality match). Hence, we propose a combined Visual and Textual Question Answering (VTQA) model which takes as input a paragraph caption as well as the corresponding image, and answers the given question based on both inputs. In our model, the inputs are fused to extract related information by cross-attention (early fusion), then fused again in the form of consensus (late fusion), and finally expected answers are given an extra score to enhance the chance of selection (later fusion). Empirical results show that paragraph captions, even when automatically generated (via an RL-based encoder-decoder model), help correctly answer more visual questions. Overall, our joint model, when trained on the Visual Genome dataset, significantly improves the VQA performance over a strong baseline model.
Tasks Image Captioning, Question Answering, Visual Question Answering
Published 2019-06-14
URL https://arxiv.org/abs/1906.06216v1
PDF https://arxiv.org/pdf/1906.06216v1.pdf
PWC https://paperswithcode.com/paper/improving-visual-question-answering-by
Repo
Framework

Gain with no Pain: Efficient Kernel-PCA by Nyström Sampling

Title Gain with no Pain: Efficient Kernel-PCA by Nyström Sampling
Authors Nicholas Sterge, Bharath Sriperumbudur, Lorenzo Rosasco, Alessandro Rudi
Abstract In this paper, we propose and study a Nystr"om based approach to efficient large scale kernel principal component analysis (PCA). The latter is a natural nonlinear extension of classical PCA based on considering a nonlinear feature map or the corresponding kernel. Like other kernel approaches, kernel PCA enjoys good mathematical and statistical properties but, numerically, it scales poorly with the sample size. Our analysis shows that Nystr"om sampling greatly improves computational efficiency without incurring any loss of statistical accuracy. While similar effects have been observed in supervised learning, this is the first such result for PCA. Our theoretical findings, which are also illustrated by numerical results, are based on a combination of analytic and concentration of measure techniques. Our study is more broadly motivated by the question of understanding the interplay between statistical and computational requirements for learning.
Tasks
Published 2019-07-11
URL https://arxiv.org/abs/1907.05226v1
PDF https://arxiv.org/pdf/1907.05226v1.pdf
PWC https://paperswithcode.com/paper/gain-with-no-pain-efficient-kernel-pca-by
Repo
Framework

Memetic EDA-Based Approaches to Comprehensive Quality-Aware Automated Semantic Web Service Composition

Title Memetic EDA-Based Approaches to Comprehensive Quality-Aware Automated Semantic Web Service Composition
Authors Chen Wang, Hui Ma, Gang Chen, Sven Hartmann
Abstract Comprehensive quality-aware automated semantic web service composition is an NP-hard problem, where service composition workflows are unknown, and comprehensive quality, i.e., Quality of services (QoS) and Quality of semantic matchmaking (QoSM) are simultaneously optimized. The objective of this problem is to find a solution with optimized or near-optimized overall QoS and QoSM within polynomial time over a service request. In this paper, we proposed novel memetic EDA-based approaches to tackle this problem. The proposed method investigates the effectiveness of several neighborhood structures of composite services by proposing domain-dependent local search operators. Apart from that, a joint strategy of the local search procedure is proposed to integrate with a modified EDA to reduce the overall computation time of our memetic approach. To better demonstrate the effectiveness and scalability of our approach, we create a more challenging, augmented version of the service composition benchmark based on WSC-08 \cite{bansal2008wsc} and WSC-09 \cite{kona2009wsc}. Experimental results on this benchmark show that one of our proposed memetic EDA-based approach (i.e., MEEDA-LOP) significantly outperforms existing state-of-the-art algorithms.
Tasks
Published 2019-06-19
URL https://arxiv.org/abs/1906.07900v1
PDF https://arxiv.org/pdf/1906.07900v1.pdf
PWC https://paperswithcode.com/paper/memetic-eda-based-approaches-to-comprehensive
Repo
Framework

Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents

Title Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents
Authors Christian Rupprecht, Cyril Ibrahim, Christopher J. Pal
Abstract As deep reinforcement learning driven by visual perception becomes more widely used there is a growing need to better understand and probe the learned agents. Understanding the decision making process and its relationship to visual inputs can be very valuable to identify problems in learned behavior. However, this topic has been relatively under-explored in the research community. In this work we present a method for synthesizing visual inputs of interest for a trained agent. Such inputs or states could be situations in which specific actions are necessary. Further, critical states in which a very high or a very low reward can be achieved are often interesting to understand the situational awareness of the system as they can correspond to risky states. To this end, we learn a generative model over the state space of the environment and use its latent space to optimize a target function for the state of interest. In our experiments we show that this method can generate insights for a variety of environments and reinforcement learning methods. We explore results in the standard Atari benchmark games as well as in an autonomous driving simulator. Based on the efficiency with which we have been able to identify behavioural weaknesses with this technique, we believe this general approach could serve as an important tool for AI safety applications.
Tasks Autonomous Driving, Decision Making
Published 2019-04-02
URL http://arxiv.org/abs/1904.01318v1
PDF http://arxiv.org/pdf/1904.01318v1.pdf
PWC https://paperswithcode.com/paper/finding-and-visualizing-weaknesses-of-deep
Repo
Framework

Riemannian Motion Policy Fusion through Learnable Lyapunov Function Reshaping

Title Riemannian Motion Policy Fusion through Learnable Lyapunov Function Reshaping
Authors Mustafa Mukadam, Ching-An Cheng, Dieter Fox, Byron Boots, Nathan Ratliff
Abstract RMPflow is a recently proposed policy-fusion framework based on differential geometry. While RMPflow has demonstrated promising performance, it requires the user to provide sensible subtask policies as Riemannian motion policies (RMPs: a motion policy and an importance matrix function), which can be a difficult design problem in its own right. We propose RMPfusion, a variation of RMPflow, to address this issue. RMPfusion supplements RMPflow with weight functions that can hierarchically reshape the Lyapunov functions of the subtask RMPs according to the current configuration of the robot and environment. This extra flexibility can remedy imperfect subtask RMPs provided by the user, improving the combined policy’s performance. These weight functions can be learned by back-propagation. Moreover, we prove that, under mild restrictions on the weight functions, RMPfusion always yields a globally Lyapunov-stable motion policy. This implies that we can treat RMPfusion as a structured policy class in policy optimization that is guaranteed to generate stable policies, even during the immature phase of learning. We demonstrate these properties of RMPfusion in imitation learning experiments both in simulation and on a real-world robot.
Tasks Imitation Learning
Published 2019-10-07
URL https://arxiv.org/abs/1910.02646v2
PDF https://arxiv.org/pdf/1910.02646v2.pdf
PWC https://paperswithcode.com/paper/riemannian-motion-policy-fusion-through
Repo
Framework
Title Comparison of the P300 detection accuracy related to the BCI speller and image recognition scenarios
Authors S. A. Karimi, A. M. Mijani, M. T. Talebian, S. Mirzakuchaki
Abstract There are several protocols in the Electroencephalography (EEG) recording scenarios which produce various types of event-related potentials (ERP). P300 pattern is a well-known ERP which produced by auditory and visual oddball paradigm and BCI speller system. In this study, P300 and non-P300 separability are investigated in two scenarios including image recognition paradigm and BCI speller. Image recognition scenario is an experiment that examines the participants, knowledge about an image that shown to them before by analyzing the EEG signal recorded during the observing of that image as visual stimulation. To do this, three types of famous classifiers (SVM, Bayes LDA, and sparse logistic regression) were used to classify EEG recordings in six classes problem. Filtered and down-sampled (temporal samples) of EEG recording were considered as features in classification P300 pattern. Also, different sets of EEG recording including 4, 8 and 16 channels and different trial numbers were used to considering various situations in comparison. The accuracy was increased by increasing the number of trials and channels. The results prove that better accuracy is observed in the case of the image recognition scenario for the different sets of channels and by using the different number of trials. So it can be concluded that P300 pattern which produced in image recognition paradigm is more separable than BCI (matrix speller).
Tasks EEG
Published 2019-12-24
URL https://arxiv.org/abs/1912.11371v1
PDF https://arxiv.org/pdf/1912.11371v1.pdf
PWC https://paperswithcode.com/paper/comparison-of-the-p300-detection-accuracy
Repo
Framework

Benefiting from Multitask Learning to Improve Single Image Super-Resolution

Title Benefiting from Multitask Learning to Improve Single Image Super-Resolution
Authors Mohammad Saeed Rad, Behzad Bozorgtabar, Claudiu Musat, Urs-Viktor Marti, Max Basler, Hazim Kemal Ekenel, Jean-Philippe Thiran
Abstract Despite significant progress toward super resolving more realistic images by deeper convolutional neural networks (CNNs), reconstructing fine and natural textures still remains a challenging problem. Recent works on single image super resolution (SISR) are mostly based on optimizing pixel and content wise similarity between recovered and high-resolution (HR) images and do not benefit from recognizability of semantic classes. In this paper, we introduce a novel approach using categorical information to tackle the SISR problem; we present a decoder architecture able to extract and use semantic information to super-resolve a given image by using multitask learning, simultaneously for image super-resolution and semantic segmentation. To explore categorical information during training, the proposed decoder only employs one shared deep network for two task-specific output layers. At run-time only layers resulting HR image are used and no segmentation label is required. Extensive perceptual experiments and a user study on images randomly selected from COCO-Stuff dataset demonstrate the effectiveness of our proposed method and it outperforms the state-of-the-art methods.
Tasks Image Super-Resolution, Semantic Segmentation, Super-Resolution
Published 2019-07-29
URL https://arxiv.org/abs/1907.12488v1
PDF https://arxiv.org/pdf/1907.12488v1.pdf
PWC https://paperswithcode.com/paper/benefiting-from-multitask-learning-to-improve
Repo
Framework
comments powered by Disqus