July 30, 2019

2808 words 14 mins read

Paper Group AWR 58

Paper Group AWR 58

Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow. Machine Learning by Unitary Tensor Network of Hierarchical Tree Structure. Revisiting Unsupervised Learning for Defect Prediction. Learning Chinese Word Representations From Glyphs Of Characters. Query-Based Abstractive Summarization Using Neural Networks. Active Lea …

Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow

Title Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow
Authors Francesco Giannini, Vincenzo Laveglia, Alessandro Rossi, Dario Zanca, Andrea Zugarini
Abstract This report provides an introduction to some Machine Learning tools within the most common development environments. It mainly focuses on practical problems, skipping any theoretical introduction. It is oriented to both students trying to approach Machine Learning and experts looking for new frameworks.
Tasks
Published 2017-03-10
URL http://arxiv.org/abs/1703.05298v2
PDF http://arxiv.org/pdf/1703.05298v2.pdf
PWC https://paperswithcode.com/paper/neural-networks-for-beginners-a-fast
Repo https://github.com/AILabUSiena/NeuralNetworksForBeginners
Framework tf

Machine Learning by Unitary Tensor Network of Hierarchical Tree Structure

Title Machine Learning by Unitary Tensor Network of Hierarchical Tree Structure
Authors Ding Liu, Shi-Ju Ran, Peter Wittek, Cheng Peng, Raul Blázquez García, Gang Su, Maciej Lewenstein
Abstract The resemblance between the methods used in quantum-many body physics and in machine learning has drawn considerable attention. In particular, tensor networks (TNs) and deep learning architectures bear striking similarities to the extent that TNs can be used for machine learning. Previous results used one-dimensional TNs in image recognition, showing limited scalability and flexibilities. In this work, we train two-dimensional hierarchical TNs to solve image recognition problems, using a training algorithm derived from the multi-scale entanglement renormalization ansatz. This approach introduces mathematical connections among quantum many-body physics, quantum information theory, and machine learning. While keeping the TN unitary in the training phase, TN states are defined, which encode classes of images into quantum many-body states. We study the quantum features of the TN states, including quantum entanglement and fidelity. We find these quantities could be properties that characterize the image classes, as well as the machine learning tasks.
Tasks Tensor Networks
Published 2017-10-13
URL http://arxiv.org/abs/1710.04833v4
PDF http://arxiv.org/pdf/1710.04833v4.pdf
PWC https://paperswithcode.com/paper/machine-learning-by-two-dimensional
Repo https://github.com/RaulBz/Master_Thesis
Framework none

Revisiting Unsupervised Learning for Defect Prediction

Title Revisiting Unsupervised Learning for Defect Prediction
Authors Wei Fu, Tim Menzies
Abstract Collecting quality data from software projects can be time-consuming and expensive. Hence, some researchers explore “unsupervised” approaches to quality prediction that does not require labelled data. An alternate technique is to use “supervised” approaches that learn models from project data labelled with, say, “defective” or “not-defective”. Most researchers use these supervised models since, it is argued, they can exploit more knowledge of the projects. At FSE’16, Yang et al. reported startling results where unsupervised defect predictors outperformed supervised predictors for effort-aware just-in-time defect prediction. If confirmed, these results would lead to a dramatic simplification of a seemingly complex task (data mining) that is widely explored in the software engineering literature. This paper repeats and refutes those results as follows. (1) There is much variability in the efficacy of the Yang et al. predictors so even with their approach, some supervised data is required to prune weaker predictors away. (2)Their findings were grouped across $N$ projects. When we repeat their analysis on a project-by-project basis, supervised predictors are seen to work better. Even though this paper rejects the specific conclusions of Yang et al., we still endorse their general goal. In our our experiments, supervised predictors did not perform outstandingly better than unsupervised ones for effort-aware just-in-time defect prediction. Hence, they may indeed be some combination of unsupervised learners to achieve comparable performance to supervised ones. We therefore encourage others to work in this promising area.
Tasks
Published 2017-03-01
URL http://arxiv.org/abs/1703.00132v2
PDF http://arxiv.org/pdf/1703.00132v2.pdf
PWC https://paperswithcode.com/paper/revisiting-unsupervised-learning-for-defect
Repo https://github.com/WeiFoo/RevisitUnsupervised
Framework none

Learning Chinese Word Representations From Glyphs Of Characters

Title Learning Chinese Word Representations From Glyphs Of Characters
Authors Tzu-Ray Su, Hung-Yi Lee
Abstract In this paper, we propose new methods to learn Chinese word representations. Chinese characters are composed of graphical components, which carry rich semantics. It is common for a Chinese learner to comprehend the meaning of a word from these graphical components. As a result, we propose models that enhance word representations by character glyphs. The character glyph features are directly learned from the bitmaps of characters by convolutional auto-encoder(convAE), and the glyph features improve Chinese word representations which are already enhanced by character embeddings. Another contribution in this paper is that we created several evaluation datasets in traditional Chinese and made them public.
Tasks
Published 2017-08-16
URL http://arxiv.org/abs/1708.04755v1
PDF http://arxiv.org/pdf/1708.04755v1.pdf
PWC https://paperswithcode.com/paper/learning-chinese-word-representations-from
Repo https://github.com/ray1007/GWE
Framework tf

Query-Based Abstractive Summarization Using Neural Networks

Title Query-Based Abstractive Summarization Using Neural Networks
Authors Johan Hasselqvist, Niklas Helmertz, Mikael Kågebäck
Abstract In this paper, we present a model for generating summaries of text documents with respect to a query. This is known as query-based summarization. We adapt an existing dataset of news article summaries for the task and train a pointer-generator model using this dataset. The generated summaries are evaluated by measuring similarity to reference summaries. Our results show that a neural network summarization model, similar to existing neural network models for abstractive summarization, can be constructed to make use of queries to produce targeted summaries.
Tasks Abstractive Text Summarization
Published 2017-12-17
URL http://arxiv.org/abs/1712.06100v1
PDF http://arxiv.org/pdf/1712.06100v1.pdf
PWC https://paperswithcode.com/paper/query-based-abstractive-summarization-using
Repo https://github.com/helmertz/querysum
Framework tf

Active Learning for Visual Question Answering: An Empirical Study

Title Active Learning for Visual Question Answering: An Empirical Study
Authors Xiao Lin, Devi Parikh
Abstract We present an empirical study of active learning for Visual Question Answering, where a deep VQA model selects informative question-image pairs from a pool and queries an oracle for answers to maximally improve its performance under a limited query budget. Drawing analogies from human learning, we explore cramming (entropy), curiosity-driven (expected model change), and goal-driven (expected error reduction) active learning approaches, and propose a fast and effective goal-driven active learning scoring function to pick question-image pairs for deep VQA models under the Bayesian Neural Network framework. We find that deep VQA models need large amounts of training data before they can start asking informative questions. But once they do, all three approaches outperform the random selection baseline and achieve significant query savings. For the scenario where the model is allowed to ask generic questions about images but is evaluated only on specific questions (e.g., questions whose answer is either yes or no), our proposed goal-driven scoring function performs the best.
Tasks Active Learning, Visual Question Answering
Published 2017-11-06
URL http://arxiv.org/abs/1711.01732v1
PDF http://arxiv.org/pdf/1711.01732v1.pdf
PWC https://paperswithcode.com/paper/active-learning-for-visual-question-answering
Repo https://github.com/frkl/active-learning
Framework torch

Learning to Bid Without Knowing your Value

Title Learning to Bid Without Knowing your Value
Authors Zhe Feng, Chara Podimata, Vasilis Syrgkanis
Abstract We address online learning in complex auction settings, such as sponsored search auctions, where the value of the bidder is unknown to her, evolving in an arbitrary manner and observed only if the bidder wins an allocation. We leverage the structure of the utility of the bidder and the partial feedback that bidders typically receive in auctions, in order to provide algorithms with regret rates against the best fixed bid in hindsight, that are exponentially faster in convergence in terms of dependence on the action space, than what would have been derived by applying a generic bandit algorithm and almost equivalent to what would have been achieved in the full information setting. Our results are enabled by analyzing a new online learning setting with outcome-based feedback, which generalizes learning with feedback graphs. We provide an online learning algorithm for this setting, of independent interest, with regret that grows only logarithmically with the number of actions and linearly only in the number of potential outcomes (the latter being very small in most auction settings). Last but not least, we show that our algorithm outperforms the bandit approach experimentally and that this performance is robust to dropping some of our theoretical assumptions or introducing noise in the feedback that the bidder receives.
Tasks
Published 2017-11-03
URL http://arxiv.org/abs/1711.01333v5
PDF http://arxiv.org/pdf/1711.01333v5.pdf
PWC https://paperswithcode.com/paper/learning-to-bid-without-knowing-your-value
Repo https://github.com/zfengharvard/bandit-sponsored-search
Framework none

Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German

Title Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
Authors Pierre-Edouard Honnet, Andrei Popescu-Belis, Claudiu Musat, Michael Baeriswyl
Abstract The goal of this work is to design a machine translation (MT) system for a low-resource family of dialects, collectively known as Swiss German, which are widely spoken in Switzerland but seldom written. We collected a significant number of parallel written resources to start with, up to a total of about 60k words. Moreover, we identified several other promising data sources for Swiss German. Then, we designed and compared three strategies for normalizing Swiss German input in order to address the regional diversity. We found that character-based neural MT was the best solution for text normalization. In combination with phrase-based statistical MT, our solution reached 36% BLEU score when translating from the Bernese dialect. This value, however, decreases as the testing data becomes more remote from the training one, geographically and topically. These resources and normalization techniques are a first step towards full MT of Swiss German dialects.
Tasks Machine Translation
Published 2017-10-30
URL http://arxiv.org/abs/1710.11035v2
PDF http://arxiv.org/pdf/1710.11035v2.pdf
PWC https://paperswithcode.com/paper/machine-translation-of-low-resource-spoken
Repo https://github.com/Kyubyong/quasi-rnn
Framework tf

Fast Genetic Algorithms

Title Fast Genetic Algorithms
Authors Benjamin Doerr, Huu Phuoc Le, Régis Makhmara, Ta Duy Nguyen
Abstract For genetic algorithms using a bit-string representation of length~$n$, the general recommendation is to take $1/n$ as mutation rate. In this work, we discuss whether this is really justified for multimodal functions. Taking jump functions and the $(1+1)$ evolutionary algorithm as the simplest example, we observe that larger mutation rates give significantly better runtimes. For the $\jump_{m,n}$ function, any mutation rate between $2/n$ and $m/n$ leads to a speed-up at least exponential in $m$ compared to the standard choice. The asymptotically best runtime, obtained from using the mutation rate $m/n$ and leading to a speed-up super-exponential in $m$, is very sensitive to small changes of the mutation rate. Any deviation by a small $(1 \pm \eps)$ factor leads to a slow-down exponential in $m$. Consequently, any fixed mutation rate gives strongly sub-optimal results for most jump functions. Building on this observation, we propose to use a random mutation rate $\alpha/n$, where $\alpha$ is chosen from a power-law distribution. We prove that the $(1+1)$ EA with this heavy-tailed mutation rate optimizes any $\jump_{m,n}$ function in a time that is only a small polynomial (in~$m$) factor above the one stemming from the optimal rate for this $m$. Our heavy-tailed mutation operator yields similar speed-ups (over the best known performance guarantees) for the vertex cover problem in bipartite graphs and the matching problem in general graphs. Following the example of fast simulated annealing, fast evolution strategies, and fast evolutionary programming, we propose to call genetic algorithms using a heavy-tailed mutation operator \emph{fast genetic algorithms}.
Tasks
Published 2017-03-09
URL http://arxiv.org/abs/1703.03334v2
PDF http://arxiv.org/pdf/1703.03334v2.pdf
PWC https://paperswithcode.com/paper/fast-genetic-algorithms
Repo https://github.com/rafalpronko/tsp-kaggle
Framework none

Fruit recognition from images using deep learning

Title Fruit recognition from images using deep learning
Authors Horea Mureşan, Mihai Oltean
Abstract In this paper we introduce a new, high-quality, dataset of images containing fruits. We also present the results of some numerical experiment for training a neural network to detect fruits. We discuss the reason why we chose to use fruits in this project by proposing a few applications that could use this kind of neural network.
Tasks
Published 2017-12-02
URL http://arxiv.org/abs/1712.00580v9
PDF http://arxiv.org/pdf/1712.00580v9.pdf
PWC https://paperswithcode.com/paper/fruit-recognition-from-images-using-deep
Repo https://github.com/applecrazy/FruitClassifier
Framework tf

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

Title Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
Authors Debidatta Dwibedi, Ishan Misra, Martial Hebert
Abstract A major impediment in rapidly deploying object detection models for instance detection is the lack of large annotated datasets. For example, finding a large labeled dataset containing instances in a particular kitchen is unlikely. Each new environment with new instances requires expensive data collection and annotation. In this paper, we propose a simple approach to generate large annotated instance datasets with minimal effort. Our key insight is that ensuring only patch-level realism provides enough training signal for current object detector models. We automatically cut' object instances and paste’ them on random backgrounds. A naive way to do this results in pixel artifacts which result in poor performance for trained models. We show how to make detectors ignore these artifacts during training and generate data that gives competitive performance on real data. Our method outperforms existing synthesis approaches and when combined with real images improves relative performance by more than 21% on benchmark datasets. In a cross-domain setting, our synthetic data combined with just 10% real data outperforms models trained on all real data.
Tasks Object Detection
Published 2017-08-04
URL http://arxiv.org/abs/1708.01642v1
PDF http://arxiv.org/pdf/1708.01642v1.pdf
PWC https://paperswithcode.com/paper/cut-paste-and-learn-surprisingly-easy
Repo https://github.com/debidatta/syndata-generation
Framework none

Recurrent Additive Networks

Title Recurrent Additive Networks
Authors Kenton Lee, Omer Levy, Luke Zettlemoyer
Abstract We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates. At every time step, the new state is computed as a gated component-wise sum of the input and the previous state, without any of the non-linearities commonly used in RNN transition dynamics. We formally show that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums. Despite this relatively simple functional form, experiments demonstrate that RANs perform on par with LSTMs on benchmark language modeling problems. This result shows that many of the non-linear computations in LSTMs and related networks are not essential, at least for the problems we consider, and suggests that the gates are doing more of the computational work than previously understood.
Tasks Language Modelling
Published 2017-05-21
URL http://arxiv.org/abs/1705.07393v2
PDF http://arxiv.org/pdf/1705.07393v2.pdf
PWC https://paperswithcode.com/paper/recurrent-additive-networks
Repo https://github.com/kentonl/ran
Framework tf

Quasi-Oracle Estimation of Heterogeneous Treatment Effects

Title Quasi-Oracle Estimation of Heterogeneous Treatment Effects
Authors Xinkun Nie, Stefan Wager
Abstract Flexible estimation of heterogeneous treatment effects lies at the heart of many statistical challenges, such as personalized medicine and optimal resource allocation. In this paper, we develop a general class of two-step algorithms for heterogeneous treatment effect estimation in observational studies. We first estimate marginal effects and treatment propensities in order to form an objective function that isolates the causal component of the signal. Then, we optimize this data-adaptive objective function. Our approach has several advantages over existing methods. From a practical perspective, our method is flexible and easy to use: In both steps, we can use any loss-minimization method, e.g., penalized regression, deep neutral networks, or boosting; moreover, these methods can be fine-tuned by cross validation. Meanwhile, in the case of penalized kernel regression, we show that our method has a quasi-oracle property: Even if the pilot estimates for marginal effects and treatment propensities are not particularly accurate, we achieve the same error bounds as an oracle who has a priori knowledge of these two nuisance components. We implement variants of our approach based on both penalized regression and boosting in a variety of simulation setups, and find promising performance relative to existing baselines.
Tasks
Published 2017-12-13
URL http://arxiv.org/abs/1712.04912v3
PDF http://arxiv.org/pdf/1712.04912v3.pdf
PWC https://paperswithcode.com/paper/quasi-oracle-estimation-of-heterogeneous
Repo https://github.com/xnie/rlearner
Framework none

Question Dependent Recurrent Entity Network for Question Answering

Title Question Dependent Recurrent Entity Network for Question Answering
Authors Andrea Madotto, Giuseppe Attardi
Abstract Question Answering is a task which requires building models capable of providing answers to questions expressed in human language. Full question answering involves some form of reasoning ability. We introduce a neural network architecture for this task, which is a form of $Memory\ Network$, that recognizes entities and their relations to answers through a focus attention mechanism. Our model is named $Question\ Dependent\ Recurrent\ Entity\ Network$ and extends $Recurrent\ Entity\ Network$ by exploiting aspects of the question during the memorization process. We validate the model on both synthetic and real datasets: the $bAbI$ question answering dataset and the $CNN\ &\ Daily\ News$ $reading\ comprehension$ dataset. In our experiments, the models achieved a State-of-The-Art in the former and competitive results in the latter.
Tasks Question Answering, Reading Comprehension
Published 2017-07-25
URL http://arxiv.org/abs/1707.07922v2
PDF http://arxiv.org/pdf/1707.07922v2.pdf
PWC https://paperswithcode.com/paper/question-dependent-recurrent-entity-network
Repo https://github.com/andreamad8/QDREN
Framework tf

Indirect Image Registration with Large Diffeomorphic Deformations

Title Indirect Image Registration with Large Diffeomorphic Deformations
Authors Chong Chen, Ozan Öktem
Abstract The paper adapts the large deformation diffeomorphic metric mapping framework for image registration to the indirect setting where a template is registered against a target that is given through indirect noisy observations. The registration uses diffeomorphisms that transform the template through a (group) action. These diffeomorphisms are generated by solving a flow equation that is defined by a velocity field with certain regularity. The theoretical analysis includes a proof that indirect image registration has solutions (existence) that are stable and that converge as the data error tends so zero, so it becomes a well-defined regularization method. The paper concludes with examples of indirect image registration in 2D tomography with very sparse and/or highly noisy data.
Tasks Image Registration
Published 2017-06-13
URL http://arxiv.org/abs/1706.04048v3
PDF http://arxiv.org/pdf/1706.04048v3.pdf
PWC https://paperswithcode.com/paper/indirect-image-registration-with-large
Repo https://github.com/bgris/odl
Framework none
comments powered by Disqus